Creation prediction function for kmean in R

up vote
1
down vote

favorite

I want create predict function which predicts for which cluster, observation belong

data(iris)

 mydata=iris

m=mydata[1:4]

train=head(m,100)

xNew=head(m,10)





rownames(train)<-1:nrow(train)



norm_eucl=function(train)

  train/apply(train,1,function(x)sum(x^2)^.5)

m_norm=norm_eucl(train)





result=kmeans(m_norm,3,30)



predict.kmean <- function(cluster, newdata)

{

  simMat <- m_norm(rbind(cluster, newdata),

              sel=(1:nrow(newdata)) + nrow(cluster))[1:nrow(cluster), ]

  unname(apply(simMat, 2, which.max))

}



## assign new data samples to exemplars

predict.kmean(m_norm, x[result$cluster, ], xNew)

After i get the error

Error in predict.kmean(m_norm, x[result$cluster, ], xNew) : 

  unused argument (xNew)

i understand that i am making something wrong function, cause I'm just learning to do it, but I can't understand where exactly.

indeed i want adopt similar function of apcluster ( i had seen similar topic, but for apcluster)

predict.apcluster <- function(s, exemplars, newdata)

{

  simMat <- s(rbind(exemplars, newdata),

              sel=(1:nrow(newdata)) + nrow(exemplars))[1:nrow(exemplars), ]

  unname(apply(simMat, 2, which.max))

}



## assign new data samples to exemplars

predict.apcluster(negDistMat(r=2), x[apres@exemplars, ], xNew)

how to do it?

asked Nov 17 at 15:00

d-max

728

add a comment |

up vote
1
down vote

favorite

I want create predict function which predicts for which cluster, observation belong

data(iris)

 mydata=iris

m=mydata[1:4]

train=head(m,100)

xNew=head(m,10)





rownames(train)<-1:nrow(train)



norm_eucl=function(train)

  train/apply(train,1,function(x)sum(x^2)^.5)

m_norm=norm_eucl(train)





result=kmeans(m_norm,3,30)



predict.kmean <- function(cluster, newdata)

{

  simMat <- m_norm(rbind(cluster, newdata),

              sel=(1:nrow(newdata)) + nrow(cluster))[1:nrow(cluster), ]

  unname(apply(simMat, 2, which.max))

}



## assign new data samples to exemplars

predict.kmean(m_norm, x[result$cluster, ], xNew)

After i get the error

Error in predict.kmean(m_norm, x[result$cluster, ], xNew) : 

  unused argument (xNew)

i understand that i am making something wrong function, cause I'm just learning to do it, but I can't understand where exactly.

indeed i want adopt similar function of apcluster ( i had seen similar topic, but for apcluster)

predict.apcluster <- function(s, exemplars, newdata)

{

  simMat <- s(rbind(exemplars, newdata),

              sel=(1:nrow(newdata)) + nrow(exemplars))[1:nrow(exemplars), ]

  unname(apply(simMat, 2, which.max))

}



## assign new data samples to exemplars

predict.apcluster(negDistMat(r=2), x[apres@exemplars, ], xNew)

how to do it?

asked Nov 17 at 15:00

d-max

728

add a comment |

up vote
1
down vote

favorite

I want create predict function which predicts for which cluster, observation belong

data(iris)

 mydata=iris

m=mydata[1:4]

train=head(m,100)

xNew=head(m,10)





rownames(train)<-1:nrow(train)



norm_eucl=function(train)

  train/apply(train,1,function(x)sum(x^2)^.5)

m_norm=norm_eucl(train)





result=kmeans(m_norm,3,30)



predict.kmean <- function(cluster, newdata)

{

  simMat <- m_norm(rbind(cluster, newdata),

              sel=(1:nrow(newdata)) + nrow(cluster))[1:nrow(cluster), ]

  unname(apply(simMat, 2, which.max))

}



## assign new data samples to exemplars

predict.kmean(m_norm, x[result$cluster, ], xNew)

After i get the error

Error in predict.kmean(m_norm, x[result$cluster, ], xNew) : 

  unused argument (xNew)

i understand that i am making something wrong function, cause I'm just learning to do it, but I can't understand where exactly.

indeed i want adopt similar function of apcluster ( i had seen similar topic, but for apcluster)

predict.apcluster <- function(s, exemplars, newdata)

{

  simMat <- s(rbind(exemplars, newdata),

              sel=(1:nrow(newdata)) + nrow(exemplars))[1:nrow(exemplars), ]

  unname(apply(simMat, 2, which.max))

}



## assign new data samples to exemplars

predict.apcluster(negDistMat(r=2), x[apres@exemplars, ], xNew)

how to do it?

asked Nov 17 at 15:00

d-max

728

I want create predict function which predicts for which cluster, observation belong

data(iris)

 mydata=iris

m=mydata[1:4]

train=head(m,100)

xNew=head(m,10)





rownames(train)<-1:nrow(train)



norm_eucl=function(train)

  train/apply(train,1,function(x)sum(x^2)^.5)

m_norm=norm_eucl(train)





result=kmeans(m_norm,3,30)



predict.kmean <- function(cluster, newdata)

{

  simMat <- m_norm(rbind(cluster, newdata),

              sel=(1:nrow(newdata)) + nrow(cluster))[1:nrow(cluster), ]

  unname(apply(simMat, 2, which.max))

}



## assign new data samples to exemplars

predict.kmean(m_norm, x[result$cluster, ], xNew)

After i get the error

Error in predict.kmean(m_norm, x[result$cluster, ], xNew) : 

  unused argument (xNew)

i understand that i am making something wrong function, cause I'm just learning to do it, but I can't understand where exactly.

indeed i want adopt similar function of apcluster ( i had seen similar topic, but for apcluster)

predict.apcluster <- function(s, exemplars, newdata)

{

  simMat <- s(rbind(exemplars, newdata),

              sel=(1:nrow(newdata)) + nrow(exemplars))[1:nrow(exemplars), ]

  unname(apply(simMat, 2, which.max))

}



## assign new data samples to exemplars

predict.apcluster(negDistMat(r=2), x[apres@exemplars, ], xNew)

how to do it?

r k-means

asked Nov 17 at 15:00

d-max

728

asked Nov 17 at 15:00

d-max

728

asked Nov 17 at 15:00

d-max

728

asked Nov 17 at 15:00

d-max

728

asked Nov 17 at 15:00

d-max

728

add a comment |

1 Answer
1

active

oldest

votes

up vote
2
down vote

accepted

Rather than trying to replicate something, let's come up with our own function. For a given vector x, we want to assign a cluster using some prior k-means output. Given how k-means algorithm works, what we want is to find which cluster's center is closest to x. That can be done as

predict.kmeans <- function(x, newdata)

  apply(newdata, 1, function(r) which.min(colSums((t(x$centers) - r)^2)))

That is, we go over newdata row by row and compute the corresponding row's distance to each of the centers and find the minimal one. Then, e.g.,

head(predict(result, train / sqrt(rowSums(train^2))), 3)

# 1 2 3 

# 2 2 2

all.equal(predict(result, train / sqrt(rowSums(train^2))), result$cluster)

# [1] TRUE

which confirms that our predicting function assigned all the same clusters to the training observations. Then also

predict(result, xNew / sqrt(rowSums(xNew^2)))

#  1  2  3  4  5  6  7  8  9 10 

#  2  2  2  2  2  2  2  2  2  2

Notice also that I'm calling simply predict rather than predict.kmeans. That is because result is of class kmeans and a right method is automatically chosen. Also notice how I normalize the data in a vectorized manner, without using apply.

answered Nov 17 at 16:00

Julius Vainora

27k75877

I am ashamed to ask you to help, because you have already helped me two times. But can you help in this topic? stackoverflow.com/questions/53359595/…
– d-max
Nov 18 at 9:49

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53352409%2fcreation-prediction-function-for-kmean-in-r%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
2
down vote

accepted

predict.kmeans <- function(x, newdata)

  apply(newdata, 1, function(r) which.min(colSums((t(x$centers) - r)^2)))

That is, we go over newdata row by row and compute the corresponding row's distance to each of the centers and find the minimal one. Then, e.g.,

head(predict(result, train / sqrt(rowSums(train^2))), 3)

# 1 2 3 

# 2 2 2

all.equal(predict(result, train / sqrt(rowSums(train^2))), result$cluster)

# [1] TRUE

which confirms that our predicting function assigned all the same clusters to the training observations. Then also

predict(result, xNew / sqrt(rowSums(xNew^2)))

#  1  2  3  4  5  6  7  8  9 10 

#  2  2  2  2  2  2  2  2  2  2

answered Nov 17 at 16:00

Julius Vainora

27k75877

I am ashamed to ask you to help, because you have already helped me two times. But can you help in this topic? stackoverflow.com/questions/53359595/…
– d-max
Nov 18 at 9:49

add a comment |

up vote
2
down vote

accepted

predict.kmeans <- function(x, newdata)

  apply(newdata, 1, function(r) which.min(colSums((t(x$centers) - r)^2)))

That is, we go over newdata row by row and compute the corresponding row's distance to each of the centers and find the minimal one. Then, e.g.,

head(predict(result, train / sqrt(rowSums(train^2))), 3)

# 1 2 3 

# 2 2 2

all.equal(predict(result, train / sqrt(rowSums(train^2))), result$cluster)

# [1] TRUE

which confirms that our predicting function assigned all the same clusters to the training observations. Then also

predict(result, xNew / sqrt(rowSums(xNew^2)))

#  1  2  3  4  5  6  7  8  9 10 

#  2  2  2  2  2  2  2  2  2  2

answered Nov 17 at 16:00

Julius Vainora

27k75877

I am ashamed to ask you to help, because you have already helped me two times. But can you help in this topic? stackoverflow.com/questions/53359595/…
– d-max
Nov 18 at 9:49

add a comment |

up vote
2
down vote

accepted

predict.kmeans <- function(x, newdata)

  apply(newdata, 1, function(r) which.min(colSums((t(x$centers) - r)^2)))

That is, we go over newdata row by row and compute the corresponding row's distance to each of the centers and find the minimal one. Then, e.g.,

head(predict(result, train / sqrt(rowSums(train^2))), 3)

# 1 2 3 

# 2 2 2

all.equal(predict(result, train / sqrt(rowSums(train^2))), result$cluster)

# [1] TRUE

which confirms that our predicting function assigned all the same clusters to the training observations. Then also

predict(result, xNew / sqrt(rowSums(xNew^2)))

#  1  2  3  4  5  6  7  8  9 10 

#  2  2  2  2  2  2  2  2  2  2

answered Nov 17 at 16:00

Julius Vainora

27k75877

predict.kmeans <- function(x, newdata)

  apply(newdata, 1, function(r) which.min(colSums((t(x$centers) - r)^2)))

That is, we go over newdata row by row and compute the corresponding row's distance to each of the centers and find the minimal one. Then, e.g.,

head(predict(result, train / sqrt(rowSums(train^2))), 3)

# 1 2 3 

# 2 2 2

all.equal(predict(result, train / sqrt(rowSums(train^2))), result$cluster)

# [1] TRUE

which confirms that our predicting function assigned all the same clusters to the training observations. Then also

predict(result, xNew / sqrt(rowSums(xNew^2)))

#  1  2  3  4  5  6  7  8  9 10 

#  2  2  2  2  2  2  2  2  2  2

answered Nov 17 at 16:00

Julius Vainora

27k75877

answered Nov 17 at 16:00

Julius Vainora

27k75877

answered Nov 17 at 16:00

Julius Vainora

27k75877

answered Nov 17 at 16:00

Julius Vainora

27k75877

I am ashamed to ask you to help, because you have already helped me two times. But can you help in this topic? stackoverflow.com/questions/53359595/…
– d-max
Nov 18 at 9:49

add a comment |

I am ashamed to ask you to help, because you have already helped me two times. But can you help in this topic? stackoverflow.com/questions/53359595/…
– d-max
Nov 18 at 9:49

I am ashamed to ask you to help, because you have already helped me two times. But can you help in this topic? stackoverflow.com/questions/53359595/…
– d-max
Nov 18 at 9:49

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

UxeU 0 fXP Uv1XH5Tca em2,3DpYt1rUTDkwHGicVzhLFzeitkz 7 gnI aNDAp,yhh3WXf,H7weUyAW,6uY5FQfVOeCLR2g9g5b

搜尋此網誌

Argthtjtr