Creation prediction function for kmean in R
up vote
1
down vote
favorite
I want create predict function which predicts for which cluster, observation belong
data(iris)
mydata=iris
m=mydata[1:4]
train=head(m,100)
xNew=head(m,10)
rownames(train)<-1:nrow(train)
norm_eucl=function(train)
train/apply(train,1,function(x)sum(x^2)^.5)
m_norm=norm_eucl(train)
result=kmeans(m_norm,3,30)
predict.kmean <- function(cluster, newdata)
{
simMat <- m_norm(rbind(cluster, newdata),
sel=(1:nrow(newdata)) + nrow(cluster))[1:nrow(cluster), ]
unname(apply(simMat, 2, which.max))
}
## assign new data samples to exemplars
predict.kmean(m_norm, x[result$cluster, ], xNew)
After i get the error
Error in predict.kmean(m_norm, x[result$cluster, ], xNew) :
unused argument (xNew)
i understand that i am making something wrong function, cause I'm just learning to do it, but I can't understand where exactly.
indeed i want adopt similar function of apcluster ( i had seen similar topic, but for apcluster)
predict.apcluster <- function(s, exemplars, newdata)
{
simMat <- s(rbind(exemplars, newdata),
sel=(1:nrow(newdata)) + nrow(exemplars))[1:nrow(exemplars), ]
unname(apply(simMat, 2, which.max))
}
## assign new data samples to exemplars
predict.apcluster(negDistMat(r=2), x[apres@exemplars, ], xNew)
how to do it?
r k-means
add a comment |
up vote
1
down vote
favorite
I want create predict function which predicts for which cluster, observation belong
data(iris)
mydata=iris
m=mydata[1:4]
train=head(m,100)
xNew=head(m,10)
rownames(train)<-1:nrow(train)
norm_eucl=function(train)
train/apply(train,1,function(x)sum(x^2)^.5)
m_norm=norm_eucl(train)
result=kmeans(m_norm,3,30)
predict.kmean <- function(cluster, newdata)
{
simMat <- m_norm(rbind(cluster, newdata),
sel=(1:nrow(newdata)) + nrow(cluster))[1:nrow(cluster), ]
unname(apply(simMat, 2, which.max))
}
## assign new data samples to exemplars
predict.kmean(m_norm, x[result$cluster, ], xNew)
After i get the error
Error in predict.kmean(m_norm, x[result$cluster, ], xNew) :
unused argument (xNew)
i understand that i am making something wrong function, cause I'm just learning to do it, but I can't understand where exactly.
indeed i want adopt similar function of apcluster ( i had seen similar topic, but for apcluster)
predict.apcluster <- function(s, exemplars, newdata)
{
simMat <- s(rbind(exemplars, newdata),
sel=(1:nrow(newdata)) + nrow(exemplars))[1:nrow(exemplars), ]
unname(apply(simMat, 2, which.max))
}
## assign new data samples to exemplars
predict.apcluster(negDistMat(r=2), x[apres@exemplars, ], xNew)
how to do it?
r k-means
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I want create predict function which predicts for which cluster, observation belong
data(iris)
mydata=iris
m=mydata[1:4]
train=head(m,100)
xNew=head(m,10)
rownames(train)<-1:nrow(train)
norm_eucl=function(train)
train/apply(train,1,function(x)sum(x^2)^.5)
m_norm=norm_eucl(train)
result=kmeans(m_norm,3,30)
predict.kmean <- function(cluster, newdata)
{
simMat <- m_norm(rbind(cluster, newdata),
sel=(1:nrow(newdata)) + nrow(cluster))[1:nrow(cluster), ]
unname(apply(simMat, 2, which.max))
}
## assign new data samples to exemplars
predict.kmean(m_norm, x[result$cluster, ], xNew)
After i get the error
Error in predict.kmean(m_norm, x[result$cluster, ], xNew) :
unused argument (xNew)
i understand that i am making something wrong function, cause I'm just learning to do it, but I can't understand where exactly.
indeed i want adopt similar function of apcluster ( i had seen similar topic, but for apcluster)
predict.apcluster <- function(s, exemplars, newdata)
{
simMat <- s(rbind(exemplars, newdata),
sel=(1:nrow(newdata)) + nrow(exemplars))[1:nrow(exemplars), ]
unname(apply(simMat, 2, which.max))
}
## assign new data samples to exemplars
predict.apcluster(negDistMat(r=2), x[apres@exemplars, ], xNew)
how to do it?
r k-means
I want create predict function which predicts for which cluster, observation belong
data(iris)
mydata=iris
m=mydata[1:4]
train=head(m,100)
xNew=head(m,10)
rownames(train)<-1:nrow(train)
norm_eucl=function(train)
train/apply(train,1,function(x)sum(x^2)^.5)
m_norm=norm_eucl(train)
result=kmeans(m_norm,3,30)
predict.kmean <- function(cluster, newdata)
{
simMat <- m_norm(rbind(cluster, newdata),
sel=(1:nrow(newdata)) + nrow(cluster))[1:nrow(cluster), ]
unname(apply(simMat, 2, which.max))
}
## assign new data samples to exemplars
predict.kmean(m_norm, x[result$cluster, ], xNew)
After i get the error
Error in predict.kmean(m_norm, x[result$cluster, ], xNew) :
unused argument (xNew)
i understand that i am making something wrong function, cause I'm just learning to do it, but I can't understand where exactly.
indeed i want adopt similar function of apcluster ( i had seen similar topic, but for apcluster)
predict.apcluster <- function(s, exemplars, newdata)
{
simMat <- s(rbind(exemplars, newdata),
sel=(1:nrow(newdata)) + nrow(exemplars))[1:nrow(exemplars), ]
unname(apply(simMat, 2, which.max))
}
## assign new data samples to exemplars
predict.apcluster(negDistMat(r=2), x[apres@exemplars, ], xNew)
how to do it?
r k-means
r k-means
asked Nov 17 at 15:00
d-max
728
728
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
2
down vote
accepted
Rather than trying to replicate something, let's come up with our own function. For a given vector x
, we want to assign a cluster using some prior k-means output. Given how k-means algorithm works, what we want is to find which cluster's center is closest to x
. That can be done as
predict.kmeans <- function(x, newdata)
apply(newdata, 1, function(r) which.min(colSums((t(x$centers) - r)^2)))
That is, we go over newdata
row by row and compute the corresponding row's distance to each of the centers and find the minimal one. Then, e.g.,
head(predict(result, train / sqrt(rowSums(train^2))), 3)
# 1 2 3
# 2 2 2
all.equal(predict(result, train / sqrt(rowSums(train^2))), result$cluster)
# [1] TRUE
which confirms that our predicting function assigned all the same clusters to the training observations. Then also
predict(result, xNew / sqrt(rowSums(xNew^2)))
# 1 2 3 4 5 6 7 8 9 10
# 2 2 2 2 2 2 2 2 2 2
Notice also that I'm calling simply predict
rather than predict.kmeans
. That is because result
is of class kmeans
and a right method is automatically chosen. Also notice how I normalize the data in a vectorized manner, without using apply
.
I am ashamed to ask you to help, because you have already helped me two times. But can you help in this topic? stackoverflow.com/questions/53359595/…
– d-max
Nov 18 at 9:49
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
Rather than trying to replicate something, let's come up with our own function. For a given vector x
, we want to assign a cluster using some prior k-means output. Given how k-means algorithm works, what we want is to find which cluster's center is closest to x
. That can be done as
predict.kmeans <- function(x, newdata)
apply(newdata, 1, function(r) which.min(colSums((t(x$centers) - r)^2)))
That is, we go over newdata
row by row and compute the corresponding row's distance to each of the centers and find the minimal one. Then, e.g.,
head(predict(result, train / sqrt(rowSums(train^2))), 3)
# 1 2 3
# 2 2 2
all.equal(predict(result, train / sqrt(rowSums(train^2))), result$cluster)
# [1] TRUE
which confirms that our predicting function assigned all the same clusters to the training observations. Then also
predict(result, xNew / sqrt(rowSums(xNew^2)))
# 1 2 3 4 5 6 7 8 9 10
# 2 2 2 2 2 2 2 2 2 2
Notice also that I'm calling simply predict
rather than predict.kmeans
. That is because result
is of class kmeans
and a right method is automatically chosen. Also notice how I normalize the data in a vectorized manner, without using apply
.
I am ashamed to ask you to help, because you have already helped me two times. But can you help in this topic? stackoverflow.com/questions/53359595/…
– d-max
Nov 18 at 9:49
add a comment |
up vote
2
down vote
accepted
Rather than trying to replicate something, let's come up with our own function. For a given vector x
, we want to assign a cluster using some prior k-means output. Given how k-means algorithm works, what we want is to find which cluster's center is closest to x
. That can be done as
predict.kmeans <- function(x, newdata)
apply(newdata, 1, function(r) which.min(colSums((t(x$centers) - r)^2)))
That is, we go over newdata
row by row and compute the corresponding row's distance to each of the centers and find the minimal one. Then, e.g.,
head(predict(result, train / sqrt(rowSums(train^2))), 3)
# 1 2 3
# 2 2 2
all.equal(predict(result, train / sqrt(rowSums(train^2))), result$cluster)
# [1] TRUE
which confirms that our predicting function assigned all the same clusters to the training observations. Then also
predict(result, xNew / sqrt(rowSums(xNew^2)))
# 1 2 3 4 5 6 7 8 9 10
# 2 2 2 2 2 2 2 2 2 2
Notice also that I'm calling simply predict
rather than predict.kmeans
. That is because result
is of class kmeans
and a right method is automatically chosen. Also notice how I normalize the data in a vectorized manner, without using apply
.
I am ashamed to ask you to help, because you have already helped me two times. But can you help in this topic? stackoverflow.com/questions/53359595/…
– d-max
Nov 18 at 9:49
add a comment |
up vote
2
down vote
accepted
up vote
2
down vote
accepted
Rather than trying to replicate something, let's come up with our own function. For a given vector x
, we want to assign a cluster using some prior k-means output. Given how k-means algorithm works, what we want is to find which cluster's center is closest to x
. That can be done as
predict.kmeans <- function(x, newdata)
apply(newdata, 1, function(r) which.min(colSums((t(x$centers) - r)^2)))
That is, we go over newdata
row by row and compute the corresponding row's distance to each of the centers and find the minimal one. Then, e.g.,
head(predict(result, train / sqrt(rowSums(train^2))), 3)
# 1 2 3
# 2 2 2
all.equal(predict(result, train / sqrt(rowSums(train^2))), result$cluster)
# [1] TRUE
which confirms that our predicting function assigned all the same clusters to the training observations. Then also
predict(result, xNew / sqrt(rowSums(xNew^2)))
# 1 2 3 4 5 6 7 8 9 10
# 2 2 2 2 2 2 2 2 2 2
Notice also that I'm calling simply predict
rather than predict.kmeans
. That is because result
is of class kmeans
and a right method is automatically chosen. Also notice how I normalize the data in a vectorized manner, without using apply
.
Rather than trying to replicate something, let's come up with our own function. For a given vector x
, we want to assign a cluster using some prior k-means output. Given how k-means algorithm works, what we want is to find which cluster's center is closest to x
. That can be done as
predict.kmeans <- function(x, newdata)
apply(newdata, 1, function(r) which.min(colSums((t(x$centers) - r)^2)))
That is, we go over newdata
row by row and compute the corresponding row's distance to each of the centers and find the minimal one. Then, e.g.,
head(predict(result, train / sqrt(rowSums(train^2))), 3)
# 1 2 3
# 2 2 2
all.equal(predict(result, train / sqrt(rowSums(train^2))), result$cluster)
# [1] TRUE
which confirms that our predicting function assigned all the same clusters to the training observations. Then also
predict(result, xNew / sqrt(rowSums(xNew^2)))
# 1 2 3 4 5 6 7 8 9 10
# 2 2 2 2 2 2 2 2 2 2
Notice also that I'm calling simply predict
rather than predict.kmeans
. That is because result
is of class kmeans
and a right method is automatically chosen. Also notice how I normalize the data in a vectorized manner, without using apply
.
answered Nov 17 at 16:00
Julius Vainora
27k75877
27k75877
I am ashamed to ask you to help, because you have already helped me two times. But can you help in this topic? stackoverflow.com/questions/53359595/…
– d-max
Nov 18 at 9:49
add a comment |
I am ashamed to ask you to help, because you have already helped me two times. But can you help in this topic? stackoverflow.com/questions/53359595/…
– d-max
Nov 18 at 9:49
I am ashamed to ask you to help, because you have already helped me two times. But can you help in this topic? stackoverflow.com/questions/53359595/…
– d-max
Nov 18 at 9:49
I am ashamed to ask you to help, because you have already helped me two times. But can you help in this topic? stackoverflow.com/questions/53359595/…
– d-max
Nov 18 at 9:49
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53352409%2fcreation-prediction-function-for-kmean-in-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown