Type: | Package |
Title: | A Modern K-Means (MKMeans) Clustering Algorithm |
Version: | 3.2 |
Date: | 2025-08-20 |
Depends: | methods, MASS |
Description: | It's a Modern K-Means clustering algorithm which works for data of any number of dimensions, has no limit with number of clusters expected, and can start with any initial cluster centers. |
Collate: | AllClasses.R MKMeans.R C.f.R Dist.R |
License: | GPL-2 |
NeedsCompilation: | no |
Packaged: | 2025-08-20 13:52:24 UTC; Yarong |
Author: | Yarong Yang [aut, cre], Nader Ebrahimi [ctb], Yoram Rubin [ctb], Jacob Zhang [ctb] |
Maintainer: | Yarong Yang <Yi.YA_yaya@hotmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-08-20 14:40:07 UTC |
Modern K-Means (MKMeans) Clustering.
Description
It's a Modern K-Means clustering algorithm which works for data of any number of dimensions, has no limit with the number of clusters expected, and can start with any initial cluster centers.
Details
Package: | MKMeans |
Type: | Package |
Version: | 3.2 |
Date: | 2025-08-20 |
License: | GPL-2 |
Author(s)
Yarong Yang, Nader Ebrahimi, Yoram Rubin, and Jacob Zhang
References
Yarong Yang, Nader Ebrahimi, Yoram Rubin, and Jacob Zhang.(2025) MKMeans: A Modern K-Means Clustering Algorithm. technical report
Examples
# Example 1:
# Generate 20 bivarate samples
x<-rnorm(20,0,1)
y<-rnorm(20,1,1)
data.test<-cbind(x,y)
# Conduct MKMeans analysis with K=3 and taking the first 3 samples as initial cluster centers
Res<-MKMeans(data.test,3,1,iteration=1000,tol=.95,type=1)
Ress<-Res
names(Ress@Classes[[1]])<-rep("red",length(Res@Classes[[1]]))
names(Ress@Classes[[2]])<-rep("blue",length(Res@Classes[[2]]))
names(Ress@Classes[[3]])<-rep("green",length(Res@Classes[[3]]))
Cols<-names(sort(c(Ress@Classes[[1]],Ress@Classes[[2]],Ress@Classes[[3]])))
plot(x,y,type="p",col=Cols,lwd=2)
points(Res@Centers,pch=15,col=c("red","blue","green"))
# Example 2:
library(MASS)
# Generate 10 bivariate normal samples
mu1 <- c(0, 0)
sigma1 <- matrix(c(1, 0.5, 0.5, 1), nrow=2)
SP1 <- mvrnorm(n=10, mu=mu1, Sigma=sigma1)
# Generate another 10 bivariate normal samples
mu2<-c(1,1)
sigma2<-matrix(c(1,0,0,1),nrow=2)
SP2<-mvrnorm(n=10,mu=mu2,Sigma=sigma2)
# Generate 10 more new bivariate normal samples
mu3<-c(2,2)
sigma3<-matrix(c(1,0.5,0.5,1),nrow=2)
SP3<-mvrnorm(n=10,mu=mu3,Sigma=sigma3)
# Combine the three groups of bivariate normal samples
data<-rbind(SP1,SP2,SP3)
# Conduct MKMeans analysis with K=4 and randomly picking four samples as initial cluster centers
Res<-MKMeans(data,4,data[sample(1:30,4),],iteration=1000,tol=.95,type=1)
names(Res@Classes[[1]])<-rep("red",length(Res@Classes[[1]]))
names(Res@Classes[[2]])<-rep("blue",length(Res@Classes[[2]]))
names(Res@Classes[[3]])<-rep("green",length(Res@Classes[[3]]))
names(Res@Classes[[4]])<-rep("black",length(Res@Classes[[4]]))
Cols<-names(sort(c(Res@Classes[[1]],Res@Classes[[2]],Res@Classes[[3]],Res@Classes[[4]])))
plot(data[,1],data[,2],type="p",pch=19,col=Cols,lwd=2,xlab="",ylab="")
points(Res@Centers,pch=5,col=c("red","blue","green","black"))
Finding the center of a cluster.
Description
It's a function of finding the center of a cluster.
Usage
C.f(dat, type)
Arguments
dat |
Numeric. A cluster matrix with each row being an observaion. |
type |
Integer. The type of distance between observations. 1 for Euclidean distance. 2 for Manhattan distance. 3 for maximum deviation along dimensions. |
Value
A vector.
Author(s)
Yarong Yang
Examples
x<-rnorm(5,0,1)
y<-rnorm(5,1,1)
data<-cbind(x,y)
Res<-C.f(dat=data,type=1)
Finding the distance between two observations.
Description
It's a function of finding the distance between two observations.
Usage
Dist(x,y,type)
Arguments
x |
Numeric. A vector denoting an observation. |
y |
Numeric. A vector denoting an observation. |
type |
Integer. The type of distance between observations. 1 for Euclidean distance. 2 for Manhattan distance. 3 for maximum deviation among dimensions. |
Value
A numeric number.
Examples
x<-rnorm(10,0,1)
y<-rnorm(10,1,1)
z<-rnorm(10,2,1)
data<-cbind(x,y,z)
Res<-Dist(data[1,],data[2,],type=1)
Class to contain the results from function MKMeans.
Description
The function MKMeans return object of class MKMean that contains the number of clusters, the center of each cluster, and the observations in each cluster.
Objects from the Class
new("MKMean",K=new("numeric"),Centers=new("matrix"),Classes=new("list"),Clusters=new("list"))
Slots
K
:An integer being the number of clusters.
Centers
:A numeric matrix with each row being center of a cluster.
Classes
:An integer list showing the original indexes of the observations in each cluster.
Clusters
:A numeric list showing the observations in each cluster.
Author(s)
Yarong Yang
References
Yarong Yang, Nader Ebrahimi, Yoram Rubin, and Jacob Zhang.(2025) MKMeans: A Modern K-Means Clustering Algorithm. technical report
Examples
showClass("MKMean")
Modern K-Means clustering.
Description
It's a Modern K-Means clustering algorithm which works for data of any number of dimensions, has no limit with the number of clusters expected, and can start with any initial cluster centers.
Usage
MKMeans(data, K, initial, iteration, tol, type)
Arguments
data |
Numeric. An observation matrix with each row being an oberservation. |
K |
Integer. The number of clusters expected. |
initial |
Numeric. Either the selected initial center matrix with each row being an observation, or 1 for the first K rows of the data matrix being the intial center. |
iteration |
Integer. The number of the most iterations wanted for the clustering process. |
tol |
Numeric. The minimum acceptable percentage of stable observations to stop the clustering process, basically greater than 0.5 to guarantee the value of the results. |
type |
Integer. The type of distance between observations. 1 for Euclidean distance. 2 for Manhattan distance. 3 for maximum deviation among dimensions. |
Value
An object of class MKMean.
Author(s)
Yarong Yang
References
Yarong Yang, Nader Ebrahimi, Yoram Rubin, and Jacob Zhang.(2025) MKMeans: A Modern K-Means Clustering Algorithm. technical report
Examples
library(MASS)
# Generate 10 bivariate normal samples
mu1 <- c(0, 0)
sigma1 <- matrix(c(1, 0.5, 0.5, 1), nrow=2)
SP1 <- mvrnorm(n=10, mu=mu1, Sigma=sigma1)
# Generate another 10 bivariate normal samples
mu2<-c(1,1)
sigma2<-matrix(c(1,0,0,1),nrow=2)
SP2<-mvrnorm(n=10,mu=mu2,Sigma=sigma2)
# Generate 10 more new bivariate normal samples
mu3<-c(2,2)
sigma3<-matrix(c(1,0.5,0.5,1),nrow=2)
SP3<-mvrnorm(n=10,mu=mu3,Sigma=sigma3)
# Combine the three groups of bivariate normal samples
data<-rbind(SP1,SP2,SP3)
# Conduct MKMeans analysis with K=3 and randomly picking three samples as initial cluster centers
Res<-MKMeans(data,3,data[sample(1:30,3),],iteration=1000,tol=.95,type=1)
names(Res@Classes[[1]])<-rep("red",length(Res@Classes[[1]]))
names(Res@Classes[[2]])<-rep("blue",length(Res@Classes[[2]]))
names(Res@Classes[[3]])<-rep("green",length(Res@Classes[[3]]))
Cols<-names(sort(c(Res@Classes[[1]],Res@Classes[[2]],Res@Classes[[3]])))
plot(data[,1],data[,2],type="p",pch=19,col=Cols,lwd=2,xlab="",ylab="")
points(Res@Centers,pch=5,col=c("red","blue","green"))
# Compare the clustering results with the original samples
par(mfrow=c(1,2))
plot(data[,1],data[,2],type="p",pch=19,col=rep(c("sky blue","orange","purple"),rep(10,3)),
lwd=2,xlab="",ylab="",main="Original Data")
plot(data[,1],data[,2],type="p",pch=19,col=Cols,lwd=2,xlab="",ylab="",
main="MKMeans Clustering Results")
points(Res@Centers,pch=5,col=c("red","blue","green"))
# conduct MKMeans analysis with K=4 and randomly picking four samples as initial cluster centers
Res<-MKMeans(data,4,data[sample(1:30,4),],iteration=1000,tol=.95,type=1)
names(Res@Classes[[1]])<-rep("red",length(Res@Classes[[1]]))
names(Res@Classes[[2]])<-rep("blue",length(Res@Classes[[2]]))
names(Res@Classes[[3]])<-rep("green",length(Res@Classes[[3]]))
names(Res@Classes[[4]])<-rep("black",length(Res@Classes[[4]]))
Cols<-names(sort(c(Res@Classes[[1]],Res@Classes[[2]],Res@Classes[[3]],Res@Classes[[4]])))
plot(data[,1],data[,2],type="p",pch=19,col=Cols,lwd=2,xlab="",ylab="")
points(Res@Centers,pch=5,col=c("red","blue","green","black"))
# Compare the clustering results with the original data
par(mfrow=c(1,2))
plot(data[,1],data[,2],type="p",pch=19,col=rep(c("sky blue","orange","purple"),rep(10,3)),
lwd=2,xlab="",ylab="",main="Original Data")
plot(data[,1],data[,2],type="p",pch=19,col=Cols,lwd=2,xlab="",ylab="",
main="MKMeans Clustering Results")
points(Res@Centers,pch=5,col=c("red","blue","green","black"))