This package provide plug-in for fuzzy clustering analysis via Rcmdr. Although it’s plugin package, you can easy analyze via command line/console on your R.
This package consist Fuzzy C-Means and Gustafson Kessel Clustering. For stability, use ensemble with vote approach. Optimal cluster via validation index, and manova analysis via Pillai Statistic. Visualize your object with biplot and radar plot.
install this package first. And then type library(Rcmdr)
to launch R commander aplication. On Tools menu choose “load plugin” and choose RcmdrPlugin.FuzzyClust
. It will restart the R Commander application.
Insert your data and perform your analysis from Statistics -> Dimensional -> Clustering -> Fuzzy Clustering.
fuzzy.CM()
perform fuzzy c-means analysis. More description of this function (parameter setting, description, and return value) explained via ?fuzzy.CM
library(RcmdrPlugin.FuzzyClust)
data(iris)
fuzzy.CM(X=iris[,1:4],K = 3,m = 2,RandomNumber = 1234)->cl
## Call:
## fuzzy.CM(X = iris[, 1:4], K = 3, m = 2, RandomNumber = 1234)
##
## Objective Function: 60.50571
## fuzzifier: 2
## Centroid:
## Sepal.Length Sepal.Width Petal.Length Petal.Width
## [1,] 5.888925 2.761067 4.363941 1.3973097
## [2,] 5.003966 3.414089 1.482815 0.2535461
## [3,] 6.775003 3.052380 5.646771 2.0535425
##
## Cluster Label:
## 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
## 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
## 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
## 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 1 3 1
## 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
## 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
## 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 1 1 1
## 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
## 1 1 1 1 1 1 1 1 1 1 3 1 3 3 3 3 1 3
## 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
## 3 3 3 3 3 1 3 3 3 3 3 1 3 1 3 1 3 3
## 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144
## 1 1 3 3 3 3 3 1 3 3 3 3 1 3 3 3 1 3
## 145 146 147 148 149 150
## 3 3 1 3 3 1
fuzzy.GK()
perform Gustafson Kessel clustering. The main differences of this method with fuzzy c-means is the distance function. GK use covarians matrix and FCM use Euclideans distances. And this function implemented the modification of GK algorithm that invented by Babuska (2002). Details and parameter use ?fuzzy.GK()
data(iris)
fuzzy.GK(X=iris[,1:4],K = 3,m = 2,RandomNumber = 1234,gamma=0)->cl
## Call:
## fuzzy.GK(X = iris[, 1:4], K = 3, m = 2, RandomNumber = 1234,
## gamma = 0)
##
## Objective Function: 31.52668
## fuzzifier: 2
## Centroid:
## Sepal.Length Sepal.Width Petal.Length Petal.Width
## [1,] 6.127994 2.801893 4.510275 1.402073
## [2,] 5.014119 3.437941 1.465400 0.244071
## [3,] 6.397884 2.975170 5.304825 2.014691
##
## Cluster Label:
## 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
## 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
## 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
## 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1
## 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
## 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1
## 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
## 1 1 1 1 1 1 1 1 1 1 1 3 3 1 1 1 1 1
## 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
## 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 1 3 1
## 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
## 1 3 3 3 3 3 3 3 3 1 1 1 3 3 1 3 3 1
## 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144
## 3 3 3 1 1 1 3 1 3 3 3 3 3 3 3 3 3 3
## 145 146 147 148 149 150
## 3 3 3 3 3 3
GK and FCM use randomization for initialize the membership matrix. So for stabilize the result this package provide ensemble clustering with SUM RULE Voting aproach. Details use ?soft.vote.ensemble
soft.vote.ensemble(iris[,1:4],seed=3,method="FCM",K=3,m=2,core=1)->Cl
## Call:
## soft.vote.ensemble(data = iris[, 1:4], seed = 3, method = "FCM",
## K = 3, m = 2, core = 1)
##
## Objective Function: 60.50571
## fuzzifier: 2
## Centroid:
## Sepal.Length Sepal.Width Petal.Length Petal.Width
## Clust 1 5.003966 3.414089 1.482815 0.2535461
## Clust 2 5.888927 2.761068 4.363945 1.3973114
## Clust 3 6.775005 3.052381 5.646774 2.0535438
##
## Cluster Label:
## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [36] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## [71] 2 2 2 2 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 2 3 3 3
## [106] 3 2 3 3 3 3 3 3 2 3 3 3 3 3 2 3 2 3 2 3 3 2 2 3 3 3 3 3 2 3 3 3 3 2 3
## [141] 3 3 2 3 3 3 2 3 3 2
The hardest question of clustering analysis is validation technique. This package provide several index that can be use to validate your result.
fuzzy.CM(X=iris[,1:4],K = 3,m = 2,RandomNumber = 1234)->cl
## Call:
## fuzzy.CM(X = iris[, 1:4], K = 3, m = 2, RandomNumber = 1234)
##
## Objective Function: 60.50571
## fuzzifier: 2
## Centroid:
## Sepal.Length Sepal.Width Petal.Length Petal.Width
## [1,] 5.888925 2.761067 4.363941 1.3973097
## [2,] 5.003966 3.414089 1.482815 0.2535461
## [3,] 6.775003 3.052380 5.646771 2.0535425
##
## Cluster Label:
## 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
## 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
## 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
## 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 1 3 1
## 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
## 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
## 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 1 1 1
## 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
## 1 1 1 1 1 1 1 1 1 1 3 1 3 3 3 3 1 3
## 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
## 3 3 3 3 3 1 3 3 3 3 3 1 3 1 3 1 3 3
## 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144
## 1 1 3 3 3 3 3 1 3 3 3 3 1 3 3 3 1 3
## 145 146 147 148 149 150
## 3 3 1 3 3 1
validation.index(cl)
## Validation Index
## MPC Index : 0.6750958
## CE Index : 0.3954918
## XB Index : 0.1369082
## S Index : 0.1369082
For analysis to proof there is a significant differences among cluster use MANOVA analysis. The statistic pillai is chosen cause the robustness for assumption.
checkManova(cl)
## Df Pillai approx F num Df den Df Pr(>F)
## factor(Cluster) 2 1.272997 63.47448 8 290 2.592269e-59
## Residuals 147 NA NA NA NA NA
Visualize your result with biplot and radar plot for easy interpretation your cluster result.
biploting(cl) -> biplotcluster
radar.plotting(cl) ->radarplot
## Using Cluster as id variables