mixedClust

Description

mixedClust is an R package to perform co-clustering on heterogeneous data. The kind of data that are taken into account are: * Categorical * Quantitative * Integer * Ordinal * Functional

Installation

set.seed(5)
library(mixedClust)

Datasets

under construction

Simulation of heterogeneous data

The following codes simulate a sample of heterogeneous data.

M <- matrix(0, nrow=150,ncol=250)

Simulation of categorical data

This snippet creates a sample of categorical data with 6 levels.

Simulation of quantitative data

Simulation of ordinal data

The model Bos is used to simulate ordinal data. This snippet creates a sample of ordinal data with 5 levels.

Shuffling lines and columns

Setting parameters

nbSEM=120
nbSEMburn=100
nbindmini=1
init = "kmeans"

kr=2
kc=c(2,2,3)
m=c(6,5)
d.list <- c(1,126,176)
distributions <- c("Multinomial","Gaussian","Bos")

Perform co-clustering

In this section, a co-clustering is executed with the simulated dataset, thanks to the mixedCoclust function.

res <- mixedCoclust(x = M1, myList = d.list,distrib_names = distributions,
                    kr = kr, kc = kc, m = m, init = init,nbSEM = nbSEM,
                    nbSEMburn = nbSEMburn, nbindmini = nbindmini)

The particular case of functional data

Functional data is taken into account in this package. However, the way of introducing them is a bit different since they are not represented by a simple matrix. Functional data must be stored in a functionalData array with three dimensions: * nrow = number of row that must be identical to the number of rows of the x data matrix. * ncol = number of features of the functional type * nslice = number of points for one function (all functions must have the same number of points) Then, functionalData is passed as an argument to the different functions (co-clustering, clustering, classification).

Simulation of functional data

The fda.usc package is used to simulate functional data

The functionalData array is built:

Setting parameters

One of the limitation of functional data is that the kmeans algorithm cannot be used as initialization.

Performing co-clustering with functional data

References