Tune Gaussian Mixture Models - MATLAB & Simulink - MathWorks India
Tune Gaussian Mixture Models - MATLAB & Simulink - MathWorks India
TuneGaussianMixtureModelsMATLAB&SimulinkMathWorksIndia
TuneGaussianMixtureModels
ThisexampleshowshowtodeterminethebestGaussianmixturemodel(GMM)fitbyadjustingthe
numberofcomponentsandthecomponentcovariancematrixstructure.
OpenThisExample
LoadFisher'sirisdataset.Considerthepetalmeasurementsaspredictors.
loadfisheriris;
X=meas(:,3:4);
[n,p]=size(X);
rng(1);%Forreproducibility
figure;
plot(X(:,1),X(:,2),'.','MarkerSize',15);
title('Fisher''sIrisDataSet');
xlabel('Petallength(cm)');
ylabel('Petalwidth(cm)');
Supposekisthenumberofdesiredcomponentsorclusters,and isthecovariancestructureforallcomponents.Follow
thesestepstotuneaGMM.
1.Choosea(k, )pair,andthenfitaGMMusingthechosenparameterspecificationandtheentiredataset.
2.EstimatetheAICandBIC.
3.Repeatsteps1and2untilyouexhaustall(k, )pairsofinterest.
4.ChoosethefittedGMMthatbalanceslowAICwithsimplicity.
Forthisexample,chooseagridofvaluesforkthatinclude2and3,andsomesurroundingnumbers.Specifyallavailable
choicesforcovariancestructure.Ifkistoohighforthedataset,thentheestimatedcomponentcovariancescanbebadly
conditioned.Specifytouseregularizationtoavoidbadlyconditionedcovariancematrices.IncreasethenumberofEM
algorithmiterationsto10000.
http://in.mathworks.com/help/stats/tunegaussianmixturemodels.html
1/6
5/8/2016
TuneGaussianMixtureModelsMATLAB&SimulinkMathWorksIndia
k=1:5;
nK=numel(k);
Sigma={'diagonal','full'};
nSigma=numel(Sigma);
SharedCovariance={true,false};
SCtext={'true','false'};
nSC=numel(SharedCovariance);
RegularizationValue=0.01;
options=statset('MaxIter',10000);
FittheGMMsusingallparametercombination.ComputetheAICandBICforeachfit.Tracktheterminalconvergence
statusofeachfit.
%Preallocation
gm=cell(nK,nSigma,nSC);
aic=zeros(nK,nSigma,nSC);
bic=zeros(nK,nSigma,nSC);
converged=false(nK,nSigma,nSC);
%Fitallmodels
form=1:nSC;
forj=1:nSigma;
fori=1:nK;
gm{i,j,m}=fitgmdist(X,k(i),...
'CovarianceType',Sigma{j},...
'SharedCovariance',SharedCovariance{m},...
'RegularizationValue',RegularizationValue,...
'Options',options);
aic(i,j,m)=gm{i,j,m}.AIC;
bic(i,j,m)=gm{i,j,m}.BIC;
converged(i,j,m)=gm{i,j,m}.Converged;
end
end
end
allConverge=(sum(converged(:))==nK*nSigma*nSC)
allConverge=
1
gmisacellarraycontainingallofthefittedgmdistributionmodelobjects.Allofthefittinginstancesconverged.
PlotseparatebarchartstocomparetheAICandBICamongallfits.Groupthebarsbyk.
http://in.mathworks.com/help/stats/tunegaussianmixturemodels.html
2/6
5/8/2016
TuneGaussianMixtureModelsMATLAB&SimulinkMathWorksIndia
figure;
bar(reshape(aic,nK,nSigma*nSC));
title('AICForVarious$k$and$\Sigma$Choices','Interpreter','latex');
xlabel('$k$','Interpreter','Latex');
ylabel('AIC');
legend({'Diagonalshared','Fullshared','Diagonalunshared',...
'Fullunshared'});
figure;
bar(reshape(bic,nK,nSigma*nSC));
title('BICForVarious$k$and$\Sigma$Choices','Interpreter','latex');
xlabel('$c$','Interpreter','Latex');
ylabel('BIC');
legend({'Diagonalshared','Fullshared','Diagonalunshared',...
'Fullunshared'});
http://in.mathworks.com/help/stats/tunegaussianmixturemodels.html
3/6
5/8/2016
TuneGaussianMixtureModelsMATLAB&SimulinkMathWorksIndia
AccordingtotheAICandBICvalues,thebestmodelhas3componentsandafull,unsharedcovariancematrixstructure.
Clusterthetrainingdatausingthebestfittingmodel.Plottheclustereddataandthecomponentellipses.
gmBest=gm{3,2,2};
clusterX=cluster(gmBest,X);
kGMM=gmBest.NumComponents;
d=500;
x1=linspace(min(X(:,1))2,max(X(:,1))+2,d);
x2=linspace(min(X(:,2))2,max(X(:,2))+2,d);
[x1grid,x2grid]=meshgrid(x1,x2);
X0=[x1grid(:)x2grid(:)];
mahalDist=mahal(gmBest,X0);
threshold=sqrt(chi2inv(0.99,2));
figure;
h1=gscatter(X(:,1),X(:,2),clusterX);
holdon;
forj=1:kGMM;
idx=mahalDist(:,j)<=threshold;
Color=h1(j).Color*0.75+0.5*(h1(j).Color1);
h2=plot(X0(idx,1),X0(idx,2),'.','Color',Color,'MarkerSize',1);
uistack(h2,'bottom');
end
h3=plot(gmBest.mu(:,1),gmBest.mu(:,2),'kx','LineWidth',2,'MarkerSize',10);
title('ClusteredDataandComponentStructures');
xlabel('Petallength(cm)');
ylabel('Petalwidth(cm)');
legend(h1,'Cluster1','Cluster2','Cluster3','Location','NorthWest');
holdoff
http://in.mathworks.com/help/stats/tunegaussianmixturemodels.html
4/6
5/8/2016
TuneGaussianMixtureModelsMATLAB&SimulinkMathWorksIndia
Thisdatasetincludeslabels.DeterminehowwellgmBestclustersthedatabycomparingeachpredictiontothetrue
labels.
species=categorical(species);
Y=zeros(n,1);
Y(species=='versicolor')=1;
Y(species=='virginica')=2;
Y(species=='setosa')=3;
miscluster=Y~=clusterX;
clusterError=sum(miscluster)/n
clusterError=
0.0800
ThebestfittingGMMgroups8%oftheobservationsintothewrongcluster.
clusterdoesnotalwayspreserveclusterorder.Thatis,ifyouclusterseveralfittedgmdistributionmodels,cluster
mightassigndifferentclusterlabelsforsimilarcomponents.
SeeAlso
cluster|fitgmdist|gmdistribution
RelatedExamples
ClusterDatafromMixtureofGaussianDistributions
ClusterGaussianMixtureDataUsingSoftClustering
MoreAbout
http://in.mathworks.com/help/stats/tunegaussianmixturemodels.html
5/6
5/8/2016
TuneGaussianMixtureModelsMATLAB&SimulinkMathWorksIndia
GaussianMixtureModels
ClusteringUsingGaussianMixtureModels
http://in.mathworks.com/help/stats/tunegaussianmixturemodels.html
6/6