0% found this document useful (0 votes)
214 views6 pages

Tune Gaussian Mixture Models - MATLAB & Simulink - MathWorks India

This document discusses tuning Gaussian mixture models (GMMs) to cluster Iris flower data. It fits GMMs with varying numbers of components (k) and covariance structures, and selects the best model based on AIC and BIC scores. The best model has 3 components and a full, unshared covariance matrix structure. This model clusters the data with 92% accuracy compared to the true labels.

Uploaded by

sunshinesun49
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
214 views6 pages

Tune Gaussian Mixture Models - MATLAB & Simulink - MathWorks India

This document discusses tuning Gaussian mixture models (GMMs) to cluster Iris flower data. It fits GMMs with varying numbers of components (k) and covariance structures, and selects the best model based on AIC and BIC scores. The best model has 3 components and a full, unshared covariance matrix structure. This model clusters the data with 92% accuracy compared to the true labels.

Uploaded by

sunshinesun49
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

5/8/2016

TuneGaussianMixtureModelsMATLAB&SimulinkMathWorksIndia

TuneGaussianMixtureModels
ThisexampleshowshowtodeterminethebestGaussianmixturemodel(GMM)fitbyadjustingthe
numberofcomponentsandthecomponentcovariancematrixstructure.

OpenThisExample

LoadFisher'sirisdataset.Considerthepetalmeasurementsaspredictors.
loadfisheriris;
X=meas(:,3:4);
[n,p]=size(X);
rng(1);%Forreproducibility
figure;
plot(X(:,1),X(:,2),'.','MarkerSize',15);
title('Fisher''sIrisDataSet');
xlabel('Petallength(cm)');
ylabel('Petalwidth(cm)');

Supposekisthenumberofdesiredcomponentsorclusters,and isthecovariancestructureforallcomponents.Follow
thesestepstotuneaGMM.
1.Choosea(k, )pair,andthenfitaGMMusingthechosenparameterspecificationandtheentiredataset.
2.EstimatetheAICandBIC.
3.Repeatsteps1and2untilyouexhaustall(k, )pairsofinterest.
4.ChoosethefittedGMMthatbalanceslowAICwithsimplicity.
Forthisexample,chooseagridofvaluesforkthatinclude2and3,andsomesurroundingnumbers.Specifyallavailable
choicesforcovariancestructure.Ifkistoohighforthedataset,thentheestimatedcomponentcovariancescanbebadly
conditioned.Specifytouseregularizationtoavoidbadlyconditionedcovariancematrices.IncreasethenumberofEM
algorithmiterationsto10000.
http://in.mathworks.com/help/stats/tunegaussianmixturemodels.html

1/6

5/8/2016

TuneGaussianMixtureModelsMATLAB&SimulinkMathWorksIndia

k=1:5;
nK=numel(k);
Sigma={'diagonal','full'};
nSigma=numel(Sigma);
SharedCovariance={true,false};
SCtext={'true','false'};
nSC=numel(SharedCovariance);
RegularizationValue=0.01;
options=statset('MaxIter',10000);
FittheGMMsusingallparametercombination.ComputetheAICandBICforeachfit.Tracktheterminalconvergence
statusofeachfit.
%Preallocation
gm=cell(nK,nSigma,nSC);
aic=zeros(nK,nSigma,nSC);
bic=zeros(nK,nSigma,nSC);
converged=false(nK,nSigma,nSC);
%Fitallmodels
form=1:nSC;
forj=1:nSigma;
fori=1:nK;
gm{i,j,m}=fitgmdist(X,k(i),...
'CovarianceType',Sigma{j},...
'SharedCovariance',SharedCovariance{m},...
'RegularizationValue',RegularizationValue,...
'Options',options);
aic(i,j,m)=gm{i,j,m}.AIC;
bic(i,j,m)=gm{i,j,m}.BIC;
converged(i,j,m)=gm{i,j,m}.Converged;
end
end
end
allConverge=(sum(converged(:))==nK*nSigma*nSC)
allConverge=
1
gmisacellarraycontainingallofthefittedgmdistributionmodelobjects.Allofthefittinginstancesconverged.
PlotseparatebarchartstocomparetheAICandBICamongallfits.Groupthebarsbyk.

http://in.mathworks.com/help/stats/tunegaussianmixturemodels.html

2/6

5/8/2016

TuneGaussianMixtureModelsMATLAB&SimulinkMathWorksIndia

figure;
bar(reshape(aic,nK,nSigma*nSC));
title('AICForVarious$k$and$\Sigma$Choices','Interpreter','latex');
xlabel('$k$','Interpreter','Latex');
ylabel('AIC');
legend({'Diagonalshared','Fullshared','Diagonalunshared',...
'Fullunshared'});
figure;
bar(reshape(bic,nK,nSigma*nSC));
title('BICForVarious$k$and$\Sigma$Choices','Interpreter','latex');
xlabel('$c$','Interpreter','Latex');
ylabel('BIC');
legend({'Diagonalshared','Fullshared','Diagonalunshared',...
'Fullunshared'});

http://in.mathworks.com/help/stats/tunegaussianmixturemodels.html

3/6

5/8/2016

TuneGaussianMixtureModelsMATLAB&SimulinkMathWorksIndia

AccordingtotheAICandBICvalues,thebestmodelhas3componentsandafull,unsharedcovariancematrixstructure.
Clusterthetrainingdatausingthebestfittingmodel.Plottheclustereddataandthecomponentellipses.
gmBest=gm{3,2,2};
clusterX=cluster(gmBest,X);
kGMM=gmBest.NumComponents;
d=500;
x1=linspace(min(X(:,1))2,max(X(:,1))+2,d);
x2=linspace(min(X(:,2))2,max(X(:,2))+2,d);
[x1grid,x2grid]=meshgrid(x1,x2);
X0=[x1grid(:)x2grid(:)];
mahalDist=mahal(gmBest,X0);
threshold=sqrt(chi2inv(0.99,2));
figure;
h1=gscatter(X(:,1),X(:,2),clusterX);
holdon;
forj=1:kGMM;
idx=mahalDist(:,j)<=threshold;
Color=h1(j).Color*0.75+0.5*(h1(j).Color1);
h2=plot(X0(idx,1),X0(idx,2),'.','Color',Color,'MarkerSize',1);
uistack(h2,'bottom');
end
h3=plot(gmBest.mu(:,1),gmBest.mu(:,2),'kx','LineWidth',2,'MarkerSize',10);
title('ClusteredDataandComponentStructures');
xlabel('Petallength(cm)');
ylabel('Petalwidth(cm)');
legend(h1,'Cluster1','Cluster2','Cluster3','Location','NorthWest');
holdoff
http://in.mathworks.com/help/stats/tunegaussianmixturemodels.html

4/6

5/8/2016

TuneGaussianMixtureModelsMATLAB&SimulinkMathWorksIndia

Thisdatasetincludeslabels.DeterminehowwellgmBestclustersthedatabycomparingeachpredictiontothetrue
labels.
species=categorical(species);
Y=zeros(n,1);
Y(species=='versicolor')=1;
Y(species=='virginica')=2;
Y(species=='setosa')=3;
miscluster=Y~=clusterX;
clusterError=sum(miscluster)/n
clusterError=
0.0800
ThebestfittingGMMgroups8%oftheobservationsintothewrongcluster.
clusterdoesnotalwayspreserveclusterorder.Thatis,ifyouclusterseveralfittedgmdistributionmodels,cluster
mightassigndifferentclusterlabelsforsimilarcomponents.

SeeAlso
cluster|fitgmdist|gmdistribution

RelatedExamples

ClusterDatafromMixtureofGaussianDistributions

ClusterGaussianMixtureDataUsingSoftClustering

MoreAbout
http://in.mathworks.com/help/stats/tunegaussianmixturemodels.html

5/6

5/8/2016

TuneGaussianMixtureModelsMATLAB&SimulinkMathWorksIndia

GaussianMixtureModels

ClusteringUsingGaussianMixtureModels

http://in.mathworks.com/help/stats/tunegaussianmixturemodels.html

6/6

You might also like