Assignment2 Final Report Random Search Optimization and Meta Learning
Assignment2 Final Report Random Search Optimization and Meta Learning
Assignment2 Final Report Random Search Optimization and Meta Learning
Bedfordshire – Luton
Campus
Undergraduate Year 2
Computer Science
Student: Diana Maria
Frunza
Student ID: 1810587
Assignment 2: Random Search
Optimization and Meta Learning
CIS006-2 - Concepts and Technologies
of Artificial Intelligence
Table of Contents
INTRODUCTION.....................................................................................................................................2
Facial recognition..............................................................................................................................2
Learning outcomes............................................................................................................................2
Task...................................................................................................................................................2
GETTING STARTED.................................................................................................................................3
FIRST TRY...............................................................................................................................................3
EXPERIMENTS AND TESTS......................................................................................................................4
FOLD 1 (fi = 1)....................................................................................................................................4
FOLD 2 (fi = 2)....................................................................................................................................8
FOLD 3 (fi = 3)....................................................................................................................................8
CONCLUSIONS.....................................................................................................................................10
REFERENCES........................................................................................................................................10
APPENDICES.........................................................................................................................................11
1
INTRODUCTION
Facial recognition
The technology which is used mostly for security purposes and crime prevention, biometric
facial recognition has helped individuals across the world to keep their belongings safe,
physical or psychical, from the expensive jewelleries and goods, to pictures and sensitive data
on their devices. If passwords can be forgotten, access to recovery ways can be lost, but the
facial features of a person will have a helpful continuity in the safekeeping field.
Learning outcomes
The main desired outcomes of this assignment are related to a student’s ability to solve a
problem which needs a considerable amount of effort with a minimum investment, to his
ability to justify his choices and solutions using rational and real arguments, and last, but not
least, to analyse and compare results and performances of an artefact.
Task
Optimising the structure and parameters of Artificial Neuronal Networks (ANNs) using one
or more strategies is the main task of this assignment. The purpose is to reach a performance
score at least higher than we did for the previous assignment (biometric facial recognition).
Achieving a better performance score will be particularly challenging, especially considering
the fact that for the last assignment, I managed to reach a performance score of 99%.
The choices of strategies are: Random search, Meta learning, Adaptive boosting, or Cascade
correlation. Also, considering the current chaotic environment, due to COVID-19 pandemic,
there are alternatives related to this topic. For this assignment, I chose to continue the work I
started for Assignment 1 and the optimisation strategy that I chose to develop further is
Random Search.
Most of the references used during the research phase of this assignment are online tutorials
and not books, and this is due to the fast-changing environment and development of Artificial
Intelligence and Machine Learning technologies linked to methods to enhance the
effectiveness of the final artefact.
MATLAB was used to run the scripts needed for experimenting the performance of the
chosen strategy, but there were a few challenges that had to be overtaken before being able to
start the actual experiments. Downloading the software required to run the scripts was an
easy step, I found it free on https://uk.mathworks.com/downloads/ , but it required registering
with the university’s email address. Also, at first, I only installed the main package for
MATLAB, which seemed to be insufficient as soon as I started running the code. I ended up
also installing the following add-ons: Deep learning toolbox and Statistics and machine
learning toolbox.
GETTING STARTED
In order to be able to start experimenting, I had to create a MATLAB project, where I
inserted the default scripts provided. To be able to run these scripts, I had to modify the pt
variable’s value to the actual path (absolute path – in the same folder with the project and the
2
script files) for the folder containing 1500 images of subjects for the facial recognition
process.
There are 3 files containing the scripts, suggestively named call_im.m, call_fold_data.m and
call_pw_annealing_main.m. In MATLAB’s command centre we have to use the following
commands in this order:
>> call_im(); - reads the 1500 pictures stored in Tr0 folder. Face data is placed in the D1
variable. Data is also prepared now for the next step, 3 fold cross validation.
>> D1=call_fold_data(); - makes the variable D1 appear in the Workspace of MATLAB
>> call_pw_annealing_main(D1,fi); - final script is run with the D1 data and a value which
has to be set for the variable fi. The recommended value is 3, but it can also be 2 or 1.
The variables which will be modified and analysed to find the optimal combination for a
higher performance (and their default values) are:
nohn = 1; % nof hidden neurons
esize = 3; % ensemble size
nofcp = 80; % prior nof pc
minpc = 20; % minimal nof pc
av = 2; % annealing variance
FIRST TRY
The first successful run of the scripts revealed a performance score of 0.802.
Validation error reached a maximum of 0.2121.
This was done using the default values and the recommended fi value of 3. The processing
time was considerably long, 2 minutes and 5 seconds.
EXPERIMENTS AND
TESTS
The recommended value for variable fi was 3, so the initial tests were done using this value.
Before starting the trials on nohn, esize, nofcp, minpc and av variables, I will run the scripts
once to see the results for a fi value of 2 and 1 as well.
3
FOLD 1 (fi = 1)
1. With default set of nohn, esize, nofcp, minpc and av
Performance score pf = 0.82
Max validation error = 0.2059
Execution time = 3 minutes and 24 seconds
2. Nohn of 4 vs initial value 1
Performance score pf = 0.832
Max validation error = 0.1818
Execution time = 3 minutes and 20 seconds
3. Nohn of 3 vs initial value 1
Performance score pf = 0.858
Max validation error = 0.1515
Execution time = 3 minutes and 2 seconds
4. Nohn of 2 vs initial value 1
Performance score pf = 0.832
Max validation error = 0.1515
Execution time = 3 minutes and 17 seconds
Choosing from the previous 4 tests, the best performance was reached while using the value 3
for the nohn variable, so that will be the one that we will use for the rest of FOLD 1 trials.
4
7. Esize of 6 vs initial value 3
Performance score pf = 0.844
Max validation error = 0.1818
Execution time = 8 minutes and 16 seconds
8. Esize of 4 vs initial value 3
Performance score pf = 0.843
Max validation error = 0.1818
Execution time = 5 minutes and 41 seconds
9. Esize of 2 vs initial value 3
Performance score pf = 0.852
Max validation error = 0.1818
Execution time = 2 minutes and 51 seconds
After 5 experiments with different values for esize variable, we observed that the
performance score is getting higher if the value of the variable is higher. However, because
processing time is too high, for the following tests we will keep esize on a small value, let’s
say 2, and after all the rest of the trials are done, we will run the scripts one more time with
an esize of 12 or even more.
5
Performance score pf = 0.88
Max validation error = 0.1818
Execution time = 4 minutes and 33 seconds
Increasing the value of the nofcp variable revealed a performance increase as well: for a
nofcp value of 20, the pf was 75.6%, but for a nofcp value of 140 the pf was 88%. We will
continue the rest of the tests using nofcp value of 140.
6
After testing the results for a change of the minpc variable for 5 different values, I observed
that the higher performance is reached using a minimum value for the variable minpc,
therefore the chosen value for the upcoming trials is 1.
The performance scores obtained with different values of av variable do not seem to be inter-
connected and they do not depend on this variable. They are appearing to be random, so I will
choose for the av variable the value which resulted in the higher performance score, and that
is an av of 15. Up until this point, the ‘ideal’ values for the variables that we work with are:
Nohn = 3; Esize = 12; Nofcp = 140; Minpc = 1; Av = 15
The trial will be once again run for these values, since we have used a small value for esize to
minimize the time invested into the testing part.
7
Performance score pf = 0.908
Max validation error = 0.1818
Execution time = 31 minutes and 7 seconds
FOLD 2 (fi = 2)
24. With default set of nohn, esize, nofcp, minpc and av
Performance score pf = 0.828
Max validation error = 0.2353
Execution time = 3 minutes and 5 seconds
25. With the chosen values which had a high-performance score on FOLD 1
Performance score pf = 0.908
Max validation error = 0.1471
Execution time = 40 minutes and 15 seconds
FOLD 3 (fi = 3)
26. With default set of nohn, esize, nofcp, minpc and av
Performance score pf = 0.802
Max validation error = 0.2121
Execution time = 2 minutes and 55 seconds
27. With the chosen values which had a high-performance score on FOLD 1
Performance score pf = 0.888
Max validation error = 0.2121
Execution time = 38 minutes and 35 seconds
28. Using the values for the best performance until now, but increasing the Esize to 20
nohn=3;
esize=20;
nofcp=100;
minpc=0.1;
av=15;
The best performance was achieved with the 29th try, a pf of 0.92. However, the run time was
63 minutes and 50 seconds,
which is enormous, and a
8
performance of 92% is not ideal, therefore, I will keep on testing different combination.
Increasing or decreasing one value and getting a better performance score does not
necessarily mean that in combination with another variable change, it will also reveal an
improvement.
29. The final values used were the following, with a fi value of 3:
nohn=3; % nof hidden neurons
esize=7; % ensemble size
nofcp=100; % prior nof pc
minpc=0.1; % min nof pc
av=15; % annealing variance
After a few more hours of running the scrips and experimenting with the variable’s values, I
have reached an historical maximum performance score of 0.958.
30. Increasing esize value to 20 after last try – for research purpose only
I am experimenting one final time with a higher value for the esize variable, just to prove that
it has an enormous influence
on the performance score, but I
will still chose as a final
combination the one I
previously mentioned, due to
the fact that the running time
is too long if we use a higher esize value. For an esize of 20, the performance score reached
an historical maximum of 0.984.
CONCLUSIONS
Optimising the performance score of the facial recognition scripts which were already
provided was incredibly challenging and mostly, time consuming due to the fact that I have
used high values for the esize variable. Also, the quality of the hardware used will also
influence the execution time and I have used a laptop with an I7 processor with 12 GB of
RAM. Measuring the execution times was done using the Profiler from MATLAB. The final
version and most efficient from a time/performance point of view was the one from the 29th
trial (performance score of 0.958). A higher performance score (0.984) can easily be reached
with a higher esize, but the execution time will explode.
REFERENCES
9
AL-Allaf, O. N. A., 2014. Review of Face Detection Systems Based Artificial Neural Networks
Algorithms. Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing
(cs.NE), 6(1), p. 16.
Rae, R. & Ritter, H., 2008. Recognition of human head orientation based on artificial neural networks.
2nd ed. s.l.:IEEE.
10
Schetinin, V. & Nyah, N., 2020. Concepts and Technologies of Artificial Intelligence - Unit Feed. Luton:
Breo - University of Bedfordshire.
APPENDICES
Scripts:
call_im.m
function call_im() % 9/03
% Variable pt defines the path to the image set
%
n = 32; % image size
m = 32;
npic = 50; % nof pictures per subject
pt='D:\UNIVERSITY - YEAR 2\Term 2\Concepts and
Technologies of Artificial Intelligence\Assignment 2\
Assignment2_Matlab_Project\Tr0\Tr0'; % set the path to
the image set
files = dir(pt);
files([1,2]) = [];
np = length(files);
noc = np/npic; % 30 persons
target1 = repmat(1:noc,npic,1);
target = reshape(target1,1,[]); % labels of 30 persons
data = zeros(n*m, np);
for i = 1:np
fn=[pt '/' files(i).name];
11
I = imread(fn);
I = imresize(I, [n, m]);
Id = double(I);
Id = reshape(Id,[],1);
data(:,i) = Id;
end
save im data target noc
return
call_fold_data.m
function D1=call_fold_data()
%
% Calculation of PCA for train and validation data
% for a given number (nf) of cross-validation folds
% and a given PCA cutoff (thresh) [see MATLAB PCA]
% Output D1 contains the results
%
nf = 3; % nof cross-validation folds
thresh = 1e-5; % a given PCA cutoff
load im.mat data target noc;
[X,n] = call_norm_brightness(data);
T = target;
D = repmat(struct('PC1',[],'T1',[],...
'PC2',[],'T2',[]),nf,1);
for i = 1:nf
fprintf('.fold=%i \n',i)
I2 = i:nf:n; % validation image indexes
I1 = 1:n;
I1(I2) = []; % train image indexes
X1 = X(:,I1);
T1 = T(I1);
X2 = X(:,I2);
12
T2 = T(I2);
% PCA for training
[X1p,Coef] = processpca(X1, thresh);
[X1p,PS] = mapstd(X1p);
% PCA for validation
X2p = processpca('apply',X2,Coef);
X2p = mapstd('apply',X2p,PS);
D(i) = struct('PC1',X1p,'T1',T1,'PC2',X2p,'T2',T2);
end
D1 = struct('D',D,'noc',noc);
return
13
function call_print(pf,E2,NE,fi
global Cb
fprintf('.fold=%i, Pf=%5.3f\n',fi,pf)
figure(1)
plot(E2)
title(sprintf('Validation errors on fold %i \n',fi))
grid
xlabel('Pairwise ANNs')
ylabel('Error')
fprintf('.ANNs with largest validation error:\n')
[Me,Ie]=sort(E2,'descend');
for i=1:20
ci=Ie(i);
fprintf(' %2i: %3i (%2i/%2i) %5.3f,acpr=%3.1f,nopc=
%3.0f \n',...
i,ci,Cb(ci,:),Me(i),NE(ci).accr,NE(ci).nopc)
end
return
function [Y2,E2,NE1]=call_bin_ANN_train(D1)
% Y2 are PWANN outcomes on validation set
% E2 are validation errors
% NE1 are PWANN
% NES are settings
global nobc noc nohn esize Av minpc nofcp
nohn=3; % nof hidden neurons
esize=7; % ensemble size
nofcp=100; % prior nof pc
minpc=0.1; % min nof pc
av=15; % annealing variance
NES=struct('nohn',nohn,'esize',esize,'nofcp',nofcp,'minpc
',minpc,'av',av);
Av1=av*(1:5); % Annealing variance of pc
Av=[-Av1, Av1];
X1=D1.PC1;
T1=D1.T1;
X2=D1.PC2;
T2=D1.T2;
n2 = size(X2,2);
Y2=zeros(n2,nobc); % outcomes of ANNs on validation set
E2=zeros(nobc,1); % train error of binary ANNs
Ac=zeros(nobc,1); % acceptance rates
NE1=repmat(struct('NE',{},'accr',0,'nopc',0),nobc,1);
14
warning ('off','NNET:Obsolete');
ic=0;
for i1=1:noc-1
I1=T1==i1; % mask of i1-th person images
n1 = sum(I1);
for i2=i1+1:noc
I2=T1==i2; % mask of i2-th person images
n2=sum(I2);
T=[ones(1,n1) -1*ones(1,n2)]; % targets for binary ANN
X=[X1(:,I1) X1(:,I2)];
[NE,accr,nopc]=call_tr_ANN_ens(X,T);
ic=ic+1;
Ac(ic)=accr;
Y2(:,ic)=call_ts_ANN_ens(NE,X2);
% val_error:
J1=T2==i1;
J2=T2==i2;
v1=sum(J1);
v2=sum(J2);
Tv=[ones(1,v1), -1*ones(1,v2)];
Yv=call_ts_ANN_ens(NE,[X2(:,J1) X2(:,J2)]);
E2(ic)=mean(sign(Yv)~=Tv');
NE1(ic)=struct('NE',NE,'accr',accr,'nopc',nopc);
end
end
return
function [NE,acr,nopc]=call_tr_ANN_ens(X,T)
% train ANN ensemble
global nohn esize Av minpc nofcp
lenAv=length(Av);
Ac=zeros(esize,2); % acceptance
NE=repmat(struct('net',{},'nofin',0),esize,1);
maxpc=size(X,1); % max nof pc
Nc=zeros(esize,1);
nofc=nofcp; % prior on nof pc
lik=-Inf;
for i=1:esize
v1=nofc+Av(randi(lenAv)); % proposed nof pc
v1=min(v1,maxpc);
v1=max(v1,minpc);
V=1:v1;
net=newff(minmax(X(V,:)),[nohn 1],{'tansig'
'tansig'},'trainlm');
net.trainParam.show = NaN;
15
net.trainParam.epochs = 50;
net.trainParam.showWindow = false;
net=train(net,X(V,:),T);
Y=sim(net,X(V,:));
lik1=call_lik(Y,T);
r=exp(lik1-lik);
if rand < r % accept proposal
ac=1;
lik=lik1;
nofc=v1;
else
ac=0;
lik=lik1;
end
Nc(i)=nofc;
NE(i)=struct('net',net,'nofin',nofc);
Ac(i,:)=[ac,lik];
end
I=Ac(:,1)==0; % indexes of rejected ann
NE(I)=[];
acr=mean(Ac(:,1));
nopc=mean(Nc(~I));
return
function lik=call_lik(Y0,T0)
% Likelihood (-Inf, 0) of ANN, Netlab book page 125 Eq
4.16, t={0,1}
% Y0= (1,-1) are ANN outcomes, T0={1,-1} labels of
classes 1 and 2
I=T0==-1; % indexes of images of second class
Y=(Y0+1)/2;
lik0=sum(log(1-Y(I)));
lik1=sum(log(Y(~I)));
lik=lik0+lik1;
return
function Ye=call_ts_ANN_ens(NE,X)
% test ANN ensemble on X for a given # components
% Ye is ensemble output
ntest=size(X,2);
sizeNE=size(NE,2);
Y=zeros(ntest,sizeNE);
for i=1:sizeNE
net=NE(i).net;
16
V=1:NE(i).nofin;
Y(:,i)=sim(net,X(V,:));
end
Ye=single(sum(Y,2));
return
function Yi=call_pw_ANN(Yb)
% output layer for PWANNs
% Yi are predicted classes
global Cb noc
n2 = size(Yb,1); % nof validation images
Y = zeros(n2,noc); % outcomes
for c = 1:noc
I1 = Cb(:,1) == c; % +1 outputs
I2 = Cb(:,2) == c; % -1 outputs
B1 = sum(Yb(:,I1),2);
B2 = sum(Yb(:,I2),2);
Y(:,c) = B1 - B2;
end
[~,Yi]=max(Y,[],2);
return
17