Assign 3 Datamining
Assign 3 Datamining
Based on Example_1 data and code given, find multivariate multioutput regression coefficients
(weights) for the data on 'insects' and 'weather' . Convert all categorical variables to one-hot
multivariate data and do required regression. This is a coding assignment. Report the training
accuracy.
Example_1.
1. For all the patients with 'BP' is 'High' , Drug 'A' is assigned
1
Datamining Seek Regularity/pattern in Data
2. For All the patient with 'BP' is 'Low' ; Drug 'B' is assigned
3. For Patients with 'BP' is 'Normal', both Drug 'A' and 'B' is assigned
but
a) if 'BP' is 'Normal' and 'Age' is less than 40; Drug 'A' is assigned
b) if 'BP' is 'Normal' and 'Age' is greater than 40; Drug 'B' is assigned
2
Replacing Decision trees with multivariate multioutput
regression
3
0 1 30 1 0 0;
0 1 26 0 0 1;
1 0 54 1 0 0;]
A = 12×6
1 0 20 1 0 0
0 1 73 1 0 0
1 0 37 1 1 0
1 0 33 0 0 1
0 1 48 1 0 0
1 0 29 1 0 0
0 1 52 1 0 0
1 0 42 0 0 1
1 0 61 1 0 0
0 1 30 1 0 0
y = [1 0; 0 1; 1 0; 0 1; 1 0; 1 0; 0 1; 0 1; 0 1; 1 0; 0 1; 1 0];
% yd=Target variable in terms of position where 1 is put.
% This is for easy comparision with predicted output
yd= [1 2 1 2 1 1 2 2 2 1 2 1];
x=A(:,3);
xmin=15;
xmax=80;
x=(x-xmin)./xmax; % make age variable values between 0 and 1
A(:,3)=x;
% learned weight =regression coefficients; two columns
w=pinv(A)*y
w = 6×2
0.4451 0.0549
0.3445 0.1555
-1.5679 1.5679
0.8353 -0.3353
0.1508 -0.1508
-0.0457 0.5457
D=A*pinv(A)*y;
[val,index]=max(D'); % max operate in columnwise. Hence Transpose
% check whether model is working or not
% 'index' gives predicted class label
index
index = 1×12
1 2 1 2 1 1 2 2 2 1 2 1
yd
yd = 1×12
1 2 1 2 1 1 2 2 2 1 2 1
Data on Insects
4
5
Weather data
6
Dataset for Practice
https://waikato.github.io/weka-wiki/datasets/
•A gzip'edtar containing ordinal, real-world datasets donated by Professor Arie Ben David (datasets-
arie_ben_david.tar.gz, 11,348 Bytes)
•A zip file containing 19 multi-class (1-of-n) text datasets donated by Dr George Forman
(19MclassTextWc.zip, 14,084,828 Bytes)
•A bzip'edtar file containing the Reuters21578 dataset split into separate files according to the ModAptesplit
reuters21578-ModApte.tar.bz2, 81,745,032 Bytes
•A zip file containing 41 drug design datasets formed using the Adriana.Code software donated by Dr
Mehmet Fatih Amasyali (Drug-datasets.zip, 11,376,153 Bytes)
•A zip file containing 80 artificial datasets generated from the Friedman function donated by Dr.M.
FatihAmasyali(YildizTechnical Unversity) (Friedman-datasets.zip, 5,802,204 Bytes)
7
•A zip file containing a new, image-based version of the classic iris data, with 50 images for each of the
three species of iris. The images have size 600x600. Please see the ARFF file for further information
(iris_reloaded.zip, 92,267,000 Bytes). After expanding into a directory using your jar utility (or an archive
program that handles tar-archives/zip files in case of the gzip'edtars/zip files), these datasets may be used
with Weka.
•Protein datasets made available by Associate Professor Shuiwang Ji when he was a PhD student at
Louisiana State University.
•Kent Ridge Biomedical Data Set Repository, which was put together by Professor Jinyan Li and Dr Huiqing
Liu while they were at the Institute for Infocomm Research, Singapore.
•Repository for Epitope Datasets (RED), maintained by Professor Yasser El-Manzalawy when he was at
Iowa State University.
https://www.youtube.com/watch?time_continue=542&v=TF1yh5PKaqI&feature=emb_logo
https://in.mathworks.com/matlabcentral/fileexchange/33128-parallel-distributed-processing-of-weka-
algorithms-in-matlab
https://in.mathworks.com/matlabcentral/fileexchange/58675-wekalab-bridging-weka-and-matlab?
s_tid=FX_rc1_behav
https://forum.image.sc/t/running-trainable-weka-segmentation-from-matlab-using-imagej-matlab/3766/2
https://github.com/NicholasMcCarthy/wekalab
https://e-archivo.uc3m.es/rest/api/core/bitstreams/7e742952-cad2-4681-b0cc-cb86b14c9ae1/content
https://medium.com/@j622amilah/exploring-the-java-weka-machine-learning-library-48e842b88307
https://blogs.mathworks.com/pick/2017/11/20/getelevations/
C:\Users\soman\Desktop\General AI
Next
8
LApart1