0% found this document useful (0 votes)

16 views73 pages

Ric Manual Final

ric manual

Uploaded by

Omkar Kamtekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views73 pages

Ric Manual Final

ric manual

Uploaded by

Omkar Kamtekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 73

UNIVERSITY OF MUMBAI

Teacher’s Reference Manual

M. Sc (Information Technology)
(Choice Based Credit System with effect from the academic year
2019 – 2020)

PSIT1P1
Research in Computing
Practical
Table of Contents
Sr. Practical Page
Name of the Practical
No No No
1) 1 A Write a program for obtaining descriptive statistics
of data. 01
2) B Import data from different data sources (from Excel,
csv, mysql, sql server, oracle to R/Python/Excel) 05
3) 2 A Design a survey form for a given case study, collect
the primary data and analyze it 08
4) B Perform suitable analysis of given secondary data. 10
5) 3 A Perform testing of hypothesis using one sample t-
test. 13
6) B Perform testing of hypothesis using two sample t-
test. 14
7) C Perform testing of hypothesis using paired t-test. 19
8) 4 A Perform testing of hypothesis using chi-squared
goodness-of-fit test. 21
9) B Perform testing of hypothesis using chi-squared Test
of Independence 23
10) 5 Perform testing of hypothesis using Z-test. 28
11) 6 A Perform testing of hypothesis using one-way
ANOVA. 30
12) B Perform testing of hypothesis using two-way
ANOVA. 35
13) C Perform testing of hypothesis using multivariate
ANOVA (MANOVA). 39
14) 7 A Perform the Random sampling for the given data and
analyse it. 45
15) B Perform the Stratified sampling for the given data
and analyse it. 47
16) 8 Compute different types of correlation. 50
17) 9 A Perform linear regression for prediction. 52
18) B Perform polynomial regression for prediction. 55
19) 10 A Perform multiple linear regression. 56
20) B Perform Logistic regression. 59
21) List of supporting files 70
~~~~~~~~~~
1
PSIT1P1~~~~~ Research in Computing Practical
Practical 1:
A. Write a program for obtaining descriptive statistics of data.
################################################################
#Practical 1A: Write a python program on descriptive statistics analysis.
################################################################
import pandas as pd
#Create a Dictionary of series
d = {'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}
#Create a DataFrame
df = pd.DataFrame(d)
print(df)
print('############ Sum ########## ')
print (df.sum())
print('############ Mean ########## ')
print (df.mean())
print('############ Standard Deviation ########## ')
print (df.std())
print('############ Descriptive Statistics ########## ')
print (df.describe())

Output:

M. Sc. [Information Technology]SEMESTER ~ ITeacher’s Reference Manual

2
PSIT1P1~~~~~ Research in Computing Practical
Using Excel
Go to File Menu  Options  Add-Ins Select Analysis ToolPak Press OK

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

3
PSIT1P1~~~~~ Research in Computing Practical

Select the data range from the excel worksheet.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

4
PSIT1P1~~~~~ Research in Computing Practical

Output:

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

5
PSIT1P1~~~~~ Research in Computing Practical
B. Import data from different data sources (from Excel, csv,
mysql, sql server, oracle to R/Python/Excel)
SQLite:
######################################### #######################
# -*- coding: utf-8 -*-
################################################################
import sqlite3 as sq
import pandas as pd
################################################################
Base='C:/VKHCG'
sDatabaseName=Base + '/01-Vermeulen/00-RawData/SQLite/vermeulen.db'
conn = sq.connect(sDatabaseName)
################################################################
sFileName='C:/VKHCG/01-Vermeulen/01-Retrieve/01-EDS/02-Python/Retrieve_IP_DATA.csv'
print('Loading :',sFileName)
IP_DATA_ALL_FIX=pd.read_csv(sFileName,header=0,low_memory=False)
IP_DATA_ALL_FIX.index.names = ['RowIDCSV']
sTable='IP_DATA_ALL'
print('Storing :',sDatabaseName,' Table:',sTable)
IP_DATA_ALL_FIX.to_sql(sTable, conn, if_exists="replace")
print('Loading :',sDatabaseName,' Table:',sTable)
TestData=pd.read_sql_query("select * from IP_DATA_ALL;", conn)
print('################')
print('## Data Values')
print('################')
print(TestData)
print('################')
print('## Data Profile')
print('################')
print('Rows :',TestData.shape[0])
print('Columns :',TestData.shape[1])
print('################')
print('### Done!! ############################################')

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

6
PSIT1P1~~~~~ Research in Computing Practical
MySQL:
Open MySql
Create a database “DataScience”
Create a python file and add the following code:
################ Connection With MySQL ######################
importmysql.connector

conn = mysql.connector.connect(host='localhost',
database='DataScience',
user='root',
password='root')
conn.connect
if(conn.is_connected):
print('###### Connection With MySql Established Successfullly ##### ')
else:
print('Not Connected -- Check Connection Properites')

Microsoft Excel
##################Retrieve-Country-Currency.py
################################################################
# -*- coding: utf-8 -*-
################################################################
importos
import pandas as pd
################################################################
Base='C:/VKHCG'
################################################################
sFileDir=Base + '/01-Vermeulen/01-Retrieve/01-EDS/02-Python'
#if not os.path.exists(sFileDir):
#os.makedirs(sFileDir)
################################################################
CurrencyRawData = pd.read_excel('C:/VKHCG/01-Vermeulen/00-RawData/Country_Currency.xlsx')
sColumns = ['Country or territory', 'Currency', 'ISO-4217']
CurrencyData = CurrencyRawData[sColumns]
CurrencyData.rename(columns={'Country or territory': 'Country', 'ISO-4217':
'CurrencyCode'}, inplace=True)
CurrencyData.dropna(subset=['Currency'],inplace=True)
CurrencyData['Country'] = CurrencyData['Country'].map(lambda x: x.strip())
CurrencyData['Currency'] = CurrencyData['Currency'].map(lambda x:
x.strip())
CurrencyData['CurrencyCode'] = CurrencyData['CurrencyCode'].map(lambda x:
x.strip())
print(CurrencyData)
print('~~~~~~ Data from Excel Sheet Retrived Successfully ~~~~~~~ ')
################################################################

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

7
PSIT1P1~~~~~ Research in Computing Practical
sFileName=sFileDir + '/Retrieve-Country-Currency.csv'
CurrencyData.to_csv(sFileName, index = False)
################################################################

OUTPUT:

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

8
PSIT1P1~~~~~ Research in Computing Practical
Practical 2:
A. Design a survey form for a given case study, collect the
primary data and analyse it

Case 1:

A researcher wants to conduct a Survey in colleges on Use of ICT in higher education

from Mumbai, Thane and Navi Mumbai. The survey focuses on access to and use of ICT
in teaching and learning, as well as on attitudes towards the use of ICT in teaching and
learning.
Design questionnaire addressed to teachers seeks information about the target class, his
experience using ICT for teaching, access to ICT infrastructure, support available, ICT
based activities and material used, obstacles to the use of ICT in teaching, learning
activities with the target class, your skills and attitudes to ICT, and some personal
background information.
Arrange question in following groups:
1. Information about the target class you teach
2. Experience with ICT for teaching
3. ICT access for teaching
4. Support to teachers for ICT use
5. ICT based activities and material used for teaching
6. Obstacles to using ICT in teaching and learning
7. Learning activities with the target class
8. Teacher skills
9. Teacher opinions and attitudes
10. Personal background information

Case 2:
A research agency wants to study the perception about App based taxi service in
Mumbai, Thane and Navi Mumbai. The survey focuses on customers attitude towards
app base taxi service as well as on attitudes towards regular taxi cab.
Design questionnaire seeks information about the target taxi service, his experience
using taxi services, access, support available, obstacles and some personal background
information, with the following objectives:
1. To find out the customer satisfaction towards the App based-taxi services.
2. To find the level of convenience and comfort with App based -taxi
services.
3. To know their opinion about the tariff system and promptness of service.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

9
PSIT1P1~~~~~ Research in Computing Practical
4. To ascertain the customer view towards the driver behaviour and courtesy.
5. To provide inputs to enhance the services to delight the customers.
6. To examine relationship between service quality factors and taxi
passenger satisfaction.
7. To suggest better regulations for transportation authorities regarding
customer protection and effective monitoring of taxi services.
Case 3:
A popular electronic store want to conduct a survey to develop awareness of branded
laptop baseline estimates and determine popularity of different company’s laptop. It
suggests steps to be initiated or strengthened in the field of demand in a region. The key
indicators are among the general population, demand branded laptop and the problem
users.

The objectives of this particular study are:-

1. To know the preferences of different types of branded laptops by students and

professionals.
2. To study which factor influence for choosing different types of branded
laptops.
3. To know about the level of satisfaction towards different types of branded
laptops.
4. To identify the perception of consumers towards the laptop positioning
strategy.
5. To know the consumer preference towards laptop in the present era.

Use the collected data for analysis.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

10
PSIT1P1~~~~~ Research in Computing Practical
B. Perform analysis of given secondary data.
Steps in Secondary Data Analysis
1. Determine your research question – Knowing exactly what you are looking for.
2. Locating data– Knowing what is out there and whether you can gain access to it.
A quick Internet search, possibly with the help of a librarian, will reveal a wealth
of options.
3. Evaluating relevance of the data – Considering things like the data’s original
purpose, when it was collected, population, sampling strategy/sample, data
collection protocols, operationalization of concepts, questions asked, and
form/shape of the data.
4. Assessing credibility of the data – Establishing the credentials of the original
researchers, searching for full explication of methods including any problems
encountered, determining how consistent the data is with data from other sources,
and discovering whether the data has been used in any credible published research.
5. Analysis – This will generally involve a range of statistical processes.

Example: Analyze the given Population Census Data for Planning and Decision
Making by using the size and composition of populations.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

11
PSIT1P1~~~~~ Research in Computing Practical
Put the cursor in cell B22 and click on the AutoSum and then click Enter. This will
calculate the total population. Then copy the formula in cell D22 across the row 22.

To calculate the percent of males in cell E4, enter the formula =-1*100*B4/$D$22 .
And copy the formula in cell E4 down to cell E21.

To calculate the percent of females in cell F4, enter the formula =100*C4/$D$22.
Copy the formula in cell F4 down to cell F21.

To build the population pyramid, we need to choose a horizontal bar chart with two
series of data (% male and % female) and the age labels in column A as the Category
X-axis labels. Highlight the range A3:A21, hold down the CTRL key and highlight the
range E3:F21

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

12
PSIT1P1~~~~~ Research in Computing Practical

Under inset tab, under horizontal bar charts select clustered bar chart

Put the tip of your mouse arrow on the Y-axis (vertical axis) so it says “Category
Axis”, right click and chose Format Axis

Choose Axis options tab and set the major and minor tick mark type to None, Axis
labels to Low, and click OK.

Click on any of the bars in your pyramid, click right and select “format data series”.
Set the Overlap to 100 and Gap Width to 0. Click OK.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

13
PSIT1P1~~~~~ Research in Computing Practical
Practical 3:
A. Perform testing of hypothesis using one sample t-test.
One sample t-test : The One Sample t Test determines whether the sample mean
is statistically different from a known or hypothesised population mean. The One
Sample t Test is a parametric test.
Program Code:
# -*- coding: utf-8 -*-
"""
Created on Mon Dec 16 18:01:46 2019
@author: Ahtesham Shaikh
"""
fromscipy.stats import ttest_1samp
importnumpy as np
ages = np.genfromtxt('ages.csv')
print(ages)
ages_mean = np.mean(ages)
print(ages_mean)
tset, pval = ttest_1samp(ages, 30)
print('p-values - ',pval)

if pval< 0.05: # alpha value is 0.05

print(" we are rejecting null hypothesis")
else:
print("we are accepting null hypothesis")

Output:

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

14
PSIT1P1~~~~~ Research in Computing Practical
B. Write a program for t-test comparing two means for
independent samples.
The t distribution provides a good way to perform one sample tests on the mean when
the population variance is not known provided the population is normal or the sample
is sufficiently large so that the Central Limit Theorem applies.

Two Sample t Test

Example: A college Princiapal informed classroom teachers that some of their students
showedunusual potential for intellectual gains. One months later the students identified
to teachers ashaving potentional for unusual intellectual gains showed significiantly
greater gains performanceon a test said to measure IQ than did students who were not so
identified. Below are the data forthe students:
Experimental Comparison
35 2
40 27
12 38
15 31
21 1
14 19
46 1
10 34
28 3
48 1
16 2
30 3
32 2
48 1
31 2
22 1
12 3
39 29
19 37
25 2
27.15 11.95 Mean
12.51 14.61 Sd

Experimental Data
To calculate Standard Mean go to cell A22 and type =SUM(A2:A21)/20
To calculate Standard Deviation go to cell A23 and type =STDEV(A2:A21)

Comparison Data
To calculate Standard Mean go to cell B22 and type =SUM(B2:B21)/20

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

15
PSIT1P1~~~~~ Research in Computing Practical
To calculate Standard Deviation go to cell B23 and type =STDEV(B2:B21)

To find T-Test Statistics go to data Data Analysis

To caluculate the T-Test square value go to cell E20 and type

=(A22-B22)/SQRT((A23*A23)/COUNT(A2:A21)+(B23*B23)/COUNT(A2:A21))

Now go to cell E20 and type

=IF(E20<E12,"H0 is Accepted", "H0 is Rejected and H1 is Accepted")

Our calculated value is larger than the tabled value at alpha = .01, so we reject the null
hypothesisand accept the alternative hypothesis, namely, that the difference in gain
scores is likely the resultof the experimental treatment and not the result of chance
variation.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

16
PSIT1P1~~~~~ Research in Computing Practical
Output:

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

17
PSIT1P1~~~~~ Research in Computing Practical
Using Python

importnumpy as np
fromscipy import stats
fromnumpy.random import randn
N = 20
#a = [35,40,12,15,21,14,46,10,28,48,16,30, 32,48,31,22,12,39,19,25]
#b = [2,27,31,38,1,19,1,34,3,1,2,1,3,1,2,1,3,29,37,2]
a = 5 * randn(100) + 50
b = 5 * randn(100) + 51
var_a = a.var(ddof=1)
var_b = b.var(ddof=1)

s = np.sqrt((var_a + var_b)/2)
t = (a.mean() - b.mean())/(s*np.sqrt(2/N))

df = 2*N - 2
#p-value after comparison with the t
p = 1 - stats.t.cdf(t,df=df)

print("t = " + str(t))

print("p = " + str(2*p))
if t> p :
print('Mean of two distribution are differnt and significant')
else:
print('Mean of two distribution are same and not significant')

Output:

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

18
PSIT1P1~~~~~ Research in Computing Practical
Practice Questions:

Example 1: we have to test whether the height of men in the population is different
from height of women in general. So we take a sample from the population and use the
t-test to see if the result is significant.

H0 – Height of men and women are same

H1 – Height of men and women are the different

Men Women
181 160
169 150
160 160
170 175
175 160
158 170
152 160
172 150
160 155
175 162
180 165
170 148
165 159
180 163
155 170
159 178
163 180
171 156
182 164
150 167

Example 2: Design a survey form to get grade of students who have passed B. Sc. IT
and B. Sc. CS from the same University. Perform T-Test to test the given hypothsis:
H0 – Scores of students in two courses are same.
H1 – Scores of students are the different.
Example 2: Collect a sample data know that use of Online Food Ordering app to
compare whether the usage is equal or different.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

19
PSIT1P1~~~~~ Research in Computing Practical
C. Perform testing of hypothesis using paired t-test.
The paired sample t-test is also called dependent sample t-test. It’s an univariate test that
tests for a significant difference between 2 related variables. An example of this is if you
where to collect the blood pressure for an individual before and after some treatment,
condition, or time point. The data set contains blood pressure readings before and after
an intervention. These are variables “bp_before” and “bp_after”.

The hypothesis being test is:

• H0 - The mean difference between sample 1 and sample 2 is equal to 0.

• H0 - The mean difference between sample 1 and sample 2 is not equal to 0

Program Code:
# -*- coding: utf-8 -*-
"""
Created on Mon Dec 16 19:49:23 2019

@author: MyHome
"""

from scipy import stats

import matplotlib.pyplot as plt

import pandas as pd

df = pd.read_csv("blood_pressure.csv")

print(df[['bp_before','bp_after']].describe())

#First let’s check for any significant outliers in

#each of the variables.
df[['bp_before', 'bp_after']].plot(kind='box')
# This saves the plot as a png file
plt.savefig('boxplot_outliers.png')

# make a histogram to differences between the two scores.

df['bp_difference'] = df['bp_before'] - df['bp_after']

df['bp_difference'].plot(kind='hist', title= 'Blood Pressure Difference Histogram')

#Again, this saves the plot as a png file

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

20
PSIT1P1~~~~~ Research in Computing Practical
plt.savefig('blood pressure difference histogram.png')
stats.probplot(df['bp_difference'], plot= plt)
plt.title('Blood pressure Difference Q-Q Plot')
plt.savefig('blood pressure difference qq plot.png')
stats.shapiro(df['bp_difference'])
stats.ttest_rel(df['bp_before'], df['bp_after'])

Output:

A paired sample t-test was used to analyze the blood pressure before and after the
intervention to test if the intervention had a significant affect on the blood pressure. The
blood pressure before the intervention was higher (156.45 ± 11.39 units) compared to
the blood pressure post intervention (151.36 ± 14.18 units); there was a statistically
significant decrease in blood pressure (t(119)=3.34, p= 0.0011) of 5.09 units.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

21
PSIT1P1~~~~~ Research in Computing Practical
Practical 4:
A. Perform testing of hypothesis using chi-squared goodness-
of-fit test.
Problem
Ansystem administrator needs to upgrade the computers for his division. He wants
to know what sort of computer system his workers prefer. He gives three choices:
Windows, Mac, or Linux. Test the hypothesis or theory that an equal percentage
of the population prefers each type of computer system .
System O Ei
Windows 20 33.33%
Mac 60 33.33%
Linux 20 33.33%
H0 : The population distribution of the variable is the same as the proposed distribution
HA : The distributions are different
To calculate the Chi –Squred value for Windows go to cell D2 and type =((B2-
C2)*(B2-C2))/C2
To calculate the Chi –Squred value for Mac go to cell D3 and type =((B3-C3)*(B3-
C3))/C3
To calculate the Chi –Squred value for Mac go to cell D3 and type =((B4-C4)*(B4-
C4))/C4

Go to Cell D5 for and type=SUM(D2:D4)

To get the table value for Chi-Square for α = 0.05 and dof = 2, go to cell D7 and type
=CHIINV(0.05,2)
At cell D8 type =IF(D5>D7, "H0 Accepted","H0 Rejected")

Output:

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

22
PSIT1P1~~~~~ Research in Computing Practical
Practice Questions:

1. The Mobile Association of Mumbai conducted a survey in 2019 and determined

that 60% of users have only one SIM card, 28% have two SIM cards and 12%
have three or more. Supposing that you have decided to conduct yourown survey
and have collected the data that out of 129 Mobile Users, 73 had one SIM and 38
had two SIM, determine whether your data supports the results of the association’s
study.(Use a significance level of 0.05.)
2. In a debate, Geeta told Pankaj that the reason her car insurance is less expensive
is that metro city drivers get in more accidents thanRuralarea drivers. According
to her study,in metro cities drivers are held responsible in 65% of accidents. If
Pankaj does some research of his own and discovers that 46 out of the 85 accidents
he investigates involve rural areadrivers, does his data support or refute Geeta’s
hypothesis?
Ho –Metro citiesdrivers are more responsible for accidents than rural area drivers
are.
O Ei
Rural Drivers 46 65%
Metro Drivers 39 35%
Total 85 100

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

23
PSIT1P1~~~~~ Research in Computing Practical
B. Perform testing of hypothesis using chi-squared test of
independence.
In a study to understatnd the permormacne of M. Sc. IT Part -1 class, a college selects
a random sample of 100 students. Each student was asked his grade obtained in B. Sc.
IT. The sample is as given below
Sr.
Roll No Student's Name Gen Grade Sr. No Roll No Student's Name Gen Grade
No
1 1 Gaborone m O 62 3 Maun f O
2 2 Francistown m O 63 7 Tete f O
3 5 Niamey m O 64 9 Chimoio f O
4 13 Maxixe m O 65 11 Pemba f O
5 16 Tema m O 66 14 Chibuto f O
6 17 Kumasi m O 67 25 Mampong f O
7 34 Blida m O 68 36 Tlemcen f O
8 35 Oran m O 69 40 Adrar f O
9 38 Saefda m O 70 41 Tindouf f O
10 42 Constantine m O 71 46 Skikda f O
11 43 Annaba m O 72 47 Ouargla f O
12 45 Bejaefa m O 73 10 Matola f D
13 48 Medea m O 74 20 Legon f D
14 49 Djelfa m O 75 21 Sunyani f D
15 50 Tipaza m O 76 72 Teenas f D
16 51 Bechar m O 77 73 Kouba f D
17 54 Mostaganem m O 78 75 HussenDey f D
18 55 Tiaret m O 79 77 Khenchela f D
19 56 Bouira m O 80 82 HassiBahbah f D
20 59 Tebessa m O 81 84 Baraki f D
21 61 El Harrach m O 82 91 Boudouaou f D
22 62 Mila m O 83 95 Tadjenanet f D
23 65 Fouka m O 84 4 Molepolole f C
24 66 El Eulma m O 85 8 Quelimane f C
25 68 SidiBel Abbes m O 86 23 Bolgatanga f C
26 69 Jijel m O 87 58 Mohammadia f C
27 70 Guelma m O 88 83 Merouana f C
28 85 Khemis El Khechna m O 89 24 Ashaiman f B
29 87 Bordj El Kiffan m O 90 76 N'gaous f B
30 88 Lakhdaria m O 91 90 Bab El Oued f B
31 6 Maputo m D 92 92 BordjMenael f B
32 12 Lichinga m D 93 93 Ksar El Boukhari f B
33 15 Ressano Garcia m D 94 74 Reghaa f A
34 19 Accra m D 95 78 Cheria f A
35 27 Wa m D 96 79 Mouzaa f A
36 28 Navrongo m D 97 80 Meskiana f A
37 37 Mascara m D 98 81 Miliana f A
38 44 Batna m D 99 94 Sig f A
39 57 El Biar m D 100 99 Kadiria f A
40 60 Boufarik m D
41 63 OuedRhiou m D
42 64 Souk Ahras m D
43 71 Dar El Befda m D
44 86 Birtouta m D
45 18 Takoradi m C
46 22 Cape Coast m C
47 29 Kwabeng m C
48 30 Algiers m C
49 31 Laghouat m C
50 39 Relizane m C
51 52 Setif m C
52 53 Biskra m C
53 67 Kolea m C
54 100 AefnFakroun m C
55 26 Nima m B
56 32 TiziOuzou m B
57 33 Chlef m B

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

24
PSIT1P1~~~~~ Research in Computing Practical
58 89 M'sila m A
59 96 Heliopolis m A
60 97 Berrouaghia m A
61 98 Sougueur m A

Null Hypothesis - H0 : The performance of girls students is same as boys students.

Alternate Hypothesis - H1 : The performance of boys and girls students are different.
Open Excel Workbook

O A B C D Total
Girls 11 7 5 5 11 39 6.075
Boys 30 4 3 10 14 61 6.075
Total 41 11 8 15 25 100 12.150
Ei 20.5 5.5 4 7.5 12.5 50
Prepare a contingency table as shown above.
To calculate Girls Students with ‘O’ Grade
Go to Cell N6 and type =COUNTIF($J$2:$K$40,"O")

To calculate Girls Students with ‘A’ Grade

Go to Cell O6 and type =COUNTIF($J$2:$K$40,"A")

To calculate Girls Students with ‘B’ Grade

Go to Cell P6 and type =COUNTIF($J$2:$K$40,"B")

To calculate Girls Students with ‘C’ Grade

Go to Cell Q6 and type =COUNTIF($J$2:$K$40,"C")

To calculate Girls Students with ‘D’ Grade

Go to Cell R6 and type =COUNTIF($J$2:$K$40,"D")

To calculate Boys Students with ‘O’ Grade

Go to Cell N7 and type =COUNTIF($D$2:$E$62,"O")

To calculate Boys Students with ‘A’ Grade

Go to Cell O7 and type =COUNTIF($D$2:$E$62,"A")

To calculate Boys Students with ‘B’ Grade

Go to Cell P7 and type =COUNTIF($D$2:$E$62,"B")
To calculate Boys Students with ‘C’ Grade
Go to Cell Q7 and type =COUNTIF($D$2:$E$62,"C")

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

25
PSIT1P1~~~~~ Research in Computing Practical

To calculate Boys Students with ‘D’ Grade

Go to Cell R7 and type =COUNTIF($D$2:$E$62,"D")

To calculated the expected value Ei

Go to Cell N9 and type =N8/2
Go to Cell O9 and type =O8/2
Go to Cell P9 and type =P8/2
Go to Cell Q9 and type =Q8/2
Go to Cell R9 and type =R8/2

Go to Cell S6 and calculate total girl students = SUM(N6:R6)

Go to Cell S7 and calculate total girl students = SUM(N7:R7)

Now Calculate
Go to cell T6 and type
=SUM((N6-$N$9)^2/$N$9,(O6-$O$9)^2/$O$9,(P6-$P$9)^2/$P$9,(Q6-Q$9)^2/$Q$9,
(R6-$R$9)^2/$R$9)
Go to cell T7 and type
=SUM((N7-$N$9)^2/$N$9,(O7-$O$9)^2/$O$9,(P7-$P$9)^2/$P$9,(Q7-Q$9)^2/$Q$9,
(R7-$R$9)^2/$R$9)
To get the table value go to cell T11 and type =CHIINV(0.05,4)
Go to cell O13 and type =IF(T8>=T11," H0 is Accepted", "H0 is Rejected")

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

26
PSIT1P1~~~~~ Research in Computing Practical
Using Python
importnumpy as np
import pandas as pd
importscipy.stats as stats

np.random.seed(10)
stud_grade = np.random.choice(a=["O","A","B","C","D"],
p=[0.20, 0.20 ,0.20, 0.20, 0.20], size=100)
stud_gen = np.random.choice(a=["Male","Female"], p=[0.5, 0.5], size=100)
mscpart1 = pd.DataFrame({"Grades":stud_grade, "Gender":stud_gen})
print(mscpart1)
stud_tab = pd.crosstab(mscpart1.Grades, mscpart1.Gender, margins=True)
stud_tab.columns = ["Male", "Female", "row_totals"]
stud_tab.index = ["O", "A", "B", "C", "D", "col_totals"]
observed = stud_tab.iloc[0:5, 0:2 ]
print(observed)
expected = np.outer(stud_tab["row_totals"][0:5],
stud_tab.loc["col_totals"][0:2]) / 100
print(expected)
chi_squared_stat = (((observed-expected)**2)/expected).sum().sum()
print('Calculated : ',chi_squared_stat)

crit = stats.chi2.ppf(q=0.95, df=4)

print('Table Value : ',crit)

ifchi_squared_stat>= crit:
print('H0 is Accepted ')
else:
print('H0 is Rejected ')

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

27
PSIT1P1~~~~~ Research in Computing Practical
Output :

Practice Questions
1. Anita claims that girls take more normal and filter appliedselfies than boys, but
Karan does not agree with her, so he conducts a survey collects the following data,
would it be correct to say that he should reject Anita’s claim that gender affects
tendency to take selfies?
H0 - Gender affects tendency to take more photographs
Normal Selfie Apply Filter Total
Female 72 489 561
Male 48 530 578
TOTAL 120 1019 1139

2. Ketan claims that single people prefer different pizzas than married people do.
Kato’s brother Anand doesn’t think that is true, so he conducts some research of
his own, and collects the data below.

H0: Marital status and pizza type are not associated.

H1: Marital type and pizza type are associated.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

28
PSIT1P1~~~~~ Research in Computing Practical
Practical 5:
Perform testing of hypothesis using Z-test.
Use a Z test if:
• Your sample size is greater than 30. Otherwise, use a t test.
• Data points should be independent from each other. In other words, one data point
isn’t related or doesn’t affect another data point.
• Your data should be normally distributed. However, for large sample sizes (over
30) this doesn’t always matter.
• Your data should be randomly selected from a population, where each item has
an equal chance of being selected.
• Sample sizes should be equal if at all possible.
Ho - Blood pressure has a mean of 156 units
Program Code for one-sample Z test.
from statsmodels.stats import weightstats as stests
import pandas as pd
from scipy import stats
df = pd.read_csv("blood_pressure.csv")
df[['bp_before','bp_after']].describe()
print(df)
ztest ,pval = stests.ztest(df['bp_before'], x2=None, value=156)
print(float(pval))

if pval<0.05:
print("reject null hypothesis")
else:
print("accept null hypothesis")
Output:

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

29
PSIT1P1~~~~~ Research in Computing Practical
Two-sample Z test- In two sample z-test , similar to t-test here we are checking two
independent data groups and deciding whether sample mean of two group is equal or
not.

H0 : mean of two group is 0

H1 : mean of two group is not 0

# -- coding: utf-8 --

"""
Created on Mon Dec 16 20:42:17 2019
@author: MyHome
"""
import pandas as pd
from statsmodels.stats import weightstats as stests
df = pd.read_csv("blood_pressure.csv")
df[['bp_before','bp_after']].describe()
print(df)

ztest ,pval = stests.ztest(df['bp_before'], x2=df['bp_after'], value=0,alternative='two-

sided')
print(float(pval))

if pval<0.05:
print("reject null hypothesis")
else:
print("accept null hypothesis")

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

30
PSIT1P1~~~~~ Research in Computing Practical
Practical 6:
A. Perform testing of hypothesis using One-way ANOVA.
ANOVA Assumptions
• The dependent variable (SAT scores in our example) should be continuous.
• The independent variables (districts in our example) should be two or more
categorical groups.
• There must be different participants in each group with no participant being in
more than one group. In our case, each school cannot be in more than one
district.
• The dependent variable should be approximately normally distributed for each
category.
• Variances of each group are approximately equal.

From our data exploration, we can see that the average SAT scores are quite different
for each district. Since we have five different groups, we cannot use the t-test, use the
1-way ANOVA test anyway just to understand the concepts.
H0 - There are no significant differences between the groups' mean SAT scores.
µ1 = µ2 = µ3 = µ4 = µ5
H1 - There is a significant difference between the groups' mean SAT scores.
If there is at least one group with a significant difference with another group, the null
hypothesis will be rejected.
import pandas as pd
importnumpy as np
importmatplotlib.pyplot as plt
importseaborn as sns
fromscipy import stats

data = pd.read_csv("scores.csv")
data.head()
data['Borough'].value_counts()

############### There is no total score column, have to create it.

########In addition, find the mean score of the each district across all schools.
data['total_score'] = data['Average Score (SAT Reading)'] + \
data['Average Score (SAT Math)'] + \
data['Average Score (SAT Writing)']

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

31
PSIT1P1~~~~~ Research in Computing Practical
data = data[['Borough', 'total_score']].dropna()
x = ['Brooklyn', 'Bronx', 'Manhattan', 'Queens', 'Staten Island']
district_dict = {}

#Assigns each test score series to a dictionary key

for district in x:
district_dict[district] = data[data['Borough'] == district]['total_score']

y = []
yerror = []
#Assigns the mean score and 95% confidence limit to each district
for district in x:
y.append(district_dict[district].mean())
yerror.append(1.96*district_dict[district].std()/np.sqrt(district_dict[district].shape[0]))
print(district + '_std : {}'.format(district_dict[district].std()))

sns.set(font_scale=1.8)
fig = plt.figure(figsize=(10,5))
ax = sns.barplot(x, y, yerr=yerror)
ax.set_ylabel('Average Total SAT Score')
plt.show()

###################### Perform 1-way ANOVA

print(stats.f_oneway(
district_dict['Brooklyn'], district_dict['Bronx'], \
district_dict['Manhattan'], district_dict['Queens'], \
district_dict['Staten Island']
))

districts = ['Brooklyn', 'Bronx', 'Manhattan', 'Queens', 'Staten Island']

ss_b = 0
for d in districts:
ss_b += district_dict[d].shape[0] * \
np.sum((district_dict[d].mean() - data['total_score'].mean())**2)

ss_w = 0
for d in districts:
ss_w += np.sum((district_dict[d] - district_dict[d].mean())**2)

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

32
PSIT1P1~~~~~ Research in Computing Practical

msb = ss_b/4
msw = ss_w/(len(data)-5)
f=msb/msw
print('F_statistic: {}'.format(f))

ss_t = np.sum((data['total_score']-data['total_score'].mean())**2)
eta_squared = ss_b/ss_t
print('eta_squared: {}'.format(eta_squared))

Output:

Since theresulting pvalueis less than 0.05. The null hypothesis is rejected and conclude
that there is a significant difference between the SAT scores for each district.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

33
PSIT1P1~~~~~ Research in Computing Practical
Using Excel
H0 - There are no significant differences between the Subject’s mean SAT scores.
µ1 = µ2 = µ3 = µ4 = µ5
H1 - There is a significant difference between the Subject's mean SAT scores.
To perform ANOVA go to data Data Analysis

Input Range : $S$1:$U$436( Select columns to be analyzed in group)

Output Range :$K$453:$S$465( Can be any Range)

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

34
PSIT1P1~~~~~ Research in Computing Practical
Anova: Single Factor

SUMMARY
Groups Count Sum Average Variance
Average Score (SAT Math) 375 162354 432.944 5177.144
Average Score (SAT Reading) 375 159189 424.504 3829.267
Average Score (SAT Writing) 375 156922 418.4587 4166.522

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 39700.57 2 19850.28 4.520698 0.01108 3.003745
Within Groups 4926677 1122 4390.977

Total 4966377 1124

Since theresulting pvalueis less than 0.05. The null hypothesis (H0) is rejected and
conclude that there is a significant difference between the SAT scores for each subject.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

35
PSIT1P1~~~~~ Research in Computing Practical
B. Perform testing of hypothesis using Two-way ANOVA.
Program Code:
import pandas as pd
import statsmodels.api as sm
from statsmodels.formula.api import ols
from statsmodels.stats.anova import anova_lm
from statsmodels.graphics.factorplots import interaction_plot
import matplotlib.pyplot as plt
from scipy import stats

def eta_squared(aov):
aov['eta_sq'] = 'NaN'
aov['eta_sq'] = aov[:-1]['sum_sq']/sum(aov['sum_sq'])
return aov

def omega_squared(aov):
mse = aov['sum_sq'][-1]/aov['df'][-1]
aov['omega_sq'] = 'NaN'
aov['omega_sq'] = (aov[:-1]['sum_sq']-(aov[:-
1]['df']*mse))/(sum(aov['sum_sq'])+mse)
return aov

datafile = "ToothGrowth.csv"
data = pd.read_csv(datafile)
fig = interaction_plot(data.dose, data.supp, data.len,
colors=['red','blue'], markers=['D','^'], ms=10)
N = len(data.len)
df_a = len(data.supp.unique()) - 1
df_b = len(data.dose.unique()) - 1
df_axb = df_a*df_b
df_w = N - (len(data.supp.unique())*len(data.dose.unique()))
grand_mean = data['len'].mean()
#Sum of Squares A – supp
ssq_a = sum([(data[data.supp ==l].len.mean()-grand_mean)**2 for l in data.supp])
#Sum of Squares B – supp
ssq_b = sum([(data[data.dose ==l].len.mean()-grand_mean)**2 for l in data.dose])
#Sum of Squares Total
ssq_t = sum((data.len - grand_mean)**2)
vc = data[data.supp == 'VC']
oj = data[data.supp == 'OJ']
vc_dose_means = [vc[vc.dose == d].len.mean() for d in vc.dose]

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

36
PSIT1P1~~~~~ Research in Computing Practical
oj_dose_means = [oj[oj.dose == d].len.mean() for d in oj.dose]
ssq_w = sum((oj.len - oj_dose_means)**2) +sum((vc.len - vc_dose_means)**2)
ssq_axb = ssq_t-ssq_a-ssq_b-ssq_w
ms_a = ssq_a/df_a #Mean Square A
ms_b = ssq_b/df_b #Mean Square B
ms_axb = ssq_axb/df_axb #Mean Square AXB
ms_w = ssq_w/df_w
f_a = ms_a/ms_w
f_b = ms_b/ms_w
f_axb = ms_axb/ms_w
p_a = stats.f.sf(f_a, df_a, df_w)
p_b = stats.f.sf(f_b, df_b, df_w)
p_axb = stats.f.sf(f_axb, df_axb, df_w)
results = {'sum_sq':[ssq_a, ssq_b, ssq_axb, ssq_w],
'df':[df_a, df_b, df_axb, df_w],
'F':[f_a, f_b, f_axb, 'NaN'],
'PR(>F)':[p_a, p_b, p_axb, 'NaN']}
columns=['sum_sq', 'df', 'F', 'PR(>F)']

aov_table1 = pd.DataFrame(results, columns=columns,

index=['supp', 'dose',
'supp:dose', 'Residual'])
formula = 'len ~ C(supp) + C(dose) + C(supp):C(dose)'
model = ols(formula, data).fit()
aov_table = anova_lm(model, typ=2)
eta_squared(aov_table)
omega_squared(aov_table)
print(aov_table.round(4))
res = model.resid
fig = sm.qqplot(res, line='s')
plt.show()

Output:

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

37
PSIT1P1~~~~~ Research in Computing Practical

Using Excel:
Go to Data tab  Data Analysis

Input Range - $A$1:$C$61

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

38
PSIT1P1~~~~~ Research in Computing Practical
Rows Per Sample – 30 (Beacause 30 Patients are given each dose)
Alpha – 0.05
Output Range - $F$1:$M$24

Output:
Anova: Two-Factor With Replication

SUMMARY len dose Total

1
Count 30 30 60
Sum 508.9 35 543.9
Average 16.96333 1.166667 9.065
Variance 68.32723 0.402299 97.22333

31
Count 30 30 60
Sum 619.9 35 654.9
Average 20.66333 1.166667 10.915
Variance 43.63344 0.402299 118.2854

Total
Count 60 60
Sum 1128.8 70
Average 18.81333 1.166667
Variance 58.51202 0.39548
ANOVA
Source of
Variation SS df MS F P-value F crit
Sample 102.675 1 102.675 3.642079 0.058808 3.922879
Columns 9342.145 1 9342.145 331.3838 8.55E-36 3.922879
Interaction 102.675 1 102.675 3.642079 0.058808 3.922879
Within 3270.193 116 28.19132
Total 12817.69 119

P-value = 0.0588079 column in the ANOVA Source of Variation table at the bottom of
the output. Because the p-values for both medicin dose and interaction are less than our
significance level, these factors are statistically significant. On the other hand, the
interaction effect is not significant because its p-value (0.0588) is greater than our
significance level. Because the interaction effect is not significant, we can focus on only
the main effects and not consider the interaction effect of the dose.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

39
PSIT1P1~~~~~ Research in Computing Practical
C. Perform testing of hypothesis using MANOVA.
MANOVA is the acronym for Multivariate Analysis of Variance. When analyzing data,
we may encounter situations where we have there multiple response variables
(dependent variables). In MANOVA there also some assumptions, like ANOVA. Before
performing MANOVA we have to check the following assumptions are satisfied or not.
• The samples, while drawing, should be independent of each other.
• The dependent variables are continuous in nature and the independent variables
are categorical.
• The dependent variables should follow a multivariate normal distribution.
• The population variance-covariance matrices of each group are same, i.e. groups
are homogeneous.
Code:
import pandas as pd
fromstatsmodels.multivariate.manova import MANOVA
df = pd.read_csv('iris.csv', index_col=0)
df.columns = df.columns.str.replace(".", "_")
df.head()
print('~~~~~~~~ Data Set ~~~~~~~~')
print(df)
maov = MANOVA.from_formula('Sepal_Length + Sepal_Width + \
Petal_Length + Petal_Width ~ Species', data=df)
print('~~~~~~~~ MANOVA Test Result ~~~~~~~~')
print(maov.mv_test())
Output:

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

40
PSIT1P1~~~~~ Research in Computing Practical
Excel:
Go to http://www.real-statistics.com/free-download/

1. Download Real Statistics Resource Pack

http://www.real-statistics.com/wp-content/uploads/2019/11/XRealStats.xlam

Install Add-in in excel. Select File > Help|Options > Add-Ins and click on the Go button at the
bottom of the window (see Figure 1).

Add-ins -> Analysis Pack -> Go

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

41
PSIT1P1~~~~~ Research in Computing Practical

Click on browse and select XrealStats file (previously downloaded).

Select the following Add-Ins. Click OK.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

42
PSIT1P1~~~~~ Research in Computing Practical

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

43
PSIT1P1~~~~~ Research in Computing Practical
Now create an excel sheet with following data.

A study was conducted to see the impact of social-economic class (rich, middle, poor) and gender
(male, female) on kindness and optimism using on a sample of 24 people based on the data in Figure
1.

Press ctrl-m to open Real Statistics menu.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

44
PSIT1P1~~~~~ Research in Computing Practical
Select the data excluding column names. Select a cell for output.

Output:

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

45
PSIT1P1~~~~~ Research in Computing Practical
Practical 7:
A. Perform the Random sampling for the given data and
analyse it.
Example 1: From a population of 10 women and 10 men as given in the table in Figure
1 on the left below, create a random sample of 6 people for Group 1 and a periodic
sample consisting of every 3rd woman for Group 2.

You need to run the sampling data analysis tool twice, once to create Group 1 and again
to create Group 2. For Group 1 you select all 20 population cells as the Input Range and
Random as the Sampling Method with 6 for the Random Number of Samples. For Group
2 you select the 10 cells in the Women column as Input Range and Periodic with Period
3.

Open existing excel sheet with population data

Sample Sheet looks as given below:

Set Cell O1 = Male and Cell O2 = Female

To generate a random sample for male students from given population go to Cell O1
and type
=INDEX(E$2:E$62,RANK(B2,B$2:B$62))
Drag teh formula to the desired no of cell to select random sample.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

46
PSIT1P1~~~~~ Research in Computing Practical
Now, to generate a random sample for female students go to cell P1 and type
=INDEX(K$2:K$40,RANK(H2,H$2:H$40))
Drag teh formula to the desired no of cell to select random sample.

Output:

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

47
PSIT1P1~~~~~ Research in Computing Practical
B. Perform the Stratified sampling for the given data and
analyse it.
we are to carry out a hypothetical housing quality survey across Lagos state, Nigeria.
And we looking at a total of 5000 houses (hypothetically). We don’t just go to one local
government and select 5000 houses, rather we ensure that the 5000 houses are a
representative of the whole 20 local government areas Lagos state is comprised of. This
is called stratified sampling. The population is divided into homogenous strata and the
right number of instances is sampled from each stratum to guarantee that the test-set
(which in this case is the 5000 houses) is a representative of the overall population. If
we used random sampling, there would be a significant chance of having bias in the
survey results.

Program Code:
import pandas as pd
importnumpy as np

importmatplotlib
importmatplotlib.pyplot as plt

plt.rcParams['axes.labelsize'] = 14
plt.rcParams['xtick.labelsize'] = 12
plt.rcParams['ytick.labelsize'] = 12

importseaborn as sns
color = sns.color_palette()
sns.set_style('darkgrid')

importsklearn
fromsklearn.model_selection import train_test_split

housing =pd.read_csv('housing.csv')
print(housing.head())
print(housing.info())

#creating a heatmap of the attributes in the dataset

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

48
PSIT1P1~~~~~ Research in Computing Practical
correlation_matrix = housing.corr()
plt.subplots(figsize=(8,6))
sns.heatmap(correlation_matrix, center=0, annot=True, linewidths=.3)

corr =housing.corr()
print(corr['median_house_value'].sort_values(ascending=False))

sns.distplot(housing.median_income)
plt.show()

output:

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

49
PSIT1P1~~~~~ Research in Computing Practical

There’s a ton of information we can mine from the heatmap above, a couple of strongly
positively correlated features and a couple of negatively correlated features. Take a look
at the small bright box right in the middle of the heatmap from total_rooms on the left
’y-axis’ till households and note how bright the box is as well as the highly positively
correlated attributes, also note that median_income is the most correlated feature to the
target which is median_house_value.

From the image above, we can see that most median incomes are clustered between
$20,000 and $50,000 with some outliers going far beyond $60,000 making the
distribution skew to the right.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

50
PSIT1P1~~~~~ Research in Computing Practical
Practical 8:
Write a program for computing different correlation.
Positive Correlation:
Let’s take a look at a positive correlation. Numpy implements a corrcoef() function that
returns a matrix of correlations of x with x, x with y, y with x and y with y. We’re
interested in the values of correlation of x with y (so position (1, 0) or (0, 1)).
Code:
importnumpy as np
importmatplotlib.pyplot as plt
np.random.seed(1)
# 1000 random integers between 0 and 50
x = np.random.randint(0, 50, 1000)
# Positive Correlation with some noise
y = x + np.random.normal(0, 10, 1000)
np.corrcoef(x, y)
matplotlib.style.use('ggplot')

plt.scatter(x, y)
plt.show()

Output:

Negative Correlation:
importnumpy as np
importmatplotlib.pyplot as plt
np.random.seed(1)
# 1000 random integers between 0 and 50

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

51
PSIT1P1~~~~~ Research in Computing Practical
x = np.random.randint(0, 50, 1000)

# Negative Correlation with some noise

y = 100 - x + np.random.normal(0, 5, 1000)

np.corrcoef(x, y)
plt.scatter(x, y)
plt.show()

Output:

No/Weak Correlation:
importnumpy as np
importmatplotlib.pyplot as plt
np.random.seed(1)
x = np.random.randint(0, 50, 1000)
y = np.random.randint(0, 50, 1000)
np.corrcoef(x, y)
plt.scatter(x, y)
plt.show()
Output:

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

52
PSIT1P1~~~~~ Research in Computing Practical
Practical 9:
A. Write a program to Perform linear regression for prediction.
# -*- coding: utf-8 -*-
"""
Created on Mon Dec 16 21:56:32 2019
@author: MyHome
"""
import Quandl, math
import numpy as np
import pandas as pd
from sklearn import preprocessing, cross_validation, svm
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
from matplotlib import style
import datetime

style.use('ggplot')
df = Quandl.get("WIKI/GOOGL")
df = df[['Adj. Open', 'Adj. High', 'Adj. Low', 'Adj. Close', 'Adj. Volume']]
df['HL_PCT'] = (df['Adj. High'] - df['Adj. Low']) / df['Adj. Close'] * 100.0
df['PCT_change'] = (df['Adj. Close'] - df['Adj. Open']) / df['Adj. Open'] * 100.0

df = df[['Adj. Close', 'HL_PCT', 'PCT_change', 'Adj. Volume']]

forecast_col = 'Adj. Close'
df.fillna(value=-99999, inplace=True)
forecast_out = int(math.ceil(0.01 * len(df)))
df['label'] = df[forecast_col].shift(-forecast_out)

X = np.array(df.drop(['label'], 1))
X = preprocessing.scale(X)
X_lately = X[-forecast_out:]
X = X[:-forecast_out]

df.dropna(inplace=True)
y = np.array(df['label'])

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

53
PSIT1P1~~~~~ Research in Computing Practical
X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.2)
clf = LinearRegression(n_jobs=-1)
clf.fit(X_train, y_train)
confidence = clf.score(X_test, y_test)

forecast_set = clf.predict(X_lately)
df['Forecast'] = np.nan

last_date = df.iloc[-1].name
last_unix = last_date.timestamp()
one_day = 86400
next_unix = last_unix + one_day

for i in forecast_set:
next_date = datetime.datetime.fromtimestamp(next_unix)
next_unix += 86400
df.loc[next_date] = [np.nan for _ in range(len(df.columns)-1)]+[i]

df['Adj. Close'].plot()
df['Forecast'].plot()
plt.legend(loc=4)
plt.xlabel('Date')
plt.ylabel('Price')
plt.show()

Output:

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

54
PSIT1P1~~~~~ Research in Computing Practical
B. Perform polynomial regression for prediction.
importnumpy as np
importmatplotlib.pyplot as plt

defestimate_coef(x, y):
# number of observations/points
n = np.size(x)

# mean of x and y vector

m_x, m_y = np.mean(x), np.mean(y)

# calculating cross-deviation and deviation about x

SS_xy = np.sum(y*x) - n*m_y*m_x
SS_xx = np.sum(x*x) - n*m_x*m_x

# calculating regression coefficients

b_1 = SS_xy / SS_xx
b_0 = m_y - b_1*m_x

return(b_0, b_1)

defplot_regression_line(x, y, b):
# plotting the actual points as scatter plot
plt.scatter(x, y, color = "m",
marker = "o", s = 30)

# predicted response vector

y_pred = b[0] + b[1]*x

# plotting the regression line

plt.plot(x, y_pred, color = "g")

# putting labels
plt.xlabel('x')
plt.ylabel('y')

# function to show plot

plt.show()

def main():
# observations
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12])

# estimating coefficients
b = estimate_coef(x, y)
print("Estimated coefficients:\nb_0 = {} b_1 = {}".format(b[0], b[1]))

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

55
PSIT1P1~~~~~ Research in Computing Practical

# plotting regression line

plot_regression_line(x, y, b)

if __name__ == "__main__":
main()
Output:

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

56
PSIT1P1~~~~~ Research in Computing Practical
Practical 10:
A. Write a program for multiple linear regression analysis.
Step #1: Data Pre Processing
a) Importing The Libraries.
b) Importing the Data Set.
c) Encoding the Categorical Data.
d) Avoiding the Dummy Variable Trap.
e) Splitting the Data set into Training Set and Test Set.

Step #2: Fitting Multiple Linear Regression to the Training set

Step #3: Predicting the Test set results.

importnumpy as np
importmatplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
importmatplotlib.pyplot as plt
defgenerate_dataset(n):
x = []
y = []
random_x1 = np.random.rand()
random_x2 = np.random.rand()
fori in range(n):
x1 = i
x2 = i/2 + np.random.rand()*n
x.append([1, x1, x2])
y.append(random_x1 * x1 + random_x2 * x2 + 1)
returnnp.array(x), np.array(y)
x, y = generate_dataset(200)
mpl.rcParams['legend.fontsize'] = 12
fig = plt.figure()
ax = fig.gca(projection ='3d')
ax.scatter(x[:, 1], x[:, 2], y, label ='y', s = 5)
ax.legend()
ax.view_init(45, 0)
plt.show()
defmse(coef, x, y):

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

57
PSIT1P1~~~~~ Research in Computing Practical
returnnp.mean((np.dot(x, coef) - y)**2)/2
def gradients(coef, x, y):
returnnp.mean(x.transpose()*(np.dot(x, coef) - y), axis = 1)
defmultilinear_regression(coef, x, y, lr, b1 = 0.9, b2 = 0.999, epsilon = 1e-8):
prev_error = 0
m_coef = np.zeros(coef.shape)
v_coef = np.zeros(coef.shape)
moment_m_coef = np.zeros(coef.shape)
moment_v_coef = np.zeros(coef.shape)
t=0
while True:
error = mse(coef, x, y)
if abs(error - prev_error) <= epsilon:
break
prev_error = error
grad = gradients(coef, x, y)
t += 1
m_coef = b1 * m_coef + (1-b1)*grad
v_coef = b2 * v_coef + (1-b2)*grad**2
moment_m_coef = m_coef / (1-b1**t)
moment_v_coef = v_coef / (1-b2**t)

delta = ((lr / moment_v_coef**0.5 + 1e-8) *

(b1 * moment_m_coef + (1-b1)*grad/(1-b1**t)))
coef = np.subtract(coef, delta)
returncoef
coef = np.array([0, 0, 0])
c = multilinear_regression(coef, x, y, 1e-1)
fig = plt.figure()
ax = fig.gca(projection ='3d')
ax.scatter(x[:, 1], x[:, 2], y, label ='y',
s = 5, color ="dodgerblue")
ax.scatter(x[:, 1], x[:, 2], c[0] + c[1]*x[:, 1] + c[2]*x[:, 2],
label ='regression', s = 5, color ="orange")
ax.view_init(45, 0)
ax.legend()

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

58
PSIT1P1~~~~~ Research in Computing Practical
plt.show()
Output:

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

59
PSIT1P1~~~~~ Research in Computing Practical
B. Perform logistic regression analysis.
Logistic regression is a classification method built on the same concept as linear
regression. With linear regression, we take linear combination of explanatory variables
plus an intercept term to arrive at a prediction.
In this example we will use a logistic regression model to predict survival.

Program Code:
import os
import numpy as np
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
import scipy.stats as stats
from sklearn import linear_model
from sklearn import preprocessing
from sklearn import metrics

matplotlib.style.use('ggplot')
plt.figure(figsize=(9,9))

def sigmoid(t): # Define the sigmoid function

return (1/(1 + np.e**(-t)))

plot_range = np.arange(-6, 6, 0.1)

y_values = sigmoid(plot_range)

# Plot curve
plt.plot(plot_range, # X-axis range
y_values, # Predicted values
color="red")
titanic_train = pd.read_csv("titanic_train.csv") # Read the data
char_cabin = titanic_train["Cabin"].astype(str) # Convert cabin to str
new_Cabin = np.array([cabin[0] for cabin in char_cabin]) # Take first letter

titanic_train["Cabin"] = pd.Categorical(new_Cabin) # Save the new cabin var

# Impute median Age for NA Age values

new_age_var = np.where(titanic_train["Age"].isnull(), # Logical check
28, # Value if check is true

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

60
PSIT1P1~~~~~ Research in Computing Practical
titanic_train["Age"]) # Value if check is false

titanic_train["Age"] = new_age_var

label_encoder = preprocessing.LabelEncoder()

# Convert Sex variable to numeric

encoded_sex = label_encoder.fit_transform(titanic_train["Sex"])

# Initialize logistic regression model

log_model = linear_model.LogisticRegression()

# Train the model

log_model.fit(X = pd.DataFrame(encoded_sex),
y = titanic_train["Survived"])

# Check trained model intercept

print(log_model.intercept_)

# Check trained model coefficients

print(log_model.coef_)

# Make predictions
preds = log_model.predict_proba(X= pd.DataFrame(encoded_sex))
preds = pd.DataFrame(preds)
preds.columns = ["Death_prob", "Survival_prob"]

# Generate table of predictions vs Sex

pd.crosstab(titanic_train["Sex"], preds.ix[:, "Survival_prob"])

# Convert more variables to numeric

encoded_class = label_encoder.fit_transform(titanic_train["Pclass"])
encoded_cabin = label_encoder.fit_transform(titanic_train["Cabin"])

train_features = pd.DataFrame([encoded_class,
encoded_cabin,
encoded_sex,
titanic_train["Age"]]).T

# Initialize logistic regression model

log_model = linear_model.LogisticRegression()

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

61
PSIT1P1~~~~~ Research in Computing Practical

# Train the model

log_model.fit(X = train_features ,
y = titanic_train["Survived"])

# Check trained model intercept

print(log_model.intercept_)

# Check trained model coefficients

print(log_model.coef_)

# Make predictions
preds = log_model.predict(X= train_features)

# Generate table of predictions vs actual

pd.crosstab(preds,titanic_train["Survived"])

log_model.score(X = train_features ,
y = titanic_train["Survived"])

metrics.confusion_matrix(y_true=titanic_train["Survived"], # True labels

y_pred=preds) # Predicted labels

# View summary of common classification metrics

print(metrics.classification_report(y_true=titanic_train["Survived"],
y_pred=preds) )

# Read and prepare test data

titanic_test = pd.read_csv("titanic_test.csv") # Read the data

char_cabin = titanic_test["Cabin"].astype(str) # Convert cabin to str

new_Cabin = np.array([cabin[0] for cabin in char_cabin]) # Take first letter

titanic_test["Cabin"] = pd.Categorical(new_Cabin) # Save the new cabin var

# Impute median Age for NA Age values

new_age_var = np.where(titanic_test["Age"].isnull(), # Logical check
28, # Value if check is true
titanic_test["Age"]) # Value if check is false

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

62
PSIT1P1~~~~~ Research in Computing Practical
titanic_test["Age"] = new_age_var

# Convert test variables to match model features

encoded_sex = label_encoder.fit_transform(titanic_test["Sex"])
encoded_class = label_encoder.fit_transform(titanic_test["Pclass"])
encoded_cabin = label_encoder.fit_transform(titanic_test["Cabin"])

test_features = pd.DataFrame([encoded_class,
encoded_cabin,encoded_sex,titanic_test["Age"]]).T

# Make test set predictions

test_preds = log_model.predict(X=test_features)

# Create a submission for Kaggle

submission = pd.DataFrame({"PassengerId":titanic_test["PassengerId"],
"Survived":test_preds})

# Save submission to CSV

submission.to_csv("tutorial_logreg_submission.csv",
index=False) # Do not save index values

print(pd)

Output:
The table shows that the model
predicted a survival chance of
roughly 19% for males and 73%
for females.

For the Titanic

competition, accuracy is
the scoring metric used
to judge the competition,
so we don't have to
worry too much about
other metrics.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

63
PSIT1P1~~~~~ Research in Computing Practical
The table above shows the classes our model predicted vs. true
values of the Survived variable.

This logistic regression model has an accuracy score of 0.75598 which is actually
worse than the accuracy of the simplistic women survive, men die model (0.76555).

Example 2:
The dataset is related to direct marketing campaigns (phone calls) of a Portuguese
banking institution. The classification goal is to predict whether the client will
subscribe (1/0) to a term deposit (variable y). The dataset provides the bank customers’
information. It includes 41,188 records and 21 fields.

Input variables

1. age (numeric)
2. job : type of job (categorical: “admin”, “blue-collar”, “entrepreneur”,
“housemaid”, “management”, “retired”, “self-employed”, “services”, “student”,
“technician”, “unemployed”, “unknown”)
3. marital : marital status (categorical: “divorced”, “married”, “single”,
“unknown”)
4. education (categorical: “basic.4y”, “basic.6y”, “basic.9y”, “high.school”,
“illiterate”, “professional.course”, “university.degree”, “unknown”)
5. default: has credit in default? (categorical: “no”, “yes”, “unknown”)
6. housing: has housing loan? (categorical: “no”, “yes”, “unknown”)
7. loan: has personal loan? (categorical: “no”, “yes”, “unknown”)
8. contact: contact communication type (categorical: “cellular”, “telephone”)
9. month: last contact month of year (categorical: “jan”, “feb”, “mar”, …, “nov”,
“dec”)
10. day_of_week: last contact day of the week (categorical: “mon”, “tue”, “wed”,
“thu”, “fri”)
11. duration: last contact duration, in seconds (numeric). Important note: this
attribute highly affects the output target (e.g., if duration=0 then y=’no’). The
duration is not known before a call is performed, also, after the end of the call, y
is obviously known. Thus, this input should only be included for benchmark
purposes and should be discarded if the intention is to have a realistic predictive
model

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

64
PSIT1P1~~~~~ Research in Computing Practical
12. campaign: number of contacts performed during this campaign and for this client
(numeric, includes last contact)
13. pdays: number of days that passed by after the client was last contacted from a
previous campaign (numeric; 999 means client was not previously contacted)
14. previous: number of contacts performed before this campaign and for this client
(numeric)
15. poutcome: outcome of the previous marketing campaign (categorical: “failure”,
“nonexistent”, “success”)
16. emp.var.rate: employment variation rate — (numeric)
17. cons.price.idx: consumer price index — (numeric)
18. cons.conf.idx: consumer confidence index — (numeric)
19. euribor3m: euribor 3 month rate — (numeric)
20. nr.employed: number of employees — (numeric)

Predict variable (desired target):

y — has the client subscribed a term deposit?
(binary: “1”, means “Yes”, “0” means “No”)

Program Code:
# -*- coding: utf-8 -*-
"""
Created on Mon Dec 16 22:24:44 2019
@author: MyHome
"""
import pandas as pd
import numpy as np
from sklearn import preprocessing
import matplotlib.pyplot as plt
plt.rc("font", size=14)
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import seaborn as sns
sns.set(style="white")
sns.set(style="whitegrid", color_codes=True)
data = pd.read_csv('bank.csv', header=0)
data = data.dropna()
print(data.shape)
print(list(data.columns))
data['education'].unique()
data['education']=np.where(data['education'] =='basic.9y', 'Basic', data['education'])
data['education']=np.where(data['education'] =='basic.6y', 'Basic', data['education'])

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

65
PSIT1P1~~~~~ Research in Computing Practical
data['education']=np.where(data['education'] =='basic.4y', 'Basic', data['education'])
data['education'].unique()
data['y'].value_counts()

sns.countplot(x='y', data=data, palette='hls')

plt.show();
plt.savefig('Practical10B-plot.jpeg')

count_no_sub = len(data[data['y']==0])
count_sub = len(data[data['y']==1])
pct_of_no_sub = count_no_sub/(count_no_sub+count_sub)
print("percentage of no subscription is", pct_of_no_sub*100)
pct_of_sub = count_sub/(count_no_sub+count_sub)
print("percentage of subscription", pct_of_sub*100)

data.groupby('y').mean()
data.groupby('job').mean()
data.groupby('marital').mean()
data.groupby('education').mean()

########### Purchase Frequency for Job Title

pd.crosstab(data.job,data.y).plot(kind='bar')
plt.title('Purchase Frequency for Job Title')
plt.xlabel('Job')
plt.ylabel('Frequency of Purchase')
plt.savefig('purchase_fre_job')

###################### Marital Status vs Purchase

table=pd.crosstab(data.marital,data.y)
table.div(table.sum(1).astype(float), axis=0).plot(kind='bar', stacked=True)
plt.title('Stacked Bar Chart of Marital Status vs Purchase')
plt.xlabel('Marital Status')
plt.ylabel('Proportion of Customers')
plt.savefig('mariral_vs_pur_stack')

############ Education vs Purchase

table=pd.crosstab(data.education,data.y)
table.div(table.sum(1).astype(float), axis=0).plot(kind='bar', stacked=True)
plt.title('Stacked Bar Chart of Education vs Purchase')

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

66
PSIT1P1~~~~~ Research in Computing Practical
plt.xlabel('Education')
plt.ylabel('Proportion of Customers')
plt.savefig('edu_vs_pur_stack')

pd.crosstab(data.day_of_week,data.y).plot(kind='bar')
plt.title('Purchase Frequency for Day of Week')
plt.xlabel('Day of Week')
plt.ylabel('Frequency of Purchase')
plt.savefig('pur_dayofweek_bar')

############ Purchase Frequency for Month

pd.crosstab(data.month,data.y).plot(kind='bar')
plt.title('Purchase Frequency for Month')
plt.xlabel('Month')
plt.ylabel('Frequency of Purchase')
plt.savefig('pur_fre_month_bar')

########### Age Purchase frequency pattern

data.age.hist()
plt.title('Histogram of Age')
plt.xlabel('Age')
plt.ylabel('Frequency')
plt.savefig('hist_age')

Output: -
percentage of no subscription is
88.73458288821988
percentage of subscription
11.265417111780131
Our classes are imbalanced, and the ratio of no-
subscription to subscription instances is 89:11.
• The average age of customers who bought
the term deposit is higher than that of the
customers who didn’t.
• The pdays (days since the customer was last
contacted) is understandably lower for the
customers who bought it. The lower the
pdays, the better the memory of the last call
and hence the better chances of a sale.
• Surprisingly, campaigns (number of contacts
or calls made during the current campaign)

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

67
PSIT1P1~~~~~ Research in Computing Practical
are lower for customers who bought the term
deposit.

The frequency of purchase

of the deposit depends a
great deal on the job title.
Thus, the job title can be a
good predictor of the
outcome variable.

The marital status does not

seem a strong predictor for
the outcome variable.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

68
PSIT1P1~~~~~ Research in Computing Practical
Education seems a good
predictor of the
outcome variable.

Day of week may not be

a good predictor of the
outcome.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

69
PSIT1P1~~~~~ Research in Computing Practical
Month might be a good predictor of
the outcome variable.

Most of the customers of the bank in

this dataset are in the age range of
30–40.

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

70
PSIT1P1~~~~~ Research in Computing Practical
Sr. Practical
Name of the Practical File Names
No No
1) 1 A Write a program for obtaining descriptive
statistics of data.
2) B Import data from different data sources (from As required in Data
Excel, csv, mysql, sql server, oracle to Science Pracitcal
R/Python/Excel)
3) 2 A Design a survey form for a given case study,
collect the primary data and analyse it
4) B Perform suitable analysis of given secondary
data.
5) 3 A Perform testing of hypothesis using one sample ages.csv
t-test.
6) B Perform testing of hypothesis using two sample
t-test.
7) C Perform testing of hypothesis using paired t-test. blood_pressure.csv
8) 4 A Perform testing of hypothesis using chi-squared Students_Score.xlsx
goodness-of-fit test.
9) B Perform testing of hypothesis using chi-squared
Test of Independence
10) 5 Perform testing of hypothesis using Z-test. blood_pressure.csv
11) 6 A Perform testing of hypothesis using one-way scores.csv
ANOVA. scores.xlsx
12) B Perform testing of hypothesis using two-way ToothGrowth.csv
ANOVA.
13) C Perform testing of hypothesis using multivariate iris.csv
ANOVA (MANOVA).
14) 7 A Perform the Random sampling for the given Students_Score.xlsx
data and analyse it.
15) B Perform the Stratified sampling for the given housing.csv
data and analyse it.
16) 8 Compute different types of correlation.
17) 9 A Perform linear regression for prediction.
18) B Perform polynomial regression for prediction.
19) 10 A Perform multiple linear regression.
20) B Perform Logistic regression. titanic_train.csv
bank.csv

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

71
PSIT1P1~~~~~ Research in Computing Practical

Dear Teacher,
Please send your valuable feedback and contribution to make this manual more
effective.

Feel Free to connect us on …….

[email protected]
[email protected]
[email protected]

M. Sc. [Information Technology] SEMESTER ~ I Teacher’s Reference Manual

Parasite SEO Secrets Revealed by Charles Floate
100% (1)
Parasite SEO Secrets Revealed by Charles Floate
73 pages
Planning For Estidama
No ratings yet
Planning For Estidama
34 pages
Practical File Informatics Practices Class 12 For 2022-23
75% (4)
Practical File Informatics Practices Class 12 For 2022-23
28 pages
Dreams - Interpreting Your Dreams and How To Dream Your Desires - Lucid Dreaming, Visions and Dream Interpretation PDF
100% (1)
Dreams - Interpreting Your Dreams and How To Dream Your Desires - Lucid Dreaming, Visions and Dream Interpretation PDF
70 pages
Practical 1: A. Write A Program For Obtaining Descriptive Statistics of Data
No ratings yet
Practical 1: A. Write A Program For Obtaining Descriptive Statistics of Data
53 pages
DSBDAlab Manual
No ratings yet
DSBDAlab Manual
116 pages
Practice Paper 1
No ratings yet
Practice Paper 1
6 pages
RIC - Cover Page
No ratings yet
RIC - Cover Page
4 pages
Ip Practical File
No ratings yet
Ip Practical File
36 pages
Ip Practical File 2
No ratings yet
Ip Practical File 2
30 pages
Lab Manual
No ratings yet
Lab Manual
80 pages
12pb24ip01 QP
No ratings yet
12pb24ip01 QP
12 pages
Xii Ip QP
No ratings yet
Xii Ip QP
8 pages
Dsbda Lab Manual Merged
No ratings yet
Dsbda Lab Manual Merged
117 pages
Class Xii Ip - Sahodaya Set1 2023
No ratings yet
Class Xii Ip - Sahodaya Set1 2023
13 pages
Xii Ip CHN 02 QP
No ratings yet
Xii Ip CHN 02 QP
5 pages
Data Science Papers
No ratings yet
Data Science Papers
109 pages
Practical File Informatics Practices (2024-2025)
No ratings yet
Practical File Informatics Practices (2024-2025)
47 pages
Sample Paper 4 (Ip)
No ratings yet
Sample Paper 4 (Ip)
9 pages
Xii Ip Study Material
No ratings yet
Xii Ip Study Material
92 pages
Ip Kvs
No ratings yet
Ip Kvs
92 pages
XIIInfo Pract S E 443
No ratings yet
XIIInfo Pract S E 443
6 pages
Dsbda Lab Manual
No ratings yet
Dsbda Lab Manual
167 pages
Information Pratice
No ratings yet
Information Pratice
131 pages
Class Xii Ip - Sahodaya Set1 2023 - Answer Key
No ratings yet
Class Xii Ip - Sahodaya Set1 2023 - Answer Key
22 pages
Cbleippu 14 B
No ratings yet
Cbleippu 14 B
8 pages
Xii PB 2023 Ip 201123
No ratings yet
Xii PB 2023 Ip 201123
6 pages
Phase 3 Xii Ip (24-12-2024) Set A
No ratings yet
Phase 3 Xii Ip (24-12-2024) Set A
9 pages
Set-D CT2 Answerkey
No ratings yet
Set-D CT2 Answerkey
11 pages
Xii - Infomatics Practices - M
No ratings yet
Xii - Infomatics Practices - M
10 pages
Ip CLSS Xii 2024-25 Hy
No ratings yet
Ip CLSS Xii 2024-25 Hy
14 pages
Class 12 Ip Prac. FILE
No ratings yet
Class 12 Ip Prac. FILE
28 pages
Orange IP065 12 QP
No ratings yet
Orange IP065 12 QP
9 pages
Class Xii Ip - Sahodaya Set2 2023
No ratings yet
Class Xii Ip - Sahodaya Set2 2023
13 pages
Megh 1234 Dvda
No ratings yet
Megh 1234 Dvda
21 pages
Cs Prac QP 2025 8 Sets PDF
No ratings yet
Cs Prac QP 2025 8 Sets PDF
8 pages
SAC Lab Syllabus AI&ML
No ratings yet
SAC Lab Syllabus AI&ML
3 pages
Ms - Preboard QP IP 2024-25 - Set3she
No ratings yet
Ms - Preboard QP IP 2024-25 - Set3she
12 pages
QP 065 Informatic Practice New
No ratings yet
QP 065 Informatic Practice New
12 pages
Ip Practical File
No ratings yet
Ip Practical File
47 pages
Xi - Infomatics Practices
No ratings yet
Xi - Infomatics Practices
10 pages
ML Manual
No ratings yet
ML Manual
21 pages
PM Shri Kendriya Vidyalaya Pattom Shift Ii: Movie Data Analysis
No ratings yet
PM Shri Kendriya Vidyalaya Pattom Shift Ii: Movie Data Analysis
35 pages
IP Record Python 23-24 Aryan
No ratings yet
IP Record Python 23-24 Aryan
42 pages
Topics For IP (065) Practical
No ratings yet
Topics For IP (065) Practical
4 pages
TUTORIAL MS IP Class 12 For 2023
No ratings yet
TUTORIAL MS IP Class 12 For 2023
13 pages
Diploma in Information Technology: Centralized Question Bank
No ratings yet
Diploma in Information Technology: Centralized Question Bank
4 pages
Question Paper CS Practical Set1-6 (23-24) T
No ratings yet
Question Paper CS Practical Set1-6 (23-24) T
6 pages
Xii Ip CHN 03 QP
No ratings yet
Xii Ip CHN 03 QP
6 pages
Ip Sample Paper 7
50% (2)
Ip Sample Paper 7
8 pages
IP Sample Paper 1
No ratings yet
IP Sample Paper 1
10 pages
Dvda Lab Manuals 087
No ratings yet
Dvda Lab Manuals 087
37 pages
SET-4 IP MS Practice
No ratings yet
SET-4 IP MS Practice
7 pages
Informatics Practicals PDF
No ratings yet
Informatics Practicals PDF
10 pages
Informatics Practices
No ratings yet
Informatics Practices
9 pages
IGNOU BCA Introduction to Software Engineering Previous Year Unsolved Papers BCS 051
From Everand
IGNOU BCA Introduction to Software Engineering Previous Year Unsolved Papers BCS 051
Manish Soni
No ratings yet
IGNOU PGDCA All in One Previous Years Unsolved Papers
From Everand
IGNOU PGDCA All in One Previous Years Unsolved Papers
Manish Soni
No ratings yet
IGNOU PGDCA First Semester Previous Years Unsolved Papers
From Everand
IGNOU PGDCA First Semester Previous Years Unsolved Papers
Manish Soni
No ratings yet
IGNOU BCA Computer Basics and PC Software Previous Year Unsolved Papers BCS 011
From Everand
IGNOU BCA Computer Basics and PC Software Previous Year Unsolved Papers BCS 011
Manish Soni
No ratings yet
IGNOU MCA Object-Oriented Analysis and Design Previous Years Unsolved Papers MCS 219
From Everand
IGNOU MCA Object-Oriented Analysis and Design Previous Years Unsolved Papers MCS 219
Manish Soni
No ratings yet
IGNOU MCA Data Science and Big Data Previous Years Unsolved Papers MCS 226
From Everand
IGNOU MCA Data Science and Big Data Previous Years Unsolved Papers MCS 226
Manish Soni
No ratings yet
IGNOU PGDCA MCS 201 Programming in C and Python Previous Years Unsolved Papers
From Everand
IGNOU PGDCA MCS 201 Programming in C and Python Previous Years Unsolved Papers
Manish Soni
No ratings yet
IGNOU BCA Statistical Techniques Previous Year Unsolved Papers BCS 040
From Everand
IGNOU BCA Statistical Techniques Previous Year Unsolved Papers BCS 040
Manish Soni
No ratings yet
Alternative Delivery Mode Learning Resource Standards (Reviewer's Copy) I. Background
No ratings yet
Alternative Delivery Mode Learning Resource Standards (Reviewer's Copy) I. Background
32 pages
Week 2 - Critical Thinking and Fundamental Reading Skills
No ratings yet
Week 2 - Critical Thinking and Fundamental Reading Skills
49 pages
Contribution of Renewable Energy On Total Energy Capacity
No ratings yet
Contribution of Renewable Energy On Total Energy Capacity
6 pages
Curved Point-in-Space
No ratings yet
Curved Point-in-Space
13 pages
SIL Selection SIL Verification With ExSIlentia Syllabus
0% (1)
SIL Selection SIL Verification With ExSIlentia Syllabus
3 pages
Exam Lo1 Electrical Circuit Protection
No ratings yet
Exam Lo1 Electrical Circuit Protection
1 page
Fuzzy Logic To Controlled Signal System
No ratings yet
Fuzzy Logic To Controlled Signal System
10 pages
Laporan Daftar Pengguna GoodEva SmartSafety - Batch 1
No ratings yet
Laporan Daftar Pengguna GoodEva SmartSafety - Batch 1
3 pages
Elephant Lifting Catalog v48
100% (1)
Elephant Lifting Catalog v48
80 pages
Rungta College of Engineering and Technology :: Dr. Vishnu Kumar Mishra :: Report
No ratings yet
Rungta College of Engineering and Technology :: Dr. Vishnu Kumar Mishra :: Report
184 pages
2013 ME Magway,, English
No ratings yet
2013 ME Magway,, English
4 pages
ReadyIAS AW Toolkit
No ratings yet
ReadyIAS AW Toolkit
41 pages
Paver Block Specification
No ratings yet
Paver Block Specification
8 pages
HDI OnQ RandI Set A Closed To Arrival Control On Rate Levels V1.0
No ratings yet
HDI OnQ RandI Set A Closed To Arrival Control On Rate Levels V1.0
11 pages
LRFD 0.9F 0.75F 0.99F: LR F A LR
No ratings yet
LRFD 0.9F 0.75F 0.99F: LR F A LR
4 pages
0-Week Forex Trading Roadmap (Jun 7
No ratings yet
0-Week Forex Trading Roadmap (Jun 7
14 pages
Phy340-Tutorial 2
No ratings yet
Phy340-Tutorial 2
2 pages
Sentence Structure: Categories Noun
No ratings yet
Sentence Structure: Categories Noun
4 pages
Eng523 - 2
No ratings yet
Eng523 - 2
4 pages
We Need To Talk About IT Architecture
No ratings yet
We Need To Talk About IT Architecture
60 pages
Oral Habits and Its Relationship To Malocclusion A Review.20141212083000
No ratings yet
Oral Habits and Its Relationship To Malocclusion A Review.20141212083000
4 pages
Lecture - 11 SD Final
100% (1)
Lecture - 11 SD Final
26 pages
List of Government Colleges Affiliated To The University of Jammu (ACADEMIC SESSION 2020-21)
No ratings yet
List of Government Colleges Affiliated To The University of Jammu (ACADEMIC SESSION 2020-21)
9 pages
WEG - Transformer
No ratings yet
WEG - Transformer
20 pages
Short Term Tender For Supply, Installation and Commissioning of Various Medical Equipments For Covid-19 Pandemic
No ratings yet
Short Term Tender For Supply, Installation and Commissioning of Various Medical Equipments For Covid-19 Pandemic
108 pages
Economic-Geology-1965 - v60-n07 - P1459-P1477structural Analysis of Ore Shoots at Greenside
No ratings yet
Economic-Geology-1965 - v60-n07 - P1459-P1477structural Analysis of Ore Shoots at Greenside
19 pages
Vertic
No ratings yet
Vertic
4 pages