100% found this document useful (1 vote)

46 views11 pages

Description: Hint: Perform Steps As Mentioned Below

This document describes analyzing IQ data from several datasets: 1. Load 10000 IQ scores and recalculate the mean and standard deviation, printing the results. 2. Using a normal distribution, calculate the percentage of scores between values from a test file and print the result. 3. Read sample data files specified in the test file, test if the sample mean equals the population mean, and print "Reject" or "Accept".

Uploaded by

Anish Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

46 views11 pages

Description: Hint: Perform Steps As Mentioned Below

Uploaded by

Anish Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

DESCRIPTION

Consider an automobile data set with values such as Name of the car Model, its
mileage, number of cylinders, number of gears and so on.

Here’s a preview of the data under consideration:

The data is present in the file named mtcars.csv which is present at the

location /data/training/mtcars.csv

Write a Python code to calculate the difference between the means 10-fold cross
validation scores of ridge regression and lasso regression with alpha as 1.0

Hint: Perform steps as mentioned below:

 Load data
 Use all the columns except ‘mpg’ & ‘model’ as predictors (x)
 Use ‘mpg’ as the response column(y)
 Perform Lasso & Ridge regression on this data. Use all the rows as
training data for performing regression.
 Perform cross validation on both the regression results for the above x & y
with cv=10 and default scoring
 Calculate and print the difference by subtracting the mean score of
ridge from the mean score of lasso regression

Input Format:

Read the input file /data/training/mtcars.csv

Output Format:

 You have to perform the operations as described above and write the
value of difference in the mean scores as stated above in a file
named output.csv, which should be present at the
location /code/output/output.csv
 output.csv should contain the value rounded to 2 decimals in the first row
Sample Output:

Example: output.csv will have data looking like this:

Iris Data - (Assignment 4 - Question 3)
bookmark_border
 subject Machine Learning / AI
 casino 5 points

DESCRIPTION
Question:

Perform logistic regression on iris data set as follows:

1. Load iris data set from sklearn.datasets

 Hint: To load the dataset, use:

from sklearn import datasets

iris = datasets.load_iris()
x = iris.data
y = iris.target

 To perform logistic regression, use function from sklean.linear_model with

default value of parameters

2. Perform cross validation on this model for the specified x & y values with cv
as 5 and scoring as accuracy.

 Hint: Use function cross_val_score

 This generates accuracy scores, one for each iteration of the 5 iterations
performed
 Print the mean accuracy score rounded to 2 decimal places

Input Format:

 Refer to the starter code provided in the CODE section to load the data
and set the predictors & response variables.

Output Format:

 Write the value of mean accuracy score in a file named output.csv which

should be present at the location /code/output/output.csv
 Write the value rounded to 2 decimal places in the first row

Sample Output:

Example: output.csv will have data looking like this:

DATASETS
EXECUTION TIME LIMIT
Default.
IQ Data - (Assignment 4 - Question 1)
bookmark_border
 subject Machine Learning / AI
 casino 15 points

DESCRIPTION
The IQ data set containing 10000 data points is present at the location
(/data/training/iqdata.csv)
The data set contains only the IQ values of people who participated in the survey
across the world in a single column without header.

Here's a preview of the data under consideration:

It contains IQ values with the below specifications:

 The average IQ is around 110

 There are a few super-intelligent people whose IQ is 192
 There are a few people with less IQ of 34
 The standard deviation is around 20
 The data points follow a normal distribution

Based on this data, create Python programs to perform the required analysis as
described below:

1. Load the 10000 point data into a 1-D array. Then recalculate
its mean & standard deviation to obtain their exact values. Print these two
values.

 Hint: Use functions from numpy library

 Note: These two values are calculated for the entire data in all the cases

2. Calculate what percentage of people should have an IQ value between two
values specified in the /data/training/testcaseiq.txt

 Hint: Since the data follows a normal distribution, use an appropriate

function of norm from scipy.stats library
 Using this function, calculate the probability of an IQ score being smaller
than the upper value specified in the testcaseiq.txt
 Similarly, calculate the probability of an IQ score being smaller than
the lower value specified in the testcaseiq.txt
 Subtract the above two values to calculate the probability of IQ score
falling between the lower and upper values
 Finally, print the result as percentage without the % sign
 Do this for all the testcases provided in testcaseiq.txt

3. A sample is drawn from this data is stored in different files such
as iqsample1.csv , iqsample2.csv and so on. Read the name of the file
(<file_name>) from testcaseiq.txt and then read the corresponding file
from /data/training/<file_name>.csv

Consider a Null hypothesis that the mean of the sample is equal to the
population mean of the above 10000 point data set. Test and decide whether the
hypothesis can be accepted or rejected based on the p-value as:
- If p-value < 0.05, print as "Reject"
- Else print as "Accept"

Input Format:

 The first file to be read will be iqdata.csv, which contains the data as
mentioned above. This file is in .csv format and is present at the location
(/data/training/iqdata.csv)
 The second file to be read is testcaseiq.txt which is present at
(/data/training/testcaseiq.txt)
 testcaseiq.txt has the following lines:
o The first line contains the number of test cases T
o From the second line, every set of three lines contain the lower
value of the desired IQ range, the upper value of the desired IQ range and the
name of the file containing samples to be used in the calculation of Null
Hypothesis testing such as iqsample1
o Then read the sample data from (/data/training/iqsample1.csv)
Output Format:

 For each test case T, create an output file, output1.csv, output2.csv, ...,

outputn.csv where n represents the test case number
 outputn.csv should be present at the location
(/code/output/outputn.csv) . This file should consist of the values
for Mean and Standard Deviation on two separate rows, both values rounded
to 2 decimal places.

Note: These two values are calculated for the entire data in all the cases

 The third line should contain the percentage value such as 34.567 of
people with IQ in the specified range. The value should be rounded to 3 decimal
places
 The fourth line should contain the result of the Null Hypothesis in the
format stated above
 outputn.csv should consist of the values on four separate rows one
below the other

Sample Test Cases:

testcaseiq.txt contains the following data:

2
80
140
iqsample1
70
120
iqsample2

Sample Output:
Example: output1.csv will have data looking like this:

DATASETS

 Training datasethelp_outline

EXECUTION TIME LIMIT

Default.

Python Advanced - Advanced Techniques For Finance Pro's - A Comprehensive Guide To The Application of Python in Finance-Reactive Publishing (2023)
100% (2)
Python Advanced - Advanced Techniques For Finance Pro's - A Comprehensive Guide To The Application of Python in Finance-Reactive Publishing (2023)
192 pages
Excel 2019 Student Guide 01 Basic
No ratings yet
Excel 2019 Student Guide 01 Basic
40 pages
1.4 Eagle Point Road Design Software Manual
91% (22)
1.4 Eagle Point Road Design Software Manual
48 pages
Linear Regression Assignment
0% (2)
Linear Regression Assignment
8 pages
CM2 CFS100 2001 05
No ratings yet
CM2 CFS100 2001 05
46 pages
ADS and Python Integration Using Datalink
No ratings yet
ADS and Python Integration Using Datalink
50 pages
Python - Unit1-7
No ratings yet
Python - Unit1-7
68 pages
ML Lab Experiment Shivansh
No ratings yet
ML Lab Experiment Shivansh
29 pages
Free Openai & Anthropic AI For Traders (O1, Sonnet, Opus, Gpt4o Etc)
No ratings yet
Free Openai & Anthropic AI For Traders (O1, Sonnet, Opus, Gpt4o Etc)
71 pages
Monika Sree 11-07-2024
No ratings yet
Monika Sree 11-07-2024
36 pages
Importing A CSV File Into The DataFrame
No ratings yet
Importing A CSV File Into The DataFrame
11 pages
Webctrl V4: User'S Guide
No ratings yet
Webctrl V4: User'S Guide
236 pages
ML Question
No ratings yet
ML Question
2 pages
Class 12 CS File Handling
No ratings yet
Class 12 CS File Handling
120 pages
Experiment No 3 Importing and Exporting Data in Python Using Pandas Student
No ratings yet
Experiment No 3 Importing and Exporting Data in Python Using Pandas Student
6 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
43 pages
Python 1
No ratings yet
Python 1
3 pages
KE Lab Manual
No ratings yet
KE Lab Manual
22 pages
Assignment 02
No ratings yet
Assignment 02
2 pages
Ritesh Mangla ML PracticalFile
No ratings yet
Ritesh Mangla ML PracticalFile
55 pages
Natural Optimizer Compiler
No ratings yet
Natural Optimizer Compiler
64 pages
TKC Yhan
No ratings yet
TKC Yhan
64 pages
Lol
100% (1)
Lol
11 pages
AMLW Assignment 4
No ratings yet
AMLW Assignment 4
2 pages
Ijeqi
No ratings yet
Ijeqi
10 pages
4-10 Aiml
No ratings yet
4-10 Aiml
25 pages
Knowledge Enginnering Record
No ratings yet
Knowledge Enginnering Record
21 pages
Final Project
No ratings yet
Final Project
4 pages
Calibration Manager
No ratings yet
Calibration Manager
34 pages
Sample QP For Mid-Semester Exam
No ratings yet
Sample QP For Mid-Semester Exam
5 pages
ML Lab Record - 250625 - 105014
No ratings yet
ML Lab Record - 250625 - 105014
29 pages
STA 591 Test1 F24 (TakeHome)
No ratings yet
STA 591 Test1 F24 (TakeHome)
7 pages
2a. Exploratory Data Analysis
No ratings yet
2a. Exploratory Data Analysis
7 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
167 pages
Week 2 Part 1 Inferential Statistics 1 Self Paced TutorialsUpload
No ratings yet
Week 2 Part 1 Inferential Statistics 1 Self Paced TutorialsUpload
16 pages
DSBDA Lab Plan
No ratings yet
DSBDA Lab Plan
5 pages
ML 6 7 8
No ratings yet
ML 6 7 8
10 pages
CS2B Nov 24 QP
No ratings yet
CS2B Nov 24 QP
5 pages
Data Science Mid Semester Exams
No ratings yet
Data Science Mid Semester Exams
2 pages
DFIR Command Line
No ratings yet
DFIR Command Line
2 pages
DSML Problem Statements
No ratings yet
DSML Problem Statements
8 pages
Lab 11,12
No ratings yet
Lab 11,12
7 pages
Mine Manual
100% (1)
Mine Manual
97 pages
R Lab
No ratings yet
R Lab
7 pages
Kanmkanman
No ratings yet
Kanmkanman
78 pages
Dataloader Instructions
No ratings yet
Dataloader Instructions
10 pages
Bulk Import Users Into Active Directory From CSV - NetworkProGuide
No ratings yet
Bulk Import Users Into Active Directory From CSV - NetworkProGuide
8 pages
Bussiness Report PM
No ratings yet
Bussiness Report PM
44 pages
Machine Learning Lab Manual 06
100% (1)
Machine Learning Lab Manual 06
8 pages
ML Question Bank
No ratings yet
ML Question Bank
7 pages
Dsbda 5
No ratings yet
Dsbda 5
4 pages
ML Lab Programs For Exam
No ratings yet
ML Lab Programs For Exam
10 pages
DS4420 Coding Midterm
No ratings yet
DS4420 Coding Midterm
5 pages
Using Edx Insights
No ratings yet
Using Edx Insights
74 pages
NMC Codebook v4 0
No ratings yet
NMC Codebook v4 0
68 pages
Linear Regression
No ratings yet
Linear Regression
1 page
Lecture 02 Running EnergyPlus
No ratings yet
Lecture 02 Running EnergyPlus
29 pages
Data Exploration and Visualisation With R: Yanchang Zhao
No ratings yet
Data Exploration and Visualisation With R: Yanchang Zhao
45 pages
LP III Lab Manual
100% (1)
LP III Lab Manual
8 pages
AIDS - DM Using Python - Lab Programs
No ratings yet
AIDS - DM Using Python - Lab Programs
19 pages
Question 1 The Given Dataset Can Be Visualized As Follows
No ratings yet
Question 1 The Given Dataset Can Be Visualized As Follows
13 pages
Assignment 3
No ratings yet
Assignment 3
3 pages
Problem Statement
No ratings yet
Problem Statement
1 page
End Sem PYQ
No ratings yet
End Sem PYQ
8 pages
Mean Deviation
No ratings yet
Mean Deviation
10 pages
People Analytics Python Training String
No ratings yet
People Analytics Python Training String
19 pages
Problem-Set - 1 Practise Problems From Textbook
No ratings yet
Problem-Set - 1 Practise Problems From Textbook
2 pages
Talend ETL Sample Documentation
No ratings yet
Talend ETL Sample Documentation
25 pages
Alteryx AWS BigML
No ratings yet
Alteryx AWS BigML
15 pages
IS5312 Mini Project-2
No ratings yet
IS5312 Mini Project-2
5 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Python Practice Questions
No ratings yet
Python Practice Questions
5 pages
2a EDA
No ratings yet
2a EDA
16 pages
FYMCA IDSLab A6 Submission
No ratings yet
FYMCA IDSLab A6 Submission
9 pages
Technical Training v7 Exercises
No ratings yet
Technical Training v7 Exercises
36 pages
IT Dashboard User Guide - V1 0 0
No ratings yet
IT Dashboard User Guide - V1 0 0
42 pages
Scrivener3 Keyboard Shortcuts: Compiled by Terrence L. Brown
No ratings yet
Scrivener3 Keyboard Shortcuts: Compiled by Terrence L. Brown
7 pages
Datascience
No ratings yet
Datascience
8 pages
Final Paper MF 450 BA
No ratings yet
Final Paper MF 450 BA
1 page
Project Report
100% (3)
Project Report
36 pages
Module 1 - Data Analysis in Excel
No ratings yet
Module 1 - Data Analysis in Excel
15 pages
Reciprocating Multiruns: Case Packages Multirun Interaction
No ratings yet
Reciprocating Multiruns: Case Packages Multirun Interaction
6 pages
C Programming Language
From Everand
C Programming Language
Younish Pathan
No ratings yet
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
From Everand
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
Nikhil Khan
No ratings yet
"C Programming for Beginners: A Step-by-Step Guide"
From Everand
"C Programming for Beginners: A Step-by-Step Guide"
Lov kush
No ratings yet
Java Programming Tutorial With Screen Shots & Many Code Example
From Everand
Java Programming Tutorial With Screen Shots & Many Code Example
Desmond Ohwofosirai
No ratings yet
C Programming
From Everand
C Programming
Netra
No ratings yet
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Python: Advanced Guide to Programming Code with Python
From Everand
Python: Advanced Guide to Programming Code with Python
Charlie Masterson
No ratings yet
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
From Everand
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
Charlie Masterson
No ratings yet
Coding In C Decoded: Decoded, #1
From Everand
Coding In C Decoded: Decoded, #1
D Brown
No ratings yet
Advanced C++ Interview Questions You'll Most Likely Be Asked
From Everand
Advanced C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

Description: Hint: Perform Steps As Mentioned Below

Uploaded by

Description: Hint: Perform Steps As Mentioned Below

Uploaded by

DESCRIPTION

The data is present in the file named mtcars.csv which is present at the

Read the input file /data/training/mtcars.csv

Example: output.csv will have data looking like this:

Perform logistic regression on iris data set as follows:

1. Load iris data set from sklearn.datasets

 Hint: To load the dataset, use:

from sklearn import datasets

 To perform logistic regression, use function from sklean.linear_model with

 Hint: Use function cross_val_score

 Write the value of mean accuracy score in a file named output.csv which

Example: output.csv will have data looking like this:

Here's a preview of the data under consideration:

It contains IQ values with the below specifications:

 The average IQ is around 110

 Hint: Use functions from numpy library

 Hint: Since the data follows a normal distribution, use an appropriate

 For each test case T, create an output file, output1.csv, output2.csv, ...,

Sample Test Cases:

EXECUTION TIME LIMIT

You might also like