0% found this document useful (0 votes)

55 views3 pages

CAE1 - 2 - Set1 Key

This document contains details of a data science examination including course outcomes, exam questions, and explanations of key data science concepts. It begins with listing the course outcomes being assessed. The exam then consists of three parts - Part A contains 5 short answer questions testing basic recall of data science terms and concepts. Part B has two longer answer questions explaining data preparation processes and the data science process in detail. Part C is a single long answer question explaining levels of measurement, types of variables, and providing examples.

Uploaded by

JANILA J.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views3 pages

CAE1 - 2 - Set1 Key

Uploaded by

JANILA J.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Reg.No.

MarEphraemCollegeof EngineeringandTechnology,Elavuvilai
B.E. DEGREE CONTINUOUS ASSESSMENT EXAMINATION I – September
2022
ThirdSemester
Department of Computer Science and Engineering
CS3352 Foundations of Data Science

Time: 1.30hrs. Maximum:50marks

CourseOutcomes(COs)forAssessmentinthisExaminatio
n
CO1 Define the data science process
CO2 Describe different types of data description for data science process
CL-CognitiveLevel;R-Remember;Un-Understand;Ap-Apply;An-Analyze;Ev-Evaluate;Cr-Create

PARTA– (5 X2=10 marks)

1. List the benefits of data science R CO1
Commercial, human resource, financial, government sector, non government,
education
2. What do you mean by unstructured data? R CO1
Unstructured simply means that it is datasets (typical large collections of files)
that aren't stored in a structured database format. Unstructured data has an
internal structure, but it's not predefined through data models. It might be human
generated, or machine generated in a textual or a non-textual format.
3. What is statistics? R CO2
Statistics is a set of mathematical methods and tools that enable us to answer important
questions about data
4. List the types of data based on statistical analysis R CO2
Nominal data.
Ordinal data.
Discrete data.
Continuous data.

5. Give the difference between Descriptive statistics and inferential statistics. R CO2
Descriptive Statistics gives information about raw data regarding its description
or features. Inferential statistics, on the other hand, draw inferences about the
population by using data extracted from the population.

PARTB– (2 X13 =26marks)

6.a Explain in detail about data preparation process Un CO1
Data preparation steps

The specifics of the data preparation process vary by industry, organization, and
need, but the workflow remains largely the same.

1. Gather data
The data preparation process begins with finding the right data. This can come
from an existing data catalog or data sources can be added ad-hoc.

2. Discover and assess data

After collecting the data, it is important to discover each dataset. This step is

about getting to know the data and understanding what has to be done before the
data becomes useful in a particular context.

3. Cleanse and validate data

Cleaning up the data is traditionally the most time-consuming part of the data
preparation process, but it’s crucial for removing faulty data and filling in gaps.
Important tasks here include:

 Removing extraneous data and outliers

 Filling in missing values
 Conforming data to a standardized pattern
 Masking private or sensitive data entries

Once data has been cleansed, it must be validated by testing for errors in the data
preparation process up to this point. Often, an error in the system will become
apparent during this validation step and will need to be resolved before moving
forward.

4. Transform and enrich data

Data transformation is the process of updating the format or value entries in

order to reach a well-defined outcome, or to make the data more easily
understood by a wider audience. Enriching data refers to adding and connecting
data with other related information to provide deeper insights.

5. Store data

Once prepared, the data can be stored or channeled into a third party application
— such as a business intelligence tool — clearing the way for processing and
analysis to take place.

7.a Explain in detail about data science process. Un CO1

There are some steps that are necessary for any of the tasks which are being
done in the field of data science to derive any fruitful results from the data at
hand.
 Data Collection – After formulating any problem statement the main task is
to calculate data that can help us in our analysis and manipulation.
Sometimes data is collected by performing some kind of survey and there
are times when it is done by performing scrapping.
 Data Cleaning – Most of the real-world data is not structured and requires
cleaning and conversion into structured data before it can be used for any
analysis or modeling.
 Exploratory Data Analysis – This is the step in which we try to find the
hidden patterns in the data at hand. Also, we try to analyze different factors
which affect the target variable and the extent to which it does so. How the
independent features are related to each other and what can be done to
achieve the desired results all these answers can be extracted from this
process as well. This also gives us a direction in which we should work to
get started with the modeling process.
 Model Building – Different types of machine learning algorithms as well as
techniques have been developed which can easily identify complex patterns
in the data which will be a very tedious task to be done by a human.
 Model Deployment – After a model is developed and gives better results on
the holdout or the real-world dataset then we deploy it and monitor its
performance. This is the main part where we use our learning from the data
to be applied in real-world applications and use cases.

PARTC– (1 X 14 =14marks)
8.a Explain in detail about levels of measurement and types of variables Un CO2
In descending order of precision, the four different levels of measurement are:

Nominal–Latin for name only (Republican, Democrat, Green, Libertarian)

Ordinal–Think ordered levels or ranks (small–8oz, medium–12oz, large–32oz)

Interval–Equal intervals among levels (1 dollar to 2 dollars is the same interval

as 88 dollars to 89 dollars)

Ratio–Let the “o” in ratio remind you of a zero in the scale (Day 0, day 1, day
2, day 3, …)

Types of variable:
Categorical variables. A categorical variable (also called qualitative variable)
refers to a characteristic that can't be quantifiable. ...
Nominal variables. ...
Ordinal variables. ...
Numeric variables. ...
Continuous variables. ...
Discrete variables.

Preparedby Verifiedby

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6458)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (648)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (1005)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1022)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (582)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5181)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Toibín
3.5/5 (2141)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (464)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (280)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2016)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4372)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2814)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2886)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4135)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
College Statistics Cheat Sheet
100% (2)
College Statistics Cheat Sheet
2 pages
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
Lecture 4 - Bias-Variance Trade-Off and Model Selection
No ratings yet
Lecture 4 - Bias-Variance Trade-Off and Model Selection
66 pages
Stock Watson 3U ExerciseSolutions Chapter03 Students PDF
No ratings yet
Stock Watson 3U ExerciseSolutions Chapter03 Students PDF
12 pages
DOE Mentos and Soda Final Paper
No ratings yet
DOE Mentos and Soda Final Paper
9 pages
BADM 572 - Stats Homework Answers 6
No ratings yet
BADM 572 - Stats Homework Answers 6
7 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
Q4 Lesson 1 2 Pearson R and T Test
No ratings yet
Q4 Lesson 1 2 Pearson R and T Test
17 pages
Pengaruh Kepemimpinan, Motivasi Dan Disiplin Kerja Terhadap Kinerja Karyawan Pada Pt. Patra Komala Di Dumai Yanti Komala Sari
No ratings yet
Pengaruh Kepemimpinan, Motivasi Dan Disiplin Kerja Terhadap Kinerja Karyawan Pada Pt. Patra Komala Di Dumai Yanti Komala Sari
9 pages
Studi Deskriptif Effect Size Penelitian
No ratings yet
Studi Deskriptif Effect Size Penelitian
17 pages
A Handbook of Statistical Analyses Using R
No ratings yet
A Handbook of Statistical Analyses Using R
6 pages
Ref. CH 3 Gujarati Book
No ratings yet
Ref. CH 3 Gujarati Book
51 pages
Covariance & Correlation
No ratings yet
Covariance & Correlation
16 pages
Assignment 5: For Sheet Granger Causality
No ratings yet
Assignment 5: For Sheet Granger Causality
8 pages
Business Report: Predictive Modelling
100% (2)
Business Report: Predictive Modelling
37 pages
QMT 533 Assesment 2
No ratings yet
QMT 533 Assesment 2
20 pages
Demo Stat
No ratings yet
Demo Stat
5 pages
Ch.2 - STATA Code For Website
No ratings yet
Ch.2 - STATA Code For Website
3 pages
Assignment 2-Group 10
No ratings yet
Assignment 2-Group 10
5 pages
Ancova
100% (1)
Ancova
20 pages
Reading 4
No ratings yet
Reading 4
15 pages
Cara Membaca Hasil Regresi
No ratings yet
Cara Membaca Hasil Regresi
17 pages
Soal Ujian Akhir Semester Statistika Bisnis Semester Genap T.A. 2018/2019 Jurusan Agribisnis Fakultas Pertanian Uho
No ratings yet
Soal Ujian Akhir Semester Statistika Bisnis Semester Genap T.A. 2018/2019 Jurusan Agribisnis Fakultas Pertanian Uho
6 pages
Brown Notes
No ratings yet
Brown Notes
102 pages
Forecasting: Operations Management R. Dan Reid & Nada R. Sanders
No ratings yet
Forecasting: Operations Management R. Dan Reid & Nada R. Sanders
32 pages
User Guide For Johansen S Method
No ratings yet
User Guide For Johansen S Method
13 pages
House Pricing Regression
No ratings yet
House Pricing Regression
11 pages
Non-CBCS Wef 2009-'10
No ratings yet
Non-CBCS Wef 2009-'10
45 pages
2010 Apr QMT500
No ratings yet
2010 Apr QMT500
8 pages
Stationary and Nonstationary Series: T y y E T y S S T y T y S T y T y T y
No ratings yet
Stationary and Nonstationary Series: T y y E T y S S T y T y S T y T y T y
17 pages
Mid Term Paper
100% (1)
Mid Term Paper
1 page
Behavioral .Decision Theoryl: Paul Slavic, Baruch Fischhaff, and Sarah Lichtenstein2
No ratings yet
Behavioral .Decision Theoryl: Paul Slavic, Baruch Fischhaff, and Sarah Lichtenstein2
39 pages

CAE1 - 2 - Set1 Key

Uploaded by

CAE1 - 2 - Set1 Key

Uploaded by

Reg.No.

Time: 1.30hrs. Maximum:50marks

PARTA– (5 X2=10 marks)

PARTB– (2 X13 =26marks)

2. Discover and assess data

After collecting the data, it is important to discover each dataset. This step is

3. Cleanse and validate data

 Removing extraneous data and outliers

4. Transform and enrich data

Data transformation is the process of updating the format or value entries in

7.a Explain in detail about data science process. Un CO1

Nominal–Latin for name only (Republican, Democrat, Green, Libertarian)

Ordinal–Think ordered levels or ranks (small–8oz, medium–12oz, large–32oz)

Interval–Equal intervals among levels (1 dollar to 2 dollars is the same interval

You might also like