0% found this document useful (0 votes)

4 views

Data Science - Unit 1 MDM

The document outlines a course on Introduction to Data Science at Jawaharlal Nehru Engineering College, detailing its objectives, outcomes, and content structure. It emphasizes the importance of data science in various sectors, the role of Python in data analysis, and the significance of exploratory data analysis (EDA) techniques. The course aims to equip students with the necessary skills to analyze and interpret data effectively using statistical and machine learning concepts.

Uploaded by

grebe64246

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Data Science - Unit 1 MDM

Uploaded by

grebe64246

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 64

Jawaharlal Nehru Engineering College,

Chh. Sambhajinagar

Data Science

Sandip S. Kankal
Assistant Professor, CSE,
JNEC, Chh. Sambhajinagar
12-Aug-24 1
Course: MDM in Data Science
Semester -III
• Course code: CSE21MDL201
• Course name: Introduction to Data Science
• Course category: MDM
• Credits: 2
• Teaching scheme: L-2hrs/week
• Evaluation scheme: CA–60, ESE–40
12-Aug-24 2
Pre-requisite
• Basics of any programming language

12-Aug-24 3
Course Objectives:
• To provide the knowledge and expertise to become
a proficient data scientist.
• Demonstrate an understanding of statistics and
machine learning concepts that are vital for data
science.
• Critically evaluate data visualisations based on
their design and use for communicating stories
from data
12-Aug-24 4
Course Outcomes:
• At the end of the course, the students will be able to
CO1: To explain how data is collected, managed and stored
for data science
CO2: To understand the key concepts in data science
including their real-world applications and the toolkit used
by data scientists.
CO3: To understand different tools and languages used for
Data Science.

12-Aug-24 5
Contents
• Unit -1 : Introduction to Data Science
• Unit - 2: Feature Generation & Extraction
• Unit - 3: Data Visualization
• Unit - 4: Applications & Tools used in Data
Science

12-Aug-24 6
Unit 1: Introduction to Data Science
• Introduction to Data Science
• Different Sectors using Data Science
• Purpose & Components of Python in Data Science
• Data Analytics Process
• Knowledge Check
• EDA
• EDA – Quantitative technique
• EDA – Graphical Technique
• Data Analytics Conclusion & Predictions
12-Aug-24 7
Data Science
• Data + Science
• Data : factual information (such as measurements or
statistics) used as a basis for reasoning, discussion, or
calculation
• A set of values of qualitative or quantitative variables.
• Raw facts and figures
• E.g. The number of visitors to a website in one month,
Individual satisfaction scores on a customer service
survey
12-Aug-24 8
Data Science
• Data + Science
• Science: Science is a strict systematic discipline that
builds and organizes knowledge in the form of
testable hypotheses and predictions about the
world.
• Science consists of observing the world by
watching, listening, observing, and recording

12-Aug-24 9
Data Science
• We live in a world that’s drowning in data

12-Aug-24 10
Data All Around
• Lots of data is being collected
and warehoused
– Web data, e-commerce
– Financial transactions, bank/credit transactions
– Online trading and purchasing
– Social Network

12-Aug-24 11
What is Data?
• a collection of raw facts, figures, numbers, or
observations
• could be anything from website visitor statistics to
customer feedback survey results.
• Think of data as the building blocks of information.
• On its own, a single data point might seem
insignificant, but when combined and analyzed, it
reveals valuable insights.
12-Aug-24 12
Data vs Information
• Data is the raw material, while information is what
we derive from that data.

12-Aug-24 13
Data vs Information
Data Information

Data is unorganised and unrefined facts Information comprises processed, organised data
presented in a meaningful context

Data is an individual unit that contains raw Information is a group of data that collectively
materials which do not carry any specific carries a logical meaning.
meaning.

Data doesn’t depend on information. Information depends on data.

Raw data alone is insufficient for decision making Information is sufficient for decision making

An example of data is a student’s test score The average score of a class is the information
derived from the given data.
12-Aug-24 14
How Much Data Do We have?

12-Aug-24 15
12-Aug-24 16
Big Data
• Big data refers to
extremely large and
diverse collections of
structured, unstructured,
and semi-structured data
that continues to grow
exponentially over time.

12-Aug-24 17
Big Data
Big Data is any data that is expensive to manage and hard
to extract value from
– Volume
• The size of the data
– Velocity
• The latency of data processing relative to the growing demand
for interactivity
– Variety and Complexity
• the diversity of sources, formats, quality, structures.

12-Aug-24 18
12-Aug-24 19
Big Data Growth

12-Aug-24 20
Data Science
• Data science is the practice of mining large data sets of
raw data, both structured and unstructured, to identify
patterns and extract actionable insight from them.
• This is an interdisciplinary field, and the foundations
of data science include statistics, inference, computer
science, predictive analytics, machine learning
algorithm development, and new technologies to gain
insights from big data.

12-Aug-24 21
What is Data Science?
• Data science is the study of data to extract
meaningful insights for business.
• It's a multidisciplinary field that combines principles
and practices from mathematics, statistics, artificial
intelligence (AI), and computer engineering to
analyze large amounts of data.
• “data+science” refers to the scientific study of data.

12-Aug-24 22
Data Science
• Data science enables businesses to process huge amounts of
structured and unstructured big data to detect patterns.
• This in turn allows companies to increase efficiencies, manage
costs, identify new market opportunities, and boost their
market advantage.
• Asking a personal assistant like Alexa or Siri for a
recommendation demands data science.
• So does operating a self-driving car, using a search engine that
provides useful results, or talking to a chatbot for customer
service.
• These are all real-life applications for data science.
12-Aug-24 23
Big Data & Data Science
• Data comes from various sources, such as online
purchases, multimedia forms, instruments,
financial logs, sensors, text files, and others.
• Data might be unstructured, semi-structured, or
structured.
• This is all “big data,” and putting it to good use is a
pressing job of the 21st century.

12-Aug-24 24
Big Data & Data Science
• Data science is not one tool, skill, or method.
• Instead, it is a scientific approach that uses applied
statistical and mathematical theory and computer tools to
process big data.
• The foundations of data science combine the
interdisciplinary strengths of data cleansing, intelligent data
capture techniques, and data mining and programming.
• The result is the data scientist’s ability to capture, maintain,
and prepare big data for intelligent analysis.
12-Aug-24 25
Different Sectors using Data Science

12-Aug-24 26
Purpose of Python in Data Science
• Data science is a domain that deals with the collection,
analysis and interpretation of data, specifically for
business purposes.
• It involves statistics, machine learning, artificial
intelligence and database systems techniques altogether.
• Python is one of the most popular programming
languages used in data science owing to its simplicity and
flexibility.

12-Aug-24 27
Purpose of Python in Data Science
• Python is one of the most popular programming languages
• It provides simplicity and flexibility
• It uses an elegant syntax, hence programs are easier to read
• Large standard library and community support
• Python is open source and expressive language
• GUI Support
• It involves statistics, machine learning, artificial intelligence.

12-Aug-24 28
Purpose of Python in Data Science
• In terms of application areas, Data scientists prefer
Python for the following modules:
• Data Analysis
• Data Visualizations
• Machine Learning
• Deep Learning
• Image processing
• Computer Vision
• Natural Language Processing (NLP)
12-Aug-24 29
Components of Python in Data Science
• Python has libraries with large collections of mathematical
functions and analytical tools.
• Pandas - This library is used for structured data operations, like
import CSV files, create dataframes, and data preparation
• Numpy - This is a mathematical library. Has a powerful N-
dimensional array object, linear algebra, Fourier transform,
etc.
• Matplotlib - This library is used for visualization of data.
• SciPy - This library has linear algebra modules
• Seaborn

12-Aug-24 30
Data Analysis
• Data analysis is the process of collecting, transforming,
cleaning and modeling data with the goal of discovering
required information
• A simple example of data analysis is whenever we take
decision in our day-to-day life is by thinking what happened
last time or what will happen by choosing that particular
decision
• This is nothing but analyzing our past or future and making
decisions based on it
12-Aug-24 31
Data Analytics
• Data analytics is the process of using tools,
technologies, and processes to convert raw data
into insights that can help solve problems and
identify trends.
• It can help businesses improve decision-making,
shape processes, and grow.
•

12-Aug-24 32
12-Aug-24 33
Analytics Types

12-Aug-24 34
12-Aug-24 35
Data Analytics Process

12-Aug-24 36
Why EDA?
• EDA is analyzing data using visual techniques.
• It is used to discover patterns or trends or to check assumptions
with the help of statistical summaries & graphical tools
• To check mistake
• Checking assumptions
• Selection of appropriate models
• Determining relationship between variables

12-Aug-24 37
• Exploratory Data Analysis
is a data analytics process to
understand the data in
depth and learn the different
data characteristics, often
with visual means.
• This allows you to get a
better feel of your data and
find useful patterns in it.

12-Aug-24 38
Key aspects of EDA include:
• Distribution of Data: Examining the distribution of data points to understand their range,
central tendencies (mean, median), and dispersion (variance, standard deviation).
• Graphical Representations: Utilizing charts such as histograms, box plots, scatter plots, and
bar charts to visualize relationships within the data and distributions of variables.
• Outlier Detection: Identifying unusual values that deviate from other data points. Outliers can
influence statistical analyses and might indicate data entry errors or unique cases.
• Correlation Analysis: Checking the relationships between variables to understand how they
might affect each other. This includes computing correlation coefficients and creating
correlation matrices.
• Handling Missing Values: Detecting and deciding how to address missing data points, whether
by imputation or removal, depending on their impact and the amount of missing data.
• Summary Statistics: Calculating key statistics that provide insight into data trends and
nuances.
• Testing Assumptions: Many statistical tests and models assume the data meet certain
conditions (like normality or homoscedasticity). EDA helps verify these assumptions.
12-Aug-24 39
Techniques
• Most of the EDA techniques are graphical in
nature with few quantitative techniques
• EDA – Quantitative
– Descriptive Statistics (Mean, Median, Mode, Variance,
Std deviation, Range)
• EDA – Graphical
– Histogram, Scatterplot, Bar chart, Line Chart, Boxplot
etc.

12-Aug-24 40
Types of EDA
• Univariate non-graphical (Quantitative)
• Univariate graphical

• Multivariate nongraphical (Quantitative)

• Multivariate graphical

12-Aug-24 41
EDA
• Univariate analysis is a statistical method that examines
one variable at a time to summarize or describe it, and to
look for patterns in the data.
• Bivariate data involves two different variables, and the
analysis of this type of data focuses on understanding the
relationship or association between these two variables.
• Multivariate data refers to datasets where each
observation or sample point consists of multiple
variables or features.
12-Aug-24 42
Analysis

• Example -
• Studying the heights of players
• Analyzing the sale of ice creams based on the
temperature outside.
• Analysing Revenue based on expenditure.

12-Aug-24 43
How to perform EDA?
• This involves exploring dataset in three ways:

• Summarizing a dataset using descriptive statistics

• Visualizing dataset using charts
• Normalizing dataset

12-Aug-24 44
Univariate Analysis
• Histograms: Used to visualize the distribution of a variable.
• Box plots: Useful for detecting outliers and understanding the spread
and skewness of the data.
• Bar charts: Employed for categorical data to show the frequency of
each category.
• Summary statistics: Calculations like mean, median, mode, variance,
and standard deviation that describe the central tendency and dispersion
of the data.

12-Aug-24 45
Bivariate Analysis
• Scatter Plots: A scatter plot helps visualize the relationship between two continuous
variables.
• Correlation Coefficient: This statistical measure (often Pearson’s correlation coefficient
for linear relationships) quantifies the degree to which two variables are related.
• Cross-tabulation: Also known as contingency tables, cross-tabulation is used to analyze
the relationship between two categorical variables. It shows the frequency distribution of
categories of one variable in rows and the other in columns, which helps in
understanding the relationship between the two variables.
• Line Graphs: In the context of time series data, line graphs can be used to compare two
variables over time. This helps in identifying trends, cycles, or patterns that emerge in
the interaction of the variables over the specified period.
• Covariance: Covariance is a measure used to determine how much two random variables
change together. However, it is sensitive to the scale of the variables, so it’s often
supplemented by the correlation coefficient for a more standardized assessment of the
relationship.
12-Aug-24 46
Multivariate Analysis
• Pair plots: Visualize relationships across several variables
simultaneously to capture a comprehensive view of
potential interactions.
• Principal Component Analysis (PCA): A dimensionality
reduction technique used to reduce the dimensionality of
large datasets, while preserving as much variance as
possible.

12-Aug-24 47
Tools for EDA
• Python Libraries • R Packages
• Pandas: Provides functions for data • ggplot2: Part of the tidyverse, it’s a
manipulation and analysis, including data powerful tool for making complex
structure handling and time series plots from data in a data frame.
functionality. • dplyr: A grammar of data
• Matplotlib: A plotting library for creating
manipulation, providing a consistent
set of verbs that help you solve the
static, interactive, and animated most common data manipulation
visualizations in Python. challenges.
• Seaborn: Built on top of Matplotlib, it • tidyr: Helps to tidy your data.
provides a high-level interface for drawing Tidying your data means storing it in
attractive and informative statistical graphics. a consistent form that matches the
• Plotly: An interactive graphing library for semantics of the dataset with the
making interactive plots and offers more way it is stored.
sophisticated visualization capabilities.
12-Aug-24 48
Steps in EDA
• Data Collection: involves gathering relevant data for analysis. Data can be
collected from various sources, including public datasets, surveys, and databases.
• Data Cleaning: This step involves checking for missing data, errors, and
outliers. The data is cleaned by removing duplicates, correcting data entry
errors, and filling in missing values.
• Data Visualization: This step involves creating visualizations to identify
patterns and relationships in the data. Common visualization techniques include
scatter plots, histograms, and box plots.
• Data Transformation: This step involves transforming the data to make it more
suitable for analysis. This can include normalization, scaling, and
standardization.
• Data Modeling: This step involves creating models to describe the relationships
between variables. Models can be simple, such as linear regression, or complex,
such as12-Aug-24
decision trees or neural networks. 49
EDA Process
• STEP 1: Import libraries
# importting Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

• STEP 2: Read .csv file

df = pd.read_csv("../input/cardataset/data.csv")

• STEP 3: Display first 5 rows and last 5 rows of dataset

# To display the top 5 rows
df.head(5)
12-Aug-24 50
#import libraries EDA Process
import pandas as pd #Renaming the columns
import numpy as np df = df.rename(columns={"Engine HP": "HP",
import seaborn as sns #visualisation "Engine Cylinders": "Cylinders", "Transmission
import matplotlib.pyplot as plt #visualisation Type": "Transmission", "Driven_Wheels":
%matplotlib inline "Drive Mode","highway MPG": "MPG-H", "city
sns.set(color_codes=True) mpg": "MPG-C", "MSRP": "Price" })
df.head(5)
#Load dataset
df = #Dropping the duplicate rows
pd.read_csv("C:/Users/input/cardataset/data. df.shape
csv")
# To displaythe top 5 rows duplicate_rows_df = df[df.duplicated()]
df.head(5) print("number of duplicate rows: ",
df.tail(5) # To display the bottom 5 rows duplicate_rows_df.shape)

#check types of data

df.dtypes 12-Aug-24 51
EDA Process
#Detecting Outliers
#Now let us remove the duplicate data because
sns.boxplot(x=df['Price'])
it's ok to remove them.
sns.boxplot(x=df['HP'])
df.count() # Used to count the number of rows
sns.boxplot(x=df['Cylinders'])
#So seen above there are 11914 rows and we are
removing 989 rows of duplicate data.
Q1 = df.quantile(0.25)
Q3 = df.quantile(0.75)
df = df.drop_duplicates()
IQR = Q3 - Q1
df.head(5)
print(IQR)
df.count()
df = df[~((df < (Q1 - 1.5 * IQR)) |(df > (Q3 + 1.5 *
IQR))).any(axis=1)]
df.shape

12-Aug-24 52
EDA Process
#Plot different features against one another (scatter), against frequency (histogram)
#Histogram
#Histogram refers to the frequency of occurrence of variables in an interval.
df.Make.value_counts().nlargest(40).plot(kind='bar', figsize=(10,5))
plt.title("Number of cars by make")
plt.ylabel('Number of cars')
plt.xlabel('Make');

#Heatmaps
#Heat Maps is a type of plot which is necessary when we need to find the dependent variables.
#One of the best way to find the relationship between the features can be done using heat maps.
#In the below heat map we know that the price feature depends mainly on the Engine Size,
Horsepower, and Cylinders.
plt.figure(figsize=(10,5))
c= df.corr()
sns.heatmap(c,cmap="BrBG",annot=True)
c 12-Aug-24 53
EDA Process
#Scatterplot
#We generally use scatter plots to find the correlation between two variables
# Here the scatter plots are plotted between Horsepower and Price and we can see the plot below.
#With the plot given below, we can easily draw a trend line.
#These features provide a good scattering of points.

fig, ax = plt.subplots(figsize=(10,6))
ax.scatter(df['HP'], df['Price'])
ax.set_xlabel('HP')
ax.set_ylabel('Price')
plt.show()

12-Aug-24 54
12-Aug-24 55
12-Aug-24 56
Data Analytics Conclusion & Prediction

12-Aug-24 57
12-Aug-24 58
Knowledge Check
• Data is
• A. A set of values of qualitative or quantitative variables
• B. factual information or raw facts
• C. Both A & B
• D. None

12-Aug-24 59
Knowledge Check
• Select one of the following where data is being collected
• A. Education
• B. Business
• C. Healthcare
• D. All of the above

12-Aug-24 60
Knowledge Check
Information is
• A. processed organized and structured data.
• B. sufficient for decision making
• C. data that carries logical meaning
• D. All of the above

12-Aug-24 61
Knowledge Check
• Identify the types of analytics
• A. Descriptive
• B. Diagnostic
• C. Predictive
• D. All of the above

12-Aug-24 62
Knowledge Check
EDA is
• A. used to discover patterns
or trends or
• B. to check assumptions with
statistical summaries &
graphical tools
• C. analyzing data using
visual techniques
• D. All of the above
12-Aug-24 63
References
• https://www.rudderstack.com/learn/data-
analytics/data-analytics-processes/
• https://dev.to/yankho817/exploratory-data-
analysis-edaultimate-guide-174d

12-Aug-24 64

Carrie Jenkins-What Love Is - and What It Could Be-Hachette (Perseus) (2017) PDF
100% (3)
Carrie Jenkins-What Love Is - and What It Could Be-Hachette (Perseus) (2017) PDF
127 pages
Ocs353dsf Unit Wise Notes
100% (2)
Ocs353dsf Unit Wise Notes
121 pages
Seminar On Data Science
100% (7)
Seminar On Data Science
25 pages
Tanzania Land Policy
No ratings yet
Tanzania Land Policy
10 pages
Fundamentals of Data Science
100% (3)
Fundamentals of Data Science
62 pages
EDS Unit 1?
No ratings yet
EDS Unit 1?
15 pages
Data Science
No ratings yet
Data Science
244 pages
Data Science - AD1102-1
No ratings yet
Data Science - AD1102-1
53 pages
DSC Unit 1
No ratings yet
DSC Unit 1
59 pages
Introduction To Datasciecne
No ratings yet
Introduction To Datasciecne
50 pages
Unit 1
No ratings yet
Unit 1
76 pages
Data Science Intro Session-18 & 19
No ratings yet
Data Science Intro Session-18 & 19
48 pages
DS-Unit-1_ABM
No ratings yet
DS-Unit-1_ABM
103 pages
Data Science Ppt1 Update
No ratings yet
Data Science Ppt1 Update
67 pages
Chapter 2. Introduction To Data Science
No ratings yet
Chapter 2. Introduction To Data Science
40 pages
CH1 Introduction To Data Science BS
No ratings yet
CH1 Introduction To Data Science BS
69 pages
AIDS C04-Session-19
No ratings yet
AIDS C04-Session-19
29 pages
Basics of Data Science KPK
No ratings yet
Basics of Data Science KPK
38 pages
Chapter 2 - Introduction To Data Science
No ratings yet
Chapter 2 - Introduction To Data Science
36 pages
CS3352 - Foundations of Data Science
No ratings yet
CS3352 - Foundations of Data Science
142 pages
Chapter one-DSA
No ratings yet
Chapter one-DSA
20 pages
Data Science: by Neha Tyagi
100% (1)
Data Science: by Neha Tyagi
17 pages
Dsbda Unit 1
No ratings yet
Dsbda Unit 1
119 pages
Data Science: October 2021
No ratings yet
Data Science: October 2021
51 pages
Unit 1-FDS
No ratings yet
Unit 1-FDS
18 pages
FDSNotes
No ratings yet
FDSNotes
12 pages
Data Science Intro
No ratings yet
Data Science Intro
52 pages
Data Science
100% (2)
Data Science
52 pages
TE Sem1 UNIT 1 (Data Science and Visualization) HONOURS - TE (SEM V)
No ratings yet
TE Sem1 UNIT 1 (Data Science and Visualization) HONOURS - TE (SEM V)
28 pages
CUITM217-DATA-SCIENCE Data
No ratings yet
CUITM217-DATA-SCIENCE Data
48 pages
Defining Data Science
100% (1)
Defining Data Science
167 pages
Session 1819
No ratings yet
Session 1819
47 pages
IDS Complete Notes
No ratings yet
IDS Complete Notes
126 pages
Internship Report 2023-24 Data Science
100% (2)
Internship Report 2023-24 Data Science
23 pages
Data Science 1
100% (3)
Data Science 1
133 pages
Chapter 1 - Lecture
No ratings yet
Chapter 1 - Lecture
7 pages
Introduction to Data Science
No ratings yet
Introduction to Data Science
25 pages
Chapter 1 (8)
No ratings yet
Chapter 1 (8)
62 pages
Chapter 2. Introduction To Data Science
100% (2)
Chapter 2. Introduction To Data Science
45 pages
DSF 1-2
No ratings yet
DSF 1-2
28 pages
Lecture 1 & 2
No ratings yet
Lecture 1 & 2
53 pages
File
No ratings yet
File
27 pages
Lesson1 Introduction To The Data Science Process and The Value of Learning Data Science
No ratings yet
Lesson1 Introduction To The Data Science Process and The Value of Learning Data Science
6 pages
Lesson 02 Introduction To Data Science
No ratings yet
Lesson 02 Introduction To Data Science
30 pages
Unit2 PDS
No ratings yet
Unit2 PDS
17 pages
Lec1 - For Upload Complete
No ratings yet
Lec1 - For Upload Complete
111 pages
Datascience Notes
No ratings yet
Datascience Notes
161 pages
2 Data Science Process 06-01-2024
No ratings yet
2 Data Science Process 06-01-2024
32 pages
Modul1 PPt.pptx
No ratings yet
Modul1 PPt.pptx
56 pages
FDS - UNIT 1
No ratings yet
FDS - UNIT 1
233 pages
Unit 1 FUNDAMENTALS OF DATA SCIENCE-1
No ratings yet
Unit 1 FUNDAMENTALS OF DATA SCIENCE-1
27 pages
himadev
No ratings yet
himadev
37 pages
FDS - Lecture Notes - III AIML, CSM
No ratings yet
FDS - Lecture Notes - III AIML, CSM
101 pages
Ch7-Overview of Data Science-part 1
No ratings yet
Ch7-Overview of Data Science-part 1
37 pages
Approaches in data science [Slides]
No ratings yet
Approaches in data science [Slides]
13 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
37 pages
IDS UNIT 1,2,3,4 & 5
No ratings yet
IDS UNIT 1,2,3,4 & 5
117 pages
UNIT 1 PPT 1
No ratings yet
UNIT 1 PPT 1
27 pages
Machine Learning Unit-1.1
No ratings yet
Machine Learning Unit-1.1
29 pages
Mastering Data Science with Python: The Ultimate Guide: Unlock the Power of Data Analysis and Visualization with Python's Cutting-Edge Tools and Techniques
From Everand
Mastering Data Science with Python: The Ultimate Guide: Unlock the Power of Data Analysis and Visualization with Python's Cutting-Edge Tools and Techniques
daniel Huston
No ratings yet
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
Data Science
From Everand
Data Science
Chloe Martin
No ratings yet
CH 2 - Lecture 1. Bearing Capacity of Shallow Foundations
100% (1)
CH 2 - Lecture 1. Bearing Capacity of Shallow Foundations
20 pages
EE 42/43/100 Introduction To Digital Electronics: Review of Ch. 4-7.3 7/19/13
No ratings yet
EE 42/43/100 Introduction To Digital Electronics: Review of Ch. 4-7.3 7/19/13
43 pages
Ovation Safety Instrumented Systems Electronic Marshalling en 67638
No ratings yet
Ovation Safety Instrumented Systems Electronic Marshalling en 67638
3 pages
AI Lab 12 Lab Tasks - 39
No ratings yet
AI Lab 12 Lab Tasks - 39
12 pages
Swollen hydrogel nanotechnology
No ratings yet
Swollen hydrogel nanotechnology
19 pages
Srinagar Kashmir - SINA To Badgam - BDGM - 5 Trains - India Rail Info - A Busy Junction For Travellers & Rail Enthusiasts
No ratings yet
Srinagar Kashmir - SINA To Badgam - BDGM - 5 Trains - India Rail Info - A Busy Junction For Travellers & Rail Enthusiasts
2 pages
Ec-Tds Analyser - CM 183 ELICO.: 1) Works Instructions
No ratings yet
Ec-Tds Analyser - CM 183 ELICO.: 1) Works Instructions
3 pages
A Critical Glance Into The Metacinematic Gestures of The Act of Killing
No ratings yet
A Critical Glance Into The Metacinematic Gestures of The Act of Killing
19 pages
Cost Control Engineering
No ratings yet
Cost Control Engineering
34 pages
DS75 - Avalanche Power - IXYS
No ratings yet
DS75 - Avalanche Power - IXYS
2 pages
All About Spelling Educational Presentation in Yellow, Green and Brown Han_20241028_151310_0000
No ratings yet
All About Spelling Educational Presentation in Yellow, Green and Brown Han_20241028_151310_0000
27 pages
DNM Soln
No ratings yet
DNM Soln
27 pages
The Impact of Management Control Systems (MCS) On Organizations Performance A Literature Review
No ratings yet
The Impact of Management Control Systems (MCS) On Organizations Performance A Literature Review
17 pages
Integrated Circuits Applications Laboratory: Lab Manual
No ratings yet
Integrated Circuits Applications Laboratory: Lab Manual
72 pages
Solutions Manual For Electromechanical Dynamics: Mit Opencourseware
No ratings yet
Solutions Manual For Electromechanical Dynamics: Mit Opencourseware
171 pages
Thesis About Racism PDF
100% (3)
Thesis About Racism PDF
4 pages
Anglgear Catalog Metrico
100% (1)
Anglgear Catalog Metrico
3 pages
University Thesis and Dissertation
No ratings yet
University Thesis and Dissertation
29 pages
Detection of Fraud Statement Based On Word Vector Evidence From Financial Companies in China - ScienceDirect
No ratings yet
Detection of Fraud Statement Based On Word Vector Evidence From Financial Companies in China - ScienceDirect
9 pages
English 5 Q4 Test Paper With Ans. Key
No ratings yet
English 5 Q4 Test Paper With Ans. Key
4 pages
MCQs RE II PDF
No ratings yet
MCQs RE II PDF
3 pages
28
No ratings yet
28
18 pages
Simple Column Design Example
No ratings yet
Simple Column Design Example
5 pages
Flash and Fire Point
No ratings yet
Flash and Fire Point
4 pages
memo-GPP Utilization
No ratings yet
memo-GPP Utilization
1 page
Learning Competency/ies: (Taken From The Curriculum Guide) Key Concepts/ Understandings To Be Developed 1. Objectives
67% (3)
Learning Competency/ies: (Taken From The Curriculum Guide) Key Concepts/ Understandings To Be Developed 1. Objectives
2 pages
AFS3 Syllabus Aug16
No ratings yet
AFS3 Syllabus Aug16
8 pages
Boiler Operation, Maintenance & Water Treatment Technology
No ratings yet
Boiler Operation, Maintenance & Water Treatment Technology
201 pages

Data Science - Unit 1 MDM

Uploaded by

Data Science - Unit 1 MDM

Uploaded by

Jawaharlal Nehru Engineering College,

Data doesn’t depend on information. Information depends on data.

• Multivariate nongraphical (Quantitative)

• Summarizing a dataset using descriptive statistics

• STEP 2: Read .csv file

• STEP 3: Display first 5 rows and last 5 rows of dataset

#check types of data

You might also like