0% found this document useful (0 votes)

36 views4 pages

What Is Data Science

Data science uses machine learning algorithms to analyze vast amounts of data from different sources to find patterns and make business decisions. The data science lifecycle involves 5 stages: data acquisition, data preparation, data examination, data analysis, and communicating results. Key prerequisites for data science include machine learning, modeling, statistics, programming (especially Python and R), and understanding databases. Python is well-suited for machine learning due to its simplicity, extensive libraries for tasks like deep learning, platform independence, and large supportive community.

Uploaded by

chandana kiran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views4 pages

What Is Data Science

Uploaded by

chandana kiran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

STUDY MATERIAL

DATA SCIENCE AND MACHINE LEARNING

What Is Data Science?

Data science is the domain of study that deals with vast volumes of data using modern tools and
techniques to find unseen patterns, derive meaningful information, and make business decisions.
Data science uses complex machine learning algorithms to build predictive models.

The data used for analysis can come from many different sources and presented in various
formats.

The Data Science Lifecycle

Now that you know what is data science, next up let us focus on the data science lifecycle. Data
science’s lifecycle consists of five distinct stages, each with its own tasks:

1. Capture: Data Acquisition, Data Entry, Signal Reception, Data Extraction. This stage involves
gathering raw structured and unstructured data.

2. Maintain: Data Warehousing, Data Cleansing, Data Staging, Data Processing, Data
Architecture. This stage covers taking the raw data and putting it in a form that can be used.

3. Process: Data Mining, Clustering/Classification, Data Modeling, Data Summarization. Data

scientists take the prepared data and examine its patterns, ranges, and biases to determine how
useful it will be in predictive analysis.

4. Analyze: Exploratory/Confirmatory, Predictive Analysis, Regression, Text Mining,

Qualitative Analysis. Here is the real meat of the lifecycle. This stage involves performing the
various analyses on the data.

5. Communicate: Data Reporting, Data Visualization, Business Intelligence, Decision Making.

In this final step, analysts prepare the analyses in easily readable forms such as charts, graphs,
and reports.
Prerequisites for Data Science

Here are some of the technical concepts you should know about before starting to learn what is
data science.

1. Machine Learning

Machine learning is the backbone of data science. Data Scientists need to have a solid grasp of
ML in addition to basic knowledge of statistics.

2. Modeling

Mathematical models enable you to make quick calculations and predictions based on what you
already know about the data. Modeling is also a part of Machine Learning and involves
identifying which algorithm is the most suitable to solve a given problem and how to train these
models.

3. Statistics

Statistics are at the core of data science. A sturdy handle on statistics can help you extract more
intelligence and obtain more meaningful results.

4. Programming

Some level of programming is required to execute a successful data science project. The most
common programming languages are Python, and R. Python is especially popular because it’s
easy to learn, and it supports multiple libraries for data science and ML.

5. Databases

A capable data scientist needs to understand how databases work, how to manage them, and how
to extract data from them.

Python has a lot of benefits among the other languages for machine learning. For example,
Python’s

 easy to learn
 100% compatible
 code is clear
 fast in development
 libraries are extensive
 object-oriented
 open-source and free
 high-level language
 data structure is built-in

4 reasons why Python is the best language for Machine Learning

1. Simplicity and consistency

AI algorithms and machine learning models are complex predictive technologies that Python can
simplify. How? With its clear code, and lots of machine learning-specific libraries, possibility to
shift focus from the language towards algorithms. Also, it is quite easy to learn, consistent, and
intuitive. That’s why Python receives 3rd place as the most popular technology. 48.24% of
developers gave their votes for this language.

2. Variety of libraries and frameworks

There is a vast database of libraries and frameworks that Python uses for machine learning
purposes. For example,

 NumPy works with arrays, in some parts of linear algebra, and different matrices.
 Keras, which is a deep learning API running on Tensorflow to make it possible to experiment
fast.
 Tensorflow – a free open source library for both ML and AI that focuses on training and deep
neural networks.
 Matplotlib is a library that allows the creation of visualizations (static, animated, interactive) in
Python.
 Seaborn – a data visualization library based on Python, which gives an opportunity to draw
graphics (statistics), which are attractive and of high quality.
 PyTorch is an open-source ML library used to build computer vision and natural language
processing applications.

3. Platform independence
Software solutions developed with Python can be built and also can run on multiple operating
system platforms. For instance, Linux, Windows, Mac, Solaris, and more. This makes python
programming machine learning a lot more convenient. That’s why developers enjoy Python in
the process of developing ML apps.

4. Great community
Like there are communities of JavaScript lovers, there is also one belonging to Python. And, it’s
huge. You can have access to almost anything you need there taking development into
consideration. And, also, when you ask something there, you will always get support and
answers.

Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
ML Class Presentation Notes
No ratings yet
ML Class Presentation Notes
51 pages
Ai Unit1 (List, Tuple, Set, Dictionary) PDF
No ratings yet
Ai Unit1 (List, Tuple, Set, Dictionary) PDF
15 pages
Types of ML
No ratings yet
Types of ML
4 pages
Cyber Securiy PPT Unit 1
No ratings yet
Cyber Securiy PPT Unit 1
93 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)

What Is Data Science

Uploaded by

What Is Data Science

Uploaded by

STUDY MATERIAL

DATA SCIENCE AND MACHINE LEARNING

What Is Data Science?

The Data Science Lifecycle

3. Process: Data Mining, Clustering/Classification, Data Modeling, Data Summarization. Data

4. Analyze: Exploratory/Confirmatory, Predictive Analysis, Regression, Text Mining,

5. Communicate: Data Reporting, Data Visualization, Business Intelligence, Decision Making.

4 reasons why Python is the best language for Machine Learning

1. Simplicity and consistency

2. Variety of libraries and frameworks

You might also like