0% found this document useful (0 votes)
53 views23 pages

CS3352 Fds

Uploaded by

mesneymar010
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views23 pages

CS3352 Fds

Uploaded by

mesneymar010
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

CS3352

FOUNDATIONS OF DATA SCIENCE


Unit 1 ⚫ Uses of Data Science
Applications of Data Science
⚫ In the healthcare industry, physicians use Data Science to analyze data from
wearable trackers to ensure their patients’ well-being and make vital
decisions. Data Science also enables hospital managers to reduce waiting
time and enhance care.
⚫ Retailers use Data Science to enhance customer experience and retention.
⚫ Data Science is widely used in the banking and finance sectors for fraud
detection and personalized financial advice.
⚫ Transportation providers use Data Science to enhance the transportation
journeys of their customers. For instance, Transport for London maps
customer journeys offering personalized transportation details, and manages
unexpected circumstances using statistical data.
⚫ Construction companies use Data Science for better decision making by
tracking activities, including average time for completing tasks,
materials-based expenses, and more.
⚫ Data Science enables trapping and analyzing massive data from manufacturing
processes, which has gone untapped so far.
Continued
⚫ With Data Science, one can analyze massive graphical data, temporal data, and
geospatial data to draw insights. It also helps in seismic interpretation and reservoir
characterization.
⚫ Data Science facilitates firms to leverage social media content to obtain real-time
media content usage patterns. This enables the firms to create target
audience-specific content, measure content performance, and recommend
on-demand content.
⚫ Data Science helps study utility consumption in the energy and utility domain. This
study allows for better control of utility use and enhanced consumer feedback.
⚫ Financial institutions use data science to predict stock markets, determine the risk
of lending money, and learn how to attract new clients for their services.
⚫ Many governmental organizations not only rely on internal data scientists to
discover valuable information, but also share their data with the public.
⚫ Nongovernmental organizations (NGOs) are also no strangers to using data. They
use it to raise money and defend their causes. The World Wildlife Fund (WWF), for
instance, employs data scientists to increase the effectiveness of their fundraising
efforts.
⚫ Universities use data science in their research but also to enhance the study
experience of their students. The rise of massive open online courses (MOOC)
produces a lot of data, which allows universities to study how this type of learning
can complement traditional classes.
Facets of Data – Way of
Representation
⚫ ■ Structured
⚫ ■ Unstructured
⚫ ■ Natural language
⚫ ■ Machine-generated
⚫ ■ Graph-based
⚫ ■ Audio, video, and images
⚫ ■ Streaming
Structured
Unstructured
NLP
Machine Generated
Graph Based Data
Audio Image Video – Gaana/
Youtube/Instagram

⚫ Streaming Data – Live matches


⚫ Structured Data ⚫ Unstructured Data
⚫ Structured data is data that ⚫ Unstructured data is
depends on a data model data that isn’t easy to fit
and resides in a fixed field into a data model
within a record. As such, because the content is
it’s often easy to store context-specific or
structured data in tables varying.
within databases or Excel ⚫ The files doesn’t have
files. specific columns/format
⚫ SQL, or Structured Query to identify specific
Language, is the preferred things.
way to manage and query ⚫ The thousands of
data that resides in different languages make
databases. this more difficult.
⚫ Eg. Email, Text files.
⚫ NLP ⚫ Machine Generated
⚫ Natural language is a special type of ⚫ automatically created by a
unstructured data; it’s challenging to computer.
process because it requires ⚫ The analysis of machine data
knowledge of specific data science relies on highly scalable tools,
techniques and linguistics. due to its high volume and
⚫ The natural language processing speed. Examples of machine
community has had success in entity data are web server logs, call
recognition, topic recognition, detail records, network event
summarization, text completion, and logs.
sentiment analysis.
⚫ It’s ambiguous by nature. The concept
of meaning itself is questionable here.
Have two people listen to the same
conversation. Will they get the same
meaning? The meaning of the same
words can vary when coming from
someone upset or joyous.
⚫ Audio Image and Video
⚫ Graph
⚫ Tasks that are trivial for
⚫ Graph or network data is, in humans, such as recognizing
short, data that focuses on the objects in pictures, turn out to
relationship or adjacency of be challenging for computers.
objects.
⚫ High-speed cameras at
⚫ The graph structures use nodes, stadiums will capture ball and
edges, and properties to athlete movements to
represent and store graphical calculate in real time, for
data. example, the path taken by a
⚫ Graph-based data is a natural way defender relative to two
to represent social networks, and baselines. ( FIFA VAR).
its structure allows you to ⚫ Eg. Autonomous Cars
calculate specific metrics such as
the influence of a person and the
shortest path between two
people. ⚫ Streaming Data
⚫ Eg. a graph with the same people ⚫ The data flows into the system
which connects business when an event happens
colleagues via LinkedIn instead of being loaded into a
⚫ Graph databases are used to store data store in a batch.
graph-based data and are queried
with specialized query languages
such as SPARQL
Data Science Process

You might also like