Unit 1 Data Analytics
Unit 1 Data Analytics
Unit: 1
1.Security
2. Digital Advertising
3. E-Commerce
4. Publishing
5. Massively Multiplayer Online Games
6. Backend Services and Messaging
7. Project Management & Collaboration
8. Real time Monitoring Services
9.Live Charting and Graphing
10. Group and Private Chat
PO8 : Ethics
PO10 : Communication
CO.K PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
CO1 2 2 2 3 3 - - - - - - -
CO2 3 2 3 2 3 - - - - - - -
CO3 3 2 3 2 3 - - - - - - -
CO4 3 2 3 2 3 - - - - - - -
CO5 3 2 3 3 3 - - - - - - -
Program Specific
S. No. PSO Description
Outcomes (PSO)
05/11/2025 15
CO-PSOs Mapping
CO1 3 - - -
CO2 3 2 - -
CO3 3 3 - -
CO4 3 3 - -
CO5 3 3 - -
Program Educational
PEOs Description
Objectives (PEOs)
Data analytics (DA) is the area of examining data sets in order to find trends and draw
conclusions about the information they contain. Increasingly, data analytics is done with
the aid of specialized systems and software.
15/06/2022
What is Data Science?
What is Data Science?
15/06/2022 25
What is Data Science?
15/06/2022 26
What is Data Science?
Data Science is a combination of multiple disciplines that uses statistics, data analysis, and machine
learning to analyze data and to extract knowledge and insights from it. .
15/06/2022 27
What is Data Science?
Data Science is used in many industries in the world today, e.g. banking, consultancy, healthcare, and manufacturing.
15/06/2022 28
What is Data Science?
15/06/2022 29
What is Data Science?
15/06/2022 30
Big Data
This is a term related to extracting meaningful data by analyzing the huge amount of complex, variously formatted data generated at high speed, that
cannot be handled, or processed by the traditional system.
Data which are very large in size is called Big Data. Normally we work on data of size MB(WordDoc ,Excel) or maximum GB(Movies, Codes) but data
in Peta bytes i.e. 10^15 byte size is called Big Data. It is stated that almost 90% of today's data has been generated in the past 3 years.
15/06/2022 31
Big Data
Sources
Social networking sites: Facebook, Google, LinkedIn all these sites generates huge amount of data on a day to day basis as they have billions of users worldwide.
• E-commerce site: Sites like Amazon, Flipkart, Alibaba generates huge amount of logs from which users buying trends can be traced.
• Weather Station: All the weather station and satellite gives very huge data which are stored and manipulated to forecast weather.
• Telecom company: Telecom giants like Airtel, Vodafone study the user trends and accordingly publish their plans and for this they store the data of its million users.
• Share Market: Stock exchange across the world generates huge amount of data through its daily transaction.
15/06/2022 32
Big Data
5 Vs of Big Data:
The 5 V's of big data -- velocity, volume, value, variety and veracity -- are the five main and innate characteristics of big data. Knowing the 5 V's lets data scientists derive more value from their data while
also allowing their organizations to become more customer-centric.
Earlier this century, big data was talked about in terms of the three V's -- volume, velocity and variety. Over time, two more V's -- value and veracity -- were added to help data scientists more effectively
articulate and communicate the important characteristics of big data. In some cases, there's even a sixth V term for big data -- variability.
15/06/2022 33
Types of Digital Data
Process of classifying data in relevant categories so that it can be used or applied more
efficiently. The classification of data makes it easy for the user to retrieve it. Data
classification holds its importance when comes to data security and compliance and
also to meet different types of business or personal objective. It is also of major
requirement, as data must be easily retrievable within a specific period of time.
We can further divide Data into three categories:
1. Structured Data
2. Unstructured Data
3. Semi Structured Data
15/06/2022 34
Types of Digital Data
Structured Data :
Structured data is created using a fixed schema and is maintained in tabular format. The elements
in structured data are addressable for effective analysis. It contains all the data which can be
stored in the SQL database in a tabular format. Today, most of the data is developed and
processed in the simplest way to manage information.
Examples –
Relational data, Geo-location, credit card numbers, addresses, etc.
Consider an example for Relational Data like you have to maintain a record of students for a
university like the name of the student, ID of a student, address, and Email of the student. To
store the record of students used the following relational schema and table for the same.
15/06/2022 35
Types of Digital Data
Un Structured Data :
It is defined as the data in which is not follow a pre-defined standard or you can say
that any does not follow any organized format. This kind of data is also not fit for the
relational database because in the relational database you will see a pre-defined manner
or you can say organized way of data. Unstructured data is also very important for the
big data domain and To manage and store Unstructured data there are many platforms
to handle it like No-SQL Database.
Examples –
Word, PDF, text, media logs, etc.
15/06/2022 36
Types of Digital Data
Semi-structured data is information that does not reside in a relational database but that
have some organizational properties that make it easier to analyze. With some process,
you can store them in a relational database but is very hard for some kind of semi-
structured data, but semi-structured exist to ease space.
Example –
XML data
15/06/2022 37
Evolution of Data Science
The field of data science has greatly transformed in recent times. Initially, it relied
mostly on statistical analysis and data mining to draw conclusions based on structured
data. However, as big data technologies have expanded and increased the amount of
unstructured information from sources such as social networks and sensors, this has
made Data Science, which is related to massive data sets, even more demanding.
Modern technology and programming both in machine learning and artificial
intelligence, have empowered Data Scientists. They can now readily develop complex
models for such tasks as predictive analytics, NLP and computer vision. This has led to
a hybrid of jobs such as within the health field, finance industry, but also retail and
manufacturing industries. They use it for things like personalized medicine, spotting
fraud, suggesting products, and predicting when machines need fixing.
15/06/2022 38
Evolution of Data Science
The evolution of Data Science over the years has taken form in many phases. It all
started with Statistics. Simple statistics models were employed to collect, analyze and
manage data since the early 1800s. These principles underwent various modulations
over time until the rise of the digital age. Once computers were introduced as
mainstream public devices, there was a shift in the industry to the Digital Age. A flood
of data and digital information was created. This resulted in the statistical practices and
models getting computerized giving rise to digital analytics. Then came the rise of the
internet that exponentially grew the data available giving rise to what we know as Big
Data. This explosion of information available to the masses gave rise to the need for
expertise to process, manage, analyze and visualize this data for the purpose of decision
making through the use of various models. This gave birth to the term Data Science.
15/06/2022 39
Datafication
15/06/2022 40
Datafication
• Let’s say social platforms, Facebook or Instagram, for example, collect and monitor data
information of our friendships to market products and services to us and surveillance
services to agencies which in turn changes our behavior; promotions that we daily see
on the socials are also the result of the monitored data. In this model, data is used to
redefine how content is created by datafication being used to inform content rather
than recommendation systems.
• However, there are other industries where datafication process is actively used:
• Insurance: Data used to update risk profile development and business models.
• Banking: Data used to establish trustworthiness and likelihood of a person paying back a
loan.
• Human resources: Data used to identify e.g. employees risk-taking profiles.
• Hiring and recruitment: Data used to replace personality tests.
• Social science research: Datafication replaces sampling techniques and restructures the
manner in which social science research is performed.
15/06/2022 41
Skill Set Needed
15/06/2022 42
Data Science Life Cycle
15/06/2022 43
Data Science Life Cycle
Phase 1—Discovery: Before you begin the project, it is important to understand the various
specifications, requirements, priorities and required budget. You must possess the ability to ask the right
questions. Here, you assess if you have the required resources present in terms of people, technology,
time and data to support the project. In this phase, you also need to frame the business problem and
formulate initial hypotheses (IH) to test.
Phase 2—Data preparation: In this phase, you require analytical sandbox in which you can perform
analytics for the entire duration of the project. You need to explore, preprocess and condition data prior
to modeling. Further, you will perform ETLT (extract, transform, load and transform) to get data into the
sandbox. Let’s have a look at the Statistical Analysis flow below.
You can use R for data cleaning, transformation, and visualization. This will help you to spot the outliers
and establish a relationship between the variables. Once you have cleaned and prepared the data, it’s
time to do exploratory analytics on it. Let’s see how you can achieve that.
Phase 3—Model planning: Here, you will determine the methods and techniques to draw the
relationships between variables. These relationships will set the base for the algorithms which you will
implement in the next phase. You will apply Exploratory Data Analytics (EDA) using various statistical
formulas and visualization tools.
15/06/2022 44
Data Science Life Cycle
Phase 4—Model building: In this phase, you will develop datasets for training and testing purposes. You
will consider whether your existing tools will suffice for running the models or it will need a more robust
environment (like fast and parallel processing). You will analyze various learning techniques like
classification, association and clustering to build the model.
Phase 5—Operationalize: In this phase, you deliver final reports, briefings, code and technical
documents. In addition, sometimes a pilot project is also implemented in a real-time production
environment. This will provide you a clear picture of the performance and other related constraints on a
small scale before full deployment.
Phase 6—Communicate results: Now it is important to evaluate if you have been able to achieve your
goal that you had planned in the first phase. So, in the last phase, you identify all the key findings,
communicate to the stakeholders and determine if the results of the project are a success, or a failure
based on the criteria developed in Phase .
15/06/2022 45
Data Science Tools and Technologies
15/06/2022 46
Data Science Tools and Technologies
15/06/2022 47
Data Science Tools and Technologies
Following is the data science technology stack that is in demand and can come as necessary for a successful career
in data science and technology.
1. Amazon Web Services (AWS)
Amazon Web Services (AWS) is a cloud provider. It is a cloud-based service that allows users to access virtual
servers. The technology is offered through an Amazon Elastic Compute Cloud or EC2 instance. Instances are
powered by Apache Spark on Amazon Linux and provide access to other services that may be used for data
processing.
2. Amazon Machine Learning (AML)
Amazon Machine Learning is a specialized ML service inside AWS that can be used to develop ML models with
predictive capabilities.
3. Text Mining
Nearly 80% of all data in the world is unstructured, making Text Mining a crucial analysis and processing method.
It is the practice of extracting useful information and finding patterns from large amounts of textual material by
organizing previously unrecognized relationships and trends.
15/06/2022 48
Data Science Tools and Technologies
2. Scikit-learn: Scikit-learn is the go-to Python library for machine learning. This library provides a consistent
interface to common algorithms, including regression, classification, clustering, and dimensionality reduction.
It's optimized for performance and widely used by data scientists.
3. Seaborn: Seaborn is a powerful data visualization library that is built on top of Matplotlib. It comes with a
range of beautiful and well-designed default themes and is particularly useful when working with pandas Data-
Frames. With Seaborn, you can create clear and expressive visualizations quickly and easily.
4. Jupyter Notebooks: Jupyter Notebooks is a popular open-source web application that allows data scientists to
create shareable documents combining live code, visualizations, equations, and text explanations. Great for
exploratory analysis, collaboration, and reporting.
15/06/2022 49
Types of Data Analysis
1. Descriptive Analysis
Goal — Describe or Summarize a set of Data
Description:
The very first analysis performed
Generates simple summaries about samples and measurements
common descriptive statistics (measures of central tendency, variability, frequency, position, etc)
Example:
Take the COVID-19 statistics page on google for example, the line graph is just a pure summary of the
cases/deaths, a presentation and description of the population of a particular country infected by the virus
Summary:
Descriptive Analysis is the first step in analysis where you summarize and describe the data you have using
descriptive statistics, and its result is a simple presentation of your data.
15/06/2022 50
Types of Data Analysis
2. Exploratory Analysis
Goal — Examine or explore data and find relationships between variables which were previously unknown
Description:
EDA helps you discover relationships between measures in your data, which are not evidence for the existence of
the correlation, as denoted by the phrase (Correlation doesn’t imply causation)
useful for discovering new connections — forming hypothesis and drives design planning and data collection
Example:
Climate change is an increasingly important topic as the global temperature is gradually rising over the years. One
example of EDA on climate change is by taking the rise in temperature over the years, say 1950 to 2020 for
example, and the increase of human activities and industrialization, and form relationships from the data, e.g.
increasing number of factories, cars on the road and airplane flights increase correlates.
Summary:
EDA explores data to find relationships between measures that tells us they exist, without the cause. They can be
used to formulate hypotheses.
15/06/2022 51
Types of Data Analysis
3. Inferential Analysis
Goal— Using a small sample of data to infer about a larger population,
The goal of statistical modeling itself, is all about using a small amount of information to extrapolate and generalize
information to a larger group.
Description:
Using estimated data that value in population and give a measure of uncertainty (standard deviation) in your
estimation
Accuracy of inference depends heavily on sampling scheme; if the sample isn’t representative of the population, the
generalization will be inaccurate (ref Central Limit Theorem).
Example:
The idea of inferring about the population at large with a smaller sample is quite intuitive, many statistics you see
on the media and the internet are inferential, a prediction of an event based on a small sample. To give an example,
a psychology study for the benefits of sleep, a total of 500 people involved in the study, when followed up with the
candidates, they reported to have better overall attention and well-being with 7–9 hours of sleep, while those with
less sleep and more sleep suffered with reduced attention and energy. This report from 500 people was just a tiny
portion of 7b people in the world, thus an inference of the larger population.
Summary:
IA extrapolates and generalizes the information of the larger group with a smaller sample to generate analysis and
predictions.
15/06/2022 52
Types of Data Analysis
4. Predictive Analysis
Goal — Using historical or current data to find patterns to make predictions about the future
Description:
Accuracy of the predictions depends on the input variables
Accuracy also depends on the types of models, a linear model might work well in some cases, and vice-versa
Using a variable to predict another doesn’t denote a causal relationships
Example:
The 2020 US election is a popular topic and many prediction models are built to predict the winning candidate
FiveThirtyEight did a great 2016 Election forecast and is back at it again in 2020. Prediction analysis for an election
would require input variables such as historical polling data, trends and the current polling data in order to get a
good prediction. Something as large as an election wouldn’t just be using a linear model, but a complex model
with certain tunings to best serve it’s purpose.
Summary:
PA takes data from the past and present to make predictions about the future.
15/06/2022 53
Types of Data Analysis
5. Causal Analysis
Goal — Looks at the cause and effect of relationships between variables, focused on finding the cause of a
correlation.
Description:
To find the cause, you have to question whether the observed correlations driving your conclusion are valid, as just
looking at the data (surface) won’t help you discover the hidden mechanisms underlying the correlations Applied
in randomized studies focused on identifying causation
the gold standard in data analysis, scientific studies where cause of phenomenon is to be extracted and singled
out, like separating wheat from chaff
Challenges:
Good data is hard to find and requires expensive research and studies. These studies are analyzed in aggregate
(multiple groups), and the observed relationships are just average effects (mean) of the whole population
(meaning the results might not apply to everyone)
Example: Say you want test out this new drug that improves human strength and focus, and to do that you
perform randomized control trials for the drug to test the effect of the drug. You compare the sample of candidates
for your new drug vs the candidates receiving mock control with a few test for on strength and overall focus and
attention and observe how the drug affects the outcome
Summary: CA is about finding out the causal relationship between variables, change one variable and what
happens to another.
15/06/2022 54
Types of Data Analysis
6. Mechanistic Analysis
Goal — Understand exact changes in variables that lead to other changes in other variables
Description:
Applied in physical or engineering sciences, situations that require high precision and little room for error(only
noise in data is measurement error)
Designed to understand a biological or behavioral process, the pathophysiology of a disease, or the mechanism of
action of an intervention. (by NIH)
Example:
Many graduate-level research and complex topics are suitable examples, but to put it in a simple manner, let’s say
an experiment is done to simulate safe and effective nuclear fusion to power the world, a mechanistic analysis of
the study would entail precise balance of controlling and manipulating variables with highly accurate measures of
both variables and the desired outcomes. It’s this intricate and meticulous modus operandi (strategy) towards
these big topics that allows for scientific breakthroughs and advancement of society.
Summary:
MA is in some ways a predictive analysis, but modified to tackle studies that require high precision and meticulous
methodologies for physical or engineering science.
15/06/2022 55
Need for Data Science
15/06/2022 56
Need for Data Science
• With the help of data science technology, we can convert the massive amount of raw and unstructured data into
meaningful insights.
• Data science technology is opting by various companies, whether it is a big brand or a startup. Google, Amazon,
Netflix, etc, which handle the huge amount of data, are using data science algorithms for better customer
experience.
• Data science is working for automating transportation such as creating a self-driving car, which is the future of
transportation.
• Data science can help in different predictions such as various survey, elections, flight ticket confirmation, etc.
15/06/2022 57
Analysis Vs Analytics Vs Reporting
•Analytics and reporting can help a business improve operational efficiency and production in several
ways. Analytics is the process of making decisions based on the data presented, while reporting is used
to make complicated information easier to understand. Let’s discuss analytics vs reporting.
•Analytics and reporting are often referred to as the same. Although both take in data as input and
present it in charts, graphs, or dashboards, they have several key differences.
15/06/2022 58
Analysis Vs Analytics Vs Reporting
Analytics is the technique of examining data and reports to obtain actionable insights that can be used to
comprehend and improve business performance. Business users may gain insights from data, recognize trends, and
make better decisions with workforce analytics.
On the one hand, analytics is about finding value or making new data to help you decide. This can be performed
either manually or mechanically. Next-generation analytics uses new technologies like AI or machine learning to
make predictions about the future based on past and present data.
The steps involved in data analytics are as follows:
• Developing a data hypothesis
• Data collection and transformation
• Creating analytical research models to analyze and provide insights
• Utilization of data visuaization, trend analysis, deep dives, and other tools.
• Making decisions based on data and insights
15/06/2022 59
Analysis Vs Analytics Vs Reporting
On the other hand, reporting is the process of presenting data from numerous sources clearly and
simply. The procedure is always carefully set out to report correct data and avoid misunderstandings.
Today’s reporting applications offer cutting-edge dashboards with advanced data visualization features.
Companies produce a variety of reports, such as financial reports, accounting reports, operational
reports, market studies, and more. This makes it easier to see how each function is operating quickly.
15/06/2022 60
Analysis Vs Analytics Vs Reporting
15/06/2022 61
Big Data Eco-System
Big data ecosystems refers to the massive volumes of both structured and unstructured data, whose size
or type is beyond the ability of a traditional relational database. These are used to capture, manage and
process data with low latency.
Nearly every successful business relies of quick, agile decisions to stay competitive. This is only
achieved when the enterprise properly deals with their data storage, processing, and visualization.
Cerebra Consulting has the experience and know-how to make sure your business does just that!
15/06/2022 62
Big Data Eco-System
•Big data ecosystem is the comprehension of massive functional components with various enabling
tools. Capabilities of the big data ecosystem are not only about computing and storing big data, but also
the advantages of its systematic platform and potentials of big data analytics. Hence, according to
proposed solutions of reviewed literature and big data capabilities, the maturity of big data ecosystem
application is categorized into three stages:
•Stage 1: Proposing a big data framework and platform;
•Stage 2: Harvesting cloud computing capacity for big data computing and storage;
•Stage 3: Analysing big data with various algorithms for the applications (prediction, fault detection,
optimization etc.).
15/06/2022 63
Big Data Importance and Applications
1. Data Management
2. Data Mining
3. Hadoop
4. In-Memory Analytics
5. Predictive Analytics
6. Text Mining
Why is big data concepts analytics important?
2. Reduced cost
2. Quick decision making
3. New products and features
15/06/2022 64
Applications of Big Data
15/06/2022 65
Future of Data Science
Data Science has evolved all the way long from statistics– with simple statistical models, the
organizations collected, managed, and analyzed the data from the 19th century. Later, once computers
emerged in the scenario, the digital era began generating massive amounts of data. The internet has
made a breakthrough with the explosion of data, and the need to manage Big Data has led to the growth
of Data Science.
Data Scientist skills help organizations to make informed business decisions through effective data
management. Data science technologies trigger personalized healthcare systems, targeted advertising,
risk and fraud detection, airline route management, financial applications, and many other processes of
various industries.
The future of data science is uncertain; however, it would definitely bring further innovation in business
processes with the technological revolution. This article, let us know the top 10 predictions of data
science.
15/06/2022 66
Future of Data Science
1. The tasks of Data Scientists hired to augment business processes could be automated soon.
2. Data Science will incorporate concepts from various fields like sociology and psychology– it will soon become
interdisciplinary.
3. Social Media and other online platforms will become the source for the collection of more data.
4. Data Science will help businesses predict the consumer behavior.
5. Data Science is moving into an era of becoming a team activity. It speaks not about creating a model, but what
would you use it for once you build it.
6. Data Science will grow more conscious of the increased cybersecurity threats.
7. Data Scientists will face a growing Cloud Computing prevalence.
8. Coding and AI skills will become more essential, and data scientists need to be more business-minded.
9. Data Scientist’s jobs become more operationalized with advanced tools to capture their workflows and train
enterprise on their best practices.
10. Data Scientists will get the opportunity to initiate a “quantum leap”.
15/06/2022 67
Future of Data Science
Data science makes the way forward powerful with many emerging trends that help organizations to
thrive. However, these changes would lead the organizations to look for candidates with advanced Data
scientist skills. To make the most of this demand and win opportunities, Data science
certifications can be a great pick. With data science certifications from an expert program provider, you
can build all the necessary skills to make a revolution in data science.
15/06/2022 68
Applications of Data Science in Various Fields
Healthcare: Data science can identify and predict disease and personalize healthcare recommendations.
Transportation: Data science can optimize shipping routes in real-time.
Sports: Data science can accurately evaluate athletes’ performance.
Government: Data science can prevent tax evasion and predict incarceration rates.
E-commerce: Data science can automate digital ad placement.
Gaming: Data science can improve online gaming experiences.
Social media: Data science can create algorithms to pinpoint compatible partners.
Fintech: Data science can help create credit reports and financial profiles, run accelerated underwriting and create predictive models based on historical payroll data.
15/06/2022 69
Applications of Data Science
15/06/2022 70
Use-Cases of Data Science
Netflix
• Netflix initially started as a DVD rental service in 1998. It mostly relied on a third party postal services to deliver its DVDs to the users. This resulted in heavy losses which they soon mitigated with the introduction of their online streaming service in 2007.
• In order to make this happen, Netflix invested in a lot of algorithms to provide a flawless movie experience to its users. One of such algorithms is the recommendation system that is used by Netflix to provide suggestions to the users.
• A recommendation system understands the needs of the users and provides suggestions of the various cinematographic products.
• A recommendation system is a platform that provides its users with various contents based on their preferences and likings. A recommendation system takes the information about the user as an input.
• This information can be in the form of the past usage of product or the ratings that were provided to the product. It then processes this information to predict how much the user would rate or prefer the product. A recommendation system makes use of a variety of machine learning algorithms.
15/06/2022 71
Use-Cases of Data Science
Another important role that a recommendation system plays today is to search for similarity between different products. In the case of Netflix, the recommendation system searches for movies that are similar to the ones you have watched or have liked previously.
• This is an important method for scenarios that involve cold start. In cold start, the company does not have much of the user data available to generate recommendations.
• Therefore, based on the movies that are watched, Netflix provides recommendations of the films that share a degree of similarity. There are two main types of Recommendation Systems –
• 1. Content-based recommendation systems
• In a content-based recommendation system, the background knowledge of the products and customer information are taken into consideration. Based on the content that you have viewed on Netflix, it provides you with similar suggestions.
• For example, if you have watched a film that has a sci-fi genre, the content-based recommendation system will provide you with suggestions for similar films that have the same genre.
15/06/2022 72
Use-Cases of Data Science
Next in data science use cases is Uber. Uber is a popular smartphone application that allows you to book a cab. Uber makes extensive use of Big Data. After all, Uber has to maintain a large database of drivers, customers, and several other records.
It is therefore, rooted in Big Data and makes use of it to derive insights and provide the best services to its users. Uber shares the big data principle with crowdsourcing. That is, registered drivers in the area can help anyone who wants to go somewhere.
As mentioned above, Uber contains a database of drivers. Therefore, whenever you hail for a cab, Uber matches your profile with the most suitable driver. What differentiates Uber from other cab companies is that Uber charges you based on the time it takes to cover the distance and not the distance itself.
It calculates the time taken through various algorithms that also make use of data related to traffic density and weather conditions.
Uber makes the best use of data science to calculate its surge pricing. When there are less drivers available to more riders, the price of the ride goes up. This happens only during the scarcity of drivers in any given area.
However, if the demand for Uber rides is less, then Uber charges a lower rate. This dynamic pricing is rooted in Big Data and makes excellent usage of data science to calculate the fares based on the parameters.
15/06/2022 73
Use-Cases of Data Science
• Facebook is a social-media leader of the world today. With millions of users around the world, Facebook utilizes a large scale quantitative research through data science to gain insights about the social interactions of the people.
• Facebook has become a hub of innovation where it has been using advanced techniques in data science to study user behavior and gain insights to improve their product. Facebook makes use of advanced technology in data science called deep learning.
• Using deep learning, Facebook makes use of facial recognition and text analysis. In facial recognition, Facebook uses powerful neural networks to classify faces in the photographs. It uses its own text understanding engine called “DeepText” to understand user sentences.
• It also uses Deep Text to understand people’s interest and aligning photographs with texts.
• However, more than being a social media platform, Facebook is more of an advertisement corporation. It uses deep learning for targeted advertising. Using this, it decides what kind of advertisements the users should view.
• It uses the insights gained from the data to cluster users based on their preferences and provides them with the advertisements that appeal to them.
15/06/2022 74
Use-Cases of Data Science
• Since its inception, Amazon has been working hard to make itself a customer-centric platform. Amazon heavily relies on predictive analytics to increase customer satisfaction. It does so through a personalized recommendation system.
• This recommendation system is a hybrid type that also involves collaborative filtering which is comprehensive in nature. Amazon analyzes the historical purchases of the user to recommend more products.
• This also comes through the suggestions that are drawn from other users who use similar products or provide similar ratings.
• Amazon has an anticipatory shipping model that uses big data for predicting the products that are most likely to be purchased by its users. It analyzes the pattern of your purchases and sends products to your nearest warehouse which you may utilize in the future.
• Amazon also optimizes the prices on its websites by keeping in mind various parameters like the user activity, order history, prices offered by the competitors, product availability, etc. Using this method, Amazon provides discounts on popular items and earns profits on less popular items.
• Another area where every e-commerce platform is addressing is Fraud Detection. Amazon has its own novel ways and algorithms to detect fraud sellers and fraudulent purchases.
• Other than online platforms, Amazon has been optimizing the packaging of products in warehouses and increasing the efficiency of packaging lines through the data collected from the workers.
15/06/2022 75
Use-Cases of Data Science
15/06/2022 76
Weekly/Monthly Assignment
15/06/2022 77
Weekly/Monthly Assignment
15/06/2022 78
Glossary Questions
15/06/2022 79
References
15/06/2022 80
Expected Questions for End Semester Exam
15/06/2022 81