Bda Ap
Bda Ap
FOR
ACADEMIC YEAR
2023-24
1 Preamble / Introduction
2
Prerequisites
3
Objectives and Outcomes
Syllabus
4 1.JNTU/R20-CMREC
2.GATE
List of Expert Details (Local/National/International with
5 Contact details/Profile link/Blogs/their research Contribution
towards the subject)
6 Journals with min 5 ref paper
7 Subject -Lesson plan
8 Suggested Books (prescribed and References)
Websites for self learning Resources like
9 www.geeksforgeeks.org,www.schools.com, Coursera, edX,
Udemy, Khan Academy, NPTEL etc along Registration
procedures
10 Question Banks
1.JNTU-Model papers
2.GATE
Two case study presentations with Project / Product/
11
Model /Prototypes/ Industrial applications.
12 Assignment Question/Innovative Assignments sets.
13 List of topics for students Seminars with Guidelines
14 STEP/Course material in softcopy
15 Expert Lectures with topics & Schedules(if any)
1. COURSE INTRODUCTION: Big Data Analytics
Big Data is a collection of data that is huge in volume, yet growing
It is a data with so large size and complexity that none of traditional data
Big data analytics is the often complex process of examining big data to
business decisions.
2. PREREQUISITES:
1. SQL,
2. Any one Programming Language
Analyze the Big Data framework like Hadoop and NOSQL to efficiently store and
Design of Algorithms to solve Data Intensive Problems using Map Reduce Paradigm
Design and Implementation of Big Data Analytics using pig and spark to solve data
3 0 0 3
Unit – I
Data Management (NOS 2101): Design Data Architecture and manage the data
for
analysis, understand various sources of Data like Sensors/signal/GPS etc. Data Management,
Data Quality (noise, outliers, missing values, duplicate data) and Data Preprocessing. Export
all the data onto Cloud ex. AWS/Rackspace etc.
Unit – II
Big Data Tools (NOS 2101): Introduction to Big Data tools like Hadoop, Spark,
Impala etc., Data ETL process, Identify gaps in the data and follow-up for decision
making.
Provide Data/information In Standard Formats (NOS 9004): Introduction,
Knowledge Management, and Standardized reporting & compliances, Decision Models,
course conclusion. Assessment.
Unit – III
Big Data Analytics: Run descriptive to understand the nature of the available data,
collate all the data sources to suffice business requirement, Run descriptive statistics for all
the variables and observer the data ranges, Outlier detection and elimination.
Unit – IV
Unit – V
(NOS 9004) Data Visualization (NOS 2101): Prepare the data for Visualization,
Use
tools like Tableau, QlickView and 03, Draw insights out of Visualization tool. Product
Implementation
TEXT BOOK
1. Student’s Handbook for Associate Analytics.
REFERENCE BOOKS
1. Introduction to Data Mining, Tan, Steinbach and Kumar, Addison Wesley, 2006.
2. Data Mining Analysis and Concepts, M. Zaki and W. Meira (the authors have
kindly made an online version available):
http:/iwww.datamininqbook. infoluploads/book.pdf
5 EXPERT DETAILS:
The Expert Details which have been mentioned below are only a few of the
eminent ones known Internationally, Nationally and Locally. There are a few
others known as well.
International :
1.
2.
National:
1. Prof. D.Lakshmi,
Professor,
Dept. of CSE
VIT Bhopal - INDIA
2. Ms.Palak Gupta
Assistant Professor
[email protected]
Regional:
6 Journals:
International
1. International Journal of Data Science and Analytics | Volumes and issues (springer.com)
2. International Journal of Big Data and Analytics in Healthcare (IJBDAH): 2379-738X, 2379-7371:
4. 🏆 International Journal of Data Science and Analytics | Impact Factor | Indexing | Acceptance rate |
National:
1. Journal of Big Data | Call for papers: Big Data in Human Behaviour Research: a Contextual Turn?
(springeropen.com)
2. Frontiers | Segmentation of Multi-Regional Skeletal Muscle in Abdominal CT Image for Cirrhotic
Sarcopenia Diagnosis (frontiersin.org)
3. JBD | A Survey of Machine Learning for Big Data Processing (techscience.com)
4. Big Data and Information Analytics (aimspress.com)
5. International Journal of Big Data Management (IJBDM) Inderscience Publishers - linking academia,
business and industry through research
7. SUBJECT (LESSON) PLAN :
NO. OF
Topic Suggested Method Of
S.NO Sub-Topic LECTURES
Syllabus Books Teaching
REQUIRED
UNIT – I
Data Management
(NOS2101): Design Data M4
1 Architecture and manage L1 T1, R2
the data for analysis
Introduction, Workplace M1
7 Safety L7 T1,R3
UNIT-2
UNIT-IV
33 Seminar L33 -- M5
34 Test L34 M5
--
UNIT-V
43 Revision-3 L43 T1 M5
METHODS OF TEACHING:
TEXT BOOKS:
R1. Introduction to Data Mining, Tan, Steinbach and Kumar, Addison Wesley, 2006.
R2.Data Mining Analysis and Concepts, M. Zaki and W. Meira (the authors have kindly made
an online version available)
http:/www.datamininqbook. infoluploads/book.pdf
9. Websites
Do not confine yourself to the list of websites mentioned here alone. Be cognizant and keep yourself abreast of
the others too.
The given list is not exhaustive.
Mooc- Nptel
https://nptel.ac.in/courses/
Geeksforgeeks
https://www.geeksforgeeks.org/
GURU99
https://www.guru99.com/
https://www.bernardmarr.com/img/bigdata-case-studybook_final.pdf
https://techvidvan.com/tutorials/top-10-big-data-case-studies/
https://nap.nationalacademies.org/read/23654/chapter/5#37
3. GE General Electric – a literal powerhouse of a corporation involved in virtually every area of industry,
has been laying the foundations of what it grandly calls the Industrial Internet for some time now.
But what exactly is it? Here’s a basic overview of the ideas which they are hoping will transform industry, and
how it’s all built around big data.
If you’ve heard about the Internet of Things which I’ve written about previously , a simple way to think of the
industrial internet is as a subset of that, which includes all the data-gathering, communicating and analysis
done in industry.
In essence, the idea is that all the separate machines and tools which make an industry possible will be
“smart” – connected, data-enabled and constantly reporting their status to each other in ways as creative as
their engineers and data scientists can devise. big data - case study collection 7 This will increase efficiency by
allowing every aspect of an industrial operation to be monitored and tweaked for optimal performance, and
reduce down-time – machinery will break down less often if we know exactly the best time to replace a worn
part. Data is behind this transformation, specifically the new tools that technology is giving us to record and
analyse every aspect of a machine’s operation. And GE is certainly not data poor – according to Wikipedia, its
2005 tax return extended across 24,000 pages when printed out. And pioneering is deeply engrained in its
corporate culture – being established by Thomas Edison, as well as being the first private company in the
world to own its own computer system, in the 1960s. So of all the industrial giants of the pre-online world, it
isn’t surprising that they are blazing a trail into the brave new world of big data.
GE generates power at its plants which is used to drive the manufacturing that goes on in its factories, and its
financial divisions enable the multi-million transactions involved when they are bought and sold. With fingers
in this many pies, it’s clearly in the position to generate, analyse and act on a great deal of data.
Sensors embedded in their power turbines, jet engines and hospital scanners will collect the data – it’s
estimated that one typical gas turbine will generate 500Gb of data every day. And if that data can be used to
improve efficiency by just 1% across five of their key sectors that they sell to, those sectors stand to make
combined savings of $300 billion. With those kinds of savings within sight, it isn’t surprising that GE big data -
case study collection 8 is investing heavily. In 2012 they announced $1 billion was being invested over four
years in their state-of-the-art analytics centre in San Ramon, California, in order to attract pioneering data
talent to lay the software foundations of the Industrial Internet.
In aviation, they are aiming to improve fuel economy, maintenance costs, reduction in delays and
cancellations and optimize flight scheduling – while also improving safety. Abu Dhabi-based Etihad Airways
was the first to deploy their Taleris Intelligent Operations technology, developed in partnership with
Accenture. Huge amounts of data are recorded from every aircraft and every aspect of ground operations,
which is reported in real-time and targeted specifically to recovering from disruption, and returning to regular
schedule. And last year it launched its Hadoop based database system to allow its industrial customers to
move its data to the cloud. It claims it has built the first infrastructure which is solid enough to meet the
demands of big industry, and works with its GE Predictivity service to allow real-time automated analysis. This
means machines can order new parts for themselves and expensive downtime minimized – GE estimates that
its contractors lose an average of $8 million per year due to unplanned downtime. Green industries are
benefitting too – its 22,000 wind turbines across the globe are rigged with sensors which stream constant data
to the cloud, which operators can use to remotely fine-tune the pitch, speed, and direction the blades are
facing, to capture as much of the energy from the wind as possible. big data - case study collection 9 Each
turbine will speak to others around it, too – allowing automated responses such as adapting their behaviour to
mimic more efficient neighbours, and pooling of resources (i.e wind speed monitors) if the device on one
turbine should fail. Their data gathering extends into homes too – millions are fitted with their smart meters
which record data on power consumption, which is analysed together with weather and even social media
data to predict when power cuts or shortages will occur. GE has come further and faster into the world of big
data than most of its old-school tech competitors. It’s clear they believe the financial incentive is there –
chairman and CEO Jeff Immelt estimates that they could add $10 trillion to $15 trillion to the world’s economy
over the next two decades. In industry, where everything including resources is finite, efficiency is of utmost
importance – and GE are demonstrating with the Industrial Internet that they believe big data is the key to
unlocking its potential.