0% found this document useful (0 votes)
99 views11 pages

BigData Lecture1 Overview

This document provides an overview of a Big Data course. The course will cover topics such as why Big Data is becoming important, the business drivers for Big Data, Big Data infrastructure, in-memory computation, streaming analysis, advanced analytics on Big Data, Big Data systems and technologies including HDFS, HIVE, HBase, Hadoop, Spark, and machine learning with Big Data. The course will also cover Big Data algorithms like matrix multiplication, PageRank, and streaming algorithms. It will focus on data engineering rather than analytics techniques. The instructor provides tips for the course and references related books and materials.

Uploaded by

Mubarak Begum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views11 pages

BigData Lecture1 Overview

This document provides an overview of a Big Data course. The course will cover topics such as why Big Data is becoming important, the business drivers for Big Data, Big Data infrastructure, in-memory computation, streaming analysis, advanced analytics on Big Data, Big Data systems and technologies including HDFS, HIVE, HBase, Hadoop, Spark, and machine learning with Big Data. The course will also cover Big Data algorithms like matrix multiplication, PageRank, and streaming algorithms. It will focus on data engineering rather than analytics techniques. The instructor provides tips for the course and references related books and materials.

Uploaded by

Mubarak Begum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Big Data Course Overview

Animesh Giri
Department of Computer Science and Engineering
[email protected]
Overview of the course

• Big Data – why is it becoming important?

• What is data? And what is Big Data?

• The business drivers for Big Data

• Why is it different?
How is this course distinct from other related courses ?
Focus is not in Analysis

Data Set Model


Obtain answer to query

How Many Liked ?


How Many didn’t like ?
What is the designation given generally for this skill set?
Typical Data Flow Cycle

Data Engineering
Overview of the course

• Big Data Introduction

• Big Data Infrastructure

• In Memory Computation

• Streaming Analysis

• Advanced Analytics on Big Data


Overview of the course

• Big Data Systems and Technologies


• Technologies: usage and design
• Storing data : HDFS
• Extracting information : HIVE, Hbase
• Computing with Big Data: Hadoop, Spark, Streaming Spark
https://www.edureka.co/blog/hadoop-ecosystem
• Hadoop - Ecosystem
• Machine learning with Big Data – MLLib
• Computation Models
• Batch and Interactive processing
Overview of the course

• Big Data Algorithms


• Matrix Multiplication

• Page Rank

• Streaming algorithms Source: https://en.wikipedia.org/wiki/PageRank

• ML algorithms
What this course is not about

• Analytics techniques – focus of a separate


course;
• This is meant to complement an analytics
course.
Course tips

• Please come to class prepared; read previous


lecture notes

• Programming assignments are intensive


• Start early; do not underestimate complexity
• Installation of software – also has learnings

• Plaigiarism policy
• Not permitted to copy – lose full marks for
assignment – both donor and receipient

Books and references

“Big Data Analytics”, Rajkamal, Preeti Saxena, 1st Edition, McGraw Hill Education,
2019

“Big Data Simplified”, Sourabh Mukherjee, Amit Kumar Das, Sayan Goswami, 1 st
Edition, Pearson, 2019

“Mining of Massive Datasets”, Anand Rajaraman, Jure Leskovec, Jeffrey D. Ullman,


Cambridge Press, 2014. – for the algorithms
THANK YOU

Animesh Giri
Department of Computer Science and Engineering
[email protected]
+91 80 6618 6603

You might also like