Cse 511
Cse 511
(CSE 511)
Note: Below outline is subject to modifications and updates.
Learning Outcomes
yyDifferentiate among major data models such as relational, spatial, and NoSQL
yyPerform queries (e.g., SQL) and analytics tasks in state-of-the-art database systems
yyApply leading-edge techniques to design/tune distributed and parallel database systems
yyUtilize existing NoSQL database systems as appropriate for specified cases
yyPerform database operations (e.g., selection, projection, join, and groupby) in state-of-the-art
cluster computing systems such as Hadoop/Spark
yyPerform scalable data processing operations (e.g., selection, projection, join, and groupby) in
cloud computing environments, including Amazon AWS
Course Content
Instruction Assessments
yy Video Lectures yy Practice activities and quizzes (auto-graded)
yy Other Videos yy Practice assignments (instructor-
yy Readings or peer-reviewed)
yy Interactive Learning Objects yy Team and/or individual project(s)
(instructor-graded)
yy Live office hours
yy Final exam (graded)
yy Webinars
Technology Requirements
Hardware
yy Standard with major OS
Learning Objectives
1.1: Explain Data Models and Data processing concepts
1.2: Utilize Relational Model and Relational Algebra
1.3: Utilize SQL query language
• Unit Introduction
• Module 1: Big Data and Data Processing
• Introduction to Data and Data Processing
• Database Management Systems
• Data Models
• Module 2: Basic Data Concepts
• Database Systems - What and Why?
• Database Management Systems
• Data Model
• Database Design: Entity Relationship Model to Relational Model
• Entity Relational Model
• ER to Relational Model
• Assignment: Create a Movie Database
• Relational Model and Relational Algebra
• Relational Data Model
• Relational Algebra: Query Language
• Query Language: Union
• Query Language: Difference
• Query Language: Cartesian Product
• Query Language: Selection
• Query Language: Projection
• Query Language: Intersection
• Query Language: 0-Join
• SQL Query Language:
• Part 1: SQL Query Language
• Part 2: SQL Query Language
• Assignment: SQL Query for Movie Recommendation
Learning Objectives
2.1 Recognize major data storage layouts
2.2 Identify major indexing schemes in Database Systems
• Unit Introduction
• Module 1: Major Storage Layouts
• Introduction to Data Storage
• Alternative File Organizations
• Module 2: Major Indexing Schemes in Database Systems
• Hash-based Indexes
• Index Classification
Learning Objectives
3.1 Examine the ACID properties
3.2 Explain Transactions and Concurrency Control concepts
3.3 Describe how recovery from failures happens in database systems
• Unit Introduction
• Module 1: ACID Properties
• Principles of Transactions: ACID Properties
• Module 2: Concurrency Control Concepts
• Concurrency Control
• Module 3: Lock-based Concurrency Control and Recovery from Failures
• Lock-Based Concurrency Control
• Database Recovery
Learning Objectives
4.1 Describe data fragmentation and replication models
4.2 Describe the components of a distributed database
4.3. Apply skills learned to complete an assignment using data partitioning
• Unit Introduction
• Module 1: Distributed Databases: Why, What?
• Why Distribution?
• Module 2: Data Fragmentation and Replication Model
• Introduction to Fragmentation
• Introduction to Replication
• Assignment: Data Fragmentation
Learning Objectives
• Unit Introduction
• Module 1: NoSQL Database Systems
• Key-Value Stores
• Graph Databases
• Document Databasesy
• Module 2: Big Data Analytics Systems
• Intro Map-Reduce / Spark
• Data Analytics in Map-Reduce / Spark
• Graph Processing Engines
• Module 3: Data Processing on Modern HW
Learning Objectives
7.1 Explain data processing in the cloud
7.2 Evaluate service models
7.3 Evaluate deployment models
• Unit Introduction
• Module 1: Introduction to Cloud Computing
• Introduction to Cloud Computing
• Module 2: Service Models
• Service Models
• Module 3: Deployment Models
• Deployment Models
Learning Objectives
8.1 Explain AWS
• Unit Introduction
• Module 1: Amazon Web Services
• Introduction to Amazon Web Services
• AWS Computing
• AWS Storage
• AWS Queueing Services
• Module 2: Build an Elastic Cloud Application
• AWS Interfaces
• Auto-Scaling
• Module 3: Build a MapReduce Cloud Application
• Scalable Data Processing
• AWS Security
As the prototype for a New American University, ASU pursues research that contributes to the
public good, and ASU assumes major responsibility for the economic, social and cultural vitality
of the communities that surround it. Recognizing the university’s groundbreaking initiatives,
partnerships, programs and research, U.S. News and World Report has named ASU as the
most innovative university all three years it has had the category.
The innovation ranking is due at least in part to a more than 80 percent improvement in ASU’s
graduation rate in the past 15 years, the fact that ASU is the fastest-growing research university
in the country and the emphasis on inclusion and student success that has led to more than 50
percent of the school’s in-state freshman coming from minority backgrounds.
Mohamed Sarwat is an Assistant Professor of Computer Science and the director of the
Data Systems (DataSys) lab at Arizona State University (ASU). He is also an affiliate member
of the Center for Assured and Scalable Data Engineering (CASCADE). Before joining ASU,
Mohamed obtained his MSc and PhD degrees in computer science from the University of
Minnesota. His research interest lies in the broad area of data management systems.
Ming Zhao is an associate professor of the ASU School of Computing, Informatics, and
Decision Systems Engineering. Before joining ASU, he was an associate professor of the
School of Computing and Information Sciences (SCIS) at Florida International University.
He directs the Research Laboratory for Virtualized Infrastructure, Systems, and Applications
(VISA). His research interests are in distributed/cloud computing, big data, high-performance
computing, autonomic computing, virtualization, storage systems and operating systems.