Question Bank Big Data Analytics
Question Bank Big Data Analytics
Unit-I
1. Big data and its importance
2. Characteristics or 5 V’s of Big data
3. Types of big data
4. Difference between traditional data and big data
5. Challenges of big data
6. Big data analytics and classification
7. Big data technologies
8. Big data applications (all the applications)
Unit-II
1. Hadoop architecture
2. Explain how data is analyzed with Hadoop Using Map and Reduce
3. Explain Hadoop Ecosystem.
4. Explain how Data is read using the Java FileSystem API.
5. Anatomy of file read.
6. Anatomy of file writes.
7. Difference between traditional database and hadoop
8. List various hadoop filesystems.
9. Assumption and goals for hadoop design.
10. Define failover and fencing in hadoop.
11. Replication policy used in hadoop design
12. What is safemode
Unit-III
1. Why NoSQL is used.
2. Difference between SQL and NoSQL
3. Types of NoSQL databases
4. Database impedence mismatch
5. Polygot persistence
6. What is Auto sharding
7. Aggregate data model with example
8. What is Key-value and document data model
9. Graph databases and relationships
10. What is schemaless database
11. Explain distribution models
12. Explain sharding
13. Explain partitioning and combining in MapReduce
Unit-IV
1. explain working of MapReduce paradigm
2. map reduce architecture
3. anatomy of YARN map reduce job run
4. anatomy of classic map reduce job run
5. failures in classic/YARN MapReduce
6. Explain Map Reduce Input formats
7. Shuffle and sort in map reduce
8. Job tracker, task tracker
9. Node manager, resource manager
10. Application master failure in YARN map reduce
11. Scheduling and types of schedulers
12. Speculative execution
13. Types of input formats for hadoop
14. Types of output formats for hadoop
15. Benefits of map reduce
Unit-V
1. Explain Hive architecture
2. Hive metastore configuration.
3. Hive comparison with traditional databases.
4. Difference between SQL and HQL.
5. What are Hive partitions and buckets
6. Sorting and aggregating in Hive.
7. Hive UDF and UDAF with examples.
8. Working (workflow) of Hive UDAF.
9. Pig Vs Hive, Pig Vs SQL.
10. Apache Pig data processing operators.
11. Apache Pig register and define statements
12. Pig Latin built-in functions.
13. Types of functions in Apache Pig.
14. Apache Pig UDFs
15. Grouping and joining, combining and splitting in Pig Latin.