0% found this document useful (0 votes)

133 views38 pages

Big Data Analytics: Snapshot of Class Lab and Data Camp Course

This document contains summaries of exercises completed on Big Data analytics using Spark. It describes exercises done on Spark RDDs like creating RDDs from lists, performing transformations like map, filter and reductions. It also discusses creating DataFrames from data, performing SQL queries on temporary tables and creating visualizations from DataFrames. Spark was used to analyze text data and find word counts. Overall it demonstrates core Spark and PySpark concepts through examples on data processing, RDDs and DataFrames.

Uploaded by

Fasih Dawood

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

133 views38 pages

Big Data Analytics: Snapshot of Class Lab and Data Camp Course

Uploaded by

Fasih Dawood

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 38

Big Data Exercises Snapshots

Big Data Analytics

Assignment
Snapshot of Class Lab and Data camp Course

By Muhammad Fasih Dawood

Roll #:20k-1395
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots
Big Data Exercises Snapshots

Spark Exercises of Class Lab:

A = sc.parallelize(range(3))
print(A)

A = sc.parallelize(range(4))
print(A)
Lines = sc.parallelize(['you are my sunshne','my only sunshine','you m
ake me happy'])
Big Data Exercises Snapshots

print(Lines)

A = sc.parallelize(range(4))
print(A)

L = A.collect()
print(type(L))
print (L)
sc.parallelize(range(4)).collect()
sc.parallelize(range(4)).count()

A = sc.parallelize(range(4))

A.reduce(lambda x,y: x+y)

A = sc.parallelize(range(4)).map(lambda x: (x,x*x))

A.collect()

words = ['this','is','the','best','mac','ever']

wordRDD = sc.parallelize(words)

wordRDD.reduce(lambda w,v: w if len(w) < len(v) else v )

B = sc.parallelize([1,3,5,2])

B.reduce(lambda x,y: x-y)

A = sc.parallelize([(1,3),(4,100),(1,-5),(3,2)])

A.reduceByKey(lambda x,y: x*y).collect()

A = sc.parallelize([(1,3),(4,100),(1,-5),(3,2)])
Big Data Exercises Snapshots

A.countByKey()

A = sc.parallelize([(1,3),(4,100),(1,-5),(3,2)])

A.lookup(3)

A = sc.parallelize([(1,3),(4,100),(1,-5),(3,2)])

A.collectAsMap()

words = ['this','is','the','best','mac','ever']

wordRDD = sc.parallelize(words)

def LargerThan(x,y):
    if len(x) > len(y) : return x
    elif len(y) > len(x) : return y
    else:
        if(x>y):
            return x
        else:
            return y

wordRDD.reduce(LargerThan)

Big Data Fundamentals with PySpark:

# Print the version of SparkContext
print("The version of Spark Context in the PySpark shell is", sc.versi
on )

# Print the Python version of SparkContext
Big Data Exercises Snapshots

print("The Python version of Spark Context in the PySpark shell is", s
c.pythonVer )

# Print the master of SparkContext
print("The master of Spark Context in the PySpark shell is", sc.master
)

# Create a python list of numbers from 1 to 100
numb = range(1, 100)

# Load the list into PySpark
spark_data = sc.parallelize(numb)

# Load a local file into PySpark shell
lines = sc.textFile(file_path)

# Print my_list in the console
print("Input list is", my_list)

# Square all numbers in my_list
squared_list_lambda = list(map(lambda x: x ** 2 , my_list))

# Print the result of the map function
print("The squared numbers are", squared_list_lambda)

# Print my_list2 in the console
print("Input list is:", my_list2)
# Filter numbers divisible by 10
filtered_list = list(filter(lambda x: (x%10 == 0), my_list2))

# Print the numbers divisible by 10
print("Numbers divisible by 10 are:", filtered_list)
Big Data Exercises Snapshots

# Create an RDD from a list of words
RDD = sc.parallelize(["Spark", "is", "a", "framework", "for", "Big Dat
a processing"])

# Print out the type of the created object
print("The type of RDD is", type(RDD))

# Print the file_path
print("The file_path is", file_path)

# Create a fileRDD from file_path
fileRDD = sc.textFile(file_path)

# Check the type of fileRDD
print("The file type of fileRDD is", type(fileRDD))

# Check the number of partitions in fileRDD
print("Number of partitions in fileRDD is", fileRDD.getNumPartitions()
)

# Create a fileRDD_part from file_path with 5 partitions
fileRDD_part = sc.textFile(file_path, minPartitions = 5)

# Check the number of partitions in fileRDD_part
print("Number of partitions in fileRDD_part is", fileRDD_part.getNumPa
rtitions())
Big Data Exercises Snapshots

# Create map() transformation to cube numbers
cubedRDD = numbRDD.map(lambda x: x ** 3 )

# Collect the results
numbers_all = cubedRDD.collect()

# Print the numbers from numbers_all
for numb in numbers_all:
print(numb)

# Filter the fileRDD to select lines with Spark keyword
fileRDD_filter = fileRDD.filter(lambda line: 'Spark' in line)

# How many lines are there in fileRDD?
print("The total number of lines with the keyword Spark is", fileRDD_f
ilter.count() )

# Print the first four lines of fileRDD
for line in fileRDD_filter.take(4):
print(line)

# Create PairRDD Rdd with key value pairs
Rdd = sc.parallelize([(1,2),(3,4),(3,6),(4,5)])

# Apply reduceByKey() operation on Rdd
Rdd_Reduced = Rdd.reduceByKey(lambda x, y: x+y )

# Iterate over the result and print the output
for num in Rdd_Reduced.collect() :
print("Key {} has {} Counts".format(num[0] , num[1]))

# Sort the reduced RDD with the key by descending order
Rdd_Reduced_Sort = Rdd_Reduced.sortByKey(ascending=False)

# Iterate over the result and print the output
for num in Rdd_Reduced_Sort.collect():
print("Key {} has {} Counts".format(num[0], num[1]))
Big Data Exercises Snapshots

# Transform the rdd with countByKey()
total = Rdd.countByKey()

# What is the type of total?
print("The type of total is", type(total))

# Iterate over the total and print the output
for k, v in total.items() :
print("key", k, "has", v, "counts")

# Create a baseRDD from the file path
baseRDD = sc.textFile(file_path)

# Split the lines of baseRDD into words
splitRDD = baseRDD.flatMap(lambda x: x.split(" "))

# Count the total number of words
print("Total number of words in splitRDD:", splitRDD.count())

# Convert the words in lower case and remove stop words from stop_word
s
splitRDD_no_stop = splitRDD.filter(lambda x: x.lower() not in stop_wor
ds)

# Create a tuple of the word and 1
splitRDD_no_stop_words = splitRDD_no_stop.map(lambda w: (w, 1))

# Count of the number of occurences of each word
resultRDD = splitRDD_no_stop_words.reduceByKey(lambda x, y: x + y)

# Display the first 10 words and their frequencies
for word in resultRDD.take(10):
print(word)

# Swap the keys and values
resultRDD_swap = resultRDD.map(lambda x: (x[1], x[0]))
Big Data Exercises Snapshots

# Sort the keys in descending order
resultRDD_swap_sort = resultRDD_swap.sortByKey(ascending=False)

# Show the top 10 most frequent words and their frequencies
for word in resultRDD_swap_sort.take(10):
print("{} has {} counts". format(word[1], word[0]))

# Create a list of tuples
sample_list = [('Mona',20), ('Jennifer',34),('John',20), ('Jim',26)]

# Create a RDD from the list
rdd = sc.parallelize(sample_list)

# Create a PySpark DataFrame
names_df = spark.createDataFrame(rdd , schema=['Name', 'Age'])

# Check the type of names_df
print("The type of names_df is", type(names_df))

# Create an DataFrame from file_path
people_df = spark.read.csv(file_path, header=True, inferSchema=True)

# Check the type of people_df
print("The type of people_df is", type(people_df))

# Print the first 10 observations
people_df.show(10)

# Count the number of rows
print("There are {} rows in the people_df DataFrame.".format(people_df
.count()))

# Count the number of columns and their names
print("There are {} columns in the people_df DataFrame and their names
are {}".format(len(people_df.columns ), people_df.columns ))
Big Data Exercises Snapshots

# Select name, sex and date of birth columns
people_df_sub = people_df.select('name', 'sex', 'date of birth')

# Print the first 10 observations from people_df_sub
people_df_sub.show(10)

# Remove duplicate entries from people_df_sub
people_df_sub_nodup = people_df_sub.dropDuplicates()

# Count the number of rows
print("There were {} rows before removing duplicates, and {} rows afte
r removing duplicates".format(people_df_sub.count(), people_df_sub_nod
up.count()))

# Filter people_df to select females
people_df_female = people_df.filter(people_df.sex == "female")

# Filter people_df to select males
people_df_male = people_df.filter(people_df.sex == "male")

# Count the number of rows
print("There are {} rows in the people_df_female DataFrame and {} rows
in the people_df_male DataFrame".format(people_df_female.count(), peop
le_df_male.count()))

# Create a temporary table "people"
people_df.createOrReplaceTempView("people")

# Construct a query to select the names of the people from the tempora
ry table "people"
query = '''SELECT name FROM people'''

# Assign the result of Spark's query to people_df_names
people_df_names = spark.sql(query)

# Print the top 10 names of the people
people_df_names.show(10)
Big Data Exercises Snapshots

# Filter the people table to select female sex
people_female_df = spark.sql('SELECT * FROM people WHERE sex=="female"
')

# Filter the people table DataFrame to select male sex
people_male_df = spark.sql('SELECT * FROM people WHERE sex=="male"')

# Count the number of rows in both DataFrames
print("There are {} rows in the people_female_df and {} rows in the pe
ople_male_df DataFrames".format(people_female_df.count(), people_male_
df.count()))

# Check the column names of names_df
print("The column names of names_df are", names_df.columns )

# Convert to Pandas DataFrame
df_pandas = names_df.toPandas()

# Create a horizontal bar plot
df_pandas.plot(kind='barh', x='Name', y='Age', colormap='winter_r')
plt.show()

# Load the Dataframe
fifa_df = spark.read.csv(file_path, header=True, inferSchema=True)

# Check the schema of columns
fifa_df.printSchema()

# Show the first 10 observations
fifa_df.show(10)

# Print the total number of rows
print("There are {} rows in the fifa_df DataFrame".format(fifa_df.coun
t()))
Big Data Exercises Snapshots

# Create a temporary view of fifa_df
fifa_df.printSchema()
fifa_df.createOrReplaceTempView('fifa_df_table')

# Construct the "query"
query = '''SELECT Age FROM fifa_df_table WHERE Nationality == "Germany
"'''

# Apply the SQL "query"
fifa_df_germany_age = spark.sql(query)

# Generate basic statistics
fifa_df_germany_age.describe().show()

# Convert fifa_df to fifa_df_germany_age_pandas DataFrame
fifa_df_germany_age_pandas = fifa_df_germany_age.toPandas()

# Plot the 'Age' density of Germany Players
fifa_df_germany_age_pandas.plot(kind='density')
plt.show()

Data Science Internship Task List
No ratings yet
Data Science Internship Task List
10 pages
Learning Pandas
No ratings yet
Learning Pandas
11 pages
50 Days of Data Analysis With Python - Sample Document
0% (1)
50 Days of Data Analysis With Python - Sample Document
14 pages
Bda Solved Sample Question Paper 70 Marks
No ratings yet
Bda Solved Sample Question Paper 70 Marks
29 pages
Apache Spark Python Slides
No ratings yet
Apache Spark Python Slides
186 pages
Pypark Scala Spark
No ratings yet
Pypark Scala Spark
26 pages
Service Manual: Separation Unit 841
100% (1)
Service Manual: Separation Unit 841
160 pages
Chapter - 1 - Managers and Management
100% (1)
Chapter - 1 - Managers and Management
22 pages
Transformer Selection
100% (1)
Transformer Selection
2 pages
Tactical Barbell Interactive Spreadsheet - Improved
No ratings yet
Tactical Barbell Interactive Spreadsheet - Improved
10 pages
Math Chapter 1 Algebra Classified
No ratings yet
Math Chapter 1 Algebra Classified
43 pages
Big Data - Spark
100% (1)
Big Data - Spark
72 pages
Pyspark 30 Days
No ratings yet
Pyspark 30 Days
32 pages
Process Safety
No ratings yet
Process Safety
98 pages
Microsoft Ai Automate
No ratings yet
Microsoft Ai Automate
259 pages
Mutable Plaits
No ratings yet
Mutable Plaits
12 pages
Series 63 Round Bottom Boats
No ratings yet
Series 63 Round Bottom Boats
54 pages
This Document Is Strategic Plan Text. It Is Part of The Supporting Assessment Resources For Assessment Task 2 of BSADM506
No ratings yet
This Document Is Strategic Plan Text. It Is Part of The Supporting Assessment Resources For Assessment Task 2 of BSADM506
4 pages
P.4 Mathematics Mid Term 1
No ratings yet
P.4 Mathematics Mid Term 1
7 pages
BDA Practical File
No ratings yet
BDA Practical File
57 pages
Bca Bigdata Fifth - Sem Approved Syllabus
No ratings yet
Bca Bigdata Fifth - Sem Approved Syllabus
23 pages
DataGrokr Technical Assignment - Data Engineering - Internshala
No ratings yet
DataGrokr Technical Assignment - Data Engineering - Internshala
5 pages
Big Data Analytics
No ratings yet
Big Data Analytics
10 pages
Dvda Practical 1&2
No ratings yet
Dvda Practical 1&2
18 pages
2 - Intro To PySpark RDD
No ratings yet
2 - Intro To PySpark RDD
35 pages
The Six Days of Genesis
94% (18)
The Six Days of Genesis
125 pages
Tushar Verma 21scse1310012 Data Analysis Using Big Data Tools 21scse1310012 Report
No ratings yet
Tushar Verma 21scse1310012 Data Analysis Using Big Data Tools 21scse1310012 Report
6 pages
Manual Aire Acondicionado
No ratings yet
Manual Aire Acondicionado
22 pages
Assignment Aiads 2074
No ratings yet
Assignment Aiads 2074
2 pages
Sunil Panda Sde
No ratings yet
Sunil Panda Sde
1 page
Fdsa Lab Manual
No ratings yet
Fdsa Lab Manual
53 pages
Data Scientist Exercise
No ratings yet
Data Scientist Exercise
2 pages
Python For Data Science Cheat Sheet
No ratings yet
Python For Data Science Cheat Sheet
14 pages
Labsession1 SparkRDD
No ratings yet
Labsession1 SparkRDD
2 pages
Pyspark Tutorial 3
No ratings yet
Pyspark Tutorial 3
5 pages
3 Getting Started With BigDL
No ratings yet
3 Getting Started With BigDL
54 pages
Oreillyfodooltweek 11675274112220
No ratings yet
Oreillyfodooltweek 11675274112220
45 pages
Math IB Questions
No ratings yet
Math IB Questions
11 pages
Pyspark TOC - 24 Hours
No ratings yet
Pyspark TOC - 24 Hours
2 pages
Dva File
No ratings yet
Dva File
29 pages
DataGrokr Technical Assignment - Data Engineering
No ratings yet
DataGrokr Technical Assignment - Data Engineering
4 pages
Brochure Python For Data Scientist
No ratings yet
Brochure Python For Data Scientist
14 pages
Dsbda Lab Manual Merged
No ratings yet
Dsbda Lab Manual Merged
117 pages
PySpark Training
No ratings yet
PySpark Training
3 pages
Big Data in Python
No ratings yet
Big Data in Python
10 pages
Python For Data Science
No ratings yet
Python For Data Science
22 pages
BDA Lect5 Apache Spark 2023
No ratings yet
BDA Lect5 Apache Spark 2023
115 pages
DS June 2023 Assignment
No ratings yet
DS June 2023 Assignment
3 pages
00 Overview
No ratings yet
00 Overview
35 pages
Data Science
No ratings yet
Data Science
7 pages
Numpy Module
No ratings yet
Numpy Module
10 pages
Bda Solved Sample Question Paper 70 Marks
No ratings yet
Bda Solved Sample Question Paper 70 Marks
29 pages
8365 1 Question Paper June 2023
No ratings yet
8365 1 Question Paper June 2023
24 pages
Introduction To Big Data With PySpark - Spark RDDs With PySpark Cheatsheet - Codecademy
No ratings yet
Introduction To Big Data With PySpark - Spark RDDs With PySpark Cheatsheet - Codecademy
6 pages
BDDA - Course Outline
No ratings yet
BDDA - Course Outline
3 pages
Lab Manual Ds&Bdal
No ratings yet
Lab Manual Ds&Bdal
100 pages
List of Experiments - CL-I
No ratings yet
List of Experiments - CL-I
3 pages
Response of Mung Bean (Vigna Radiata (L.) R. Wilczek) To An Increasing Natural Temperature Gradient Under Different Crop Management Systems
No ratings yet
Response of Mung Bean (Vigna Radiata (L.) R. Wilczek) To An Increasing Natural Temperature Gradient Under Different Crop Management Systems
18 pages
Index
No ratings yet
Index
4 pages
10 Spark1
No ratings yet
10 Spark1
31 pages
PCAC2009
No ratings yet
PCAC2009
3 pages
Spark Using Python
No ratings yet
Spark Using Python
28 pages
Data Analytics Curriculum
No ratings yet
Data Analytics Curriculum
8 pages
Chapter 1-Stat210
No ratings yet
Chapter 1-Stat210
24 pages
20191216134846D3338 - COMP6579 Session 10 - Big Data Analytics (Apache Spark - SparkML)
No ratings yet
20191216134846D3338 - COMP6579 Session 10 - Big Data Analytics (Apache Spark - SparkML)
42 pages
Apache Spark With Java
No ratings yet
Apache Spark With Java
209 pages
Int 421
No ratings yet
Int 421
2 pages
Professional Machine Learning Engineer - 8
No ratings yet
Professional Machine Learning Engineer - 8
6 pages
FS 10 Series: Model Input Output Max Rip. & Noise Effc
No ratings yet
FS 10 Series: Model Input Output Max Rip. & Noise Effc
6 pages
Drmahmoud - 2195 - 3382 - 1 - Assignment MT Cloud Computing
No ratings yet
Drmahmoud - 2195 - 3382 - 1 - Assignment MT Cloud Computing
23 pages
Mosfet Basics
No ratings yet
Mosfet Basics
51 pages
Assignment No.1
No ratings yet
Assignment No.1
1 page
Using Using Using Using Using Using Namespace Class: Player
No ratings yet
Using Using Using Using Using Using Namespace Class: Player
4 pages
Design: The Resonant Interface HCI Foundations For Interaction Design First Edition
No ratings yet
Design: The Resonant Interface HCI Foundations For Interaction Design First Edition
35 pages
Apach Spark With Scala Slides
No ratings yet
Apach Spark With Scala Slides
187 pages
Unit I
No ratings yet
Unit I
139 pages
Python Data Science Group Bootcamp NYC (Affordable Machine Learning)
No ratings yet
Python Data Science Group Bootcamp NYC (Affordable Machine Learning)
16 pages
Addressing Modes and It'S Types: Muhammad Fasih Dawood and Zainab Iqbal
No ratings yet
Addressing Modes and It'S Types: Muhammad Fasih Dawood and Zainab Iqbal
15 pages
Replacement 111
No ratings yet
Replacement 111
2 pages
Chapter 04 Computer Architecture CH04-COA9e Upto Mid
No ratings yet
Chapter 04 Computer Architecture CH04-COA9e Upto Mid
8 pages
System : Assignment by Fasih Dawood "Countings"
No ratings yet
System : Assignment by Fasih Dawood "Countings"
3 pages
Rotrex Technical Datasheet C38 Range
No ratings yet
Rotrex Technical Datasheet C38 Range
7 pages
Assignment # 1
No ratings yet
Assignment # 1
14 pages
Question 1 Answer
No ratings yet
Question 1 Answer
3 pages
Python and PowerBI Syllabus
No ratings yet
Python and PowerBI Syllabus
3 pages
Mixer (Sajjad Ahmed Qureshi)
No ratings yet
Mixer (Sajjad Ahmed Qureshi)
1 page
Assignment #1: Logo (05 Marks)
No ratings yet
Assignment #1: Logo (05 Marks)
5 pages
"Code of Tables": Using Using Using Using Using
No ratings yet
"Code of Tables": Using Using Using Using Using
3 pages
Tunnel Gearboxes Power Take Off 22092016 Web
No ratings yet
Tunnel Gearboxes Power Take Off 22092016 Web
8 pages
Emat Sensor Design
No ratings yet
Emat Sensor Design
20 pages
Metra 18C and Metrawin 90 Modular Calibration System, CP
No ratings yet
Metra 18C and Metrawin 90 Modular Calibration System, CP
8 pages
Mehta-OS - Intermediate 6CO-1
No ratings yet
Mehta-OS - Intermediate 6CO-1
2 pages
Assignment by Fasih Dawood ROLL # 1612118: "Code of Tables"
No ratings yet
Assignment by Fasih Dawood ROLL # 1612118: "Code of Tables"
2 pages
ASSIGNMENT by Fasih Dawood Roll # 1612118: Code of Swapping
No ratings yet
ASSIGNMENT by Fasih Dawood Roll # 1612118: Code of Swapping
2 pages
Fatigue Detection On Face Image Using Facenet Algorithm and K-Nearest Neighbor Classifier
No ratings yet
Fatigue Detection On Face Image Using Facenet Algorithm and K-Nearest Neighbor Classifier
10 pages
Replacement 222
No ratings yet
Replacement 222
1 page
Cambridge O Level: Additional Mathematics 4037/12
No ratings yet
Cambridge O Level: Additional Mathematics 4037/12
16 pages
12.1 Experimental Techniques - Multiple Choice (Questions Only)
No ratings yet
12.1 Experimental Techniques - Multiple Choice (Questions Only)
12 pages
A Survey On Neural Network Hardware Accelerators
No ratings yet
A Survey On Neural Network Hardware Accelerators
22 pages
Assignment2 BSAI 008
No ratings yet
Assignment2 BSAI 008
7 pages
Question 3 Answer
No ratings yet
Question 3 Answer
4 pages
Preliminary Investigation On Stabilized Annotating System Via 360 Degree Camera Environment
No ratings yet
Preliminary Investigation On Stabilized Annotating System Via 360 Degree Camera Environment
1 page
Experimental Study On Bond Slip Relationship of Steel Sleeve
No ratings yet
Experimental Study On Bond Slip Relationship of Steel Sleeve
5 pages
Materials and Design
No ratings yet
Materials and Design
12 pages
Psychiatric Nursing Contemporary Practice Boyd 5th Edition Test Bank 2024 Scribd Download Full Chapters
100% (11)
Psychiatric Nursing Contemporary Practice Boyd 5th Edition Test Bank 2024 Scribd Download Full Chapters
31 pages
AAA Minahil Saeed Assigment 1
No ratings yet
AAA Minahil Saeed Assigment 1
4 pages
SAP S/4 Implementation: A Comprehensive Guide for Practitioners
From Everand
SAP S/4 Implementation: A Comprehensive Guide for Practitioners
Dave Karpinsky
No ratings yet
IGNOU MCA Data Science and Big Data Previous Years Unsolved Papers MCS 226
From Everand
IGNOU MCA Data Science and Big Data Previous Years Unsolved Papers MCS 226
Manish Soni
No ratings yet
SAFe® Scrum Master Exam Companion : Q & A with Explanations
From Everand
SAFe® Scrum Master Exam Companion : Q & A with Explanations
SUJAN
No ratings yet
PMI-ACP Exam Companion : Q & A with Explanations
From Everand
PMI-ACP Exam Companion : Q & A with Explanations
SUJAN
No ratings yet

Big Data Analytics: Snapshot of Class Lab and Data Camp Course

Uploaded by

Big Data Analytics: Snapshot of Class Lab and Data Camp Course

Uploaded by

Big Data Exercises Snapshots

Big Data Analytics

By Muhammad Fasih Dawood

Spark Exercises of Class Lab:

Big Data Fundamentals with PySpark:

You might also like