Apply Hadoop For Creating Recommandation System
Apply Hadoop For Creating Recommandation System
CREATING
RECOMMANDATION
SYSTEM
Submitted by,
DIVYALAKSHMI .K
DHARUSHYAN N
GANESH ARAVIND A
0VERVIEW
• Our project, the Recommendation System Using
Hadoop, is a powerful solution designed to provide
personalized recommendations and enhance user
experiences.
• Developed with Python and Hadoop, it efficiently
processes large datasets, integrating key libraries like
Pyspark and PyFlink for seamless data handling.
• Users can implement features like collaborative
filtering, content-based filtering, and hybrid models
to tailor recommendations, making it a valuable tool
for businesses aiming to optimize user engagement
and customer satisfaction.
2
Pre-requisites:
MACHI
Pyspar NE
k and LEANIN
PyFlink G
HADOO Data
P Preproc
ess
3
How are We Going to Build This?
4
Click icon to add picture
TECHNOLOGIES
Hadoop Cluster:
• To begin, you'll need a Hadoop cluster up and running. Ensure
you have access to the Hadoop Distributed File System (HDFS)
and Hadoop MapReduce for data storage and processing.
Python Environment:
• A Python development environment is crucial for coding your
recommendation system. Python offers flexibility and
compatibility with Hadoop libraries.
Hadoop Streaming:
• Hadoop Streaming allows you to use any programming
language (like Python) for writing MapReduce jobs. It's handy
for customizing recommendation algorithms.
5
Click icon to add picture
Libraries
• PySpark:
• PySpark is a fundamental library for
integrating Python with Hadoop. It provides
APIs for distributed data processing, enabling
efficient data manipulation.
• PyFlink:
• PyFlink is another powerful library that
complements Hadoop. It focuses on stream
and batch processing, making it suitable for
real-time recommendations.
6
Click icon to add picture
SYSTEM ARCHITECTURE
• The architecture of the recommendation
system consists of multiple layers, including
data collection, preprocessing, model training,
and recommendation generation.
• Data is ingested and stored in HDFS, where
MapReduce jobs perform preprocessing and
analysis to create recommendation models.
• The final output is delivered as
recommendations that are personalized for
each user.
7
Lorem ipsum dolor sit amet ipsum
8
Title Slide
Lorem ipsum dolor sit Lorem ipsum dolor sit Lorem ipsum dolor sit Lorem ipsum dolor sit
amet amet amet amet
9
Slide Title
incididunt ut labore et
4
dolore magna aliqua.
Ut enim ad minim 3
veniam, quis nostrud
exercitation ullamco 2
laboris nisi ut aliquip ex
ea commodo 1
consequat.
0
Category 1 Category 2 Category 3 Category 4