Daeeh Module
Daeeh Module
EDUCATION, ENTERTAINMENT
& HOSPITALITY
MODULE 1
PREPARED BY:
MS. PRATIBHA SAJWAN
WHAT IS DATA ANALYTICS
• DATA ANALYTICS IS THE PROCESS OF COLLECTING, ORGANIZING AND STUDYING DATA TO FIND USEFUL
INFORMATION UNDERSTAND WHAT’S HAPPENING AND MAKE BETTER DECISIONS. IN SIMPLE WORDS IT
HELPS PEOPLE AND BUSINESSES LEARN FROM DATA LIKE WHAT WORKED IN THE PAST, WHAT IS
HAPPENING NOW AND WHAT MIGHT HAPPEN IN THE FUTURE.
IMPORTANCE
INTRODUCTION
• DEFINITION: THE PROCESS OF EXAMINING DATA SETS TO DRAW CONCLUSIONS ABOUT THE INFORMATION
THEY CONTAIN.
• DATA ANALYSIS IS NOT A NEW CONCEPT. EVEN IN ANCIENT TIMES, PEOPLE USED RUDIMENTARY FORMS
OF DATA ANALYSIS TO RECORD ASTRONOMICAL OBSERVATIONS, TRACK GOODS, AND MAINTAIN CENSUS
DATA. HOWEVER, IT WAS ONLY WITH THE ADVENT OF COMPUTERS IN THE MID-20TH CENTURY THAT
DATA ANALYTICS BEGAN TO TAKE ON A FORM WE WOULD RECOGNIZE TODAY.
• THE ADVENT OF DATABASES IN THE 1960S AND 1970S MARKED A SIGNIFICANT LEAP FORWARD IN
DATA ANALYTICS. THIS ALLOWED ORGANIZATIONS TO STORE AND MANAGE LARGE AMOUNTS OF DATA
DIGITALLY. THE 1980S SAW THE DEVELOPMENT OF MORE ADVANCED STATISTICAL SOFTWARE, AND BY
THE 1990S, DATA MINING TECHNIQUES WERE BEING USED TO EXTRACT INSIGHTS FROM LARGE
DATASETS.
THE PRESENT: THE AGE OF BIG DATA AND
MACHINE LEARNING
• IN THE 21ST CENTURY, DATA ANALYTICS HAS ENTERED A NEW ERA CHARACTERIZED BY TWO KEY
DEVELOPMENTS: THE EXPLOSION OF BIG DATA AND THE ADVENT OF MACHINE LEARNING.
• WITH THE ADVENT OF THE INTERNET, SOCIAL MEDIA, AND SMART DEVICES, WE ARE NOW GENERATING
DATA AT AN UNPRECEDENTED RATE. THIS HAS GIVEN RISE TO "BIG DATA," CHARACTERIZED BY ITS
VOLUME, VARIETY, AND VELOCITY. DATA ANALYTICS HAS HAD TO ADAPT TO HANDLE THIS DELUGE OF
INFORMATION.
• CONCURRENTLY, MACHINE LEARNING HAS TAKEN DATA ANALYTICS TO NEW HEIGHTS. MACHINE
LEARNING, A SUBSET OF ARTIFICIAL INTELLIGENCE, INVOLVES TEACHING COMPUTERS TO LEARN FROM
DATA AND MAKE PREDICTIONS OR DECISIONS WITHOUT BEING EXPLICITLY PROGRAMMED TO DO SO.
THIS HAS ENABLED MORE ADVANCED FORMS OF DATA ANALYTICS, SUCH AS PREDICTIVE ANALYTICS
AND PRESCRIPTIVE ANALYTICS.
THE FUTURE: TOWARDS AN AI-DRIVEN WORLD
• LOOKING AHEAD, IT'S CLEAR THAT THE EVOLUTION OF DATA ANALYTICS IS FAR FROM OVER. SEVERAL
EMERGING TRENDS SUGGEST THE
AUTOMATED MACHINE LEARNING
• AUTOMATED MACHINE LEARNING (AUTO ML) REPRESENTS THE NEXT STEP IN THE EVOLUTION OF
MACHINE LEARNING. IT INVOLVES
• EFFICIENCY OF EXPERTS.
REAL-TIME ANALYTICS
• WITH THE RISE OF INTERNET OF THINGS (IOT) DEVICES AND OTHER TECHNOLOGIES THAT GENERATE
CONTINUOUS STREAMS OF DATA, REALTIME ANALYTICS IS BECOMING INCREASINGLY IMPORTANT. THIS
INVOLVES ANALYZING DATA IN NEAR-REAL TIME TO GAIN INSTANT INSIGHTS.
DATA PRIVACY AND ETHICS
• AS DATA ANALYTICS BECOMES MORE PERVASIVE, ISSUES OF DATA PRIVACY AND ETHICS BECOME INCREASINGLY
IMPORTANT. FUTURE
• DEVELOPMENTS IN DATA ANALYTICS WILL NEED TO CONSIDER THESE ISSUES, BALANCING THE BENEFITS OF DATA
ANALYSIS WITH THE NEED
• THE EVOLUTION OF DATA ANALYTICS FROM SIMPLE STATISTICS TO ADVANCED MACHINE LEARNING TECHNIQUES
HAS BEEN A REMARKABLE
• JOURNEY. WITH THE ADVENT OF BIG DATA AND ARTIFICIAL INTELLIGENCE, DATA ANALYTICS IS POISED TO
CONTINUE ITS TRANSFORMATIVE
• IMPACT ON BUSINESSES AND SOCIETY. AS WE LOOK TO THE FUTURE, WE CAN EXPECT FURTHER INNOVATIONS
AND CHALLENGES IN THIS
• EXCITING FIELD
INTRODUCTION TO DATA PROCESSING
• DATA PROCESSING, THE CONVERSION OF RAW DATA INTO MEANINGFUL INFORMATION, IS PIVOTAL IN
TODAY'S INFORMATION-DRIVEN WORLD.
• THE DATA PROCESSING PROCESS IS VITAL ACROSS VARIOUS SECTORS, FROM BUSINESS AND SCIENCE TO
REAL-TIME APPLICATIONS, SHAPING THE WAY WE INTERPRET AND UTILIZE INFORMATION.
WHAT IS DATA PROCESSING?
• PROCESSED DATA DEFINITION TYPICALLY REFERS TO THE REFINED AND FINALIZED SPECIFICATIONS AND
ATTRIBUTES ASSOCIATED WITH DATA AFTER IT HAS UNDERGONE VARIOUS PROCESSING STEPS.
STAGES OF DATA PROCESSING PROCESS
• COLLECTION
• PREPARATION
• INPUT
• DATA PROCESSING
• DATA OUTPUT
• DATA STORAGE
COLLECTION
• THE PROCESS BEGINS WITH THE COLLECTION OF RAW DATA FROM VARIOUS SOURCES. THE STAGE
ESTABLISHES THE FOUNDATION FOR SUBSEQUENT PROCESSING, ENSURING A COMPREHENSIVE POOL OF
DATA RELEVANT TO THE INTENDED ANALYSIS. IT COULD INCLUDE SURVEYS, SENSORS, DATABASES, OR
ANY OTHER MEANS OF GATHERING RELEVANT INFORMATION.
PREPARATION
• DATA PREPARATION FOCUSES ON ORGANIZING, DATA CLEANING, AND FORMATTING RAW DATA.
IRRELEVANT INFORMATION IS FILTERED OUT, ERRORS ARE CORRECTED, AND THE DATA IS STRUCTURED
IN A WAY THAT FACILITATES EFFICIENT ANALYSIS DURING SUBSEQUENT STAGES OF PROCESSING.
INPUT
• DURING THE DATA INPUT STAGE, THE PREPARED DATA IS ENTERED INTO A COMPUTER SYSTEM. THIS
CAN BE ACHIEVED THROUGH MANUAL ENTRY OR AUTOMATED METHODS, DEPENDING ON THE NATURE OF
THE DATA AND THE SYSTEMS IN PLACE.
DATA PROCESSING
• THE CORE OF DATA PROCESSING INVOLVES MANIPULATING AND ANALYZING THE PREPARED DATA.
OPERATIONS SUCH AS SORTING, SUMMARIZING, CALCULATING, AND AGGREGATING ARE PERFORMED TO
EXTRACT MEANINGFUL INSIGHTS AND PATTERNS.
DATA OUTPUT
• THE RESULTS OF DATA PROCESSING ARE PRESENTED IN A COMPREHENSIBLE FORMAT DURING THE DATA
OUTPUT STAGE. THIS COULD INCLUDE REPORTS, CHARTS, GRAPHS, OR OTHER VISUAL REPRESENTATIONS
THAT FACILITATE UNDERSTANDING AND DECISION-MAKING BASED ON THE ANALYZED DATA.
DATA STORAGE
• THE FINAL STAGE ENTAILS STORING THE PROCESSED DATA FOR FUTURE REFERENCE AND ANALYSIS.
THIS IS CRUCIAL FOR MAINTAINING A HISTORICAL RECORD, ENABLING EFFICIENT RETRIEVAL, AND
SUPPORTING ONGOING OR FUTURE DATA-RELATED INITIATIVES. PROPER DATA STORAGE ENSURES THE
LONGEVITY AND ACCESSIBILITY OF VALUABLE INFORMATION.
TYPES OF DATA PROCESSING
• IN THIS TYPE, DATA IS PROCESSED BY HUMANS WITHOUT THE USE OF MACHINES OR ELECTRONIC
DEVICES. IT INVOLVES TASKS SUCH AS MANUAL CALCULATIONS, SORTING, AND RECORDING, MAKING IT
A TIME-CONSUMING PROCESS.
MECHANICAL DATA PROCESSING
• THIS TYPE UTILIZES MECHANICAL DEVICES, SUCH AS PUNCH CARDS OR MECHANICAL CALCULATORS, TO
PROCESS DATA. WHILE MORE EFFICIENT THAN MANUAL PROCESSING, IT LACKS THE SPEED AND
CAPABILITIES OF ELECTRONIC METHODS.
• ELECTRONIC DATA PROCESSING (EDP) INVOLVES THE USE OF COMPUTERS TO PROCESS AND ANALYZE
DATA. IT SIGNIFICANTLY ENHANCES SPEED AND ACCURACY COMPARED TO MANUAL AND MECHANICAL
METHODS, MAKING IT A FUNDAMENTAL SHIFT IN DATA PROCESSING.
WHAT IS REAL-TIME DATA PROCESSING?
• REAL-TIME DATA PROCESSING IS A TECHNIQUE OF INSTANTANEOUS INGESTION, TRANSFORMATION,
STORAGE, AND ANALYSIS OF DATA AS SOON AS IT IS GENERATED. IT IS A PREFERRED METHOD FOR
PERFORMING FASTER DATA OPERATIONS WITH LATENCY IN THE RANGE OF MILLISECONDS. YOU CAN
LEVERAGE REAL-TIME DATA PROCESSING FOR ACCELERATED DATA ANALYSIS AND BUSINESS INSIGHTS
GENERATION.
• THE FIRST STEP IN REAL-TIME DATA PROCESSING IS INSTANT DATA INGESTION. THIS IS DONE BY
COLLECTING DATA FROM VARIOUS SOURCES, SUCH AS SERVER LOGS, IOT DEVICES, OR SOCIAL MEDIA
FEEDS. YOU CAN INGEST THIS DATA WITH THE HELP OF DATA STREAMING TOOLS LIKE APACHE KAFKA
OR AMAZON KINESIS.
DATA PROCESSING
• HERE, THE INGESTED DATA IS AGGREGATED, CLEANED, TRANSFORMED, AND ENRICHED. THIS CONVERTS
IT INTO A FORMAT SUITABLE FOR OTHER DATA SYSTEMS AND APPLICATIONS.
DATA STORAGE
• AFTER PROCESSING, YOU CAN STORE YOUR DATA IN DESTINATION SYSTEMS SUCH AS RELATIONAL
DATABASES, STREAMING PLATFORMS, OR IN-MEMORY DATABASES.
DATA DISTRIBUTION
• YOU CAN DISTRIBUTE THE PROCESSED DATA ACROSS VARIOUS SYSTEMS TO MAKE IT ACCESSIBLE FOR
DOWNSTREAM OPERATIONS.
BENEFITS OF REAL-TIME DATA PROCESSING
• YOU CAN USE THIS INFORMATION TO SPEEDILY CHANGE YOUR PRODUCT FEATURES IN ACCORDANCE
WITH THE PREFERENCES OF YOUR TARGET CUSTOMERS.
• THIS GIVES YOU AN EDGE OVER YOUR COMPETITORS, CONTRIBUTING TO BUSINESS GROWTH.
ENHANCED DATA QUALITY
• DURING REAL-TIME DATA MOVEMENT, YOU CAN QUICKLY DISCOVER DISCREPANCIES IN YOUR
DATASETS.
• EARLY DETECTION HELPS YOU REMOVE ERRORS IMMEDIATELY AND IMPROVE YOUR DATA QUALITY.
• MOREOVER, THE ERRORS ARE CAUGHT CLOSER TO THEIR SOURCE IN REAL-TIME DATA FLOW. THIS
HELPS IDENTIFY THE ROOT CAUSE AND RESOLVE INACCURACIES IN DATA AT AN EARLY STAGE.
ELEVATED CUSTOMER EXPERIENCE
• IN AN ENTERPRISE, REAL-TIME DATA PROCESSING CAN HELP YOU EXPONENTIALLY IMPROVE THE
CUSTOMER SERVICE EXPERIENCE.
• YOU CAN EVALUATE YOUR CUSTOMER DATA PROMPTLY AND IDENTIFY THE LOOPHOLES DISCOURAGING
PEOPLE FROM BUYING YOUR PRODUCTS OR SERVICES.
• SUCH EVALUATIONS HELP YOU IMPROVE YOUR PRODUCT DEVELOPMENT AND MARKETING STRATEGIES
TO INCREASE CUSTOMER ENGAGEMENT AND REVENUE.
INCREASED DATA SECURITY
• YOU CAN QUICKLY DETECT FRAUD OR SECURITY BREACHES BY MONITORING AND PROCESSING YOUR
DATA IN REAL-TIME.
• THIS IS ESPECIALLY USEFUL IN THE FINANCE SECTOR OR STOCK MARKETS. YOU CAN ALSO USE
REAL-TIME PROCESSING TO IDENTIFY EARLY SIGNS OF NEGATIVE TRENDS THAT COULD IMPACT MARKET
OR STOCK PRICES.
• THIS ALLOWS YOU TO TAKE PREVENTIVE ACTIONS BEFOREHAND TO MINIMIZE POTENTIAL LOSSES.
REAL-TIME VS. BATCH VS. NEAR REAL-TIME
DATA PROCESSING
Feature Real-time Processing Batch Processing
Definition Real-time processing involves Here, you have to process data flows and
instantaneous data flow and operations in batches.
operations.
Latency Range It has a latency range of a few It has a latency range from several hours to
milliseconds. days.
Complexity Real-time processing is a complex It is the easiest of the three processing
process that requires technical techniques.
expertise and infrastructure.
• THE SPEED LAYER ENABLES THE DISTRIBUTED PROCESSING OF REAL-TIME DATA USING STREAM
PROCESSING TOOLS LIKE APACHE KAFKA OR APACHE STORM. THE SERVING LAYER ALLOWS YOU TO
UNIFY THE OUTPUTS OF BOTH BATCH AND SPEED LAYERS. IT ACTS AS AN INTERMEDIARY BETWEEN THE
END-USER AND THE PROCESSED DATA. THE SERVING LAYER USES A DATABASE SUCH AS APACHE
CASSANDRA OR MONGODB TO STORE PROCESSED DATA AND QUERY ENGINES LIKE APACHE HIVE TO
ENABLE YOU TO QUERY THE DATA.
2. KAPPA
• KAPPA ARCHITECTURE IS SIMPLER THAN LAMBDA ARCHITECTURE AND CAN HANDLE REAL-TIME
PROCESSING IN A MORE STREAMLINED WAY. IT CONSISTS OF ONLY ONE STREAMING LAYER. THE KAPPA
ARCHITECTURE LAYER USES TOOLS LIKE APACHE KAFKA STREAM OR APACHE FLINK TO INGEST AND
PROCESS DATA AND STORE IT IN A DATABASE LIKE APACHE CASSANDRA.
DELTA
• THE DELTA ARCHITECTURE ENABLES YOU TO COMBINE AND STREAMLINE BOTH LAMBDA AND KAPPA'S
STORAGE AND PROCESSING CAPABILITIES THROUGH THE MICRO-BATCHING TECHNIQUE. THIS TECHNIQUE
IS AN INTERMEDIARY APPROACH AND FORMS THE BASIS OF DATA PROCESSING IN MANY MODERN DATA
LAKES, LIKE DELTA LAKE.
CHALLENGES WITH PROCESSING DATA IN REAL-TIME
• 1. SCALABILITY
IT CAN BE CHALLENGING TO PROCESS LARGE VOLUMES OF DATA IN REAL-TIME COMING FROM SOURCES SUCH AS
SOCIAL MEDIA FEEDS OR FINANCIAL TRANSACTIONS. THIS CAN LEAD TO INFRASTRUCTURE OVERLOADING,
REDUCING PROCESSING EFFICIENCY. MOREOVER, YOUR INFRASTRUCTURE MAY NOT SHOW FLEXIBILITY ACCORDING
TO AN INCREASE OR DECREASE IN DATA VOLUME, RESULTING IN OVER- OR UNNECESSARY USAGE OF ITS RESOURCES.
• 2. DATA QUALITY
REAL-TIME DATA COMES FROM VARIOUS SOURCES IN DIFFERENT FORMATS, WHICH MAY OR MAY NOT BE
COMPATIBLE WITH EACH OTHER. REAL-TIME PROCESSING ALSO INTRODUCES TECHNICAL GLITCHES, SOMETIMES
RESULTING IN MISSING DATA. ALL THESE IRREGULARITIES CONTRIBUTE TO THE DETERIORATION OF DATA QUALITY.
• 3. COMPLEXITY
• IN REAL-TIME DATA PROCESSING, MULTIPLE TASKS, SUCH AS INGESTION, CLEANING, TRANSFORMATION, AND
LOADING, ARE PERFORMED SIMULTANEOUSLY. AS A RESULT, PROPER RESOURCE ALLOCATION FOR EACH OF
THESE TASKS BECOMES A CHALLENGE. HISTORICAL ANALYSIS ALSO BECOMES DIFFICULT IN REAL-TIME
PROCESSING AS MORE EMPHASIS IS PLACED ON CURRENT DATA. THIS AFFECTS THE FRAMING OF LONG-TERM
STRATEGIES IN BUSINESS.
• 4. SECURITY
• WHILE PROCESSING YOUR DATA IN REAL-TIME, THERE ARE RISKS OF UNAUTHORIZED ACCESS AND DATA
BREACHES. THE REASON FOR THIS IS THAT IN THE RUSH TO ACHIEVE FASTER PROCESSING, SECURITY PROTOCOLS
GET COMPROMISED. TO AVOID THIS, YOU SHOULD SECURE THE DATA PIPELINES THROUGH ROLE-BASED ACCESS
CONTROL, AUTHENTICATION, AND ENCRYPTION FEATURES.
• 5. COSTS
• REAL-TIME DATA PROCESSING REQUIRES SPECIALIZED HARDWARE AND SOFTWARE INFRASTRUCTURE. THIS
INFRASTRUCTURE IS COSTLY TO BUY AND MAINTAIN, SO PROCESSING DATA IN REAL-TIME MAY CAUSE FINANCIAL
STRAIN.
USE CASES FOR REAL-TIME DATA PROCESSING
• 1. FINANCE
• REAL-TIME DATA PROCESSING IS USEFUL FOR DETECTING SUSPICIOUS TRANSACTIONS AND FRAUD IN THE
FINANCE SECTOR. IT ALSO FACILITATES REAL-TIME ANALYSIS OF STOCK MARKET TRENDS TO ENABLE YOU TO
TAKE PRECAUTIONARY ACTIONS BEFORE POTENTIAL MONETARY LOSS.
• 2. E-COMMERCE
• REAL-TIME PROCESSING IN E-COMMERCE CAN HELP YOU COMPREHENSIVELY ANALYZE CUSTOMER BEHAVIOR
DATA. IT HELPS YOU QUICKLY RELATE PURCHASING BEHAVIOR, SEARCH HISTORY, AND PREFERENCES TO GIVE
BETTER PRODUCT RECOMMENDATIONS. YOU CAN ALSO MANAGE YOUR INVENTORY BY CONTINUOUSLY
ANALYZING SALES AND INVENTORY DATA.
• 3. HEALTHCARE
• PROCESSING HEALTH DATA RECORDS OF A PATIENT IN REAL TIME HELPS IN THE EARLY DIAGNOSIS AND
TREATMENT OF SERIOUS DISEASES. IT ALSO HELPS IN MONITORING AND CONTROLLING PUBLIC
OUTBREAKS OF VIRUSES. YOU CAN ALSO USE REAL-TIME DATA PROCESSING IN INVENTORY
MANAGEMENT OF PHARMACEUTICALS AND MEDICAL EQUIPMENT TO ENSURE THAT THERE ARE NO GAPS
IN DEMAND AND SUPPLY.
• 4. COMMUNICATION
• REAL-TIME DATA PROCESSING HAS MADE SWIFT COMMUNICATION WITH NO LATENCY POSSIBLE. THIS IS
VITAL FOR CUSTOMER SERVICE, WHERE YOU CAN GET SOLUTIONS FOR YOUR PROBLEMS INSTANTLY
THROUGH TECHNOLOGIES LIKE CHATBOTS. REAL-TIME DATA PROCESSING IS INTEGRAL TO STREAMING
SERVICES, DELIVERING CONTENT WITHOUT BUFFERING. IT IS ALSO AN IMPORTANT PART OF SHARED
WORKSPACES, ENABLING REAL-TIME INTERACTION AND COLLABORATION OF TEAMS FOR PROJECT
COMPLETION.
REAL-TIME DATA PROCESSING TOOLS
• STORAGE,
• PROCESSING
• ANALYSIS FUNCTIONALITY
• REAL-TIME DATA INGESTION TOOLS
• REAL-TIME DATA INGESTION TOOLS ENABLE YOU TO COLLECT DATA AS SOON AS IT IS GENERATED FROM
VARIOUS SOURCES. APACHE KAFKA, APACHE NIFI, AMAZON KINESIS, AND WAVEFRONT ARE SOME
EXAMPLES OF REAL-TIME DATA INGESTION TOOLS.
• STREAM PROCESSING FRAMEWORKS
• STREAM PROCESSING INVOLVES REAL-TIME PROCESSING OF DATA STREAMS. THESE STREAMS COLLECT
AND SEND DATA RECORDS INTO MULTIPLE SYSTEMS SIMULTANEOUSLY. APACHE SPARK, APACHE
STORM, APACHE SAMZA, AND APACHE FLINK ARE SOME EXAMPLES OF STREAM PROCESSING
FRAMEWORKS.
• REAL-TIME DATA STORAGE
• THE REAL-TIME DATA STORAGE PROCESS INVOLVES CAPTURING AND STORING DATA IMMEDIATELY
AFTER ITS GENERATION WITH LOW LATENCY. THERE ARE SOME PLATFORMS THAT FACILITATE THIS,
SUCH AS APACHE CASSANDRA, AMAZON DYNAMODB, FIREBASE, AND MONGODB.
• REAL-TIME ANALYTICS
• REAL-TIME ANALYTICS INVOLVES INSTANT DATA ANALYSIS TO GENERATE FASTER INSIGHTS FOR QUICK
DECISION-MAKING. MANY TOOLS, SUCH AS GOOGLE CLOUD DATAFLOW, AZURE STREAM ANALYTICS,
STREAMSQL, AND IBM STREAM ANALYTICS, FACILITATE REAL-TIME ANALYTICS.
ANY QUERY?????
DATA ANALYTICS IN
EDUCATION, ENTERTAINMENT
& HOSPITALITY
MODULE 2
PREPARED BY:
MS. PRATIBHA SAJWAN
DATA CREATION TOOLS
• MOCKAROO: CREATE REALISTIC MOCK DATA (NAMES, EMAILS, DATES, ETC.) IN FORMATS LIKE CSV,
SQL, JSON.
• GENERATEDATA.COM: ONLINE TOOL TO CREATE CUSTOMIZABLE FAKE DATA SETS FOR TESTING AND
TRAINING.
• DATABENE BENERATOR: JAVA-BASED TOOL FOR GENERATING TEST DATA WITH RULES AND
CONSTRAINTS.
• REDGATE SQL DATA GENERATOR: FOR CREATING DATA FOR SQL SERVER DATABASES WITH
REAL-LIKE PATTERNS.
SYNTHETIC DATA GENERATION TOOLS
(ESPECIALLY FOR ML/AI MODELS)
• CHATGPT / GPT-4O: GENERATE LARGE VOLUMES OF SYNTHETIC TEXT DATA (EMAILS, ARTICLES,
DIALOGUES).
• DATA ANALYTICS MAINTAINS THE EFFICIENCY BY TRACKING SUPPLY CHAIN METRICS, THUS SAVING
LIVES AND COSTS.
• MOODLE
• BLACKBOARD
• CANVAS
• GOOGLE CLASSROOM
EXISTING TOOLS AND SYSTEMS USED FOR DATA
ANALYTICS IN EDUCATION
• POWERSCHOOL
• SKYWARD
• INFINITE CAMPUS
• ASPEN
EXISTING TOOLS AND SYSTEMS USED FOR DATA
ANALYTICS IN EDUCATION
• TABLEAU
• POWER BI
• D3.JS
• EXCEL
EXISTING TOOLS AND SYSTEMS USED FOR DATA
ANALYTICS IN EDUCATION
• LEARNING ANALYTICS PLATFORMS
• KNEWTON- KNEWTON IS AN ADAPTIVE LEARNING TECHNOLOGY PROVIDER THAT MAKES IT POSSIBLE FOR
OTHERS TO BUILD ADAPTIVE LEARNING APPLICATIONS. IN 2016, THE COMPANY ALSO BEGAN DEVELOPING
COURSEWARE FOR HIGHER EDUCATION CLASSES USING CONTENT FROM EDUCATIONAL COMPANIES AND OPEN
EDUCATIONAL RESOURCES.
• EAB NAVIGATE-THE EAB NAVIGATE SYSTEM IS AN ONLINE TOOL TO CONNECT STUDENTS TO FACULTY,
STAFF, AND CAMPUS RESOURCES. THROUGH THIS TOOL, USERS CAN RAISE EARLY ALERTS FOR STUDENTS,
VIEW THE PROGRESS FOR GROUPS OF STUDENTS, CREATE A KIOSK SYSTEM, INITIATE PROACTIVE APPOINTMENT
CAMPAIGNS, AND INNOVATIVELY HELP STUDENTS GRADUATE IN A TIMELY MANNER.
EXISTING TOOLS AND SYSTEMS USED FOR DATA
ANALYTICS IN EDUCATION
• KAHOOT!
• QUIZLET
• EDMODO
• SOCRATIVE
EXISTING TOOLS AND SYSTEMS USED FOR DATA
ANALYTICS IN EDUCATION
• RAPIDMINER
• WEKA
• KNIME
EXISTING TOOLS AND SYSTEMS USED FOR DATA
ANALYTICS IN EDUCATION
• IBM SPSS
• SAS ANALYTICS
• RAPID INSIGHT
EXISTING TOOLS AND SYSTEMS USED FOR DATA
ANALYTICS IN EDUCATION
• DREAMBOX
• SMART SPARROW
• ALEKS
EXISTING TOOLS AND SYSTEMS USED FOR DATA
ANALYTICS IN EDUCATION
• APACHE HADOOP
• AMAZON REDSHIFT
• GOOGLE BIGQUERY
EXISTING TOOLS AND SYSTEMS USED FOR DATA
ANALYTICS IN EDUCATION
• LEARNING LOCKER
• WATERSHED LRS
EXISTING TOOLS AND SYSTEMS USED FOR DATA
ANALYTICS IN EDUCATION
• TENSORFLOW
• KERAS
• PYTORCH
HTTPS://WWW.YOUTUBE.COM/WATCH?V=JGXP1EE4WMS
• DATA-DRIVEN MARKETING: IDENTIFYING THE MOST EFFECTIVE RECRUITMENT STRATEGIES AND CHANNELS.
• DEMAND FORECASTING: USING HISTORICAL DATA TO PREDICT DEMAND FOR COURSES AND PROGRAMS.
• COURSE SCHEDULING: OPTIMIZING COURSE SCHEDULES TO REDUCE CONFLICTS AND IMPROVE CLASSROOM
UTILIZATION.
• CURRICULUM DEVELOPMENT: ANALYZING STUDENT PERFORMANCE DATA TO IDENTIFY GAPS AND IMPROVE
CURRICULA.
OPERATIONAL EFFICIENCY IN EDUCATIONAL
INSTITUTION IN DATA ANALYTICS
• RESOURCE MANAGEMENT
• ADMINISTRATIVE PROCESSES
• BUSINESS INTELLIGENCE (BI) TOOLS: USING BI TOOLS LIKE TABLEAU, POWER BI, AND
GOOGLE DATA STUDIO FOR DATA VISUALIZATION AND REPORTING.
• MACHINE LEARNING MODELS: IMPLEMENTING MACHINE LEARNING ALGORITHMS FOR
PREDICTIVE AND PRESCRIPTIVE ANALYTICS.
OPERATIONAL EFFICIENCY IN EDUCATIONAL
INSTITUTION IN DATA ANALYTICS
• VISUALIZATION: CREATE DASHBOARDS AND REPORTS TO VISUALIZE KEY METRICS AND INSIGHTS.
• ACTIONABLE INSIGHTS: DERIVE ACTIONABLE INSIGHTS TO INFORM DECISION-MAKING AND
POLICY FORMULATION.
• DEFINE OBJECTIVES
• INTEGRATE DATA FROM THE LMS, SIS, AND ROOM BOOKING SYSTEMS INTO A
CENTRALIZED DATA WAREHOUSE.
EXAMPLE OF DATA ANALYTICS FOR OPERATIONAL
EFFICIENCY
• DATA CLEANING :
• DATA ANALYSIS :
• DESCRIPTIVE ANALYTICS: ANALYZE HISTORICAL DATA TO IDENTIFY TRENDS IN COURSE ENROLLMENTS, PEAK USAGE
TIMES FOR CLASSROOMS, AND COMMON SCHEDULING CONFLICTS.
• PREDICTIVE ANALYTICS: USE MACHINE LEARNING MODELS TO FORECAST FUTURE ENROLLMENT FOR EACH COURSE.
• MODEL: TRAIN A REGRESSION MODEL USING HISTORICAL ENROLLMENT DATA.
• OPTIMIZATION ALGORITHMS: APPLY OPTIMIZATION ALGORITHMS TO GENERATE AN OPTIMAL SCHEDULE.
• ALGORITHM: USE LINEAR PROGRAMMING OR OTHER OPTIMIZATION TECHNIQUES.
EXAMPLE OF DATA ANALYTICS FOR OPERATIONAL
EFFICIENCY
• VISUALIZATION :
• ACTIONABLE INSIGHTS :
• IMPLEMENTATION :
• HTML: CREATE A FORM TO INPUT COURSES, CLASSROOMS, AND TIME SLOTS. CREATE AN
INDEX.HTML FILE. IT PROVIDES A FORM FOR INPUTTING COURSES, CLASSROOMS, AND TIME SLOTS. A BUTTON
TRIGGERS THE OPTIMIZATION PROCESS.
• CSS: STYLE THE FORM FOR BETTER APPEARANCE. CREATE A STYLES.CSS FILE. IT STYLES THE FORM
AND RESULT DISPLAY FOR BETTER APPEARANCE.
MODULE 3
PREPARED BY:
MS. PRATIBHA SAJWAN
CONTENT RECOMMENDATION SYSTEM
• THIS DATA CAN COME FROM VARIOUS SOURCES LIKE SOCIAL MEDIA, STREAMING
SERVICES, BOX OFFICE SALES, AND VIEWER RATINGS.
• DATA SOURCES: SOCIAL MEDIA, STREAMING PLATFORMS, ONLINE REVIEWS, BOX OFFICE RECEIPTS,
TICKET SALES, AND AUDIENCE DEMOGRAPHICS.
• STREAMING SERVICES LIKE NETFLIX USE BIG DATA TO ANALYZE VIEWING PATTERNS AND PREFERENCES.
• THEY TRACK WHAT SHOWS ARE WATCHED, WHEN, AND FOR HOW LONG, AND USE THIS DATA TO
RECOMMEND CONTENT, DECIDE ON RENEWALS OR CANCELLATIONS, AND PLAN NEW PRODUCTIONS.
IMPACT OF DATA ANALYTICS IN ENTERTAINMENT
• DATA ANALYTICS HAS TRANSFORMED THE ENTERTAINMENT INDUSTRY BY ENABLING MORE INFORMED
DECISION-MAKING, ENHANCING USER EXPERIENCES, AND OPTIMIZING OPERATIONS.
• THE IMPACT IS EVIDENT IN AREAS SUCH AS CONTENT CREATION, MARKETING, AUDIENCE ENGAGEMENT,
AND REVENUE GENERATION.
IMPACT OF DATA ANALYTICS IN ENTERTAINMENT
• CONTENT CREATION: DATA ANALYTICS HELPS CREATORS UNDERSTAND WHAT TYPES OF CONTENT
RESONATE WITH AUDIENCES, LEADING TO MORE SUCCESSFUL PRODUCTIONS.
• DATA ANALYTICS WAS USED TO DETERMINE THE POPULARITY OF DIFFERENT TYPES OF CONTENT DURING
DIFFERENT TIMES OF THE YEAR, LEADING TO STRATEGIC SCHEDULING OF NEW RELEASES TO MAXIMIZE
VIEWERSHIP.
EXISTING TOOLS AND SYSTEMS USED FOR DATA
ANALYTICS IN ENTERTAINMENT
• VARIOUS TOOLS AND SYSTEMS ARE EMPLOYED IN THE ENTERTAINMENT INDUSTRY TO GATHER,
PROCESS, AND ANALYZE DATA.
• THESE TOOLS RANGE FROM DATA MANAGEMENT PLATFORMS TO ADVANCED ANALYTICS AND
VISUALIZATION TOOLS.
EXISTING TOOLS AND SYSTEMS USED FOR DATA
ANALYTICS IN ENTERTAINMENT
• APACHE HADOOP: FOR PROCESSING LARGE DATA SETS ACROSS DISTRIBUTED COMPUTING
ENVIRONMENTS.
• GOOGLE ANALYTICS: FOR TRACKING AND ANALYZING WEBSITE AND APP TRAFFIC.
• TABLEAU CAN BE USED BY A MOVIE STUDIO TO CREATE DASHBOARDS THAT VISUALIZE BOX OFFICE
PERFORMANCE, AUDIENCE DEMOGRAPHICS, AND SOCIAL MEDIA SENTIMENT, HELPING EXECUTIVES MAKE
DATA-DRIVEN DECISIONS.
• DEMOGRAPHIC ANALYSIS MEANS COLLECTING AND EXAMINING FACTORS, SUCH AS AGE, GENDER
IDENTIFICATION, LOCATION, EDUCATION, AND INCOME. THIS IS USUALLY THE FIRST AND EASIEST STEP
IN ANALYZING YOUR CUSTOMER BASE AND HELPS REVEAL RELEVANT ISSUES AND A RELATABLE STYLE.
• HTTPS://WWW.SEMRUSH.COM/BLOG/SOCIAL-MEDIA-ANALYTICS-TOOLS/
• HTTP://VARIANCEEXPLAINED.ORG/R/TRUMP-TWEETS/
WHY USE SOCIAL MEDIA ANALYTICS TOOLS?
• 1. CUSTOMER FEEDBACK ANALYSIS – AMAZON & FLIPKART
• ANALYZING CUSTOMER REVIEWS TO DETERMINE SATISFACTION LEVELS WITH PRODUCTS.
• HELPS IN RECOMMENDING SIMILAR PRODUCTS OR IMPROVING EXISTING ONES.
• SOCIAL MEDIA MONITORING – TWITTER/X & FACEBOOK
• COMPANIES MONITOR BRAND MENTIONS TO UNDERSTAND PUBLIC OPINION DURING PRODUCT LAUNCHES
OR PR EVENTS.
•GANs (Generative Adversarial Networks): Generate highly realistic images, videos, and art.
•Transformers (e.g., GPT, BERT): Generate text content, translate languages, write scripts.
•VAEs (Variational Autoencoders): Used in image generation and compression.
•RNNs and LSTMs: For generating sequences like music and speech.
EXAMPLES BY CONTENT TYPE
Content Type Deep Learning Application Examples
Video Deepfake, Video Synthesis Models Synthesia – AI avatars, virtual news anchors
MODULE 4
PREPARED BY:
MS. PRATIBHA SAJWAN
SEGMENTATION
• IN ANALYTICS, THIS HELPS IDENTIFY PATTERNS OR STRUCTURES (E.G., SEGMENTING STUDENTS INTO
PERFORMANCE GROUPS).
THRESHOLDING
• GLOBAL THRESHOLDING: A FIXED THRESHOLD VALUE IS APPLIED TO CONVERT DATA INTO BINARY (0
OR 1).
• ADAPTIVE THRESHOLDING: DIFFERENT THRESHOLDS ARE USED FOR DIFFERENT PARTS OF THE DATASET
OR IMAGE.
• OTSU’S METHOD: FINDS THE OPTIMAL THRESHOLD VALUE BY MINIMIZING INTRA-CLASS VARIANCE
(COMMONLY USED FOR GRAYSCALE IMAGES).
EXAMPLE USE CASES
• A METHOD USED TO IDENTIFY AND LABEL CONNECTED REGIONS IN BINARY DATA (LIKE GROUPING
PIXELS OF THE SAME COLOR).
• AGGLOMERATIVE: START WITH INDIVIDUAL ELEMENTS AND GROUP THEM STEP-BY-STEP (BOTTOM-UP
APPROACH).
• DIVISIVE: START WITH THE ENTIRE DATASET AND SPLIT IT RECURSIVELY (TOP-DOWN APPROACH).
• USES IF-ELSE RULES TO SEGMENT DATA BASED ON SPECIFIC CONDITIONS (E.G., AGE > 25 AND SALARY
< 50K).
EXAMPLE USE CASES
• DIVIDES (SPLITS) DATA OR AN IMAGE INTO REGIONS RECURSIVELY AND THEN MERGES SIMILAR
REGIONS.
• COMMONLY USED IN IMAGE ANALYSIS WHERE REGIONS WITH SIMILAR PROPERTIES ARE GROUPED
TOGETHER.
MOTION-BASED SEGMENTATION
• CANNY EDGE DETECTION: MULTI-STEP PROCESS TO DETECT EDGES WITH NOISE FILTERING.
• HOUGH TRANSFORM: DETECTS STRAIGHT LINES BY MAPPING THEM INTO POLAR COORDINATES.
• USES NEURAL NETWORKS TO AUTOMATICALLY DETECT AND SEGMENT COMPLEX PATTERNS IN IMAGES
OR VIDEOS.
• MASK R-CNN: DETECTS OBJECTS AND CREATES PIXEL-WISE MASKS FOR EACH OBJECT.
EXAMPLE USE CASES
• HOSPITALITY: DETECTING OBJECTS (E.G., BEDS OR FURNITURE) IN ROOM IMAGES FOR INVENTORY
MANAGEMENT.
REVENUE PER AVAILABLE ROOM (REVPAR)
• REVPAR MEASURES THE AVERAGE REVENUE GENERATED FROM ALL AVAILABLE ROOMS, WHETHER
OCCUPIED OR NOT.
• IT REFLECTS BOTH OCCUPANCY RATE AND ROOM PRICING, MAKING IT A CRITICAL METRIC TO ASSESS
THE OVERALL FINANCIAL HEALTH OF A HOTEL.
MODULE 5
PREPARED BY:
MS. PRATIBHA SAJWAN
BUSINESS INTELLIGENCE AND ITS FORMATION
• BUSINESS INTELLIGENCE (BI) REFERS TO THE PROCESSES AND TOOLS USED TO ANALYZE DATA AND
PRESENT ACTIONABLE INSIGHTS FOR DECISION-MAKING.
• STEPS IN BI FORMATION:
• DATA COLLECTION: FROM INTERNAL (PMS, CRM) AND EXTERNAL SOURCES (MARKET TRENDS).
• DATA CLEANING & INTEGRATION: REMOVING INCONSISTENCIES AND MERGING DATASETS.
• DATA ANALYSIS: APPLYING STATISTICAL AND PREDICTIVE TECHNIQUES TO EXTRACT INSIGHTS.
• REPORT GENERATION & VISUALIZATION: PRESENTING INSIGHTS USING DASHBOARDS AND REPORTS.
EXAMPLE USE CASES
• A HOTEL CHAIN COLLECTS BOOKING DATA FROM MULTIPLE PROPERTIES, INTEGRATES IT INTO A
CENTRALIZED SYSTEM, AND CREATES DASHBOARDS TO VISUALIZE REVENUE TRENDS, SEASONAL
BOOKING PATTERNS, AND GUEST PREFERENCES. THIS ALLOWS MANAGEMENT TO ADJUST ROOM
PRICES DYNAMICALLY AND OPTIMIZE MARKETING STRATEGIES.
IMPACT OF DATA ANALYTICS ON BUSINESS
INTELLIGENCE
• FASTER REPORTING: AUTOMATED ANALYTICS TOOLS ENHANCE THE SPEED OF BI, REDUCING
DEPENDENCY ON MANUAL DATA HANDLING.
• A RESTAURANT COLLECTS DATA THROUGH ITS POINT-OF-SALE (POS) SYSTEM, CAPTURING DETAILS
LIKE ORDER HISTORY, PEAK HOURS, AND POPULAR MENU ITEMS. IT ALSO GATHERS CUSTOMER
FEEDBACK THROUGH ONLINE REVIEWS, HELPING IDENTIFY AREAS FOR IMPROVEMENT.
DATA VISUALIZATION TECHNIQUES FOR EFFECTIVE
COMMUNICATION
• DATA VISUALIZATION HELPS CONVERT RAW DATA INTO MEANINGFUL VISUALS FOR EASIER
UNDERSTANDING AND DECISION-MAKING.
• COMMON TECHNIQUES:
• BAR CHARTS & LINE GRAPHS: FOR TREND ANALYSIS.
• HEAT MAPS: FOR COMPARING METRICS ACROSS MULTIPLE DIMENSIONS.
• PIE CHARTS: FOR SHOWING PROPORTIONS.
• DASHBOARDS: INTERACTIVE DASHBOARDS PROVIDE REAL-TIME DATA INSIGHTS.
• TOOLS: TABLEAU, POWER BI, GOOGLE DATA STUDIO.
EXAMPLE USE CASES
• A THEME PARK USES A DASHBOARD WITH HEATMAPS TO VISUALIZE VISITOR DENSITY ACROSS
DIFFERENT ATTRACTIONS. THE PARK MANAGEMENT USES THESE INSIGHTS TO ADJUST STAFF
ALLOCATIONS AND REDUCE WAIT TIMES AT POPULAR RIDES.
CHALLENGES IN IMPLEMENTING A BI PROGRAM
• DATA VISUALIZATION HELPS CONVERT RAW DATA INTO MEANINGFUL VISUALS FOR EASIER
UNDERSTANDING AND DECISION-MAKING.
• DATA QUALITY ISSUES: POOR OR INCOMPLETE DATA CAN LEAD TO INACCURATE INSIGHTS.
• INTEGRATION PROBLEMS: COMBINING DATA FROM DIFFERENT SYSTEMS IS COMPLEX.
• USER RESISTANCE: EMPLOYEES MAY RESIST USING NEW BI TOOLS WITHOUT PROPER TRAINING.
• HIGH COSTS: INITIAL SETUP AND LICENSING FOR BI TOOLS CAN BE EXPENSIVE.
• SECURITY RISKS: DATA LEAKS AND BREACHES ARE POTENTIAL CONCERNS IF GOVERNANCE ISN’T
STRONG.
EXAMPLE USE CASES
• A HOSPITAL FACES INTEGRATION CHALLENGES WHEN MERGING PATIENT RECORDS FROM DIFFERENT
DEPARTMENTS INTO A SINGLE BI SYSTEM. THIS REQUIRES ADDITIONAL EFFORT IN DATA CLEANING
AND MAPPING TO ENSURE SMOOTH OPERATIONS AND REPORTING.
DATA GOVERNANCE AND SECURITY
• A BANK IMPLEMENTS A DATA GOVERNANCE POLICY TO ENSURE CUSTOMER DATA IS ENCRYPTED AND
ONLY ACCESSIBLE TO AUTHORIZED PERSONNEL. REGULAR AUDITS AND COMPLIANCE CHECKS ENSURE
THAT THE BANK FOLLOWS REGULATIONS LIKE GDPR.
FUTURE OF DATA ANALYTICS AND BUSINESS
INTELLIGENCE
• AI-DRIVEN BI: FUTURE BI SYSTEMS WILL BE MORE PREDICTIVE AND PRESCRIPTIVE, LEVERAGING
AI AND MACHINE LEARNING.
• REAL-TIME ANALYTICS: BI WILL INCREASINGLY OFFER REAL-TIME INSIGHTS TO RESPOND TO
DYNAMIC MARKET CONDITIONS.
• NATURAL LANGUAGE PROCESSING (NLP): USERS WILL INTERACT WITH BI TOOLS USING VOICE
COMMANDS OR NATURAL LANGUAGE QUERIES.
• SELF-SERVICE BI: BUSINESS USERS WILL GAIN THE ABILITY TO CREATE THEIR OWN REPORTS AND
DASHBOARDS WITHOUT TECHNICAL EXPERTISE.
• EDGE ANALYTICS: DATA ANALYSIS WILL SHIFT CLOSER TO THE DATA SOURCE (E.G., IOT DEVICES),
REDUCING LATENCY AND ENHANCING EFFICIENCY.
EXAMPLE USE CASES