The document provides an introduction to data science, defining it as a combination of statistics, data analysis, and machine learning to extract insights from data. It outlines the role of data scientists, prerequisites for entering the field, and various applications of data science across industries. Additionally, it discusses the evolution of data, tools used in data science, and the challenges faced by professionals in the field.
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
17 views
Introduction to Data Science
The document provides an introduction to data science, defining it as a combination of statistics, data analysis, and machine learning to extract insights from data. It outlines the role of data scientists, prerequisites for entering the field, and various applications of data science across industries. Additionally, it discusses the evolution of data, tools used in data science, and the challenges faced by professionals in the field.
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17
INTRODUCTION TO DATA SCIENCE
“DATA RULES THE WORLD”
Agenda • What is Data Science? • Who is Data Scientist? – Types of Job Roles • Prerequisite for Data Science • Real-time Environment • How a Data Science Works? • Tools for Data Science • Applications, Advantages, Challenges • What is Data? – Types of data What is Data Science? • Data Science is a combination of multiple disciplines that uses, 1. Statistics 2. Data analysis and 3. Machine Learning • To analyze data and to extract knowledge and insights from it. Why Data science? • Glass-door has ranked Data Science as its topmost profession. • High Paying job role. • Impactful Problem Solving. • Example – commercial Industry that wants to maximize its sales. Who is Data Scientist? • One of the top on-demand job in 2024 is Data scientist. • A data scientist is a professional who works with an enormous amount of data to come up with compelling business insights through the deployment of various tools, techniques, methodologies, algorithms, etc. Real-time Environment • Data Science can be applied in nearly every part of a business where data is available. • Some Examples : – Stock markets – Industry – Politics – Logistic companies – E-commerce – Healthcare Data Algorithm Insight • Eg: Names, ages, • Algorithm step-by-step • Insight is a meaningful test scores, and instructions understanding of data extracurricular activities. analysis result. What is Data? • Data - text, observations, figures, images, numbers, graphs, or symbols. • For example – Weights – Addresses – Ages – Names – Temperatures – Dates – Distances • Data is a raw form of knowledge and, on its own, doesn't carry any significance or purpose. Types of Data • Two Types of Data – Categorical Data • Examples – Marital Status, Hair Color, Political Party – Numerical Data • Discrete • Continuous Data Evolution • Example – Retail Industry • (Pre-2000) Retail businesses primarily relied on traditional data sources, such as sales records and customer feedback forms. • (Mid- 2000) Rise of e-commerce and the internet, retailers started to collect vast amounts of data from online transactions. • Power of Data Analytics(Mid -2000) • Personalization and Customer Experience (Late-2000) • Machine Learning and AI- present Data Evolution • 1920’s-1950’s - Statistics – Data was work on statistics in multiple domain areas(eg. Agriculture, biology, Social Science) • 1960’s - Computer – store data in computer Organizations and governments started using computers to store and analyze data. • 1970’s – Database – Store large dataset The DBMS and the concept of data warehousing allowed for more efficient storage and retrieval of data. Prerequisite For Data Science • Non-Technical Skills • Curiosity • Critical Thinking • Communication Skill • Technical Skill • Programming Language – Python or R • Machine Learning • Statistics • Database How a Data Science work? • Identify the Business Problem • Ask the right question – to understand the problem • Explore and Collect the data related to that problem. • Transform the data to a standardized format. • Clean the data – Remove the erroneous value from the data. • Handling the missing values with statistical technique. • Analyze the data with data visualization. • Build the model for specific dataset. • Deploy the model and make the decision for business problem. Tools For Data Science • Jupyter Notebook • Python IDE • R Studio • Microsoft Power BI • Tableau • SQL • Excel Challenges • Finding and Accessing the Right Data • Data Quality and Cleaning • Handling Large Volume of Data • Balancing Data Security • Communicating Result to Non-Technical Stakeholders Applications • Search Engine – Google • Transport – Uber • Finance – Credit Card Fraud Detection • E-Commerce – Amazon • Health Care – Early Disease Detection • Auto Complete – Email, Text editor • Image Recognition - Facebook Road-Map to Data Science • Domain Knowledge • Mathematics Foundation » Statistics and Probability • Computer Science » Programming Language – Python or R » Data Base – SQL and Mongo DB » Machine Learning » Deep Learning • Communication » Data Visualization – Tableau, Power BI Dashboards