0% found this document useful (0 votes)
7 views

Assignament

Uploaded by

Sami Ullah
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Assignament

Uploaded by

Sami Ullah
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Name

Noshaba Muneer

Registration No

Math 211101076(6B)

Course:

Data Science

Khawaja Fareed University of Engineering and Information Technology


What is Data Science?

 Data Science is a blend of various tools, algorithms, and machine learning principles with
the goal to discover hidden patterns from the raw data. But how is this different from
what statisticians have been doing for years?
 Data Science is primarily used to make decisions and predictions making use of
predictive causal analytics, prescriptive analytics (predictive plus decision science) and
machine learning.

What is Big Data?

Big data refers to extremely large and diverse collections of structured, unstructured, and
semistructured data that continue to grow exponentially over time. Let’s break down what this
means:

Three V’s of Big Data:

1. Volume: Big data encompasses a vast volume of information. It’s not just a few
gigabytes; we’re talking about terabytes, petabytes, or even exabytes of data.

2. Velocity: Data is generated at an incredibly high speed. Think of social media posts,
sensor data, financial transactions, and more pouring in rapidly.

3. Variety: Big data comes in various formats: text, images, videos, logs, sensor readings,
and more. It’s not neatly organized; it’s a mix of structured and unstructured data.

Sources of Big Data

 Social Media: Comments, posts, tweets, and interactions on platforms like Facebook,
Twitter, and Instagram.
 Internet of Things (IoT): Data from connected devices, sensors, wearables, and smart
appliances.
 Business Transactions: Sales records, customer orders, invoices, and financial data
 Web Logs: Information from web servers, user behavior, and website visits.
 Scientific Research: Genomic data, climate models, and simulations.
 Healthcare: Electronic health records, medical imaging, and patient data.

Challenges and Opportunities:

 Handling Complexity: Big data is messy, and separating valuable insights from noise
can be challenging
 Storage and Processing: Storing and analyzing massive datasets require specialized
infrastructure and tools.
 Data Privacy and Security: Protecting sensitive information is crucial.
 Business Insights: Big data analytics can reveal patterns, trends, and correlations that
drive business decisions.

Use Cases

 Personalization: Recommending products, movies, or music based on user preferences.


 Healthcare: Predictive analytics for disease outbreaks, personalized treatments, and drug
discovery.
 Finance: Fraud detection, risk assessment, and algorithmic trading.
 Smart Cities: Optimizing traffic flow, energy consumption, and waste management.
 Scientific Research: Climate modeling, particle physics, and genomics.

Data science vs. Data engineering:

 Data science is the computational science of extracting meaningful insights from raw data
and then effectively communicating those insights to generate value.
 Data engineering, on the other hand is an engineering domain that’s dedicated to building
and maintaining systems that overcome data processing and data handling problems for
applications that consume, process, and store large volumes, varieties, and velocities of
data.
 Data science focuses on extracting insights from data using statistical and machine
learning techniques
 While data engineering involves designing and maintaining the infrastructure to collect,
store, and process data efficiently
 Both roles are crucial for successful data-driven decision-making within an organization

Who can use data science?

 You
 Your organization
 Your employer
 Anyone who has a bit of understanding and training can begin using data insights to
improve their lives, their careers and the well-being of their businesses

Structured vs. Unstructured:

Structured data entails data that is categorized and stored in a file according to a particular
format description, where unstructured data is free-form text that takes on a number of types.

 Website links
 Emails
 Twitter responses
 Product reviews
 Pictures/images
 Written text on various platforms

You might also like