0% found this document useful (0 votes)

16 views32 pages

AD3491-Unit 1

This document provides an introduction to data science, covering the need for data science, characteristics of big data, and the data science process. It defines key concepts such as structured and unstructured data, data repositories, and the benefits and applications of data science across various sectors. Additionally, it outlines data science tools and real-time applications, emphasizing the importance of data in modern analysis and decision-making.

Uploaded by

Mahalakshmi S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views32 pages

AD3491-Unit 1

Uploaded by

Mahalakshmi S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 32

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

UNIT I – INTRODUCTION TO DATA SCIENCE

SYLLABUS:
Need for data science – benefits and uses – facets of data – data science
process – setting the research goal – retrieving data – cleansing,
integrating, and transforming data – exploratory data analysis – build the
models – presenting and building applications.

PART A
1. What is Bigdata?
 Big data is a huge volume, high velocity and variety of data that cannot be
processed by traditional processing system.
 They are characterized by the 7 Vs: velocity, variety, volume, variability,
visualization, value and veracity.

2. What are the Characteristics of Bigdata?

 Velocity - refers to the speed of data processing
 Volume - refers to the amount of data
 Value - refers to the benefits that the organization derives from the data.
 Variety - refers to the different types of big data.
 Veracity - refers to the accuracy of your data.
 Validity – refers to the relevance of data for the intended purpose.
 Volatility – refers to constantly changing
 Visualization - Visualization refers to showing your big data-generated
insights
 through visual representations such as charts and graphs.

3. Define Data Science.

 Data science is the field of study of data, using modern scientific techniques,
statistical methods and algorithms to derive insights from huge volume of data
and to create business and IT strategies.
 It deals about where the data comes from, what it represents, and the ways by
which it can be transformed into valuable inputs and resources

4. What are the benefits and uses of Bigdata

 Commercial Companies
 Human Resource professionals
 Financial institutions
 Governmental organizations
 Nongovernmental organizations (NGOs)
 Universities

PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 1

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

5. List out the Facets of data.

The facets of data are categorized below,
 Structured
 Unstructured
 Natural language
 Machine-generated
 Graph-based
 Audio, video, and images
 Streaming

6. Define Structured data.

 Structured data is data that depends on a data model and resides in a fixed
field within a record.
 It’s easy to store structured data in tables within databases or Excel files.
 SQL, or Structured Query Language, is the preferred way to manage and
query data that resides in databases.
 Example: Excel files

7. Define unstructured data

 Unstructured data is data that isn’t easy to fit into a data model because the
content is context-specific or varying.
 Example: Email

8. What is Machine Generated Data?

 Machine-generated data is information that’s automatically created by a
computer, process, application, or other machine without human intervention.
 The analysis of machine data relies on highly scalable tools, due to its high
volume and speed.
 Examples: web server logs, call detail records, network event logs, and telemetry

9. What is Streaming Data?

 The data flows into the system in a continuous manner when an event
happens instead of being loaded into a data store in a batch.
 Examples - “What’s trending” on Twitter, live sporting or music events, and the
stock market.

10. Define Graph based or Network data

 “Graph” points to mathematical graph theory.
PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 2

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

 In graph theory, a graph is a mathematical structure to model pair-wise

relationships between objects.
 Graph or network data is, a data that focuses on the relationship or
adjacency of objects.
 The graph structures use nodes, edges, and properties to represent and store
graphical data.
 Graph databases are used to store graph-based data and are queried with
specialized query languages such as SPARQL.
 Example: social media websites
o For instance, on LinkedIn you can see who you know at which
company.
o Your follower list on Twitter is another example of graph-based data.

11. List out the steps in Data Science Process

The data science process typically consists of six steps.

12. What is meant by Project Charter?

 All the information which are related to research goal is best collected in a project
charter.
 A project charter requires teamwork, and input covers at least the following:
o A clear research goal
o The project mission and context
o How to perform analysis
o What resources to use
o Proof that it’s an achievable project, or proof of concepts
o Deliverables and a measure of success
o A timeline
PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 3
AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

13. How to retrieving the data in Data Science process?

 The second step is to collect data by finding suitable data and getting
access to the data from the data owner.
 Data can also be delivered by third-party companies and take many forms
ranging from Excel spreadsheets to different types of databases.
 The result is data in its raw form, which probably needs polishing and
transformation before it becomes usable.

14. What is Data Repositories?

 A data repository is also known as a data library or data archive.
 The data repository is a large database infrastructure — several databases —
that collect, manage, and store data sets for data analysis, sharing and
reporting.
 Example: Database, Data Warehouse, Data mart, Data Lake.

15. Difference between Data Marts and Data warehouse.

Data Warehouse Data Mart
Data Warehouse stores a large amount of Data Mart contains only the specific
data which is collected from different data from data warehouse, which is
sources required by the company for analysis
Data Warehouse is focused on all Data Mart focuses on a specific group.
departments in an organization
Data Warehouse designing process is Data Mart process is easy to design.
complicated
Data Warehouse takes a long time for data Data Mart takes a short time for data
handling handling.
Data Warehouse size range is 100 GB to 1 Data Mart size is less than 100 GB.
TB+

16. Define Data Lake.

 A data lake is a large data repository that stores unstructured data that is
classified and tagged with metadata.

17. What is Exploratory Data Analysis (EDA)?

 Data exploration is concerned with building a deeper understanding of the data
to know how variables interact with each other, the distribution of the data, and
whether there are outliers.

PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 4

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

18. Define Data Modeling.

 Building a model is an iterative process that involves selecting the variables for
the model, executing the model, and model diagnostics.
 Models consist of the following main steps:
o Selection of a modeling technique and variables to enter in the model
o Execution of the model
o Diagnosis and model comparison

19. Define linking and brushing technique.

 With brushing and linking can combine and link different graphs and tables
so changes in one graph are automatically transferred to the other
graphs.

20. What is Histogram and Boxplot?

 In a histogram a variable is cut into discrete categories and the number of
occurrences in each category are summed up and shown in the graph.
 The boxplot, doesn’t show how many observations are present but does offer an
impression of the distribution within categories.
 It can show the maximum, minimum, median, and other characterizing
measures at the same time.

21. Define Presentation and automation steps in Data Science process.

 Finally presenting the results to the business.
 These results can take many forms, ranging from presentations to research reports.
 Sometimes need to automate the execution of the process because the business will
use the insights gained in another project or enable an operational process to use the
outcome from the model.

22. Discuss the three sub-phases of Data preparation.

 This includes transforming the data from a raw form into data that’s directly usable in
your models.

PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 5

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

 This phase consists of three sub-phases:

i) Data cleansing removes false values from a data source and
inconsistencies across data sources,
ii) Data integration enriches data sources by combining information from
multiple data sources, and
iii) Data transformation ensures that the data is in a suitable format for use in
your models.

23. Define common errors that occur during cleansing data.

24. Define outlier.

 An outlier is an observation that seems to be distant from other observations or,
more specifically, one observation that follows a different logic or generative
process than the other observations.
 The easiest way to find outliers is to use a plot or a table with the minimum
and maximum values.
PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 6

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

PART B
1. Give the description about data science and its applications, also
discuss the benefits and uses of Data Science and Big Data.

Contents
 Big Data
 Data Science
 Benefits and Uses:
1. Commercial Companies
2. Human Resource Professionals
3. Financial Institutions
4. Government Organizations
5.Non-governmental organizations
(NGOs)
6. Universities
 Data Science Tools
 Real Time Applications of Data Science

Data
 Data is a collection of discrete states that convey information,
describing quantity, quality, fact and statistics.

Big data
 Big data is a huge volume, high velocity and variety of data that
cannot be processed by traditional processing system.
 They are characterized by the 7 Vs: velocity, variety, volume,
variability, visualization, value and veracity.

Data science
 Data science is the field of study of data, using modern scientific
techniques, statistical methods and algorithms to derive insights from
huge volume of data and to create business and IT strategies.
 It deals about where the data comes from, what it represents, and the ways
by which it can be transformed into valuable inputs and resources

Benefits and uses of data science

1. Commercial Companies
 Commercial companies use data science to gain insights into their
customers, processes, staff, completion, and products.

PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 7

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

 Many companies use data science to offer customers a better user

experience, cross-sell, up-sell, and personalize their offerings.
 Example:
o Google AdSense - collects data from internet users so relevant commercial
messages can be matched to the person browsing the internet.
o MaxPoint - example of real-time personalized advertising.
2. Human Resource Professionals
 Human resource professionals use people analytics and text mining to
screen candidates, monitor the mood of employees, and study informal
networks among co-workers.
3. Financial Institutions
 Financial institutions use data science to predict stock markets, determine the
risk of lending money, and learn how to attract new clients for their services.
4. Government Organizations
 Governmental organizations are also aware of data’s value.
 Example:
o Data.gov is the home of the US Government’s open data.
5. Non-governmental organizations (NGOs)
 Non-governmental organizations (NGOs) use it to raise money and
defend their causes.
 Example:
o The World Wildlife Fund (WWF), employs data scientists to increase the
effectiveness of their fund raising efforts.
o DataKind is a data scientist group that devotes it’s time to the benefit
of mankind.
6. Universities
 Universities use data science in their research to enhance the study
experience of their students.
 Example:
o The rise of massive open online courses (MOOC) produces a lot of data,
which allows universities to study.
o Coursera, Udacity, and edX.

Data Science Tools

1. SAS - processing Statistical operations
2. Apache Spark - handles batch processing and stream processing
3. BigML - processing Machine Learning Algorithms
4. MATLAB - processing Mathematical Information
5. Tableau - Data Visualization Software

PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 8

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

6. Jupyter - Used for writing code in Python.

7. MatplotLib - Library for plotting and visualization in python.
8. NLTK - Natural Language Processing
9. Tensor flow - Machine Learning Algorithm
10. Numpy - Numerical python for Data Analysis
11. Scipy - Scientific python for scientific and technical
Computations
12. Pandas - Used for Data Analysis

Real Time Applications of Data Science

 Fraud and Risk Detection
 Healthcare
o Medical Image Analysis
o Medical Drug Development
o Virtual Assistance for patients and customer support
 Internet Search
 Target Advertising
 Website Recommendation
 Speech Recognition
 Gaming
 Augmented Reality
 Robotics

2. List and explain the facets of data or different types of data or categories of data.

Contents
1. Structured
2. Unstructured
3. Natural Language
4. Machine-generated
5. Graph-based
6. Audio, video, and images
7. Streaming

 Categories of data:
1. Structured data
 Structured data is data that depends on a data model and resides in a fixed
field within a record.
 It’s easy to store structured data in tables within databases or Excel files.
PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 9

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

 SQL, or Structured Query Language, is the preferred way to manage and
query data that resides in databases.
Example: Refer Figure 1.1

2. Unstructured data
 Unstructured data is data that isn’t easy to fit into a data model because the
content is context-specific or varying.
 Example - regular email. (Figure 1.2).

 In Figure 1.2, email contains structured elements such as the sender, title, and
body text, it’s a challenge to find the number of people who have written an
email complaint about a specific employee because so many ways exist to
refer to a person, for example.

PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 10

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

3. Natural language
 Natural language is a special type of unstructured data; it’s challenging to
process because it requires knowledge of specific data science techniques and
linguistics.
 The natural language processing community had success in entity recognition,
topic recognition, summarization, text completion, and sentiment analysis, but
models trained in one domain don’t generalize well to other domains.
4. Machine-generated data
 Machine-generated data is information that’s automatically created by a
computer, process, application, or other machine without human intervention.
 The analysis of machine data relies on highly scalable tools, due to its high
volume and speed.
 Examples - web server logs, call detail records, network event logs, and telemetry
(Figure 1.3).

 The machine data in figure 1.3 would fit nicely in a classic table-
structured database.
 This isn’t the best approach for highly interconnected or “networked” data,
where the relationships between entities have a valuable role to play.
5 Graph-based or network data
 “Graph” points to mathematical graph theory.
 In graph theory, a graph is a mathematical structure to model pair- wise
relationships between objects.
 Graph or network data is, a data that focuses on the relationship or
adjacency of objects.
 The graph structures use nodes, edges, and properties to represent and store
graphical data.

PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 11

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

 Graph-based data is a natural way to represent social networks, and its

structure allows to calculate specific metrics such as the influence of a person
and the shortest path between two people.
 Example: graph-based data can be found on many social media websites such
as Follower list on Twitter. (figure 1.4).

 Graph databases are used to store graph-based data and are queried with
specialized query languages such as SPARQL.
6. Audio, image, and video
 Audio, image, and video are data types that pose specific challenges to a
data scientist.
 Tasks that are trivial for humans, such as recognizing objects in pictures, turn
out to be challenging for computers.
 High-speed cameras at stadiums will capture ball and athlete movements to
calculate in real time, for example, the path taken by a defender relative to
two baselines.
 Recently a company called DeepMind succeeded at creating an algorithm
that’s capable of learning how to play video games.
 This algorithm takes the video screen as input and learns to interpret everything
via a complex process of deep learning.
 This prompted Google to buy the company for their own Artificial Intelligence
(AI) development plans.
7. Streaming data
 The data flows into the system in a continuous manner when an event happens
instead of being loaded into a data store in a batch.
 Examples - “What’s trending” on Twitter, live sporting or music events, and
the stock market.
PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 12
AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

3 Explain in detail about data design process with examples.

Content:

 The data science process – An Overview

Figure 1.5: Steps of Data Science Process

PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 13

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

 The data science process typically consists of six steps, as shown in

figure 1.5
1. Setting the research goal

 The first step of this process is defining a research goal by creating a

project charter.
 A project charter requires teamwork, and input covers at least the
following:
o A clear research goal
o The project mission and context
o How to perform analysis
o What data and resources to use
o Proof that it’s an achievable project, or proof of concepts
o Deliverables and a measure of success
o A timeline
2 Retrieving data

 The second step is to collect data by finding suitable data and getting access
to the data from the data owner.
 Start with data stored within the company
o The data can be stored in official data repositories such as databases,
data marts, data warehouses, and data lakes maintained by a team of IT
professionals.
o The primary goal of a database is data storage, while a data warehouse is
designed for reading and analyzing that data.
o A data mart is a subset of the data warehouse and geared toward
serving a specific business unit.
o While data warehouses and data marts are home to preprocessed data, data
lakes contains data in its natural or raw format which probably needs
polishing and transformation before it becomes usable..
 Don’t be afraid to shop around
o Many companies specialize in collecting valuable information.
o Data can also be delivered by third-party companies and take many
forms ranging from Excel spreadsheets to different types of databases.
Refer Table 1.2

PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 14

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

Table 1.2 – Open Data Sites

 Do data quality checks to prevent problems later

o Expect to spend a good portion of your project time doing data
correction and cleansing, sometimes up to 80%.

3 Data preparation

 Data collection is an error-prone process; this phase enhance the

quality of the data and prepare it for use in subsequent steps.
 This phase consists of three sub-phases:
1. Data cleansing - Data cleansing is a sub process of the data science that
removes false values from a data source and inconsistencies across data
sources,.
Types of errors
 Interpretation error – Taking value for granted.
Example: person’s age is greater than 300 years
 Inconsistencies – between data sources and standardized value.
Example: putting “Female” in one table and “F” in another
PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 15

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

Common Errors
Table 1.3 – Common Errors

1. Data Entry Errors

 Data collection and data entry are error-prone processes.
 They often require human intervention, and because humans are
only human, they make typos or lose their concentration for a
second and introduce an error into the chain.
 Example
if x == “Godo”:
x = “Good”
if x == “Bade”:
x = “Bad”
2. Redundant Whitespace
Whitespaces tend to be hard to detect but cause errors. Fixing
redundant whitespaces is luckily easy enough in most programming
languages. They all provide string functions that will remove the
leading and trailing whitespaces.
Example:
Python the strip() function is used to remove leading and trailing
spaces.
3. Impossible Values And Sanity Checks
Sanity checks are another valuable type of data check. Example:
Sanity checks can be directly expressed with rules: check =
0 <= age <= 120
4. Outliers
An outlier is an observation that seems to be distant from other observations or,
more

PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 16

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

specifically, one observation that follows a different logic or

generative process than
the other observations. The easiest way to find outliers is to use a plot
or a table with
the minimum and maximum values. An
example is shown in figure 1.6.

Figure 1.6 Distribution plots are helpful in detecting

outliers and helping you understand the variable.

5. Dealing With Missing Values

Missing values aren’t necessarily wrong, but still need to handle them
separately;

PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 17

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

Table 1.4 An overview of techniques to handle missing data

6. Deviations From A Code Book

 Detecting errors in larger data sets against a code book or against
standardized values can be done with the help of set operations.
 A code book is a description of your data, a form of metadata.
 It contains things such as the number of variables per observation, the
number of observations, and what each encoding within a variable
means.
 (For instance “0” equals “negative”, “5” stands for “very positive”.)
 A code book also tells the type of data looking at: is it hierarchical,
graph, something else
7. Different Units Of Measurement
 When integrating two data sets, should pay attention to their
respective units of measurement.
 An example of this would be when studying the prices of gasoline in the
world, gather data from different data providers.
 Data sets can contain prices per gallon and others can contain prices
per liter.
 A simple conversion will do the trick in this case.
8. Different Levels Of Aggregation
 Having different levels of aggregation is similar to having different
types of measurement.
PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 18

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

 An example of this would be a data set containing data per week
versus one containing data per work week.
 This type of error is generally easy to detect, and summarizing (or the
inverse, expanding) the data sets will fix it.
 After cleaning the data errors, combine information from different data
sources.

Correct errors as early as possible

 Data should be cleansed when acquired for many reasons:
o Decision-makers may make costly mistakes on information based on
incorrect data from applications that fail to correct for the faulty data.
o If errors are not corrected early on in the process, the cleansing will
have to be done for every project that uses that data.
o Data errors may point to a business process that isn’t working as
designed.
o Data errors may point to defective equipment, such as broken
transmission lines and defective sensors.
o Data errors can point to bugs in software or in the integration of
software that may be critical to the company.

Combining data from different data sources

The different ways of combining data
 The first operation is joining: enriching an observation from one table with
information from another table.
 The second operation is appending or stacking: adding the
observations of one table to those of another table.
1. Joining Tables
 Joining tables allows to combine the information of one observation
found in one table with the information that found in another table.
 To join tables, use variables that represent the same object in both
tables, such as a date, a country name,.
 These common fields are known as keys.
 When these keys also uniquely define the records in the table they
are called primary keys

PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 19

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

Example:

Figure 1.6 : Joining two tables on

the Item and Region keys
In figure 1.6, both tables contain the client name, and this makes it
easy to enrich the client expenditures with the region of the client.

2. Appending or stacking:
 Appending or stacking tables is effectively adding observations from
one table to another table.
 The equivalent operation in set theory would be the union, and this is
also the command in SQL, the common language of relational
databases.
 Other set operators are also used in data science, such as set difference
and intersection.
Example:

Figure 1.7: Appending tables

In figure 1.7, Appending data from tables is a common operation
but requires an equal structure in the tables being appended.

PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 20

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

3. View
 Views are kind of virtual tables.
 Can create a view by selecting fields from one or more tables
present in the database.
 A View can either have all the rows of a table or specific rows
based on certain condition.

Figure 1.8: Views

4 Data transformation
 Certain models require their data to be in a certain shape.
 Ensures that the data is in a suitable format for use in data
models.
 Taking the log of the independent variables simplifies the
estimation problem dramatically.
Example – Refer Figure 1.9
Relationships between an input variable and an output variable aren’t always linear.

Figure 1.9: Transformation

PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 21

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

Figure 1.9 Transforming x to log x makes the relationship between x and y
linear (right), compared with the non-log x (left).

5. Data exploration or EDA (Exploratory Data Analysis)

o Data exploration is concerned with building a deeper understanding of the
data to know how variables interact with each other, the distribution of the
data, and whether there are outliers.
o The visualization techniques used in this phase range from simple line graphs
or histograms, to more complex diagrams such as Sankey and network graphs.

 Graphs: - Simple and Combined Graphs

In figure 1.11 - From top to bottom, a bar chart, a line plot, and a
Distribution is some of the graphs used in exploratory analysis.

 Brushing and linking.

With brushing and linking can combine and link different graphs and
tables so changes in one graph are automatically transferred to the other
graphs.

PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 22

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS UNIT 1

Figure 1.11 - Graphs used in exploratory analysis

Histogram
 In a histogram a variable is cut into discrete categories and the number of
occurrences in each category are summed up and shown in the graph.

Figure 1.12 - Example Histogram

 Example – Figure 1.12 shows the number of people in the age groups of
5-year intervals

PREPARED BY: Mrs.S.MAHALAKSHMI AP/AI&DS 23

TYCS Data Science Questions Bank
No ratings yet
TYCS Data Science Questions Bank
3 pages
FDS - Unit 1 Question Bank
No ratings yet
FDS - Unit 1 Question Bank
16 pages
SAP BW Certification Material
50% (2)
SAP BW Certification Material
4 pages
Management Information Systems: About Starbucks
No ratings yet
Management Information Systems: About Starbucks
6 pages
INFORMATION MANAGEMENT Unit 3 NEW
100% (1)
INFORMATION MANAGEMENT Unit 3 NEW
61 pages
Ab Initio Interview Question v1.0
No ratings yet
Ab Initio Interview Question v1.0
7 pages
Chapter 2. Introduction To Data Science
100% (2)
Chapter 2. Introduction To Data Science
45 pages
Data Science 1
100% (4)
Data Science 1
133 pages
DWM Lab Manual
No ratings yet
DWM Lab Manual
92 pages
Multiple Choice Quiz
No ratings yet
Multiple Choice Quiz
2 pages
Gujarat Technological University: Page 1 of 2
No ratings yet
Gujarat Technological University: Page 1 of 2
2 pages
Devinder Gill - DE - Resume
No ratings yet
Devinder Gill - DE - Resume
5 pages
Fundamentals of Data Science: Nehru Institute of Engineering and Technology
100% (1)
Fundamentals of Data Science: Nehru Institute of Engineering and Technology
17 pages
Chapter 2. Introduction To Data Science
No ratings yet
Chapter 2. Introduction To Data Science
40 pages
DWM - Viva and Short Question Answers
No ratings yet
DWM - Viva and Short Question Answers
24 pages
IS414: Data Mining: DR - Waleed M.Ead
No ratings yet
IS414: Data Mining: DR - Waleed M.Ead
36 pages
Lecture #2 - Data Warehouse Architecture
No ratings yet
Lecture #2 - Data Warehouse Architecture
6 pages
MIS Assignment
No ratings yet
MIS Assignment
23 pages
IDS Complete Notes
No ratings yet
IDS Complete Notes
126 pages
12 2marks With Ans
No ratings yet
12 2marks With Ans
21 pages
AD3491 - FDSA - Unit I - Introduction - Part I
100% (2)
AD3491 - FDSA - Unit I - Introduction - Part I
23 pages
Inmon Vs Kimball 1
No ratings yet
Inmon Vs Kimball 1
33 pages
MIS - Chapter - 6 - Database & Information System
No ratings yet
MIS - Chapter - 6 - Database & Information System
40 pages
Best Practices in Data Warehouse Testing GOOD
No ratings yet
Best Practices in Data Warehouse Testing GOOD
18 pages
Data Mining and Warehousing
No ratings yet
Data Mining and Warehousing
29 pages
UNIT-1 Business Intelligence
No ratings yet
UNIT-1 Business Intelligence
30 pages
Unit 1 FODS - QB
No ratings yet
Unit 1 FODS - QB
2 pages
Database Management System
No ratings yet
Database Management System
32 pages
Fdsa Unit 1
No ratings yet
Fdsa Unit 1
25 pages
MLM FDS
No ratings yet
MLM FDS
19 pages
Database Testbank
No ratings yet
Database Testbank
13 pages
Business Intelligence Syllabus
No ratings yet
Business Intelligence Syllabus
3 pages
Fods QB
No ratings yet
Fods QB
35 pages
Unit I 2 Marks With Ans
No ratings yet
Unit I 2 Marks With Ans
7 pages
Chapter 4. Enterprise Technologies and Big Data Business
No ratings yet
Chapter 4. Enterprise Technologies and Big Data Business
37 pages
Qdi Gold Client and Qlik Catalog License Metrics
No ratings yet
Qdi Gold Client and Qlik Catalog License Metrics
3 pages
JPC - 15553 - Bhavyasri Tanneeru
No ratings yet
JPC - 15553 - Bhavyasri Tanneeru
8 pages
Answers
No ratings yet
Answers
4 pages
12 2marks With Ans
No ratings yet
12 2marks With Ans
21 pages
FDS 2 Marks All Units For File
No ratings yet
FDS 2 Marks All Units For File
13 pages
17 Olap
No ratings yet
17 Olap
66 pages
Designing Distributed and Internet Systems: Jeffrey A. Hoffer Joey F. George Joseph S. Valacich
100% (1)
Designing Distributed and Internet Systems: Jeffrey A. Hoffer Joey F. George Joseph S. Valacich
28 pages
Foundations of Data Science
No ratings yet
Foundations of Data Science
139 pages
Fdsa Unit 1 Aids Sem 4
No ratings yet
Fdsa Unit 1 Aids Sem 4
26 pages
II CSE CS3352 FDS QB Unit1
No ratings yet
II CSE CS3352 FDS QB Unit1
4 pages
Internship Report 2023-24 Data Science
100% (2)
Internship Report 2023-24 Data Science
23 pages
Unit I 2 Marks
No ratings yet
Unit I 2 Marks
5 pages
Fdsa 12 - 2M
No ratings yet
Fdsa 12 - 2M
15 pages
Unit 345 DW Autosaved
No ratings yet
Unit 345 DW Autosaved
68 pages
II Cse Cs3352 Fds QB Unit1
No ratings yet
II Cse Cs3352 Fds QB Unit1
5 pages
2marks Unit 1 2marks Unit 1: Foundations of Datascience (Anna University) Foundations of Datascience (Anna University)
No ratings yet
2marks Unit 1 2marks Unit 1: Foundations of Datascience (Anna University) Foundations of Datascience (Anna University)
8 pages
Data Science Unit-1 Notes
No ratings yet
Data Science Unit-1 Notes
19 pages
Data Science Fundamentals QB
No ratings yet
Data Science Fundamentals QB
23 pages
01.ad3491 Fdsa QB
No ratings yet
01.ad3491 Fdsa QB
16 pages
FDS Unit1
No ratings yet
FDS Unit1
30 pages
2 Mark Material
No ratings yet
2 Mark Material
11 pages
3.question Bank
No ratings yet
3.question Bank
7 pages
Ocs353 Data Science Fundamentals Notes
No ratings yet
Ocs353 Data Science Fundamentals Notes
145 pages
PDS Question Bank
No ratings yet
PDS Question Bank
19 pages
FDS Notes
No ratings yet
FDS Notes
5 pages
DW Unit I Notes
No ratings yet
DW Unit I Notes
28 pages
Fds Two Marks
No ratings yet
Fds Two Marks
10 pages
Chandrakala Resume
No ratings yet
Chandrakala Resume
8 pages
FDS Unit 1 QB
No ratings yet
FDS Unit 1 QB
7 pages
UNIT I Material
No ratings yet
UNIT I Material
25 pages
Ccs367-Question Bank
No ratings yet
Ccs367-Question Bank
23 pages
AD3491-Unit 2
No ratings yet
AD3491-Unit 2
102 pages
II CSE - A&B (96) DS-int 1 QP ANS-set1
No ratings yet
II CSE - A&B (96) DS-int 1 QP ANS-set1
7 pages
IV AI-DS AD3491 FDSA QB Unit1
No ratings yet
IV AI-DS AD3491 FDSA QB Unit1
5 pages
Unit 2 DW&DM Notes Mr. Rohit Pratap Singh
No ratings yet
Unit 2 DW&DM Notes Mr. Rohit Pratap Singh
33 pages
AD3491 - Unit 1 - Introduction To Data Science Important Questions 2 Marks With Answer - 3-8
No ratings yet
AD3491 - Unit 1 - Introduction To Data Science Important Questions 2 Marks With Answer - 3-8
6 pages
DTS 201 Lecture Note
No ratings yet
DTS 201 Lecture Note
24 pages
Unit 1
No ratings yet
Unit 1
14 pages
A004 Project Report
No ratings yet
A004 Project Report
4 pages
Assignment 1
No ratings yet
Assignment 1
1 page
Question Bank With Answers
No ratings yet
Question Bank With Answers
103 pages
Ixs8h l8mgc
No ratings yet
Ixs8h l8mgc
40 pages
Ad3491-FDA Unit 1 Question Bank
No ratings yet
Ad3491-FDA Unit 1 Question Bank
8 pages
Data Science (Quick Guide) For College Exams
No ratings yet
Data Science (Quick Guide) For College Exams
34 pages
DS 3-Marks Semeseter Suggestion
No ratings yet
DS 3-Marks Semeseter Suggestion
54 pages
2 Marks With Answers
No ratings yet
2 Marks With Answers
39 pages
Foundation of Data Science (BSC)
No ratings yet
Foundation of Data Science (BSC)
64 pages
Fds Question Bank
No ratings yet
Fds Question Bank
116 pages
Foundation of Data Science (BSC) 1
No ratings yet
Foundation of Data Science (BSC) 1
64 pages
2 Marks Foundations of Data Science
No ratings yet
2 Marks Foundations of Data Science
13 pages
Dpa-Set - A
No ratings yet
Dpa-Set - A
29 pages
cs3352 Foundation of Data Science
No ratings yet
cs3352 Foundation of Data Science
117 pages
Q1. Explain Data Science Process Along With Detailed Diagram
No ratings yet
Q1. Explain Data Science Process Along With Detailed Diagram
7 pages
Data Science Unit 01
No ratings yet
Data Science Unit 01
19 pages
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Learn Data Warehousing in 24 Hours
From Everand
Learn Data Warehousing in 24 Hours
Alex Nordeen
No ratings yet
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
Enterprise Data Science: Smarter Decisions with Big Data
From Everand
Enterprise Data Science: Smarter Decisions with Big Data
Vidhur Gupta
No ratings yet
Building and Operating Data Hubs: Using a practical Framework as Toolset
From Everand
Building and Operating Data Hubs: Using a practical Framework as Toolset
Georg Graner
No ratings yet