0% found this document useful (0 votes)

33 views4 pages

STD 10 Chap 4 Data Merging Notes

The document discusses data merging, which is the process of combining multiple data sets into a single data frame for analysis. It explains different types of data joins (one-to-one, one-to-many, many-to-many), the concepts of primary and foreign keys, and provides definitions and interpretations of Z-scores, percentiles, quartiles, and deciles. Additionally, it includes formulas and examples to illustrate these concepts.

Uploaded by

kushalthakut809

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views4 pages

STD 10 Chap 4 Data Merging Notes

Uploaded by

kushalthakut809

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Std 10

Chap 4: Data merging

Q1)What is data merging?

Ans: Data merging is the process of combining two or more data sets into a single data
frame. This process is necessary when we have raw data stored in multiple files or data
tables, that we want to analyse all in one go.

Q2)Explain 3 categories of data joins?

Ans:Following are the categories of data joins:

1. One to One Joins

2. One to Many Joins

3. Many to Many Joins

One To One Joins

One to one join is probably one of the simplest join techniques. In this type of join, each
row in one table is linked to a single row in another table using a “key” column.

For example, in a company database, each employee has only one Employee ID, and each
Employee ID is assigned to only one employee.

One To Many Joins

In a one to many join, one record in a table can be related to one or many records in
another table.

For example, each student can have multiple books by school library.
Many To Many Joins

A many to many relationships is said to occur when multiple records in one table are related
to multiple records of other table. For example, a many to many relationships exists
between students and courses. A student can register for multiple courses. A course can
have multiple students.

Q3)What is primary key and foreign KEY?

Ans: Primary keys serve as unique identifiers for each row in a database table. Foreign keys
link data in one table to the data in another table.

A foreign key column in a table point to a column with unique values in another table (often
the primary key column) to create a way of cross-referencing the two tables.

Q4) What is data merging?

Ans : In Data Science, data merging is the process of combining two or more data sets into a
single data frame. This process is necessary when we have raw data stored in multiple files
or data tables that we want to analyse all in one go.

Q5) What is join table? In which data join category, join table is used and why?

Ans : Every record in a join table contains a match field that contains the value of the
primary keys of two tables that it joins. Join table is used to perform join on tables which
have many to many relationships. Since it is not easy to merge tables of having many to
many relation, by using third table i.e. join table it is break into two one to many
relationships.
Q6) What is Z-score?

Ans : A Z-score describes the position of a point in terms of its distance from the mean
when it is measured in the standard deviation units. The z-score is always positive if the
value of z score lies above the mean and it is negative if its value is below the mean.

Q7) Write down Z-score formula.

Ans : The mathematical formula for calculating the z-score is as following: Z = (x-μ)/σ
Where, X = raw score μ = Population mean σ = Population Standard Deviation

Q 8) How to interpret the Z-score?

Ans : A positive z-score tells us that the raw score is higher than the mean average. For
example, if the z-score is equal to +2, it is 2 standard deviations above the mean. A negative
z-score tells us that the score is below the mean average. For example, if a z-score is equal
to -3, it is 3 standard deviations below the mean. If the z-score is equal to 0, it is on the
mean.

Q9) What is meant by percentile? Explain it using example.

Ans : A percentile can be defined as the percentage of the total ordered observations at or

below it.

Consider the following data set: [10, 12, 15, 17, 13, 22, 16, 23, 20, 24]

Here, we want to find the percentile for element 22, we follow the steps below:

1. Sort the dataset in ascending order.

[10, 12, 13, 15, 16, 17, 20, 22, 23, 24]

2. The number of values at or below the element 22 is 8.

The total number of elements in the dataset is 10.

3. By the definition, 80 percent of the values are at or below the element 22.

4. Thus, percentile for the element 22 is 80 percentiles.

Q 10) What is meant by Quartiles? Explain IQR and its application.

Ans : Quartiles of dataset partitions the data into four equal parts, with one-fourth of the
data values in each part. The total of 100% is divided into four equal parts: 25%, 50%, 75% &

100%.
An interquartile range can be defined as the measure of middle 50% of the values when
ordered from lowest to highest. The interquartile range can be calculated by subtracting
first quartile (Q1) from the third quartile (Q3).

IQR = Q3 – Q1

An important application of quartiles is in temperature ranges for the day as reported on a

weather report. In the presence of irregularities, IQR is more robust as well as a better

representation of the amount of spread in the data.

Q 11) Explain Deciles.

Ans : Deciles sort the data into ten equal parts: the 10th, 20th, 30th, 40th, 50th, 60th, 70th,

80th, 90th, 100th. The higher the place in the decile ranking, the higher is the overall
ranking.

The mathematical formula to calculate decile is:

Where n is the number of data in the population sample.

i is the ith decile and can be represented as: 1st Decile, D1 = 1 * (n + 1)/ 10th data 2nd
Decile,

D2 = 2 * (n + 1)/ 10th data and so on

Classx DS Unit 4
No ratings yet
Classx DS Unit 4
43 pages
CH 4 Handout
No ratings yet
CH 4 Handout
11 pages
Data Science Important Questions
No ratings yet
Data Science Important Questions
4 pages
Classx DS Student Handbook
No ratings yet
Classx DS Student Handbook
60 pages
DSM V
No ratings yet
DSM V
7 pages
Unit 4 Data Handling and Model Evaluation 4.1 Data Aggregation
100% (1)
Unit 4 Data Handling and Model Evaluation 4.1 Data Aggregation
31 pages
DS Assignment
No ratings yet
DS Assignment
5 pages
FDS - Ans Key 16.09 PDF
No ratings yet
FDS - Ans Key 16.09 PDF
12 pages
Dbb2102 Quantitative Techniques For Management - Ia
No ratings yet
Dbb2102 Quantitative Techniques For Management - Ia
13 pages
CS3352-QB Fds
No ratings yet
CS3352-QB Fds
12 pages
DBB1202
No ratings yet
DBB1202
11 pages
GFG DataScience Interview Questions
No ratings yet
GFG DataScience Interview Questions
64 pages
Unit - 1 Data Preprocessing
No ratings yet
Unit - 1 Data Preprocessing
66 pages
KMBN 203 - BRM - Unit-5
No ratings yet
KMBN 203 - BRM - Unit-5
67 pages
Dpir Ia1
No ratings yet
Dpir Ia1
13 pages
Computer Unit - 4
No ratings yet
Computer Unit - 4
28 pages
UNIT 2 Notes - Data Science
No ratings yet
UNIT 2 Notes - Data Science
18 pages
FDS Notes
No ratings yet
FDS Notes
5 pages
Data Preprocessing
No ratings yet
Data Preprocessing
39 pages
IDS Crispy Notes
No ratings yet
IDS Crispy Notes
36 pages
Unit 1
No ratings yet
Unit 1
34 pages
Important Questions
No ratings yet
Important Questions
26 pages
Chapter 4 (Part2 - MMW)
No ratings yet
Chapter 4 (Part2 - MMW)
32 pages
Fds 2 Marks
No ratings yet
Fds 2 Marks
13 pages
FDS Most Imp Question
No ratings yet
FDS Most Imp Question
12 pages
Bafpred Module 2 Week 5 6
No ratings yet
Bafpred Module 2 Week 5 6
35 pages
15CS34E Analytic Computing Key
No ratings yet
15CS34E Analytic Computing Key
17 pages
What Is Data Visualization and Why Is It Important
No ratings yet
What Is Data Visualization and Why Is It Important
18 pages
FDS - 5 Solved
No ratings yet
FDS - 5 Solved
13 pages
Fds 2 Marks
No ratings yet
Fds 2 Marks
14 pages
Untitled Document
No ratings yet
Untitled Document
10 pages
FDS - Question Bank
No ratings yet
FDS - Question Bank
17 pages
Data - Mining 1 18 36
No ratings yet
Data - Mining 1 18 36
19 pages
Internal Mock Ques
No ratings yet
Internal Mock Ques
6 pages
Foundations of Data Science Faq 5 Units
No ratings yet
Foundations of Data Science Faq 5 Units
13 pages
CS3552 - Fods - QB 2024
No ratings yet
CS3552 - Fods - QB 2024
11 pages
2 Mark Material
No ratings yet
2 Mark Material
11 pages
02 Data Warehouse
No ratings yet
02 Data Warehouse
18 pages
FDS - 2 Solved
No ratings yet
FDS - 2 Solved
14 pages
21BCAD5C01 IDA Module 2 Notes
No ratings yet
21BCAD5C01 IDA Module 2 Notes
16 pages
DBBA2102
No ratings yet
DBBA2102
10 pages
Fds Print
No ratings yet
Fds Print
7 pages
The Origin of Life On Earth
No ratings yet
The Origin of Life On Earth
72 pages
Study+Material+Unit 4+Data+Preprocessing+
No ratings yet
Study+Material+Unit 4+Data+Preprocessing+
8 pages
Data Mining Unit-II
No ratings yet
Data Mining Unit-II
4 pages
FDS PYQ Solution
No ratings yet
FDS PYQ Solution
8 pages
100+ Data Analyst Interview QnA PDF
No ratings yet
100+ Data Analyst Interview QnA PDF
19 pages
Notes - Unit01 - Data Science and Big Data Analytics
No ratings yet
Notes - Unit01 - Data Science and Big Data Analytics
7 pages
Unit 1
No ratings yet
Unit 1
8 pages
Cephalopelvic Disproportion
60% (5)
Cephalopelvic Disproportion
2 pages
DM - Midsem - Question Bank
No ratings yet
DM - Midsem - Question Bank
5 pages
OJCST Vol13 N2-3 P 78-81
No ratings yet
OJCST Vol13 N2-3 P 78-81
4 pages
Data Analysis: Analysis of Data Is A Process of Inspecting, Cleaning, Transforming, and Modeling
No ratings yet
Data Analysis: Analysis of Data Is A Process of Inspecting, Cleaning, Transforming, and Modeling
6 pages
Business Risk
No ratings yet
Business Risk
10 pages
STATS Stem and Leaf Plots
No ratings yet
STATS Stem and Leaf Plots
5 pages
II Cse Cs3352 Fds QB Unit2
No ratings yet
II Cse Cs3352 Fds QB Unit2
5 pages
Think Like A Prodigy Talk Like A 3rd Grader - Myron Golden
No ratings yet
Think Like A Prodigy Talk Like A 3rd Grader - Myron Golden
6 pages
Wrote: (Hadn't Arrived)
No ratings yet
Wrote: (Hadn't Arrived)
7 pages
Humilitainment Research Paper
No ratings yet
Humilitainment Research Paper
19 pages
Data Mining
No ratings yet
Data Mining
5 pages
Q.1. Why Is Data Preprocessing Required?
100% (1)
Q.1. Why Is Data Preprocessing Required?
26 pages
Assignment July-December 2014: Management Programme
No ratings yet
Assignment July-December 2014: Management Programme
14 pages
Nsi MC 1616 Manual en
No ratings yet
Nsi MC 1616 Manual en
20 pages
Books Doubtnut Question Bank
No ratings yet
Books Doubtnut Question Bank
80 pages
Data Analytics Questions
No ratings yet
Data Analytics Questions
6 pages
Essay 2-Revised and Polished
No ratings yet
Essay 2-Revised and Polished
5 pages
Data Warehousing and Mining: Ii Unit: Data Preprocessing, Language Architecture Concept Description
No ratings yet
Data Warehousing and Mining: Ii Unit: Data Preprocessing, Language Architecture Concept Description
7 pages
1) What Is Business Analytics?
No ratings yet
1) What Is Business Analytics?
6 pages
4.236M Parts Catalog
100% (4)
4.236M Parts Catalog
53 pages
Grouping
No ratings yet
Grouping
5 pages
School Management System Database Project
100% (1)
School Management System Database Project
15 pages
Precise Software Solutions - EPGP - 10 - 119 PDF
No ratings yet
Precise Software Solutions - EPGP - 10 - 119 PDF
4 pages
Hafeez Contractor
No ratings yet
Hafeez Contractor
10 pages
AP EURO HRG Unit 1 Noteguide Answers
No ratings yet
AP EURO HRG Unit 1 Noteguide Answers
10 pages
David Forrest - Autobahn
100% (2)
David Forrest - Autobahn
14 pages
1569928600-7cs It3a dmwh-3555
No ratings yet
1569928600-7cs It3a dmwh-3555
2 pages
Science Room Rules
No ratings yet
Science Room Rules
5 pages
Shannon Butler Resume
No ratings yet
Shannon Butler Resume
3 pages
Conten. Prioriz 2ºcuatrim, 5tomy T, 8 AyB
No ratings yet
Conten. Prioriz 2ºcuatrim, 5tomy T, 8 AyB
6 pages
Vacation Test Paper - 23.05.2022
No ratings yet
Vacation Test Paper - 23.05.2022
5 pages
Yayasan Insan Mulia Merangin Penilaian Tengah Semester: Nama: Kelas/Semester: VI/1 Mapel: Bahasa Inggris
No ratings yet
Yayasan Insan Mulia Merangin Penilaian Tengah Semester: Nama: Kelas/Semester: VI/1 Mapel: Bahasa Inggris
5 pages
Social Work Law TMA 1
No ratings yet
Social Work Law TMA 1
7 pages
PingAccess Interview Questions
No ratings yet
PingAccess Interview Questions
12 pages
NSTP 2 Worksheet 3 - Matias
No ratings yet
NSTP 2 Worksheet 3 - Matias
4 pages
Clinical - 2020 Batch Resit Student Attendance
No ratings yet
Clinical - 2020 Batch Resit Student Attendance
4 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
30 2090 0109 - 707PlateFinisher EN
No ratings yet
30 2090 0109 - 707PlateFinisher EN
1 page
Conjunction: A Conjunction Joins Words, Phrases and Sentences Together
No ratings yet
Conjunction: A Conjunction Joins Words, Phrases and Sentences Together
1 page