0% found this document useful (0 votes)

36 views7 pages

DAL Oral Question Bank

Uploaded by

jackiejamessjj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views7 pages

DAL Oral Question Bank

Uploaded by

jackiejamessjj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Q.

What is Data Analysis:

Data analysis is the process of examining, transforming, and arranging raw data in a specific way to
generate useful information from it

Q. What are types of Data Analytics:

There are four major types of data analytics

Descriptive analytics Diagnostic analytics

Predictive analytics Prescriptive analytics

Descriptive Analytics
• Descriptive Analytics, is the conventional form of data analysis
• It seeks to provide a depiction or “summary view” of facts and figures in
an understandable format
Diagnostic analytics
• Diagnostic Analytics is a form of advanced analytics which examines data
or content to answer the question “Why did it happen?”
Predictive analytics
• Predictive analytics helps to forecast trends based on the current events
Prescriptive analytics
• Set of techniques to indicate the best course of action
• It tells what decision to make to optimize the
outcome

Q. Which are Python Packages used for Data Science

A package is a collection of Python modules.

• Numpy
• SciPy
• Pandas
• Statsmodels
• Matplotlib
• Seaborn
• Plotly
• Bokeh
• Scikit Learn
• Keras
Q. Explain in short about Basic libraries in Python:

NumPy–Numerical Python: NumPyisa Python library used for working with arrays. It also
has functions for working in domain of linear algebra, fourier transform, and matrices.

Pandas–Data frame Python: pandas is a software library written for the Python programming
language for data manipulation and analysis. In particular, it offers data structures and
operations for manipulating numerical tables and time series.

Matplotlib–Visualization: Matplotlib is a comprehensive library for creating static,

animated, and interactive visualizations in Python. Matplotlib makes easy things easy and
hard things possible. Create, Develop publication quality plots. Use interactive figures that
can zoom, pan, update.

Sklearn–Machine Learning: Scikit-learn is a free machine learning library for Python. It

features various algorithms like support vector machine, random forests, and k-neighbours.

Seaborn is a library for making statistical graphics in Python.

 It builds on top of matplotlib and integrates closely with pandas data structures.

 Seaborn helps you explore and understand your data. Its plotting functions operate on
dataframes and arrays containing whole datasets and internally perform the necessary
semantic mapping and statistical aggregation to produce informative plots. Its dataset-
oriented, declarative API lets you focus on what the different elements of your plots
mean, rather than on the details of how to draw them

Q. Explain Different plots

 A scatter plot (also called a scatterplot, scatter graph, scatter chart, scattergram, or
scatter diagram) is a type of plot or mathematical diagram using Cartesian coordinates
to display values for typically two variables for a set of data.

 A histogram is a graphical representation that organizes a group of data points into

user-specified ranges. Similar in appearance to a bar graph, the histogram condenses
a data series into an easily interpreted visual by taking many data points and grouping
them into logical ranges or bins.

 Box Plot is the visual representation of the depicting groups of numerical data
through their quartiles. Boxplot is also used for detect the outlier in data set. It
captures the summary of the data efficiently with a simple box and whiskers and
allows us to compare easily across groups. Boxplot summarizes a sample data using
25th, 50th and 75th percentiles. These percentiles are also known as the lower
quartile, median and upper quartile.

Q. What is statement in python To Print

Importing Numpy import numpy as np

#Print 3x3 matrix with all zeros print(np.zeros((3,3)))

#Print 2x2 matrix with all zeros print(np.ones((2,2)))

#Print identity matrix of 3x3 print(np.eye(3))

# Printing shape of array print("Shape of array: ", c.shape)

# Printing size (total number of elements) of a print("Size of array: ", c.size)

rray

# Printing type of elements in array print("Array stores elements of type: ", c.dty
pe)

Importing Pandas import pandas as pd

reading data from csv file df = pd.read_csv('data.csv')

Exporting data to CSV with pandas df.to_csv('export.csv')

Q. What is Pandas DataFrame:

A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in
rows and columns. A pandas DataFrame can be created using various inputs like – Lists,
dictionary, series, Numpy ndarrays, another DataFrame.

Q. What is data Wrangling

Data wrangling is the process of cleaning, structuring and enriching raw data into a desired
format for better decision making in less time.

Data wrangling in python deals with the below functionalities:

1. Data exploration: In this process, the data is studied, analyzed and understood by
visualizing representations of data.
2. Dealing with missing values: Most of the datasets having a vast amount of data contain
missing values of NaN, they are needed to be taken care of by replacing them with mean,
mode, the most frequent value of the column or simply by dropping the row having
a NaN value.
3. Reshaping data: In this process, data is manipulated according to the requirements,
where new data can be added or pre-existing data can be modified.
4. Filtering data: Some times datasets are comprised of unwanted rows or columns which
are required to be removed or filtered

Q. What is population?
Population is a pool or collection of elements or individuals from which we draw a statistical

sample for a study. It is the entire group about which we want to draw a conclusion. The

number of elements or individuals in a population is called the population size.

Q. What is Sample?

It is a subset of the population. It is the specific group from which you collect data. The

number of elements or individuals in a sample is called the sample size. The process of

selecting a sample is called sampling.

Q. What are different Sampling techniques explain any one:

Simple Random Sampling (SRS):
Stratified Sampling:
Cluster Sampling:
Systematic Sampling:
Convenience Sampling:

Q. What is Hypothesis:
It is a statement about a population which we want to verify on the basis of information
which contained in a sample. E.g Messi is the best captain

Q. Different Terminologies in Hypothesis Testing:

1. Population: Population is a pool or collection of elements or individuals from which
we draw a statistical sample for a study
2. Sample: Sample is a subset of the population. It is the specific group from which you
collect data
3. Parameter:
It is a summary description of a fixed characteristic of the target population e.g Mean,
Variance, Standard Deviation
4. Sampling Distribution:
Statistic obtained through a large number of samples drawn from a specific population
5. Standard Error:
Similar to Standard Deviation, but this the measure of spread of Sample data, whereas
SD is for Population

6. Null Hypothesis H0:

Hypothesis that the event won't happen Hypothesis assumed to be true before we collect
data. If the null hypothesis is approved, no changes will be made.
Ex: If the hypothesis is that “the consumption of a particular medicine reduces the chances of
heart arrest”, the null hypothesis will be “the consumption of the medicine doesn’t reduce the
chances of heart arrest.”

7. Alternate Hypothesis H1:

Hypothesis that the event will happen This is what we want to prove to be true with our
collected data The rejection of the null hypothesis leads to the acceptance of the alternative
hypothesis. Ex: If a researcher is assuming that the
bearing capacity of a bridge is more than 10 tons, then the hypothesis under this study will be
Null hypothesis H0: μ= 10 tons
Alternative hypothesis Ha: μ>10 tons

8. Simple Hypothesis
Hypothesis completely specifies the distribution of the population

9. Composite Hypothesis:
Hypothesis does not completely specify the distribution of the population

10. Type-1 Error:

Error occurs when the sample results, lead to the rejection of the null hypothesis when it
is in fact true. It is equivalent to false positives when you reject a true null hypothesis

11. Type-2 Error:

Error occurs when sample results, the null hypothesis is not rejected when it is in fact false
It is equivalent to false negatives when you accept a false null hypothesis

12. Level of Significance (α):

The probability of making a Type-I error Alpha is the maximum probability that we have a
Type-I error.
For a 95% confidence level, the value of alpha is 0.05. This means that there is a 5%
probability that we will reject a true null hypothesis

Q. Which are different hypothesis Test?

1. Z Test

2. T Test

3. ANOVA / analysis of variance:

What is Machine Learning:

Machine learning is a branch of artificial intelligence (AI) and computer science

which focuses on the use of data and algorithms to imitate the way that humans
learn, gradually improving its accuracy.

Classification of Machine Learning

Machine learning implementations are classified into three major categories,
depending on the nature of the learning “signal” or “response” available to a
learning system which is as follows:-
1. Supervised learning: When an algorithm learns from example data and
associated target responses that can consist of numeric values or string
labels, such as classes or tags, in order to later predict the correct
response when
2. Unsupervised learning: Whereas when an algorithm learns from plain
examples without any associated response, leaving to the algorithm to
determine the data patterns on its own.
3. Reinforcement learning: When you present the algorithm with examples
that lack labels, as in unsupervised learning. However, you can
accompany an example with positive or negative feedback according to
the solution the algorithm proposes comes under the category of
Reinforcement learning,

Categorizing on the basis of required Output

1. Classification: When inputs are divided into two or more classes, and
the learner must produce a model that assigns unseen inputs to one or
more (multi-label classification) of these classes. This is typically tackled
in a supervised way. Spam filtering is an example of classification, where
the inputs are email (or other) messages and the classes are “spam” and
“not spam”.
2. Regression: Which is also a supervised problem, A case when the
outputs are continuous rather than discrete.
3. Clustering: When a set of inputs is to be divided into groups. Unlike in
classification, the groups are not known beforehand, making this typically
an unsupervised task.

Business Statistics 1430 Important Questions 2025
No ratings yet
Business Statistics 1430 Important Questions 2025
57 pages
Unit Iv Bba BRM
No ratings yet
Unit Iv Bba BRM
51 pages
Lecture 4 - Data Wrangling
No ratings yet
Lecture 4 - Data Wrangling
41 pages
Pds Sem 5 Imp by Gtu Medium
No ratings yet
Pds Sem 5 Imp by Gtu Medium
37 pages
Unit 2 1
No ratings yet
Unit 2 1
54 pages
MCA Question Bank
No ratings yet
MCA Question Bank
33 pages
Module 1 DAP
No ratings yet
Module 1 DAP
55 pages
DS Assignment No 2
No ratings yet
DS Assignment No 2
21 pages
Apuntes Estadistica
No ratings yet
Apuntes Estadistica
116 pages
3 Data Description
No ratings yet
3 Data Description
87 pages
Data Science Interview Q - A
No ratings yet
Data Science Interview Q - A
165 pages
Data - Mining 1 18 36
No ratings yet
Data - Mining 1 18 36
19 pages
Research Methodogy Class 4
No ratings yet
Research Methodogy Class 4
29 pages
Research Methodogy Class 5
No ratings yet
Research Methodogy Class 5
29 pages
DS Chapter - 2
No ratings yet
DS Chapter - 2
73 pages
Ms Data Science S, 24 (WEEK# 1) Unlock
No ratings yet
Ms Data Science S, 24 (WEEK# 1) Unlock
31 pages
Ms Data Science S, 24 (WEEK# 1)
No ratings yet
Ms Data Science S, 24 (WEEK# 1)
30 pages
Da 1733591326
No ratings yet
Da 1733591326
132 pages
3-Data Description
No ratings yet
3-Data Description
91 pages
CE880 Lecture3 Slides
No ratings yet
CE880 Lecture3 Slides
44 pages
Lecture 2 - Statistical Inference - EDA and DS Process - 02032023 111156am 1 - 1 27022024 012412pm
No ratings yet
Lecture 2 - Statistical Inference - EDA and DS Process - 02032023 111156am 1 - 1 27022024 012412pm
44 pages
Datascience Interview
100% (1)
Datascience Interview
31 pages
Unit IV
No ratings yet
Unit IV
22 pages
Data Analytics With Python Lecture 1
No ratings yet
Data Analytics With Python Lecture 1
23 pages
UNIT 4 Data Science Notes
No ratings yet
UNIT 4 Data Science Notes
4 pages
Data Science
No ratings yet
Data Science
24 pages
Crack Data Science Interview 1731300339
No ratings yet
Crack Data Science Interview 1731300339
132 pages
02 Exploratory Data Analytics
No ratings yet
02 Exploratory Data Analytics
41 pages
Statistic e Book
No ratings yet
Statistic e Book
61 pages
ML Lecture 6 7 Preprocess
No ratings yet
ML Lecture 6 7 Preprocess
43 pages
PRW Questions
No ratings yet
PRW Questions
31 pages
Ai & DS Iat-2 QB Soln
No ratings yet
Ai & DS Iat-2 QB Soln
27 pages
Unit 1
No ratings yet
Unit 1
84 pages
Data Science Q&A
No ratings yet
Data Science Q&A
4 pages
Week 3 Q&A
No ratings yet
Week 3 Q&A
10 pages
1 Collecting and Interpreting Data Edexcel PDF
No ratings yet
1 Collecting and Interpreting Data Edexcel PDF
3 pages
AI ML June 4 2022
No ratings yet
AI ML June 4 2022
40 pages
CH 4
No ratings yet
CH 4
17 pages
Unit 2
No ratings yet
Unit 2
36 pages
Interview Preparation Data Science Analyse
No ratings yet
Interview Preparation Data Science Analyse
9 pages
Ds Viva
No ratings yet
Ds Viva
9 pages
Data Scientist Interview Questions and Answers PDF
No ratings yet
Data Scientist Interview Questions and Answers PDF
37 pages
Types of Data
No ratings yet
Types of Data
12 pages
Statistical Data Science
No ratings yet
Statistical Data Science
5 pages
BRM Chapter 6
No ratings yet
BRM Chapter 6
8 pages
ITS62604 Tutorial 6 (Answer)
No ratings yet
ITS62604 Tutorial 6 (Answer)
2 pages
Ass-3 Ds
No ratings yet
Ass-3 Ds
7 pages
Unit 1,2
No ratings yet
Unit 1,2
17 pages
Dsa QB 2023-24
No ratings yet
Dsa QB 2023-24
3 pages
2 Mark Key DS
No ratings yet
2 Mark Key DS
3 pages
VIP Question Bank For DPV For Theory Exam
No ratings yet
VIP Question Bank For DPV For Theory Exam
6 pages
Data Analytics Lab QA
No ratings yet
Data Analytics Lab QA
7 pages
IJERT Data Analysis Using Python
No ratings yet
IJERT Data Analysis Using Python
6 pages
Da Question Bank
No ratings yet
Da Question Bank
7 pages
Basic Data Science Interview Questions
No ratings yet
Basic Data Science Interview Questions
18 pages
Viva
No ratings yet
Viva
7 pages
Unit - II - Part I - Importance of Statistics in Data Science
No ratings yet
Unit - II - Part I - Importance of Statistics in Data Science
10 pages
Exam Topics2
No ratings yet
Exam Topics2
7 pages
Data Analytics Questions and Solutions
No ratings yet
Data Analytics Questions and Solutions
2 pages
Six Sigma DMAIC Analyze
100% (3)
Six Sigma DMAIC Analyze
22 pages
Matrix of Curriculum Standards (Competencies), With Corresponding Recommended Flexible Learning Delivery Mode and Materials Per Grading Period
100% (4)
Matrix of Curriculum Standards (Competencies), With Corresponding Recommended Flexible Learning Delivery Mode and Materials Per Grading Period
5 pages
OMBC106 Research Methodology
No ratings yet
OMBC106 Research Methodology
13 pages
Applied Multivariate Statistics For The Social Sciences: University of Cincinnati
100% (2)
Applied Multivariate Statistics For The Social Sciences: University of Cincinnati
708 pages
Study+guide++community+medicine+2010 2011 PDF
No ratings yet
Study+guide++community+medicine+2010 2011 PDF
55 pages
Chapter 9 Fundamental of Hypothesis Testing
No ratings yet
Chapter 9 Fundamental of Hypothesis Testing
27 pages
Business Statistics Question Answer MBA First Semester-1
No ratings yet
Business Statistics Question Answer MBA First Semester-1
59 pages
Q4 Weeks 4 Week 5 Statistics and Probability
100% (2)
Q4 Weeks 4 Week 5 Statistics and Probability
14 pages
Fraenkel and Wallen
No ratings yet
Fraenkel and Wallen
8 pages
Inferentialstatistics 210411214248
No ratings yet
Inferentialstatistics 210411214248
102 pages
နှိုင်းယှဥ်မှုပြသုတေသနစာတမ်းရေးသားနည်းလမ်းညွှန်
No ratings yet
နှိုင်းယှဥ်မှုပြသုတေသနစာတမ်းရေးသားနည်းလမ်းညွှန်
88 pages
Statistics Anusha Illukkumbura
No ratings yet
Statistics Anusha Illukkumbura
46 pages
Comprehensive Ebook of Statistics For Data Science - Chaitali
No ratings yet
Comprehensive Ebook of Statistics For Data Science - Chaitali
21 pages
Ma6452 SNM QB - 1
No ratings yet
Ma6452 SNM QB - 1
16 pages
Lecture 12 - Hypothesis Testing
No ratings yet
Lecture 12 - Hypothesis Testing
61 pages
Literature Review On Investment Management
100% (1)
Literature Review On Investment Management
14 pages
Probability and Statistics - Practice Tests and Solutions
No ratings yet
Probability and Statistics - Practice Tests and Solutions
46 pages
Stat q4
No ratings yet
Stat q4
5 pages
Analysis of Variance
No ratings yet
Analysis of Variance
45 pages
Hypothesis Testing Seminar
No ratings yet
Hypothesis Testing Seminar
10 pages
Flipped Notes 8 Hypothesis Testing
No ratings yet
Flipped Notes 8 Hypothesis Testing
60 pages
Set Volume For Muscle Size - The Ultimate Evidence Based Bible
No ratings yet
Set Volume For Muscle Size - The Ultimate Evidence Based Bible
26 pages
Operations Management Systems Answers Fo
No ratings yet
Operations Management Systems Answers Fo
16 pages
Sample Size Determination 03202012
No ratings yet
Sample Size Determination 03202012
28 pages
Midline
No ratings yet
Midline
53 pages
05 Design
No ratings yet
05 Design
41 pages
CH7 - Statistical Data Treatment and Evaluation
No ratings yet
CH7 - Statistical Data Treatment and Evaluation
56 pages
Large Samples Proportion
No ratings yet
Large Samples Proportion
50 pages
bt1101 Cheat Sheet
No ratings yet
bt1101 Cheat Sheet
3 pages
Additional Topics With Hypothesis Testing: Math Courseware Specialists
No ratings yet
Additional Topics With Hypothesis Testing: Math Courseware Specialists
55 pages
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
From Everand
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
Peter Bradley
No ratings yet

DAL Oral Question Bank

Uploaded by

DAL Oral Question Bank

Uploaded by

Q.

What is Data Analysis:

Q. What are types of Data Analytics:

There are four major types of data analytics

Descriptive analytics Diagnostic analytics

Predictive analytics Prescriptive analytics

Q. Which are Python Packages used for Data Science

A package is a collection of Python modules.

Matplotlib–Visualization: Matplotlib is a comprehensive library for creating static,

Sklearn–Machine Learning: Scikit-learn is a free machine learning library for Python. It

Seaborn is a library for making statistical graphics in Python.

Q. Explain Different plots

 A histogram is a graphical representation that organizes a group of data points into

Q. What is statement in python To Print

#Print 3x3 matrix with all zeros print(np.zeros((3,3)))

#Print 2x2 matrix with all zeros print(np.ones((2,2)))

#Print identity matrix of 3x3 print(np.eye(3))

# Printing shape of array print("Shape of array: ", c.shape)

# Printing size (total number of elements) of a print("Size of array: ", c.size)

Importing Pandas import pandas as pd

reading data from csv file df = pd.read_csv('data.csv')

Exporting data to CSV with pandas df.to_csv('export.csv')

Q. What is data Wrangling

Data wrangling in python deals with the below functionalities:

number of elements or individuals in a population is called the population size.

selecting a sample is called sampling.

Q. What are different Sampling techniques explain any one:

Q. Different Terminologies in Hypothesis Testing:

6. Null Hypothesis H0:

7. Alternate Hypothesis H1:

10. Type-1 Error:

11. Type-2 Error:

12. Level of Significance (α):

Q. Which are different hypothesis Test?

3. ANOVA / analysis of variance:

Machine learning is a branch of artificial intelligence (AI) and computer science

Classification of Machine Learning

Categorizing on the basis of required Output

You might also like