DAL Oral Question Bank
DAL Oral Question Bank
Descriptive Analytics
• Descriptive Analytics, is the conventional form of data analysis
• It seeks to provide a depiction or “summary view” of facts and figures in
an understandable format
Diagnostic analytics
• Diagnostic Analytics is a form of advanced analytics which examines data
or content to answer the question “Why did it happen?”
Predictive analytics
• Predictive analytics helps to forecast trends based on the current events
Prescriptive analytics
• Set of techniques to indicate the best course of action
• It tells what decision to make to optimize the
outcome
NumPy–Numerical Python: NumPyisa Python library used for working with arrays. It also
has functions for working in domain of linear algebra, fourier transform, and matrices.
Pandas–Data frame Python: pandas is a software library written for the Python programming
language for data manipulation and analysis. In particular, it offers data structures and
operations for manipulating numerical tables and time series.
It builds on top of matplotlib and integrates closely with pandas data structures.
Seaborn helps you explore and understand your data. Its plotting functions operate on
dataframes and arrays containing whole datasets and internally perform the necessary
semantic mapping and statistical aggregation to produce informative plots. Its dataset-
oriented, declarative API lets you focus on what the different elements of your plots
mean, rather than on the details of how to draw them
A scatter plot (also called a scatterplot, scatter graph, scatter chart, scattergram, or
scatter diagram) is a type of plot or mathematical diagram using Cartesian coordinates
to display values for typically two variables for a set of data.
Box Plot is the visual representation of the depicting groups of numerical data
through their quartiles. Boxplot is also used for detect the outlier in data set. It
captures the summary of the data efficiently with a simple box and whiskers and
allows us to compare easily across groups. Boxplot summarizes a sample data using
25th, 50th and 75th percentiles. These percentiles are also known as the lower
quartile, median and upper quartile.
# Printing type of elements in array print("Array stores elements of type: ", c.dty
pe)
A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in
rows and columns. A pandas DataFrame can be created using various inputs like – Lists,
dictionary, series, Numpy ndarrays, another DataFrame.
Data wrangling is the process of cleaning, structuring and enriching raw data into a desired
format for better decision making in less time.
Q. What is population?
Population is a pool or collection of elements or individuals from which we draw a statistical
sample for a study. It is the entire group about which we want to draw a conclusion. The
Q. What is Sample?
It is a subset of the population. It is the specific group from which you collect data. The
number of elements or individuals in a sample is called the sample size. The process of
Q. What is Hypothesis:
It is a statement about a population which we want to verify on the basis of information
which contained in a sample. E.g Messi is the best captain
8. Simple Hypothesis
Hypothesis completely specifies the distribution of the population
9. Composite Hypothesis:
Hypothesis does not completely specify the distribution of the population
1. Z Test
2. T Test