Previous QP
Previous QP
SECTION-A
I. Answer any four questions. Each question carries two marks. (4x2=8)
1. Define the term Data Analytics.
Data analytics refers to the process of examining, cleaning, transforming, and modeling data to uncover
useful information, draw conclusions, and support decision-making. It involves using statistical,
computational, and machine learning techniques to analyze large datasets and extract valuable insights
2. Name any four data visualization tools used.
Tableau, Power BI, QlinkView, Google Charts,Dj3.
3. Explain the term Normal Distribution.
Normal distribution is a symmetric, bell-shaped probability distribution that represents the distribution of
a continuous random variable. The majority of the values cluster around the mean, and the frequency
of values decreases as they move away from the mean. It is characterized by its mean (µ) and standard
deviation (σ), and many statistical methods assume data follows a normal distribution
4. Define the following events
a. Mutually exclusive- Two events are mutually exclusive if they cannot occur at the same time. If one
event happens, the other cannot. For example, rolling a 3 or a 5 on a single dice throw
b. Equally likely- Two events are equally likely if they have the same probability of occurring. For
example, flipping a fair coin, where both heads and tails have equal chances of occurring.
5. What is power query?
Power Query is a data connection technology in Microsoft Power BI, Excel, and other Microsoft tools. It
allows users to discover, connect, combine, and refine data from various sources. With Power Query,
users can automate the process of extracting and transforming data before using it for analysis or
reporting
6. What are Filters in Power BI?
Filters in Power BI are used to restrict the data that appears in visualizations, reports, or dashboards. They
allow users to narrow down data based on certain conditions, such as filtering by a specific time
period, region, or product category. Filters can be applied at different levels: visual level, page level,
and report level
SECTION B
II. Answer any four questions. Each question caries five marks. (4x5=20)
7. Write a note on Data Analytics Life Cycle.
• Data Collection: The first step in data analytics is gathering data from various sources, including
internal sources like databases and spreadsheets, and external sources such as social media and market
research. The data should be relevant to the business problem and of high quality.
• Data Cleaning and Preprocessing: After data collection, it must be cleaned and preprocessed to remove
errors and inconsistencies. This includes removing duplicates, filling in missing values, and correcting
errors. Preprocessing may also involve transforming the data into a suitable format for analysis, such
as converting categorical variables into numerical ones.
• Data Transformation: Data transformation involves converting data into a format suitable for analysis,
which may include scaling, normalizing, and applying mathematical functions.
• Data Analysis: Data is analyzed using statistical and computational techniques, including descriptive
statistics (e.g., mean, standard deviation) and inferential statistics (e.g., hypothesis testing, regression
analysis).
• Interpretation and Reporting: Interpret the results to derive actionable insights and make informed
decisions. This involves understanding how the analysis impacts the business problem or question
being addressed.
8. Define Hypothesis. Explain the purpose of ANOVA in Hypothesis testing.
Hypothesis is a claim or belief, hypothesis testing is a statistical process of either rejecting or retaining a
claim or belief or association related to a business context, product, service, processes, etc.
1. Null Hypothesis (H₀): Assumes there is no effect, difference, or relationship between variables.
2. Alternative Hypothesis (H₁): Suggests that there is an effect, difference, or relationship between
variables.
ANOVA (Analysis of Variance) is a statistical technique used to compare the means of three or more
groups to determine if there are significant differences between them. It evaluates whether the
observed variations among group means are due to random chance or a true effect.
1. Testing for Differences: ANOVA tests the null hypothesis that all group means are equal (H₀: µ₁ = µ₂
= µ₃ ... = µₖ). If rejected, it suggests at least one group mean differs significantly.
2. Identifying Variance Sources: It partitions the total variance in the data into two components:
o Within-group variance: Variability due to differences within each group.
o Between-group variance: Variability due to differences between the group means.
3. Reducing Error: ANOVA helps detect significant group differences while controlling for Type I
errors (false positives) that can arise when conducting multiple t-tests.
2. Data Collection
8. Deployment
Bayes' Theorem is a fundamental concept in probability theory that describes the relationship between
conditional probabilities. It allows us to update the probability of a hypothesis based on new evidence.
Statement of Bayes' Theorem
For two events A and B with P(B)>0, Bayes' Theorem is stated as:
P(A∣B)=P(B∣A)⋅P(A)/P(B)
Where:
• P(A∣B): The probability of event A occurring given that B has occurred (posterior probability).
• P(B∣A): The probability of event B occurring given that A has occurred (likelihood).
• P(A): The probability of event A occurring (prior probability).
• P(B): The probability of event B occurring (marginal probability)
P(A∩B)
P(A∣B)= (1)
P(B)
P(A∩B)
P(B∣A)= (2)
P(A)
P(A∩B)=P(B∣A)P(A) (3)
Step 3: Substitute equation (3) into equation (1)
P(B∣A)P(A)
P(A∣B)= P(B)
11. The owner of Maumee Ford-solvo wants to study the relationship between the age of a car and its
selling price. Listed below is a random sample of 10 used cars sold at the dealership during last year.
Age (years)
9 7 11 12 8 7 8 11 10 12
Selling Price ($000) 8.1 6.0 3.6 4.0 5.0 10.0 7.8 8.6 8.0 6.0
Calculate the correlation coefficient between car's age and its sale price.
Step 2: Calculate the Deviations
Advantages:
1. User-Friendly Interface: Power BI has a visually appealing and intuitive interface that makes it
accessible for users with varying levels of technical expertise.
2. Integration with Microsoft Products: It seamlessly integrates with other Microsoft tools like Excel,
Azure, and SharePoint, making it easier for organizations already using Microsoft services.
3. Data Connectivity: Power BI supports a wide range of data sources, including databases, cloud
services, and spreadsheets, allowing users to consolidate data from multiple sources.
4. Real-Time Data Access: Users can access real-time data, enabling timely decision-making and
insights.
5. Customizable Dashboards and Reports: Users can create personalized dashboards and reports that
can be easily shared across the organization.
6. Advanced Analytics: It offers powerful analytics capabilities, including DAX (Data Analysis
Expressions) for complex calculations and machine learning integrations.
7. Collaboration Features: Power BI facilitates collaboration through sharing options and integration
with Microsoft Teams, enhancing teamwork and communication.
Cost-Effective: For small to medium-sized businesses, Power BI can be a cost-effective solution
compared to other analytics tools.
SECTION-C
III. Answer any four questions. Each question carries eight marks. (4x8=32)
13. With an example explain the different types of analytics.
1. Descriptive Analytics
Purpose:
Example:
• Insights:
o Total sales for the year: $5 million.
o Top-performing product: Sneakers (20% of total sales).
o Sales distribution by region: East Coast accounted for 40%.
Tools: Dashboards, reports, and data visualization tools like Tableau or Power BI.
2. Diagnostic Analytics
Purpose:
Example:
• Analysis:
o Found that a competitor launched a similar product at a lower price.
o Marketing campaigns had lower engagement due to inadequate targeting.
3. Predictive Analytics
Purpose:
Example:
• Insights:
o Sales for Q4 are expected to grow by 10% due to holiday promotions.
o Customers who purchased electronics are 70% likely to buy accessories.
Purpose:
Example:
• Recommendation:
o Implement Route A to reduce fuel costs by 15%.
o Schedule deliveries during non-peak hours to save time.
14. With a case study explain how analytics has helped the food industry to improve their business
Background:
Domino’s Pizza, one of the world’s largest pizza delivery chains, faced challenges in delivery efficiency,
customer satisfaction, and predicting demand. By leveraging analytics, Domino’s transformed its
operations and marketing strategies to achieve significant growth.
Implementation of Analytics:
4. Customer-Centric Insights
Results:
Conclusion:
Domino’s Pizza effectively used analytics to address critical business challenges, resulting in improved
operational efficiency, customer satisfaction, and profitability. The case highlights the transformative
power of analytics in the food industry.
15. Define regression. Find the two regression equations for the data of 10 students in two subjects
given below
English 75 80 93 65 87 71 98 68 89 77
Economics 82 78 86 72 91 80 95 72 89 74
Regression is a statistical method used to model and analyze the relationship between two or more variables. It
helps determine the equation that best describes how a dependent variable (e.g., Economics scores) changes
in response to changes in an independent variable (e.g., English scores).
16.
a) What are the various types of refresh options provided in power BI? (3+5)
Power BI offers multiple data refresh options to ensure reports and dashboards display up-to-date
information. The main types are:
1. Manual Refresh:
o Triggered by the user in the Power BI service or desktop.
o Suitable for ad-hoc data updates.
2. Scheduled Refresh:
o Automatically refreshes data at predefined intervals (e.g., daily, hourly).
o Configured in the Power BI service and requires a Power BI Gateway for on-premises data.
3. Real-Time (Automatic) Refresh:
o Allows data to update in real-time by connecting to streaming datasets.
o Suitable for scenarios like live dashboards showing stock prices or IoT data.
4. On-Demand Refresh:
o Triggered using APIs for specific use cases, such as programmatically refreshing datasets when certain
events occur.
5. Direct Query or Live Connection:
o Data is queried directly from the source in real time, so no scheduled refresh is required.
o Useful for large datasets that cannot be imported into Power BI.
Power BI is composed of several key components that work together to create a complete data analytics
solution:
1. Datasets:
o Collections of data imported or connected to Power BI from sources like SQL, Excel, or APIs.
o Example: Sales data for a year.
2. Reports:
o A collection of visualizations, such as charts, tables, and graphs, displayed on multiple pages.
o Example: A report showing monthly sales trends, top products, and customer demographics.
3. Dashboards:
o A single-page, real-time view of key metrics and insights, pulling data from multiple reports.
o Example: An executive dashboard summarizing company performance.
4. Visualizations:
o Graphical representations of data like bar charts, line charts, and pie charts.
o Example: A bar chart showing product-wise revenue distribution.
5. Tiles:
o A single visualization in a dashboard, pinned from a report or dataset.
o Example: A KPI tile showing total sales.
6. Power BI Service:
o A cloud-based platform where users can share, collaborate, and publish reports and dashboards.
o Example: Sharing a sales performance report with a team.
7. Power BI Desktop:
o A Windows application for creating reports and models.
o Example: Building a sales forecasting model.
8. Dataflows:
o Used for data preparation and transformation within the Power BI service.
o Example: Creating reusable datasets for customer analytics.
17.
a) What is the purpose of COUNT, COUNTA, COUNTBLANK and COUNTIF in Excel?(4+4)
• COUNT:
• COUNTA:
• Purpose: Counts the number of non-blank cells, regardless of data type (numbers, text, or formulas).
• Example: =COUNTA(A1:A10) counts cells with any data (text, numbers, or logical values).
• COUNTBLANK:
• COUNTIF:
18.
a) Differentiate between Dashboard and Reports. (4+4)
b) Explain the different visualisation techniques used for spatial data.
Spatial Data:
Map Visualization
Description: A simple geographical map visualization where data points are plotted using latitude and
longitude coordinates or by geographic regions (like country, state, or city).
Use Cases: Displaying precise locations of points, such as customer addresses, store locations, or
event locations.
Description: A choropleth map that fills geographical areas (e.g., countries, states, or districts) with
colors based on the values in your dataset.
Use Cases: Visualizing data intensity or value distribution across predefined geographical regions.
Description: A custom visual in Power BI that allows you to create custom shapes or regions and
assign data to them, similar to a filled map but not restricted to geographical locations.
Use Cases: Floor plans (e.g., visualizing sales or occupancy in different parts of a retail store or
building).
Description: A map visualization where bubbles (circles) are placed on specific geographic locations,
with the size of the bubble representing the magnitude of a particular value (e.g., population, sales, or
revenue).
Use Cases: Displaying distribution or intensity of a value across different locations.
Description: A scatter plot can visualize geographical data when paired with latitude and longitude on
a Cartesian plane, but more commonly, it is used to represent data distribution between two numeric
values.
Use Cases: Great for identifying correlations or relationships between two data points, such as
comparing sales vs. customer satisfaction in different regions.
Tree maps
Description: A tree map visualizes hierarchical data as a set of nested rectangles. Each rectangle's
size is proportional to a value, making it easy to compare parts of a whole.
Use Cases: While not inherently geographical, tree maps can be used to represent hierarchies of
regions (e.g., country > state > city) and compare metrics within those regions.
Custom visuals
Description: Power BI allows you to integrate a variety of custom visuals designed for more advanced
or specialized spatial data visualization. Some custom visuals cater specifically to geographical data.
Use Cases: Mapbox: Advanced custom maps, including 3D maps, heat maps, contour maps, and
satellite imagery.