0% found this document useful (0 votes)
12 views

Artificial Intelligence & BA - Practicals Assignments

Uploaded by

girijamma.ha
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Artificial Intelligence & BA - Practicals Assignments

Uploaded by

girijamma.ha
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Job Role - AI-Business Intelligence Analyst

Practical Questions and Assignments

Module 1: AI-Business Intelligence Analyst -An Introduction


1. Explain the Evolution of AI and problems that can be solved by AI.
2. What is Big Data, and why is it important for businesses?
3. What are the 7 V’s of Bigdata?
4. Differentiate structured, semi structured and unstructured data.
5. How has AI improved healthcare outcomes in recent years?
6. How can AI and Big Data contribute to solving environmental issues?
7. What is image processing, and how is it used in social media apps?
8. Explain how computer vision helps in self-driving cars.
9. What is robotics, and how are robots used in modern manufacturing?
10. How does Natural Language Processing (NLP) help virtual assistants understand spoken commands?
11. Can you give an example of how AI in healthcare uses image processing for diagnosis?
12. How do AI-driven recommendations enhance user experiences in online platforms?

Module 2: Basic Statistical Concepts


Problem 1 :

A small company recorded the monthly sales figures (in thousands of dollars) for the past 10 months. The sales
figures are as follows:

22,30,25,30,28,32,29,30,26,27,22, 30, 25, 30, 28, 32, 29, 30, 26, 2722,30,25,30,28,32,29,30,26,27

Tasks:

1. Calculate the Mean: Find the average monthly sales.

2. Determine the Median: Identify the middle value of the monthly sales data when arranged in ascending
order.

3. Find the Mode: Determine the most frequently occurring sales figure.

4. Interpret the Results: Based on your calculations, which measure of central tendency provides the best
insight into the company's typical monthly sales?

For Private Circulation Only


Problem 2:

A researcher collected data on the number of hours studied and the corresponding test scores for a sample of 8
students. The data is as follows:

Hours Studied Test Score

2 55

3 60

5 70

4 65

6 75

7 80

8 85

9 90

Tasks:

1. Calculate Pearson’s Correlation Coefficient: Determine the strength and direction of the linear relationship
between hours studied and test scores.

2. Perform Regression Analysis Using the Method of Least Squares:

o Find the equation of the best-fit line in the form Y=a+bXY = a + bXY=a+bX, where YYY is the test
score and XXX is the number of hours studied.

o Interpret the slope and intercept of the line.

Problem 3:

A data analyst is working with a dataset that contains information about the number of hours studied and the
corresponding exam scores of students. The dataset includes the following columns:

• Hours Studied (continuous variable)


• Exam Score (continuous variable, for linear regression)
• Passed Exam (binary variable, 1 if the student passed and 0 if failed, for logistic regression)
The dataset is as follows:

For Private Circulation Only


Hours Exam Passed
Studied Score Exam
2 55 0

3 60 0

5 70 1

4 65 1

6 75 1

7 80 1

8 85 1

9 90 1

Tasks:

1. Linear Regression: Fit a linear regression model to predict Exam Score based on Hours Studied. Provide the
regression equation and interpret the coefficients.

2. Logistic Regression: Fit a logistic regression model to predict the probability of passing the exam based on
Hours Studied. Provide the logistic regression equation and interpret the coefficients.

3. Ridge Regression: Apply ridge regression to the same linear regression problem. Explain how ridge
regression modifies the standard linear regression model.

4. Lasso Regression: Apply lasso regression to the same linear regression problem. Discuss how
lasso regression impacts feature selection compared to ridge regression.

Problem 4 :

**One-Sample t-Test:**

- A company claims that their light bulbs last an average of 1000 hours. A sample of 25 light bulbs has a mean
lifespan of 980 hours with a standard deviation of 50 hours. Test the claim at a 0.05 significance level. What
are the null and alternative hypotheses, and what is the conclusion?

Problem 5:

**Two-Sample t-Test:**

- Two different teaching methods are tested for their effectiveness on student performance. Group A (n=30) has
a mean test score of 75 with a standard deviation of 8, and Group B (n=30) has a mean test score of 70 with a
standard deviation of 10. Test if there is a significant difference between the two groups' test scores at a 0.01
significance level. What are the null and alternative hypotheses, and what is the conclusion?

For Private Circulation Only


Problem 6:

**Chi-Square Test for Independence:**

- A survey is conducted to examine the relationship between gender (male, female) and preference for a new
product (like, dislike). The following results are obtained:

Gender/Product Like Dislike


Male 30 10
Female 25 15

Test whether gender and product preference are independent at a 0.05 significance level. What are the null and
alternative hypotheses, and what is the conclusion?

Problem 7:

ANOVA (Analysis of Variance):**

- A researcher wants to compare the effectiveness of three different diets on weight loss. The weight loss (in
pounds) after 6 weeks for each diet group is as follows:

- Diet A: [3, 4, 2, 5, 4]

- Diet B: [6, 7, 5, 6, 7]

- Diet C: [8, 9, 8, 10, 9]

Perform an ANOVA test at a 0.05 significance level to determine if there are significant differences in weight
loss between the three diet groups. What are the null and alternative hypotheses, and what is the conclusion?

Problem 8:

**Paired Sample t-Test:**

- A health study measures blood pressure before and after a treatment on 12 subjects. The blood pressure
readings before treatment are:

[ [120, 115, 130, 125, 140, 135, 125, 130, 120, 110, 140, 125] \] And the readings

after treatment are:

[ [115, 110, 125, 120, 130, 125, 120, 125, 115, 105, 130, 120] \]

For Private Circulation Only


Test if there is a significant change in blood pressure due to the treatment at a 0.05 significance level. What
are the null and alternative hypotheses, and what is the conclusion?

Problem 9:

**Z-Test for Proportions:**

- In a survey, 60 out of 200 respondents reported they prefer online shopping over in-store shopping. The
company claims that 30% of the population prefers online shopping. Test the company's claim at a 0.05
significance level. What are the null and alternative hypotheses, and what is the conclusion?

Problem 10:

**Regression Coefficient Significance:**

- In a simple linear regression analysis, the estimated regression equation is \( Y = 2 + 3X \). The standard error of
the slope coefficient (3) is 0.5. Test the significance of the slope at a 0.05 significance level. What are the null
and alternative hypotheses, and what is the conclusion?

Problem 11:

Normal Distribution

1. Question:

o The heights of adult women in a certain city are normally distributed with a mean of 65 inches and
a standard deviation of 3 inches. What is the probability that a randomly selected woman from this
city is taller than 68 inches?

Problem 12:

Poisson Distribution

A call center receives an average of 4 calls per hour. What is the probability that exactly 3 calls are received in a given
hour?

Problem 13:

Exponential Distribution

The lifespan of a certain type of battery is exponentially distributed with a mean lifespan of 500 hours. What is the
probability that a battery lasts more than 600 hours?

For Private Circulation Only


Problem 14:

Bernoulli Distribution

A factory produces light bulbs, and each light bulb has a 90% chance of passing the quality control test. What is the
probability that a single randomly selected light bulb passes the test?

Problem 15:

Binomial Distribution

In a factory, 5% of items are defective. If a quality inspector randomly selects 10 items, what is the probability that
exactly 2 of them are defective?

Problem 16:

Uniform Distribution

An employee’s work shift starts at a random time uniformly distributed between 9 AM and 5 PM. What is the
probability that the shift starts after 2 PM?

Module 3 : Statistical Tools and Usage (Jupyter notebook /


R programming)
Jupyter Notebook (Python)

1. **Basic Arithmetic and Functions:**


How do you calculate the mean of a list of numbers `[10, 20, 30, 40, 50]` in a Jupyter Notebook using Python? Write
the code to perform this calculation.

2. **Using Libraries:**

Import the `numpy` library in a Jupyter Notebook and create a NumPy array with the values `[1, 2, 3, 4, 5]`. How do
you find the standard deviation of this array using `numpy`?

3. **Data Analysis with Pandas:**

Load the following dataset into a Pandas DataFrame in a Jupyter Notebook:

For Private Circulation Only


data = {'Name': ['Alice', 'Bob', 'Charlie'],

'Age': [25, 30, 35]}

Display the Data Frame and calculate the average age.

4. **Plotting with Matplotlib:**

Create a simple line plot using Matplotlib in a Jupyter Notebook.

Plot the points (1, 2), `(2, 4)`, and `(3, 6)`.

5. **Basic Data Frame Operations:**

Given the following Pandas Data Frame:

import pandas as pd df = pd.Data Frame({'A': [1, 2, 3], 'B':

[4, 5, 6]})

What is the code to calculate the sum of each column?

R Programming

1. **Basic Arithmetic and Functions:**

How do you calculate the mean of a vector `c(10, 20, 30, 40, 50)` in R? Write the code to perform this calculation.

2. **Using Libraries:**

Install and load the `ggplot2` library in R. Write the code to create a simple scatter plot of `x = c(1, 2, 3)` and `y = c(2,
4, 6)` using `ggplot2`.

3. **Data Analysis with Data Frames:**

Create a data frame in R with the following data:

For Private Circulation Only


Name <- c("Alice", "Bob", "Charlie") Age <- c(25,

30, 35) df <- data.frame(Name, Age)

Display the data frame and calculate the average age.

4. **Plotting with Base R:**

Create a bar plot in R for the vector `c(5, 10, 15)` with the labels `c("A", "B", "C")`.

5. **Basic Data Frame Operations:**

Given the following data frame in R:

df <- data.frame(A = c(1, 2, 3), B = c(4, 5, 6))

What is the code to calculate the sum of each column?

Data Visualisation using Python / R

1. **Bar Plot:**

Create a bar plot in Jupyter Notebook using Matplotlib for the following

data: categories = ['A', 'B', 'C'] values = [10, 15, 7]

Label the x-axis as "Categories" and the y-axis as "Values". What is the code to generate this bar plot?

2. **Histogram:**

Generate a histogram of the following data in Jupyter Notebook using Matplotlib:

data = [5, 7, 8, 5, 6, 7, 9, 10, 6, 7]

Use 5 bins and label the x-axis as "Value" and the y-axis as "Frequency". What is the code?

3. **Pie Chart:**

Create a pie chart in Jupyter Notebook using Matplotlib for the following data:

sizes = [20, 30, 50] labels = ['A', 'B','C']

Add a title "Distribution of Categories". What is the code to generate this pie chart?
For Private Circulation Only
4. **Box Plot:**

Using the following data, create a box plot in Jupyter Notebook with Matplotlib:

data = [5, 7, 8, 6, 5, 7, 9, 10, 6, 7]

Label the y-axis as "Values". What is the code to generate this box plot?

Module 4 : Business Requirements Analysis


Gathering Business Requirements (Solve any TWO)

1. Question:

o You are working on a new software project and need to gather requirements from various
stakeholders. What is one common method you could use to collect detailed business
requirements, and how does it help ensure that all stakeholder needs are considered?

2. Question:

o During a project kickoff meeting, you are using interviews to gather business requirements. What
are two key questions you might ask a stakeholder to understand their needs and expectations
from the project?

3. Question:

o You decide to use surveys to collect business requirements from a large group of stakeholders.
What is one advantage of using surveys over interviews for gathering requirements, and what is one
potential limitation?

Mapping Requirements to Team Capabilities(Solve any TWO)

1. Question:

o After gathering business requirements, you need to map these requirements to your delivery team's
capabilities. What is one approach you can use to assess whether your team has the necessary skills
to meet the requirements?

2. Question:

o You have identified a requirement that requires advanced data analytics capabilities, but your
delivery team is currently lacking expertise in this area. What is one approach you could take
to address this skills gap before proceeding with the project?

For Private Circulation Only


3. Question:

o When mapping requirements to team capabilities, you find that some requirements might be
beyond your team's current technical skills. What is one strategy you could use to handle
these requirements while keeping the project on track?

Module 5: Importing Data


1. For a project, you are tasked with capturing sales data from an online store and customer feedback from
social media. Outline a method for collecting these two types of data and explain any potential challenges
you might face.

2. A dataset is stored in an Excel file with multiple sheets, and another dataset is available in a cloud-based SQL
database. Describe how you would import these datasets into a Pandas DataFrame in Python and an R data
frame. Include any relevant libraries or functions.

3. You have data stored in a public CSV file available online and another dataset in a private SQL database.
Demonstrate the steps to import these datasets into a data frame in Python using Pandas and R using
readr. What code or functions would you use?

4. You are preparing data for analysis and need to organize and map metadata to understand the context
and structure of your data better. Explain the process of mapping metadata for a dataset that includes
columns like "Date", "Sales", and "Region". How would you document the metadata to support data
analysis?

5. You are performing data profiling on a dataset to assess its quality. The dataset contains columns such as
"Customer ID", "Purchase Amount", and "Transaction Date". Describe the steps you would take to
evaluate the quality of this data and identify any potential issues, such as missing values or inconsistencies.

Module 6: Pre-processing Data

1. Analyzing Unprocessed Data for Anomalies

You have a DataFrame df with columns Name, Age, and Salary. The Age column should be numeric, but some entries
are text, and the Salary column contains missing values. Write a Python script using Pandas to identify:

 Any missing values in the Salary column.

 Incorrect data types in the Age column.

o What code would you use to discover these anomalies?

For Private Circulation Only


2. Cleaning Unprocessed Data

Given the Data Frame df with columns Age (which has some non-numeric values) and Salary (which has missing
values), apply the following cleaning steps:

 Convert the Age column to numeric values, coercing errors to NaN.

 Remove rows where the Salary column is missing.

o Write a Python script to perform these data cleaning tasks using Pandas. What code would you use?

3. Normalizing Datasets and Validation

You have a Data Frame df with columns Height (in cm) and Weight (in kg). To prepare the data for analysis, you
need to normalize these columns to a range between 0 and 1 using Min-Max scaling. After normalization, perform a
basic validation to ensure that the normalized values fall within the expected range.

o Write a Python script using Scikit-learn and Pandas to normalize the data and validate the
results. What code would you use?

Module 7: Exploring Data

Dimension Reduction Techniques

1. Question:

o You have a dataset df with multiple features. Apply Principal Component Analysis
(PCA) to reduce the dimensionality of the dataset to 2 principal components. Write a Python
script using Scikit-learn to perform PCA and display the variance explained by each principal
component. What code would you use?

2. Question:

o Using the same dataset df, apply Linear Discriminant Analysis (LDA) to reduce the dimensionality
of the dataset to 2 components. Assume that df has a target variable target for classification. Write
a Python script using Scikit-learn to perform LDA and project the data onto the 2 components.
What code would you use?

3. Question:

o You have a dataset df with non-negative values. Apply Non-negative Matrix


Factorization (NMF) to reduce the dimensionality of the dataset to 3 components. Write a Python
script using Scikit-learn to perform NMF and extract the resulting components. What code would
you use?

For Private Circulation Only


Evaluating Correlations Using Graphical Techniques

4. Question:

o After applying dimension reduction techniques to your dataset, you want to evaluate the
correlations between different data points. Create a scatter plot of the first two principal
components obtained from PCA. Write a Python script using Matplotlib to create the scatter plot
and label the axes as "Principal Component 1" and "Principal Component 2". What code would
you use?

5. Question:

o Using the reduced dataset from PCA or LDA, perform clustering using K-means and visualize the
clusters in a scatter plot. Write a Python script using Scikit-learn and Matplotlib to apply K-means
clustering with 3 clusters and plot the results. What code would you use to create this scatter plot
with cluster centroids and labels?

Module 8: Creating Visualizations


Representing Outcomes through Visualizations

1 Question:

o Create a visualization to represent the results of a data analysis using Python. The dataset includes
columns for Category, Sales, and Profit. Use Matplotlib or Seaborn to create a bar chart that shows
the total sales and profit for each category. What code would you use to generate this
visualization?

2 Question:

o You need to present a sales performance dashboard using Tableau. Describe the steps you
would take to:

 Import a dataset into Tableau.

 Create a line chart to show the trend of sales over time.

 Add filters to allow users to view sales data by different regions.

o What are the key actions and settings you would use to build this dashboard? Performing Version

Control and Maintaining Reports

3 Question:

o You are managing multiple versions of a report and need to use version control. Explain how
you would use Git to:

 Initialize a repository for your report documents.

 Commit changes to a report after each update.


For Private Circulation Only
 Push the changes to a remote repository.

o What commands would you use to accomplish these tasks?

4 Question:

o You want to maintain your data analysis reports in a knowledge base for easy access and
collaboration. Describe the process of setting up a knowledge base using a platform like Confluence
or SharePoint. Include steps for:

 Creating a space or folder for your reports.

 Uploading and organizing your reports.

 Ensuring that team members can access and contribute to the reports.

o What key features would you use to manage and share the reports effectively?

Module 9: Manage and plan work requirements


In order to maintain an organized work area and apply effective time management principles, you need to manage
your daily tasks efficiently. Outline a strategy you would use to:

 Organize your workspace to maximize productivity.


 Prioritize your tasks and manage your time effectively.
 Use any tools or techniques to track and complete your tasks on time.
 How would you measure the effectiveness of your time management and organization strategies?

Module 10: Communication and collaboration with colleagues


1. Question:

o You need to present a project update to your team using oral, written, and nonverbal
communication skills. Describe how you would:

 Prepare and deliver an effective oral presentation.

 Create a written report summarizing the project update.

 Use non-verbal communication techniques to enhance your message during


the presentation.

o How would you ensure that your communication is clear and effectively conveys your thoughts and
ideas?

For Private Circulation Only


2. Question:

o As a team leader, you are responsible for demonstrating professional behavior and providing
effective mentorship to your team. Explain how you would:

 Model professional behavior in the workplace.


 Mentor a junior team member to help them improve their skills and performance.
 Foster a collaborative and supportive team environment.
o What specific actions would you take to ensure that your mentorship is effective and contributes to
the overall success of the team?

Module 11: Workplace data management


• You are tasked with analyzing customer data stored in a CRM database. Describe how you would:

Perform rule-based analysis to extract meaningful insights from the data. Format the data into the required
types/forms for analysis. Identify any anomalies in the data, such as missing or inconsistent entries. Evaluate
the information and knowledge management systems used to store and retrieve the data. Apply information
confidentiality guidelines to ensure the data is handled securely.

• Use the CRM database to record new information and extract existing customer information for your analysis.

• What steps and tools would you use to complete this task efficiently and securely?

Module 12: Relationship management at the workplace


Describe a scenario where a conflict arises between two business units within an organization. Outline two different
approaches to manage and resolve this conflict. Additionally, demonstrate two methods to build and maintain
healthy relationships across these business units.

Module 13: Client relationship management


Imagine you are the project manager for a new software development project.

1. Gathering Client Requirements:

o Describe two methods you would use to gather requirements from the client. Explain why you chose
these methods and how they would help ensure a thorough understanding of the client's needs.

2. Managing Client Expectations:

o Outline two approaches you would take to manage client expectations, including how you would
prioritize their requirements and set performance expectations. Provide examples of how these
approaches would be implemented in the project.

3. Effective Communication and Relationship Building:

o Demonstrate how you would maintain effective communication and build a good working
relationship with the client throughout the project. Provide two specific strategies or techniques you
would use and explain why they are important.

For Private Circulation Only


Module 14: Persuasive Communication
Question:

You are leading a team tasked with implementing a new project management software in your organization. Some
team members are resistant to change and skeptical about the new software's benefits.

Using Evidence to Support Arguments:

Present an argument in favor of the new software and provide at least two pieces of evidence to support your
argument. Explain why this evidence is compelling and how it addresses the team's concerns.

Framing Goals by Finding Common Ground:

Describe how you would frame the goal of implementing the new software in a way that finds common ground with
your skeptical team members. What shared objectives or benefits would you highlight to gain their support?

Applying Visual and Verbal Communication Techniques:

Demonstrate how you would use both visual and verbal communication techniques to influence your team's
perspectives and encourage them to embrace the new software. Provide specific examples of the techniques you
would use and explain their effectiveness.

Module 15: Inclusive and environmentally sustainable workplaces


You are responsible for improving sustainability practices and promoting inclusivity within your organization.

1. Segregation of Waste:

o Describe the steps you would take to practice the segregation of recyclable, non-recyclable, and
hazardous waste in your office. Provide specific examples of how you would ensure compliance
among employees.

2. Energy Resource Use Optimization and Conservation:

o Demonstrate two different methods you would implement to optimize and conserve energy
resources in the workplace. Explain how these methods contribute to overall energy efficiency and
sustainability.

3. Inclusive Communication:

o Demonstrate how you would communicate essential information in a manner that is inclusive of all
genders and sensitive to persons with disabilities (PwD). Provide specific examples of communication
techniques or tools you would use to ensure everyone is informed and included.

************************

For Private Circulation Only

You might also like