0% found this document useful (0 votes)
23 views32 pages

Internship Report 2

The internship report details the development of an interactive Drug Sales Report Dashboard using Microsoft Power BI, aimed at enhancing the analysis and visualization of pharmaceutical sales data. It highlights the limitations of traditional reporting methods and presents a modern solution that includes features like real-time filtering and interactive visualizations. The report also outlines the objectives, methodologies, and technologies used during the internship at Sumago Infotech Pvt. Ltd., Pune.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views32 pages

Internship Report 2

The internship report details the development of an interactive Drug Sales Report Dashboard using Microsoft Power BI, aimed at enhancing the analysis and visualization of pharmaceutical sales data. It highlights the limitations of traditional reporting methods and presents a modern solution that includes features like real-time filtering and interactive visualizations. The report also outlines the objectives, methodologies, and technologies used during the internship at Sumago Infotech Pvt. Ltd., Pune.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

INTERNSHIP REPORT

A report submitted in partial fulfillment of the requirements for the Award of


Degree of

BACHELOR OF ENGINEERING
in
COMPUTER ENGINEERING
by
Bhure Sangmeshwar Dattatray
Under Supervision of
Mr. Amol Pawar
Sumago Infotech Pvt. Ltd.,Pune
(Duration: 26th December, 2024 to 06th February 2025)

DEPARTMENT OF COMPUTER ENGINEERING

Jaihind College of Engineering, Kuran


A/p-Kuran, Tal-Junnar, Dist-Pune-410511, State Maharashtra,India
2024-2025
DEPARTMENT OF COMPUTER ENGINEERING
Jaihind College of Engineering, Kuran
A/p-Kuran, Tal-Junnar, Dist-Pune-410511, State Maharashtra,India

CERTIFICATE
............

This is to certify that the “Internship Report” submitted by Bhure Sangmeshwar Dattatray is
work done by her and submitted during 2024 – 2025 academic year, in partial fulfillment of the
requirements for the award of the degree of BACHELOR OF ENGINEERING in COMPUTER
ENGINEERING,at Sumago Infotech Pvt. Ltd. Pune .

Prof.Mande P.A. Prof.Mandlik S.Y.


Internship Guide Internship Coordinator
JCOE, Kuran. JCOE, Kuran.

Dr.A. A. Khatri Dr.D. J. Garkal


HOD Principal
JCOE, Kuran. JCOE, Kuran.
ACKNOWLEDGEMENT

First I would like to thank Mr. Amol Pawar , HR, Head, of Sumago Infotech Pvt.Ltd,Pune
for giving me the opportunity to do an internship within the organization.
I also would like all the people that worked along with Sumago Infotech Pvt.Ltd,Pune with
their patience and openness they created an enjoyable working environment.
It is indeed with a great sense of pleasure and immense sense of gratitude that I acknowledge
the help of these individuals.
I am highly indebted to Director Dr. Galhe Sir and Principal Dr. D.J.Garkal, for the facili-
ties provided to accomplish this internship.
I would like to thank my Head of the Department Dr. A.A.Khatri for his constructive criti-
cism throughout my internship.
I would like to thank Prof. S.Y.Mandlik, for their support and advices to get and complete
internship in above said organization.
I am extremely great full to my department staff members and friends who helped me in
successful completion of this internship..

Bhure Sangmeshwar Dattatray


i
ABSTRACT

This project focuses on the development of an interactive Drug Sales Report Dashboard using
Microsoft Power BI, aimed at analyzing and visualizing pharmaceutical sales data. Traditional
methods like spreadsheets or static reports often limit analytical depth and user engagement.
This dashboard addresses those limitations by offering powerful visualizations, real-time filter-
ing, and interactive elements that help stakeholders derive meaningful business insights.
The dashboard is structured into three core pages:
Sales Average Analysis

Total Revenue Overview

Country-wise Sales Distribution

It incorporates advanced features like slicers, custom buttons, and multiple visual charts
(bar, pie, line, etc.), enabling users to explore drug sales trends across different markets and
timeframes. This project exemplifies how Business Intelligence tools can transform raw sales
data into actionable insights, benefiting pharma analysts, marketers, and strategic decision-
makers.

Organisation Information:
Since its establishment Soumago Infotech has constantly grown and expanded. With our
four year extensive research and market exposure ,we have earned reputation for delivering top
quality solutions on time within budget, resulting in long term customer relationship. While
designing and developing your product we do take care of your requirement but at the same we
suggest changes if required according to the latest technologies.
Smart Tech Software is a term of software developers working in the field of Application
development, Barcode solution,Web development, Mobile apps, E-commerce -and Educational
project development.
We specialize in Web design and Software development. If you are looking for upgrade
your website computable in mobiles and tablets, Even if you don’t have any website, then just

ii
remember us and makes your dream success. We are giving the best solution for your best value
of money..

Programs and opportunities:


This ground up approach helps us deliver not only the solution to our clients but also
add value to At the core SMART TECH SOFTWARE SOLUTION operates in three specific
domains namely Software Development, Website Design and Development and Geographic In-
formation Services. We also offer our services in building E-Commerce solutions, Search Engine
Optimization (SEO)

Methodologies:
We follow a structured methodology for our projects which starts from designing the solu-
tion to the implementation phase. Well planned Project reduces the time to deliver the project
and any additional ad-hoc costs to our clients, hence we dedicate majority of our time un-
derstanding our clients business and gather requirements. This ground up approach helps us
deliver not only the solution to our clients but also add value to your investments.

Key parts of the report:


Under each division we further provide specific industry solutions on focused domains with
cutting edge technologies.

Benefits of the Company/Institution through our report:


Under each division we further provide specific industry solution on focused domains with
cutting edge technologies. We emphasize on building relationships with our clients by delivering
projects on time and within budget.

iii
INDEX

Acknowledgement ii

Index iv

1 Learning Objectives/Internship Objectives 1

2 INTRODUCTION 5
2.1 Module Description: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 SYSTEM ANALYSIS 7

4 SOFTWARE REQUIREMENTS SPECIFICATIONS 8

5 TECHNOLOGY 9

6 Mini Project 19
6.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
6.2 Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
6.3 ProductView . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

7 SCREENSHOT 21
7.1 Home Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
7.2 Customer Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
7.3 Trend Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
7.4 Tooltip Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

8 CONCLUSION 24

BIBLIOGRAPHY 25

iv
Chapter 1

Learning Objectives/Internship Objectives

• To understand the practical application of Python and Power BI in real-world data ana-
lytics.

• An objective for this position should emphasize the skills you already possess in the area
and your interest in learning more

• To learn how to clean, manipulate, and visualize datasets.

• To gain hands-on experience with BI tools and dashboards.

• To explore interactive reporting features using slicers, filters, and custom visuals.

• Utilizing internships is a great way to build your resume and develop skills that can be
emphasized in your resume for future jobs. When you are applying for a Training Intern-
ship, make sure to highlight any special skills or talents that can make you stand apart
from the rest of the applicants so that you have an improved chance of landing the position.

1
Internship Report

2
Jaihind COE Kuran, Department of Computer Engineering - 2024-25
Internship Report

3
Jaihind COE Kuran, Department of Computer Engineering - 2024-25
Internship Report

4
Jaihind COE Kuran, Department of Computer Engineering - 2024-25
Chapter 2

INTRODUCTION

The pharmaceutical industry produces a large volume of sales data from various countries
and drug categories. A meaningful analysis of this data can offer insights into market behavior,
drug performance, and regional growth trends. However, static methods of data reporting often
make it difficult to explore such data deeply.
The Drug Sales Report Dashboard was created using Power BI, which is a powerful Business
Intelligence (BI) tool by Microsoft. This dashboard enables users to:
- Monitor drug-wise and region-wise sales performance.
- Analyze average sales across months/years.
- Visualize country-wise contributions to total revenue.
- Filter and navigate using slicers and interactive buttons.

2.1 Module Description:

1. Sales Average Module

This module is designed around the concept of trend and pattern analysis. It focuses on
the average sales of each drug across time (monthly/yearly) and region. Users can com-
pare the performance of multiple drugs using: Bar and line charts for visualizing average
sales trends. Slicers for filtering data by drug type, country, or time. Interactive KPIs that
highlight changes over time. This module helps in understanding product popularity and
identifying underperforming drugs, enabling companies to tailor their strategies effectively.

5
Internship Report

2. Total Revenue Module

The revenue analysis module is based on financial analytics and KPI tracking. It high-
lights the overall revenue generated from various drugs, countries, and time periods. Key
components include: Dynamic cards showing Total Revenue, Highest-Earning Drug, and
Top-Selling Country. Stacked bar and column charts representing revenue distribution
across categories. Trend visuals for identifying sales spikes and dips. This module sup-
ports strategic planning by providing insights into the most profitable drugs and regions.

3. Country-wise Sales Module

This module incorporates geospatial analysis and regional performance metrics. It helps
visualize how drug sales are distributed across different countries using:
Map visuals (filled and bubble maps) showing revenue per country. Region-based com-
parisons using data tables and heat maps. Interactive slicers to filter by country, drug,
or time. It supports decision-making for global supply chain optimization and targeted
marketing.

4. User Interaction and Navigation Module

Enhancing user experience and flexibility, this module integrates interactive elements such
as:
Navigation buttons for switching between pages. Slicers and filters for customizing views
based on product, date, region, etc. Tooltips and drill-through options to provide deeper
data exploration. This layer turns the dashboard into an exploratory tool rather than a
static report, promoting user-driven analysis.

5. Visual Design and Storytelling Module

Following best practices of business intelligence and data storytelling, this module ensures:
Use of consistent color schemes and intuitive layout. Visual hierarchy to emphasize im-
portant metrics. Charts that highlight trends, outliers, and key comparisons. The goal
is to communicate complex data in a clear, meaningful, and visually appealing way that
aids interpretation and action.

6
Jaihind COE Kuran, Department of Computer Engineering - 2024-25
Chapter 3

SYSTEM ANALYSIS

Requirement Analysis
Existing System:
Traditional sales reporting methods rely heavily on Excel sheets and static charts, which limit
real-time interactivity and exploration. Users are often unable to drill down into granular data
or view customized reports.

Disadvantages of the Existing Systems

1. Lack of dynamic interaction

2. Time-consuming manual updates

3. Difficult to identify patterns and trends

Proposed System

The proposed system addresses these issues through a modern, Power BI-based dash-
board that:
- Provides interactive visual reports.
- Supports real-time data exploration.
- Enhances analytical depth via intuitive UI and filters.
- To automate data updates, reducing manual reporting efforts.

7
Chapter 4

SOFTWARE REQUIREMENTS SPECIFICATIONS

System configurations
The software requirement specification can produce at the culmination of the analysis task.
The function and performance allocated to software as part of system engineering are refined
by established a complete information description, a detailed functional description, and indica-
tion of performance and design constrain, appropriate validate criteria, and other information
pertinent to requirements
Software Requirements:

• Operating system : Windows 10 Ultimate.

• Coding Language : PYTHON.

• Tools : Visual Studio Professional, Notepad++. JupyterNoteBook,Power BI

• Data Base : SQL

Hardware Requirement:

• System : Pentium IV 2.4 GHz.

• Hard Disk : 1TB.

• Ram : 4GB.

8
Chapter 5

TECHNOLOGY

PYTHON
Python is a high-level, general-purpose programming language. Its design philosophy em-
phasizes code readability with the use of significant indentation.
Python is dynamically type-checked and garbage-collected. It supports multiple program-
ming paradigms, including structured (particularly procedural), object-oriented and functional
programming. It is often described as a ”batteries included” language due to its comprehensive
standard library
Python is a multi-paradigm programming language. Object-oriented programming and
structured programming are fully supported, and many of their features support functional pro-
gramming and aspect-oriented programming (including metaprogramming and metaobjects).
Many other paradigms are supported via extensions, including design by contract and logic pro-
gramming. Python is often referred to as a ’glue language’ because it can seamlessly integrate
components written in other languages.
Python uses dynamic typing and a combination of reference counting and a cycle-detecting
garbage collector for memory management. It uses dynamic name resolution (late binding),
which binds method and variable names during program execution.
Its design offers some support for functional programming in the Lisp tradition. It has fil-
ter,mapandreduce functions; list comprehensions, dictionaries, sets, and generator expressions.
The standard library has two modules (itertools and functools) that implement functional tools
borrowed from Haskell and Standard ML.
Its core philosophy is summarized in the Zen of Python (PEP 20), which includes aphorisms
such as:

• Beautiful is better than ugly.

• Explicit is better than implicit.

• Simple is better than complex.

• Complex is better than complicated.

9
Internship Report

• Readability counts.
However, Python features regularly violate these principles and have received criticism
for adding unnecessary language bloat. Responses to these criticisms are that the Zen of
Python is a guideline rather than a rule .The addition of some new features had been so
controversial that Guido van Rossum resigned as Benevolent Dictator for Life following
vitriol over the addition of the assignment expression operator in Python 3.8
Python is meant to be an easily readable language. Its formatting is visually unclut-
tered and often uses English keywords where other languages use punctuation. Unlike
many other languages, it does not use curly brackets to delimit blocks, and semicolons
after statements are allowed but rarely used. It has fewer syntactic exceptions and special
cases than C or Pascal.

Indentation
Python uses whitespace indentation, rather than curly brackets or keywords, to de-
limit blocks. An increase in indentation comes after certain statements; a decrease in
indentation signifies the end of the current block. Thus, the program’s visual structure
accurately represents its semantic structure. This feature is sometimes termed the off-
side rule. Some other languages use indentation this way; but in most, indentation has
no semantic meaning. The recommended indent size is four spaces

Libraries of Python
Python is the language that has gained preference in data analytics due to simplic-
ity, versatility and a very powerful ecosystem of libraries. If you are dealing with large
data sets conducting statistical analysis or visualizing insights, it has a very wide range
of libraries to facilitate the process. From data manipulation using Pandas to the so-
phisticated application of machine learning through Scikit-learn, these libraries make the
extraction of meaningful insights more efficient for analysts and data scientists. This
guide highlights the 15 best Python libraries for data analytics making your data-driven
decision-making process that much easier.

10
Jaihind COE Kuran, Department of Computer Engineering - 2024-25
Internship Report

Numpy Lib:
Numpy is a general-purpose array-processing package. It provides a high-performance
multidimensional array object, and tools for working with these arrays. It is the funda-
mental package for scientific computing with Python.
Besides its obvious scientific uses, Numpy can also be used as an efficient multi-dimensional
container of generic data.
Array in Numpy is a table of elements (usually numbers), all of the same type, indexed
by a tuple of positive integers. In Numpy, number of dimensions of the array is called
rank of the array. A tuple of integers giving the size of the array along each dimension
is known as shape of the array. An array class in Numpy is called as ndarray. Elements
in Numpy arrays are accessed by using square brackets and can be initialized by using
nested Python Lists

Pandas Lib:
Pandas is a powerful and open-source Python library designed for data manipulation and
analysis. It was created by Wes McKinney in 2008 and is built on top of the NumPy
library . Pandas is well-suited for working with tabular data, such as spreadsheets or
SQL tables, and is an essential tool for data analysts, scientists, and engineers

Pandas provides two primary data structures:

1. Series: A one-dimensional labeled array capable of holding data of any type (integer,
string, float, Python objects, etc.). It is similar to a column in an Excel sheet.

2. DataFrame: A two-dimensional data structure with labeled axes (rows and columns),
similar to a table in a database or an Excel sheet

Matplot Lib:

Matplotlib is a powerful and widely-used data visualization library in Python. It allows


users to create static, interactive, and animated visualizations. Built on top of NumPy, it
is efficient for handling large datasets and provides a variety of plots like line charts, bar
charts, histograms, scatter plots, and more.
Matplotlib is a powerful and widely-used Python library for creating static, animated and
interactive data visualizations. In this article, we will provide a guide on Matplotlib and
how to use it for data visualization with practical implementation.
Pyplot is a module within Matplotlib that provides a MATLAB-like interface for making

11
Jaihind COE Kuran, Department of Computer Engineering - 2024-25
Internship Report

plots. It simplifies the process of adding plot elements such as lines, images, and text to
the axes of the current figure

Step In Pyplot:

• Import Matplotlib:Start by importing matplotlib.pyplot as plt.

• Create Data: Prepare your data in the form of lists or arrays.

• Plot Data: Use plt.plot() to create the plot.

• Customize Plot: Add titles, labels, and other elements using methods like plt.title(),
plt.xlabel(), and plt.ylabel().

• Display Plot: Use plt.show() to display the plot.

EDA(Explorartory Data Analysis)


Exploratory Data Analysis (EDA) is an important first step in data science projects. It
involves looking at and visualizing data to understand its main features, find patterns,
and discover how different parts of the data are connected.
EDA helps to spot any unusual data or outliers and is usually done before starting more
detailed statistical analysis or building models. In this article, we will discuss what is
Exploratory Data Analysis (EDA) and the steps to perform EDA.
Why Exploratory Data Analysis is Important?
Exploratory Data Analysis (EDA) is important for several reasons, especially in the con-
text of data science and statistical modeling. Here are some of the key reasons why EDA
is a critical step in the data analysis process:
Helps to understand the dataset, showing how many features there are, the type of data
in each feature, and how the data is spread out, which helps in choosing the right methods
for analysis.
EDA helps to identify hidden patterns and relationships between different data points,
which help us in and model building. Allows to spot errors or unusual data points (outliers)
that could affect your results. Insights that you obtain from EDA help you decide which
features are most important for building models and how to prepare them to improve
performance. By understanding the data, EDA helps us in choosing the best modeling
techniques and adjusting them for better results. Types of Exploratory Data Analysis
There are various sorts of EDA strategies based on nature of the records. Depending on

12
Jaihind COE Kuran, Department of Computer Engineering - 2024-25
Internship Report

the number of columns we are analyzing we can divide EDA into three types: Univariate,
bivariate and multivariate.

1. Univariate Analysis

Univariate analysis focuses on studying one variable to understand its characteristics. It


helps describe the data and find patterns within a single feature. Common methods in-
clude histograms to show data distribution, box plots to detect outliers and understand
data spread, and bar charts for categorical data. Summary statistics like mean, median,
mode, variance, and standard deviation help describe the central tendency and spread of
the data.

2. Bivariate Analysis

Bivariate analysis focuses on exploring the relationship between two variables to find
connections, correlations, and dependencies. It’s an important part of exploratory data
analysis that helps understand how two variables interact. Some key techniques used in
bivariate analysis include scatter plots, which visualize the relationship between two con-
tinuous variables; correlation coefficient, which measures how strongly two variables are re-
lated, commonly using Pearson’s correlation for linear relationships; and cross-tabulation,
or contingency tables, which show the frequency distribution of two categorical variables
and help understand their relationship.
Line graphs are useful for comparing two variables over time, especially in time series
data, to identify trends or patterns. Covariance measures how two variables change to-
gether, though it’s often supplemented by the correlation coefficient for a clearer, more
standardized view of the relationship.

3. Multivariate Analysis

Multivariate analysis examines the relationships between two or more variables in the
dataset. It aims to understand how variables interact with one another, which is crucial
for most statistical modeling techniques. It include Techniques like pair plots, which show
the relationships between multiple variables at once, helping to see how they interact. An-
other technique is Principal Component Analysis (PCA), which reduces the complexity of
large datasets by simplifying them, while keeping the most important information. Steps
for Performing Exploratory Data Analysis Performing Exploratory Data Analysis (EDA)

13
Jaihind COE Kuran, Department of Computer Engineering - 2024-25
Internship Report

involves a series of steps designed to help you understand the data you’re working with,
uncover underlying patterns, identify anomalies, test hypotheses, and ensure the data is
clean and suitable for further analysis.

Step 1: Understand the Problem and the Data

The first step in any data analysis project is to clearly understand the problem you’re
trying to solve and the data you have. This involves asking key questions such as:

1. What is the business goal or research question?

2. What are the variables in the data and what do they represent?

3. What types of data (numerical, categorical, text, etc.) do you have?

4. Are there any known data quality issues or limitations?

5. Are there any domain-specific concerns or restrictions? By thoroughly understanding the


problem and the data, you can better plan your analysis, avoid wrong assumptions, and
ensure accurate conclusions.

Step 2: Import and Inspect the Data

After clearly understanding the problem and the data, the next step is to import the data
into your analysis environment (like Python, R, or a spreadsheet tool). At this stage, it’s
crucial to examine the data to get an initial understanding of its structure, variable types,
and potential issues.
Here’s what you can do: Load the data into your environment carefully to avoid errors or
truncations. Examine the size of the data (number of rows and columns) to understand
its complexity. Check for missing values and see how they are distributed across vari-
ables, since missing data can impact the quality of your analysis. Identify data types for
each variable (like numerical, categorical, etc.), which will help in the next steps of data
manipulation and analysis. Look for errors or inconsistencies, such as invalid values, mis-
matched units, or outliers, which could signal deeper issues with the data. By completing
these tasks, you’ll be prepared to clean and analyze the data more effectively.

14
Jaihind COE Kuran, Department of Computer Engineering - 2024-25
Internship Report

Step 3: Handle Missing Data

Missing data is common in many datasets and can significantly affect the quality of your
analysis. During Exploratory Data Analysis (EDA), it’s important to identify and handle
missing data properly to avoid biased or misleading results.
Here’s how to handle it:
1.Understand the patterns and possible reasons for missing data. Is it missing completely
at random (MCAR), missing at random (MAR), or missing not at random (MNAR)?
Knowing this helps decide how to handle the missing data.
2.Decide whether to remove missing data (listwise deletion) or impute (fill in) the missing
values. Removing data can lead to biased outcomes, especially if the missing data isn’t
MCAR.
3.Imputing values helps preserve data but should be done carefully.
4.Use appropriate imputation methods like mean/median imputation, regression impu-
tation, or machine learning techniques like KNN or decision trees based on the data’s
characteristics.
5.Consider the impact of missing data. Even after imputing, missing data can cause un-
certainty and bias, so interpret the results with caution.
6.Properly handling missing data improves the accuracy of your analysis and prevents
misleading conclusions.

Step 4: Explore Data Characteristics

After addressing missing data, the next step in EDA is to explore the characteristics of
your data by examining the distribution, central tendency, and variability of your vari-
ables, as well as identifying any outliers or anomalies. This helps in selecting appropriate
analysis methods and spotting potential data issues. You should calculate summary statis-
tics like mean, median, mode, standard deviation, skewness, and kurtosis for numerical
variables. These provide an overview of the data’s distribution and help identify any ir-
regular patterns or issues.

15
Jaihind COE Kuran, Department of Computer Engineering - 2024-25
Internship Report

Step 5: Perform Data Transformation

Data transformation is an essential step in EDA because it prepares your data for accu-
rate analysis and modeling. Depending on your data’s characteristics and analysis needs,
you may need to transform it to ensure it’s in the right format. Common transformation
techniques include: 1.Scaling or normalizing numerical variables (e.g., min-max scaling
or standardization). 2.Encoding categorical variables for machine learning (e.g., one-hot
encoding or label encoding). 3.Applying mathematical transformations (e.g., logarithmic
or square root) to correct skewness or non-linearity. 4.Creating new variables from exist-
ing ones (e.g., calculating ratios or combining variables). 5.Aggregating or grouping data
based on specific variables or conditions.
Step 6: Visualize Data Relationship

Visualization is a powerful tool in the EDA process, helping to uncover relationships be-
tween variables and identify patterns or trends that may not be obvious from summary
statistics alone. For categorical variables, create frequency tables, bar plots, and pie charts
to understand the distribution of categories and identify imbalances or unusual patterns.
For numerical variables, generate histograms, box plots, violin plots, and density plots
to visualize distribution, shape, spread, and potential outliers. To explore relationships
between variables, use scatter plots, correlation matrices, or statistical tests like Pearson’s
correlation coefficient or Spearman’s rank correlation

Step 7: Handling Outliers

Outliers are data points that significantly differ from the rest of the data, often caused
by errors in measurement or data entry. Detecting and handling outliers is important
because they can skew your analysis and affect model performance. You can identify
outliers using methods like interquartile range (IQR), Z-scores, or domain-specific rules.
Once identified, outliers can be removed or adjusted depending on the context. Properly
managing outliers ensures your analysis is accurate and reliable.

16
Jaihind COE Kuran, Department of Computer Engineering - 2024-25
Internship Report

Step 8: Communicate Findings and Insights

The final step in EDA is to communicate your findings clearly. This involves summa-
rizing your analysis, pointing out key discoveries, and presenting your results in a clear
and engaging way. Clearly state the goals and scope of your analysis. Provide context
and background to help others understand your approach. Use visualizations to support
your findings and make them easier to understand. Highlight key insights, patterns, or
anomalies discovered. Mention any limitations or challenges faced during the analysis.
Suggest next steps or areas that need further investigation.

17
Jaihind COE Kuran, Department of Computer Engineering - 2024-25
Internship Report

Steps-for-Performing-Exploratory-Data-Analysis

18
Jaihind COE Kuran, Department of Computer Engineering - 2024-25
Chapter 6

Mini Project

AMAZON DASHBORAD IN POWER BI

6.1 Overview

19
Internship Report

6.2 Product

6.3 ProductView

20
Jaihind COE Kuran, Department of Computer Engineering - 2024-25
Chapter 7

SCREENSHOT

7.1 Home Page

21
Internship Report

7.2 Customer Page

7.3 Trend Page

22
Jaihind COE Kuran, Department of Computer Engineering - 2024-25
Internship Report

7.4 Tooltip Page

23
Jaihind COE Kuran, Department of Computer Engineering - 2024-25
Chapter 8

CONCLUSION

The development of the Drug Sales Report Dashboard using Microsoft


Power BI represents a major evolution from traditional, static sales report-
ing to a modern, dynamic, and interactive business intelligence solution. By
harnessing Power BI’s powerful data visualization capabilities, the dashboard
offers real-time insights, customizable views, and deep analytical exploration of
pharmaceutical sales data across various regions and timeframes. This system
addresses the key limitations of conventional reporting methods by enhancing
data accessibility, automating performance tracking, and simplifying complex
datasets through intuitive visual storytelling. It enables users—from business
analysts and pharmaceutical sales managers to marketing strategists—to ex-
plore, filter, and interpret sales data with ease. With features such as slicers,
drill-throughs, interactive charts, and KPIs, the dashboard improves user en-
gagement and supports effective decision-making. It not only helps identify
trends and outliers but also empowers stakeholders to act on data-driven in-
sights for improving drug performance, regional sales strategy, and business
growth. Overall, the Drug Sales Report Dashboard demonstrates how mod-
ern BI tools like Power BI can transform raw sales data into actionable knowl-
edge—enhancing operational efficiency, competitive intelligence, and strategic
planning in the pharmaceutical sector.

24
Bibliography

• Kaggle Open Datasets for Sales and Marketing.


https://www.kaggle.com/datasets. Accessed 10 Apr. 2025.

• Few, Stephen. Now You See It: Simple Visualization Techniques for Quanti-
tative Analysis. Analytics Press, 2009.

• McKinney, Wes. Python for Data Analysis: Data Wrangling with Pandas,
NumPy, and IPython. O’Reilly Media, 2017.

• Microsoft Power BI Documentation. Microsoft Docs.


https://learn.microsoft.com/en-us/power-bi/. Accessed 10 Apr. 2025.

• Python Official Documentation. Python Software Foundation.


https://docs.python.org/3/. Accessed 10 Apr. 2025.

25

You might also like