0% found this document useful (0 votes)
25 views54 pages

Data Visualisation

Uploaded by

Mohit Aggarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views54 pages

Data Visualisation

Uploaded by

Mohit Aggarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 54

SUMMER TRAINING PROJECT REPORT

ON
“DATA VISUALISATION”

SUBMITTED IN PARTIAL FULLFILLMENT OF REQUIREMENT OF


BACHELOR OF COMMERCE (HONORS) DEGREE

SUBMITTED TO:

INTERNAL GUIDE EXTERNAL GUIDE

ASSISTANT PROFESSOR
UNISON UNIVERSITY
DEHRADUN

SUBMITTED BY:

IMS UNISON UNIVERSITY, DEHRADUN


BATCH 2021-2024
INTERNAL GUIDE CERTIFICATE

I have the pleasure in certifying that Ms. is a bonafide student of 5 th Semester of the B.Com
(H)’s Degree (Batch 2021-24), of IMS Unison University, Dehradun, Roll No.
IUU21BCO052.

He/She has completed his/her project work entitled “DATA VISUALISATION ” under my
guidance.

I certify that this is his/her original effort & has not been copied from any other source. This
project has also not been submitted in any other Institute / University for the purpose of
award of any Degree.

This project fulfils the requirement of the curriculum prescribed by this Institute for the said
course. I recommend this project work for evaluation & consideration for the award of
Degree to the student.

Signature : ……………………………………
Name of the Guide : DR SAURABH SINGH
Designation : Assistant Professor, School of Management
Date : ……………………………………
DECLARATION
I humbly declare that this report entitled “DATA VISUALISATION”
submitted in partial fulfilment of the requirement for the degree of
B.Com(H) at IMS Unison University, Dehradun is based on the original
work, carried by me and no part of it has been presented or published
previously for any higher degree/diploma.

It is also declared that this report has been prepared for academic purposes
alone and has not been/will not be submitted elsewhere for any other
purposes.

Date:
Signature
Name:
ACKNOWLEDGEMENT

I would like to express my special thanks of gratitude to my external guide. MR.


SIDDHANT ARORA for his valuable guidance and encouragement during the entire
training period. I am thankful to IMS Unison University Dehradun for giving me this
opportunity and my faculty mentor Dr. SAURABH SINGH , Assistant Professor for
his aspiring guidance in different matter regarding the topic, valuable moral support,
constructive criticism, and friendly advice during the internship. I am grateful to them
for sharing their truthful and illuminating views on number of issues related to project.

At last, I would like to extend my deep gratitude towards my family and all of my
friends for their Cooperation, Inspiration, Guidance and support during all stages of
the preparation of this report and help me to ride over the difficulties I had during my
project work.

Finally, thanks to all for your encouragement and continuous support


EXECUTIVE SUMMARY

1. TITLE PAGE

2. COMPANY TRAINING CERTIFICATE.( SCANNED/PHOTOCOPY)

3. INTERNAL GUIDE CERTIFICATE

4. DECLARATION

5. ACKNOWLEDGEMENT

6. EXECUTIVE SUMMARY

7. LIST OF CONTENTS.(MENTIONING PAGE NUMBERS)

8. INTRODUCTION

9. RESEARCH METHODOLOGY

10. DATA ANALYSIS AND INTERPRETATION

11. FINDINGS

12. CONCLUSION

13. SUGGESTIONS

14. BIBLIOGRAPHY OR REFERENCES


15. ANNEXURE (E.G. QUESTIONNAIRE, ADDRESSES OF CONTACTED
PERSONS, COMPANY LITERATURE, PRODUCT LITERATURE ETC.)
DATA VISUALISATION

1. Estimating parameters:
This means taking a statistic from the sample data (for example the
sample mean) and using it to infer about a population parameter
(i.e. the population mean).There may be sampling variations
because of chance fluctuations, variations in sampling techniques,
and other sampling errors. Estimation about population
characteristics may be influenced by such factors. Therefore, in
estimation the important point is that to what extent our estimate is
close to the true value.
Characteristics of Good Estimator: A good statistical estimator
should have the following characteristics, (i) Unbiased (ii)
Consistent (iii) Accuracy
i) Unbiased
An unbiased estimator is one in which, if we were to obtain an
infinite number of random samples of a certain size, the mean of the
statistic would be equal to the parameter. The sample mean, ( x ) is
an unbiased estimate of population mean (μ)because if we look at
possible random samples of size N from a population, then mean of
the sample would be equal to μ.
ii) Consistent
A consistent estimator is one that as the sample size increased, the
probability that estimate has a value close to the parameter also
increased. Because it is a consistent estimator, a sample mean based
on 20 scores has a greater probability of being closer to (μ) than
does a sample mean based upon only 5 scores
iii) Accuracy
The sample mean is an unbiased and consistent estimator of
population mean (μ).But we should not over look the fact that an
estimate is just a rough or approximate calculation. It is unlikely in
any estimate that ( x ) will be exactly equal to population mean (μ).
Whether or not x is a good estimate of (μ) depends upon the
representativeness of sample, the sample size, and the variability of
scores in the population.

2. Hypothesis tests. This is where sample data can be used to


answer research questions. For example, we might be interested
in knowing if a new cancer drug is effective. Or if breakfast
helps children perform better in schools.
Inferential statistics is closely tied to the logic of hypothesis testing.
We hypothesize that this value characterise the population of
observations. The question is whether that hypothesis is reasonable
evidence from the sample. Sometimes hypothesis testing is referred
to as statistical decision-making process. In day-to-day situations
we are required to take decisions about the population on the basis
of sample information.

2.6.1 Statement of Hypothesis


A statistical hypothesis is defined as a statement, which may or may
not be true about the population parameter or about the probability
distribution of the parameter that we wish to validate on the basis of
sample information. Most times, experiments are performed with
random samples instead of the entire population and inferences
drawn from the observed results are then generalised over to the
entire population. But before drawing inferences about the
population it should be always kept in mind that the observed
results might have come due to chance factor. In order to have an
accurate or more precise inference, the chance factor should be
ruled out.
Null Hypothesis
The probability of chance occurrence of the observed results is
examined by the null hypothesis (H0 ). Null hypothesis is a
statement of no differences. The other way to state null hypothesis
is that the two samples came from the same population. Here, we
assume that population is normally distributed and both the groups
have equal means and standard deviations.
Since the null hypothesis is a testable proposition, there is counter
proposition to it known as alternative hypothesis and denoted by
H1 . In contrast to null hypothesis, the alternative hypothesis (H1)
proposes that
i) the two samples belong to two different populations,
ii) their means are estimates of two different
parametric means of the respective population, and
iii) there is a significant difference between their sample means.
The alternative hypothesis (H1 ) is not directly tested statistically;
rather its acceptance or rejection is determined by the rejection or
retention of the null hypothesis. The probability ‘p’ of the null
hypothesis being correct is assessed by a statistical test. If
probability ‘p’ is too low, H0 is rejected and H1 is accepted.
It is inferred that the observed difference is significant. If
probability ‘p’ is high, H0 is accepted and it is inferred that the
difference is due to the chance factor and not due to the variable
factor.
2.6.2 Level of Significance
The level of significance is defined as the probability of rejecting a
null hypothesis by the test when it is really true, which is denoted
as α. That is, P (Type I error) = α.

Confidence level:
Confidence level refers to the possibility of a parameter that lies
within a specified range of values, which is denoted as c. Moreover,
the confidence level is connected with the level of significance. The
relationship between level of significance and the confidence level
is c=1−α. The common level of significance and the corresponding
confidence level are given below:

• The level of significance 0.10 is related to the 90% confidence level.


• The level of significance 0.05 is related to the 95% confidence level.
• The level of significance 0.01 is related to the 99% confidence level.
The rejection rule is as follows:

Rejection region:
The rejection region is the values of test statistic for which the null hypothesis is
rejected.

Non rejection region:


The set of all possible values for which the null hypothesis
is not rejected is called the rejection region.
The rejection region for two-tailed test is shown below:

The rejection region for one-tailed test is given below:


In the left-tailed test, the rejection region is shaded in left side.
In the right-tailed test, the rejection region is shaded in right side.

2.6.3 One-tail and Two-tail Test


Depending upon the statement in alternative hypothesis (H1 ),
either a one-tail or two tail test is chosen for knowing the statistical
significance. A one-tail test is a directional test. It is formulated to
find the significance of both the magnitude and the direction
(algebraic sign) of the observed difference between two statistics.
Thus, in two-tailed tests researcher is interested in testing whether
one sample mean is significantly higher (alternatively lower) than
the other sample mean.

Types of Inferential Statistics Tests

There are many tests in this field, of which some of the most important are
mentioned below.
1. Linear Regression Analysis
In this test, a linear algorithm is used to understand the relationship
between two variables from the data set. One of those variables is
the dependent variable, while there can be one or more independent
variables used. In simpler terms, we try to predict the value of the
dependent variable based on the available values of the independent
variables. This is usually represented by using a scatter plot,
although we can also use other types of graphs too.

2. Analysis of Variance
This is another statistical method which is extremely popular in
data science. It is used to test and analyse the differences between
two or more means from the data set. The significant differences
between the means are obtained, using this test.

3. Analysis of Co-variance
This is only a development on the Analysis of Variance method
and involves the inclusion of a continuous co-variance in the
calculations. A co-variate is an independent variable which is
continuous, and is used as regression variables. This method is used
extensively in statistical modelling, in order to study the differences
present between the average values of dependent variables.

4. Statistical Significance (T-Test)


A relatively simple test in inferential statistics, this is used to
compare the means of two groups and understand if they are
different from each other. The order of difference, or how
significant the differences are can be obtained from this.

5. Correlation Analysis
Another extremely useful test, this is used to understand the extent
to which two variables are dependent on each other. The strength of
any relationship, if they exist, between the two variables can be
obtained from this. You will be able to understand whether the
variables have a strong correlation or a weak one. The correlation
can also be negative or positive, depending upon the variables. A
negative correlation means that the value of one variable decreases
while the value of the other increases and positive correlation
means that the value both variables decrease or increase
simultaneously.

Differences between Descriptive and Inferential Statistics

Descriptive Statistics Inferential Statistics


Concerned with describing the target Make inferences fromthe sample and
population generalize them to the population
Organise, analyse, present the data Compare, tests and predicts future
in a meaningful way outcomes
The analysed results are in the form of The analysed results are the probability
graphs, charts etc scores
Describes the data which is already known Tries to make conclusions about the
population
beyond the data available
Tools: Measures of central tendency Tools: Hypothesis tests, analysis of
and measures of spread variance etc

Random Variables

A random variable, X, is a variable whose possible values are


numerical outcomes of a random phenomenon. There are two types
of random variables, discrete and continuous.

Example of Random variable

- A person’s blood type


- Number of leaves on a tree
- Number of times a user visits LinkedIn in a day
- Length of a tweet.

Discrete Random Variables :


A discrete random variable is one which may take on only a
countable number of distinct values such as 0,1,2,3,4,........ Discrete
random variables are usually counts. If a random variable can take
only a finite number of distinct values, then it must be discrete.
Examples of discrete random variables include the number of
children in a family, the Friday night attendance at a cinema, the
number of patients in a doctor's surgery, the number of defective
light bulbs in a box of ten.
The probability distribution of a discrete random variable is a list
of probabilities associated with each of its possible values. It is also
sometimes called the probability function or the probability mass
function
Suppose a random variable X may take k different values, with
the probability that X = xi defined to be P(X = xi) = pi. The
probabilities pi must satisfy the following:
1: 0 < pi < 1 for each i
2: p1 + p2 + ... + pk = 1.
Example

Suppose a variable X can take the


values 1, 2, 3, or 4. The
probabilities associated with each
outcome are described by the
following table:
Outcome 1 2 3 4
Probability 0.1 0.3 0.4 0.2
The probability that X is equal to 2
or 3 is the sum of the two
probabilities: P(X = 2 or X = 3) =
P(X = 2) + P(X = 3) = 0.3 +
0.4 = 0.7. Similarly, the probability
that X is greater than 1 is equal to 1
- P(X = 1) = 1 - 0.1 = 0.9, by the
complement rule.

Continuous Random Variables


A continuous random variable is one which takes an infinite
number of possible values. Continuous random variables are
usually measurements. Examples include height, weight, the
amount of sugar in an orange, the time required to run a mile.
A continuous random variable is not defined at specific values.
Instead, it is defined over an interval of values, and is represented
by the area under a curve (known as an integral). The probability
of observing any single value is equal to 0, since the number of
values which may be assumed by the random variable is infinite.
Suppose a random variable X may take all values over an interval
of real numbers. Then the probability that X is in the set of
outcomes A, P(A), is defined to be the area above A and under a
curve. The curve, which represents a function p(x), must satisfy the
following:
1: The curve has no negative values (p(x) > 0 for all x)
2: The total area under the curve is equal to 1.
A curve meeting these requirements is known as a density curve.
All random variables (discrete and continuous) have a cumulative
distribution function. It is a function giving the probability that
the random variable X is less than or equal to x, for every value x.
For a discrete random variable, the cumulative distribution
function is found by
summing up the probabilities.
Normal Probability Distribution

The Bell-Shaped Curve


The Bell-shaped Curve is commonly called the normal curve and
is mathematically referred to as the Gaussian probability
distribution. Unlike Bernoulli trials which are based on discrete
counts, the normal distribution is used to determine the
probability of a continuous random variable.

The normal or Gaussian Probability Distribution is most popular


and important because of its unique mathematical properties which
facilitate its application to practically any physical problem in the
real world. The constants μ and σ2 are the parameters;
 “μ” is the population true mean (or expected value) of the
subject phenomenon characterized by the continuous
random variable, X,
 “σ2” is the population true variance characterized by
the continuousrandom variable, X.
 Hence, “σ” the population standard deviation characterized
by the continuous random variable X;
 the points located at μ−σ and μ+σ are the points of
inflection; that is, where the graph changes from cupping
up to cupping down
The normal curve graph of the normal probability
distribution) is symmetric with respect to the mean μ as the
central position. That is, the area between μ and κ units to the

left of μ is equal to the area between μ and κ units to the right


of μ.
There is not a unique normal probability distribution. The
figure below is a graphical representation of the normal
distribution for a fixed value of σ2 with μ varying.

The figure below is a graphical representation of the normal


distribution for a fixed value of μ with varying σ2.

SAMPLING and SAMPLING DISTRIBUTION

Sampling is a process used in statistical analysis in which a


predetermined number of observations are taken from a larger
population. It helps us to make statistical inferences about the
population. A population can be defined as a whole that
includes all items and characteristics of the research taken into
study. However, gathering all this information is time
consuming and costly. We therefore make inferences about the
population with the help of samples.
Random sampling:
In data collection, every individual observation has equal
probability to be selected into a sample. In random sampling,
there should be no pattern when drawing a sample.
Probability sampling:
It is the sampling technique in which every individual unit of
the population has greater than zero probability of getting
selected into a sample.
Non-probability sampling:
It is the sampling technique in which some elements of the
population have no probability of getting selected into a
sample.
Cluster samples:
It divides the population into groups (clusters). Then a random
sample is chosen from the clusters.
Systematic sampling : select sample elements from an ordered
frame. A sampling frame is just a list of participants that we
want to get a sample from.
Stratified sampling : sample each subpopulation
independently. First, divide the population into homogeneous
(very similar) subgroups before getting the sample. Each
population member only belongs to one group. Then apply
simple random or a systematic method within each group to
choose the sample.

Sampling Distribution
A sampling distribution is a probability distribution of a statistic. It
is obtained through a large number of samples drawn from a
specific population. It is the distribution of all possible values taken
by the statistic when all possible samples of a fixed size n are taken
from the population.

Sampling Distributions and Inferential Statistics


Sampling distributions are important for inferential statistics. A
population is specified and the sampling distribution of the mean
and the range were determined. In practice, the process proceeds
the other way: the sample data is collected and from these data we
estimate parameters of the sampling distribution. This knowledge of
the sampling distribution can be very useful.
Knowing the degree to which means from different samples
would differ from each other and from the population mean
( this would give an idea of how close the particular sample
mean is likely to be to the population mean )
The most common measure of how much sample means differ
from each other is the standard deviation of the sampling
distribution of the mean. This standard deviation is called the
standard error of the mean.
If all the sample means were very close to the population mean,
then the standard error of the mean would be small. On the other
hand, if the sample means varied considerably, then the standard
error of the mean would be large.
Choose an existing object storage service instance or create a new one.

Click Create. You can start adding resources if your project is empty or begin working with
the resources you imported.

To add data files to a project:

From your project’s Assets page, click Add to project > Data or click the Find and add data
icon ().You can also click the Find and add data icon from within a notebook or canvas.

In the Load pane that opens, browse for the files or drag them onto the pane. You must stay
16
on the page until the load is complete. You can cancel an ongoing load process if you want to
stop loading a file.

Case Study:

Let us take the Iris Data set to see how we can visualize the data in Watson studio.

17
Adding Data to Data Refinery

Visualizing information in graphical ways can give you insights into your data. By enabling
you to look at and explore data from different perspectives, visualizations can help you
identify patterns, connections, and relationships within that data as well as understand large
amounts of information very quickly. You can also visualize your data with these same charts
in an SPSS Modeler flow. Right-click a node and select Profile.

To visualize your data:

From Data Refinery, click the Visualizations tab.

Start with a chart or select columns.

18
1. Click any of the available charts. Then add columns in the DETAILS panel that opens on the
left side of the page.

2. Select the columns that you want to work with. Suggested charts will be indicated with a
dot next to the chart name. Click a chart to visualize your data.

Click on refine

Click on Visualization tab:

Add the columns by selecting.

19
Adyay Technology pvt. Ltd
ADHYAY EQUI PREF PVT. LTD.
As on: 2024-06-12
Adhyay Equi Pref Pvt. Ltd. is a Private company incorporated on 06 December 1994. It is
classified as Non-government company and is registered at Registrar of Companies, Kolkata.
Its authorized share capital is Rs. 71,700,000 and its paid up capital is Rs. 71,637,620. It's
NIC code is 741 (which is part of its CIN). As per the NIC code, it is inolved in Legal,
accounting, book-keeping and auditing activities; tax consultancy; market research and public
opinion polling; business and management consultancy.

Adhyay Equi Pref .'s Annual General Meeting (AGM) was last held on N/A and as per
records from Ministry of Corporate Affairs (MCA), its balance sheet was last filed on 31
March 2015.

Directors of Adhyay Equi Pref . are MADHAB CHANDRA DAW and JAYATI
MAJUMDER.

Adhyay Equi Pref .'s Corporate Identification Number (CIN) is U74140WB1994PTC066348


and its registration number is 66348. Users may contact Adhyay Equi Pref . on its Email
address - [email protected]. Registered address of Adhyay Equi Pref . is 3
SAKLAT PLACE 1ST FLOOR , KOLKATA, West Bengal, India - 700013.

Current status of Adhyay Equi Pref Pvt. Ltd. is - Strike Off.

Basic Information

CIN U74140WB1994PTC066348

Name

Listed on Stock Exchange Unlisted

20
Company Status Strike Off

ROC ROC Kolkata

Registration Number 66348

Company Category Company limited by shares

Company Sub Category Non-government company

Class of Company Private

Date of Incorporation 06 December 1994

Age of Company 29 years, 11 month, 1 days

Activity NIC Code: 741NIC Description: Legal, accounting, book-keeping and auditing activiti

Number of Members 0

Annual Compliance Status

Date of Last Annual General Meeting

Date of Last Filed Balance Sheet

Key Numbers

Authorised Share Capital

Paid-up Share Capital

UNIT – III

Introduction to Anaconda -

Anaconda is a package manager, an environment manager, and


Python distribution that contains a collection of many open source
packages.

Anaconda Installation -

Go to the Anaconda Website and choose a Python 3.x graphical

21
installer (A) or a Python 2.x graphical installer.

When the screen below appears, click on Next

Note your installation location and then click Next.

Choose whether to add Anaconda to your PATH environment


variable. We recommend not adding Anaconda to the PATH
environment variable, since this can interfere with other software.
Instead, use Anaconda software by opening Anaconda Navigator or
the Anaconda Prompt from the Start Menu.

22
After that click on next.

Click Finish.

We need to set anaconda path to system environmental variables.

Open a Command Prompt. Check if you already have


Anaconda added to your path. Enter the commands below
into your Command Prompt.

23
Installing Anaconda Distribution will also include Jupyter Notebook.
To access the Jupyter Notebook go to anaconda prompt and run below command

Or go to Command Prompt and first activate root before launching jupyter


notebook

Then you'll see the application opening in the web browser on


the following address: http://localhost:8888.

24
A statement or expression is an instruction the computer will run or
execute. Perhaps the simplest program you can write is a print
statement. When you runthe print statement, Python will simply
display the value in the parentheses. The value in the parentheses is
called the argument.

If you are using a Jupyter notebook, you will see a small rectangle
with the statement. This is called a cell. If you select this cell with
your mouse, then click the run cell button. The statement will
execute. The result will be displayed beneath the cell.

It’s customary to comment your code. This tells other people what
your code does. You simply put a hash symbol proceeding your
comment. When you run the code, Python will ignore the comment.

Data Types

A type is how Python represents different types of data. You can


have different types in Python. They can be integers like 11, real
numbers like 21.213. They can even be words.

25
The following chart summarizes three data types for the last
examples. The first coslumn indicates the expression. The second
Column indicates the data type. We can see the actual data type in
Python by using the type command. We can have int, which stands
for an integer, and float that stands for float, essentially a real
number. The type string is a sequence of characters.

Integers can be negative or positive. It should be noted that there is


a finite range of integers, but it is quite large. Floats are real
numbers; they include the integers but also numbers in between the
integers. Consider the numbers between 0 and 1. We can select
numbers in between them; these numbers are floats. Similarly,
consider the numbers between 0.5 and 0.6. We can select numbers
in-between them; these are floats as well.

Nothing really changes. If you cast a float to an integer, you must

2
6
be careful. For example, if you cast the float 1.1 to 1, you will lose
some information. If a string contains an integer value, you can
convert it to int. If we convert a string that contains a non-integer
value, we get an error. You can convert an int to a string or a float
to a string.

Boolean is another important type in Python. A Boolean can take on


two values. The first value is true, just remember we use an
uppercase T. Boolean values can also be false, with an uppercase F.
Using the type command on a Boolean value, we obtain the term
bool, this is short for Boolean. If we cast a Boolean true to an
integer or float, we will get a 1.

If we cast a Boolean false to an integer or float, we get a zero. If


you cast a 1 to a Boolean, you get a true. Similarly, if you cast a 0
to a Boolean, you get a false.

2
7
Basics of Data Visualization

Before jumping into the term “Data Visualization”, let’s have a brief discussion
on the term “Data Science” because these two terms are interrelated.
In simple terms, “Data Science is the science of analyzing raw data using
statistics and machine learning techniques with the purpose of drawing
conclusions about that information“.
In simple words, a pipeline in data science is “a set of actions which changes
the raw (and confusing) data from various sources (surveys, feedback, list of
purchases, votes, etc.), to an understandable format so that we can store it and
use it for analysis.”

The raw data undergoes different stages within a pipeline, which are:

2
8
1. Fetching/Obtaining the Data
2. Scrubbing/Cleaning the Data
3. Data Visualization
4. Modeling the Data
5. Interpreting the Data
6. Revision

Data visualization is the graphical representation of information and data in a


pictorial or graphical format(Example: charts, graphs, and maps). Data
visualization tools provide an accessible way to see and understand trends,
patterns in data, and outliers. Data visualization tools and technologies are
essential to analyzing massive amounts of information and making data-driven
decisions. The concept of using pictures is to understand data that has been used
for centuries. General types of data visualization are Charts, Tables, Graphs,
Maps, Dashboards.

Categories of Data Visualization

Data visualization is very critical to market research where both numerical and
categorical data can be visualized, which helps in an increase in the impact of
insights and also helps in reducing the risk of analysis paralysis. So, data
visualization is categorized into the following categories:

10
Data visualization and big data

The increased popularity of big data and data analysis projects has made
visualization more important than ever. Companies are increasingly using
machine learning to gather massive amounts of data that can be difficult and slow
to sort through, comprehend, and explain. Visualization offers a means to speed
this up and present information to business owners and stakeholders in ways they
can understand.
Big data visualization often goes beyond the typical techniques used in
normal visualization, such as pie charts, histograms and corporate graphs. It
instead uses more complex representations, such as heat maps and fever
charts. Big data visualization requires powerful computer systems to collect
raw data, process it, and turn it into graphical representations that humans can
use to quickly draw insights.
While big data visualization can be beneficial, it can pose several
disadvantages to organizations. They are as follows:
· To get the most out of big data visualization tools, a
visualization specialist must be hired. This specialist must be able to
identify the best data sets and visualization styles to guarantee
organizations are optimizing the use of their data.
· Big data visualization projects often require involvement
from IT, as well as management, since the visualization of big data
requires powerful computer hardware, efficient storage systems and
even a move to the cloud.
· The insights provided by big data visualization will only be as
accurate as the information being visualized. Therefore, it is essential to
have people and processes in place to govern and control the quality of
corporate data, metadata, and data sources.

Examples of data visualization

In the early days of visualization, the most common visualization technique was
using a Microsoft Excel spreadsheet to transform the information into a table, bar
graph or pie chart. While these visualization methods are still commonly used,
more intricate techniques are now available, including the following:
· infographics
· bubble clouds
· bullet graphs
· heat maps
· fever charts
· time series charts
Some other popular techniques are as follows:
Line charts. This is one of the most basic and common techniques used. Line
charts display how variables can change over time.
Area charts. This visualization method is a variation of a line chart; it displays
multiple values in a time series -- or a sequence of data collected at consecutive,
equally spaced points in time.
Scatter plots. This technique displays the relationship between two variables. A
scatter plot takes the form of an x- and y-axis with dots to represent data points.
Treemaps. This method shows hierarchical data in a nested format. The size of the
rectangles used for each category is proportional to its percentage of the
whole. Treemaps are best used when multiple categories are present, and the
goal is to compare different parts of a whole.
Population pyramids. This technique uses a stacked bar graph to display the
complex social narrative of a population. It is best used when trying to display
the distribution of a population.

Advantages of Data Visualization

1. Better Agreement: In business, for numerous periods, it


happens that we need to look at the exhibitions of two components
or two situations. A conventional methodology is to experience the
massive information of both the circumstances and afterward
examine it. This will clearly take a great deal of time.
2. A Superior Method: It can tackle the difficulty of placing the
information of both perspectives into the pictorial structure. This will
unquestionably give a superior comprehension of the circumstances.
For instance, Google patterns assist us with understanding information
identified with top ventures or inquiries in pictorial or graphical
structures.
3. Simple Sharing of Data: With the representation of the
information, organizations present another arrangement of
correspondence. Rather than sharing the cumbersome information,
sharing the visual data will draw in and pass on across the data
which is more absorbable.
4. Deals Investigation: With the assistance of information
representation, a salesman can, without much of a stretch,
comprehend the business chart of
items. With information perception instruments like warmth maps, he will have
the option to comprehend the causes that are pushing the business numbers up
just as the reasons that are debasing the business numbers. Information
representation helps in understanding the patterns and furthermore, different
variables like sorts of clients keen on purchasing, rehash clients, the impact of
topography, and so forth.
5. Discovering Relations Between Occasions: A business is influenced
by a lot of elements. Finding a relationship between these elements
or occasions encourages chiefs to comprehend the issues identified with
their business. For instance, the online business market is anything
but another thing today. Each time during certain happy seasons, like
Christmas or Thanksgiving, the diagrams of online organizations go
up. Along these lines, state if an online organization is doing a normal
$1 million business in a specific quarter and the business ascends
straightaway, at that point they can rapidly discover the occasions
compared to it.
6. Investigating Openings and Patterns: With the huge loads of
information present, business chiefs can discover the profundity of
information in regard to the patterns and openings around them.
Utilizing information representation, the specialists can discover
examples of the conduct of their clients, subsequently preparing for
them to investigate patterns and open doors for business.

Why is Data Visualization Important?

Let’s take an example. Suppose you compile a data visualization of the


company’s profits from 2010 to 2020 and create a line chart. It would be very
easy to see the line going constantly up with a drop in just 2018. So you can
observe in a second that the company has had continuous profits in all the years
except a loss in 2018. It would not be that easy to get this information so fast
from a data table. This is just one demonstration of the usefulness of data
visualization. Let’s see some more reasons why data visualization is so
important.
1. Data Visualization Discovers the Trends in Data
The most important thing that data visualization does is discover the trends in
data. After all, it is much easier to observe data trends when all the data is
laid out in front of you in a visual form as compared to data in a table. For
example, the screenshot below on Tableau demonstrates the sum of sales made by
each customer in descending order. However, the color red denotes loss while
grey denotes profits. So it is very easy to observe from this visualization that even
though some customers may have huge sales, they are still at a loss. This
would be very difficult to observe from a table.
2. Data Visualization Provides a Perspective on the Data
Data Visualization provides a perspective on data by showing its meaning in
the larger scheme of things. It demonstrates how particular data references
stand with respect to the overall data picture. In the data visualization below,
the data between sales and profit provides a data perspective with respect to
these two measures. It also demonstrates that there are very few sales above 12K
and higher sales do not necessarily mean a higher profit.
3. Data Visualization Puts the Data into the Correct Context
It is very difficult to understand the context of the data with data visualization.
Since context provides the whole circumstances of the data, it is very difficult
to grasp by just reading numbers in a table. In the below data visualization on
Tableau, a TreeMap is used to demonstrate the number of sales in each region
of the United States. It is very easy to understand from this data visualization
that California has the largest number of sales out of the total number since
the rectangle for California is the largest. But this information is not easy to
understand outside of context without data visualization.
4. Data Visualization Saves Time
It is definitely faster to gather some insights from the data using data visualization
rather than just studying a chart. In the screenshot below on Tableau, it is very
easy to identify the states that have suffered a net loss rather than a profit.
This is because all the cells with a loss are colored red using a heat map, so it
is obvious states have suffered a loss. Compare this to a normal table where
you would need to check each cell to see if it has a negative value to
determine a loss. Obviously, data visualization saves a lot of time in this
situation!
5. Data Visualization Tells a Data Story
Data visualization is also a medium to tell a data story to the viewers. The
visualization can be used to present the data facts in an easy-to-understand
form while telling a story and leading the viewers to an inevitable conclusion.
This data story, like any other type of story, should have a good beginning, a
basic plot, and an ending that it is leading towards. For example, if a data analyst
has to craft a data visualization for company executives detailing the profits on
various products, then the data story can start with the profits and losses of
various products and move on to recommendations on how to tackle the
losses.
In a nutshell, Data visualization provides a quick and effective way to
communicate information in a universal manner using visual information. The
practice can also help businesses identify which factors affect customer
behavior; pinpoint areas that need to be improved or need more attention;
make data more memorable for stakeholders; understand when and where to
place specific products; and predict sales volumes.

Other benefits of data visualization include the following:

 the ability to absorb information quickly, improve insights and


make faster decisions;
 an increased understanding of the next steps that must be taken
to improve the organization;
 an improved ability to maintain the audience's interest with
information they can understand;
 an easy distribution of information that increases the opportunity
to share insights with everyone involved;
 eliminate the need for data scientists since data is more
accessible and understandable; and
 an increased ability to act on findings quickly and, therefore,
achieve success with greater speed and less mistakes.

Common data visualization use cases

Common use cases for data visualization include the following:

Sales and marketing. Research from market and consumer data provider Statista
estimated $566 billion was spent on digital advertising in 2022 and that number
will cross the $700 billion mark by 2025. Marketing teams must pay close
attention to their sources of web traffic and how their web properties generate
revenue. Data visualization makes it easy to see how marketing efforts effect
traffic trends over time.
Politics. A common use of data visualization in politics is a geographic map
that displays the party each state or district voted for.

Healthcare. Healthcare professionals frequently use choropleth maps to visualize


important health data. A choropleth map displays divided geographical areas
or regions that are assigned a certain color in relation to a numeric variable.
Choropleth maps allow professionals to see how a variable, such as the
mortality rate of heart disease, changes across specific territories.

Scientists. Scientific visualization, sometimes referred to in shorthand as SciVis,


allows scientists and researchers to gain greater insight from their experimental
data than ever before.

Finance. Finance professionals must track the performance of their investment


decisions when choosing to buy or sell an asset. Candlestick charts are used as
trading tools and help finance professionals analyze price movements over time,
displaying important information, such as securities, derivatives, currencies,
stocks, bonds and commodities. By analyzing how the price has changed over
time, data analysts and finance professionals can detect trends.

Logistics. Shipping companies can use visualization tools to determine the best
global shipping routes.

Data scientists and researchers. Visualizations built by data scientists are typically
for the scientist's own use, or for presenting the information to a select
audience. The visual representations are built using visualization libraries of
the chosen programming languages and tools. Data scientists and researchers
frequently use open source programming languages -- such as Python -- or
proprietary tools designed for complex data analysis. The data visualization
performed by these data scientists and researchers helps them understand data
sets and identify patterns and trends that would have otherwise gone
unnoticed.

Data visualization tools and vendors

Data visualization tools can be used in a variety of ways. The most common
use today is as a business intelligence (BI) reporting tool. Users can set up
visualization
tools to generate automatic dashboards that track company performance
across key performance indicators (KPIs) and visually interpret the results.

The generated images may also include interactive capabilities, enabling users
to manipulate them or look more closely into the data for questioning and
analysis. Indicators designed to alert users when data has been updated or
when predefined conditions occur can also be integrated.

Many business departments implement data visualization software to track


their own initiatives. For example, a marketing team might implement the
software to monitor the performance of an email campaign, tracking metrics
like open rate, click-through rate and conversion rate.

As data visualization vendors extend the functionality of these tools, they are
increasingly being used as front ends for more sophisticated big data
environments. In this setting, data visualization software helps data engineers and
scientists keep track of data sources and do basic exploratory analysis of data
sets prior to or after more detailed advanced analyses.

The biggest names in the big data tools marketplace include Microsoft, IBM, SAP
and SAS. Some other vendors offer specialized big data visualization software;
popular names in this market include Tableau, Qlik and Tibco.

While Microsoft Excel continues to be a popular tool for data visualization,


others have been created that provide more sophisticated abilities:

 IBM Cognos Analytics


 Qlik Sense and QlikView
 Microsoft Power BI
 Oracle Visual Analyzer
 SAP Lumira
 SAS Visual Analytics
 Tibco Spotfire
 Zoho Analytics

20
 D3.js
 Jupyter
 MicroStrategy
 Google Charts

Disadvantages of Data Visualization

While there are many advantages, some of the disadvantages may seem less
obvious. For example, when viewing a visualization with many different data
points, it’s easy to make an inaccurate assumption. Or sometimes the
visualization is just designed wrong so that it’s biased or confusing.

Some other disadvantages include:

 Biased or inaccurate information.


 Correlation doesn’t always mean causation.
 Core messages can get lost in translation.

Top Data Visualization Libraries

The following are the top Data Visualization Libraries


 Python:
 Matplotlib
 Plotly
 ggplot
 Seaborn
 Altair
 Geoplotlib
 Bokeh
 R:
 ggplot2
 Plotly
 Leaflet
 Esquisse
 Lattice
 Javascript:
 D3.js
 Chart.js
 Plotly
Data Dimension and Modality
In computer science and data analysis, "data dimension" and "modality" are
important concepts that relate to the structure and characteristics of data. Let's
provide an introduction to each concept and discuss how they are represented
in computer science:

Data Dimension:

Definition: Data dimension, often referred to as "dimensionality," represents the


number of attributes or features associated with each data point in a dataset. It
indicates the intrinsic complexity of the data by counting the variables that
describe each observation.
Representation in Computer Science:
● In computer science and data analysis, data dimensions are
often represented using matrices or multi-dimensional arrays.
Each row of the matrix corresponds to a data point, while each
column represents a different attribute or feature. The number
of columns (i.e., the width of the matrix) is the data
dimension.
● In machine learning, high-dimensional data is commonly
encountered, where each data point is described by a large
number of features. Techniques like dimensionality reduction
(e.g., Principal Component Analysis or t-SNE) are used to
reduce the dimensionality and extract essential information.

Modality:

Definition: Modality refers to the number of modes or forms within a dataset.


It characterizes the distribution of data points based on the number of
significant peaks or clusters in the data.
Representation in Computer Science:
● In computer science and data analysis, modality is often
represented visually through data visualization techniques.
For example:
● Histograms: Histograms provide a visual
representation of the distribution of data by showing
the frequency of data points within different bins or
intervals. The number of peaks or modes in a
histogram can indicate the modality of the data.
● Density Plots: Density plots visualize the probability
density function of data, and the number of peaks in the
plot can reveal modality.
● Scatterplots: Scatterplots can help identify clusters or
modes in two- dimensional data.
● In statistical analysis, tests like the D'Agostino and Pearson's test
or the Anderson-Darling test can be used to formally assess the
modality of data by examining the distribution's skewness and
kurtosis.
To summarize, data dimension and modality are fundamental concepts in
computer science and data analysis:
● Data dimension measures the number of attributes or features
describing each data point and is often represented through
matrices or arrays.
● Modality characterizes the distribution of data based on the
number of significant modes or clusters and is usually
represented visually using histograms, density plots, or
statistical tests.
These concepts are essential for understanding data complexity and
distribution, which, in turn, guide the selection of appropriate analysis and
visualization techniques in various data-driven applications, including
machine learning and statistics.
Textual data representation and analysis in data visualization
Textual data representation and analysis in data visualization involve
converting raw text into visual formats that facilitate exploration,
understanding, and communication of insights. Here are key approaches to
textual data representation and analysis through visualization:
Word Clouds:
● Representation: Word clouds visually display words
from a text, with word size proportional to its frequency.
● Analysis: Easily identify the most frequent words in a
document or corpus.
Bar Charts and Histograms:
● Representation: Bar charts or histograms can display the
frequency distribution of words.
● Analysis: Highlight the distribution of word frequencies,
aiding in understanding patterns and outliers.
Heatmaps:
● Representation: Heatmaps represent the
relationships between words or terms, showing the
intensity of co-occurrence.
● Analysis: Identify patterns of association between
words, useful for understanding semantic relationships.
Scatter Plots:
● Representation: Scatter plots with text labels can be used
to visualize relationships between pairs of words.
● Analysis: Explore correlations or co-occurrences
between specific terms.
Topic Modeling and LDA Visualization:
● Representation: Topic modeling algorithms (e.g.,
Latent Dirichlet Allocation - LDA) can be used to
identify topics in a corpus, and visualizations (e.g.,
pyLDAvis) can help explore and interpret topics.
● Analysis: Understand the main themes and topics
within a body of text.
Network Diagrams:
● Representation: Network diagrams visualize
relationships between words or entities, where nodes
represent words, and edges represent connections.
● Analysis: Explore semantic relationships and identify
key terms in a network.
Sentiment Analysis Visualizations:
● Representation: Sentiment analysis results can be
visualized using charts (e.g., bar charts) to show the
distribution of positive, negative, and neutral sentiments.
● Analysis: Understand the sentiment trends and
patterns in a set of texts.
Text Clustering and Dendrogram:
● Representation: Hierarchical clustering can be visualized
using dendrograms, showing relationships between
clusters of words.
● Analysis: Identify groups of words with similar
characteristics or meanings.
Word Embeddings and t-SNE Visualization:
● Representation: Techniques like word embeddings can
be visualized using t-SNE (t-distributed Stochastic
Neighbor Embedding) to visualize high-dimensional
word representations in a lower- dimensional space.
● Analysis: Explore the spatial relationships between
words based on their semantic similarity.
Text Annotation and Highlighting:
● Representation: Annotate and highlight specific terms or
phrases in a document for emphasis.
● Analysis: Draw attention to key information or extract
important insights from the text.
When working with textual data, the choice of visualization method depends on
the goals of the analysis and the characteristics of the data. Effective
visualizations enable data scientists and analysts to gain insights into patterns,
trends, and relationships within textual data, making it easier to communicate
findings to a broader audience.

Sentiment Analysis on Textual Data in Data Visualization

Sentiment analysis, also known as opinion mining, is a natural language


processing (NLP) technique used to determine the sentiment or emotional tone
expressed in textual data, such as customer reviews, social media posts, or
survey responses. Visualizing sentiment analysis results can provide valuable
insights into public
opinion, customer satisfaction, and trends. Here's how to perform sentiment
analysis on textual data and visualize the results:
1. Preprocessing:
Before performing sentiment analysis, you should preprocess the textual data,
which includes:
● Text cleaning: Removing special characters, punctuation,
and irrelevant symbols.
● Tokenization: Splitting the text into words or phrases (tokens).
● Lowercasing: Converting all text to lowercase for consistency.
● Stopword removal: Eliminating common words like "the,"
"and," or "in" that don't carry much sentiment information.
2. Sentiment Analysis:

Sentiment analysis typically involves classifying text into categories such as


positive, negative, or neutral. Common techniques include:
● Lexicon-based approaches: Using sentiment lexicons
(dictionaries of words with associated sentiment scores) to
determine sentiment.
● Machine learning models: Training models (e.g., Naive Bayes,
Support Vector Machines, or deep learning models) on labeled
sentiment data to predict sentiment.
3. Visualization:

Once you have sentiment analysis results, you can create various visualizations to
convey insights:
● Pie Chart: Create a pie chart to show the distribution of
sentiments (positive, negative, neutral) in your textual data.
● Bar Chart: Use a bar chart to display the frequency of each
sentiment category. This can provide a quick overview of
sentiment distribution.
● Line Chart or Time Series Plot: If you have sentiment data over
time (e.g., daily sentiment trends), use a line chart to visualize
sentiment fluctuations.
● Word Clouds: Generate word clouds for positive and negative
sentiments to highlight frequently occurring terms in each
category.
● Heatmap: Create a heatmap that shows sentiment scores for
different topics or entities. Rows represent topics, and columns
represent sentiment scores.
● Scatter Plot: If you want to explore the relationship between
sentiment and other variables (e.g., sentiment vs. product
ratings), use a scatter plot.
● Stacked Area Chart: Visualize sentiment changes over time
using a stacked area chart, where each sentiment category is
represented by a different color.
● Geospatial Visualization: If your data includes geographic
information, use maps to visualize sentiment variations across
different regions.
● Sentiment Flow Diagram: Display sentiment transitions within a
text (e.g., positive to negative) using a flow diagram.
● Interactive Dashboards: Create interactive dashboards that allow
users to explore sentiment analysis results dynamically, filtering
by various attributes like time, source, or sentiment category.
● Comparison Plots: Compare sentiment across different sources,
products, or categories using side-by-side visualizations.
● Emotion Analysis: If you perform emotion analysis as part of
sentiment analysis, visualize emotional tones (e.g., joy, anger,
sadness) using color- coded charts or radial diagrams.
Visualization of sentiment analysis results not only aids in understanding the
overall sentiment but also helps identify trends, anomalies, and actionable insights
in large volumes of textual data. Interactive and dynamic visualizations allow
users to drill down into specific aspects of the data, making it easier to make data-
driven decisions based on sentiment.

Sentiment Analysis on e-Commerce Women’s Clothing

Performing sentiment analysis on women's clothing reviews in the context of e-


commerce can provide valuable insights into customer opinions, preferences, and
satisfaction. Here's a step-by-step guide on how to conduct sentiment analysis on
e-commerce women's clothing data:

1. Data Collection:

Collect a dataset of women's clothing reviews from an e-commerce platform.


You can obtain this data from sources such as customer reviews on the e-
commerce website, social media, or other online platforms.
2. Data Preprocessing:

Clean and preprocess the textual data to prepare it for sentiment analysis:
● Remove special characters, punctuation, and irrelevant symbols.
● Tokenize the text into words or phrases.
● Convert the text to lowercase for consistency.
● Remove stopwords (common words like "the," "and," etc.).
● Address issues like misspellings and abbreviations.

3. Sentiment Analysis:

Apply sentiment analysis techniques to classify each review into positive,


negative, or neutral sentiments. Methods include:
● Lexicon-Based Approaches: Use sentiment lexicons that
associate words with sentiment scores. Calculate the overall
sentiment based on the scores of individual words.
● Machine Learning Models: Train machine learning models (e.g.,
Naive Bayes, Support Vector Machines, or deep learning
models) on labeled sentiment data to predict sentiment.

4. Visualization:

Create visualizations to present the sentiment analysis results in an


understandable and insightful way:
● Pie Chart or Bar Chart: Display the distribution of sentiment
categories (positive, negative, neutral).
● Word Clouds: Generate word clouds for positive and negative
sentiments to highlight frequently mentioned terms.
● Time Series Plot: If your data includes timestamps, visualize
how sentiments change over time.
● Product Category Comparison: Compare sentiment across
different women's clothing categories (e.g., dresses, tops, pants)
using stacked bar charts.
● Interactive Dashboards: Create interactive dashboards allowing
users to explore sentiments by filtering through attributes like
product ratings, clothing types, or brands.
● Emotion Analysis: If applicable, conduct emotion analysis
and visualize emotional tones associated with reviews (e.g.,
joy, anger, sadness).

5. Extracting Insights:

Analyze the visualizations to extract actionable insights:


● Identifypopular and well-received products by
analyzing positive sentiments.
● Address negative sentiments and common issues to
improve customer satisfaction.
● Explore trends and patterns to make informed decisions
about inventory, marketing, or customer engagement strategies.

6. Continuous Monitoring:

Implement continuous sentiment monitoring to track changes in customer


opinions over time. Regularly update your sentiment analysis model to adapt to
evolving language and trends.

Tools and Libraries:

For sentiment analysis: NLTK, spaCy, TextBlob, VADER Sentiment.



For machine learning: Scikit-learn, TensorFlow, PyTorch.

For visualization: Matplotlib, Seaborn, Plotly, Tableau.

By conducting sentiment analysis on women's clothing reviews in an e-
commerce setting, businesses can gain a deeper understanding of customer
sentiment, improve products and services, and enhance overall customer
satisfaction.

To perform sentiment analysis on e-commerce women's clothing reviews in


Python, you can use the Natural Language Toolkit (NLTK) for text
processing and sentiment analysis. Make sure to install the NLTK library before
running the following program:
pip install nltk
Here's a simple Python program that demonstrates sentiment analysis
on e- commerce women's clothing reviews using NLTK:
Data Cleaning

Data cleaning, also known as data cleansing or data preprocessing, is a crucial


step in the data visualization process within the realm of data science. The
quality and reliability of visualizations heavily depend on the cleanliness of the
underlying data. Here is an overview of what data cleaning involves:
1. Handling Missing Data:
● Identification: Identify and locate missing values within the dataset.
● Strategies: Depending on the extent and nature of missing data,
you may choose to remove rows or columns with missing
values, impute missing values with statistical measures (e.g.,
mean, median, mode), or use more advanced imputation
techniques.
2. Dealing with Duplicates:

● Detection: Identify and remove duplicate records from the dataset.


● Strategies: Use unique identifiers to find and remove
duplicates or use algorithms to identify similarity and decide
which record to keep.
3. Outlier Detection and Treatment:

● Identification: Identify outliers—data points that significantly


deviate from the overall pattern.
● Strategies: Outliers can be handled by removing them,
transforming them, or using robust statistical measures.
4. Standardizing and Normalizing Data:

● Standardization: Standardize numerical features to have a


mean of 0 and a standard deviation of 1.
● Normalization: Scale numerical features to a specific range,
often between 0 and 1.
5. Handling Inconsistent Data:

● Identification: Identify inconsistent or erroneous data.


● Strategies: Correct inconsistencies, such as standardizing
date formats, fixing typos, or resolving discrepancies in
categorical values.
6. Encoding Categorical Variables:

● Conversion: Convert categorical variables into a numerical


format suitable for analysis and visualization.
● Strategies: One-hot encoding, label encoding, or using
more advanced techniques for ordinal variables.
7. Data Type Conversion:

● Conversion: Ensure that data types are appropriate for


analysis and visualization.
● Strategies: Convert data types, such as converting string
representations of numbers to actual numeric types.
8. Handling Data Skewness and Transformation:

● Identification: Identify skewed distributions in numerical variables.


● Strategies: Apply transformations (e.g., logarithmic, square root)
to reduce skewness and make the data more amenable to
analysis.
9. Handling Data Inconsistencies:

● Identification: Identify inconsistencies in the data that might


affect its integrity.
● Strategies: Correct inconsistencies, reconcile conflicting
information, and ensure data integrity.
10. Addressing Data Security and Privacy Concerns: - Anonymization:
If dealing with sensitive data, anonymize or pseudonymize the data to
protect privacy.
11. Exploratory Data Analysis (EDA): - Visualization: Utilize
exploratory data analysis techniques and visualizations to understand
the structure and characteristics of the data before creating final
visualizations.
By addressing these aspects in the data cleaning phase, data scientists ensure
that the data used for visualization is accurate, consistent, and ready for
meaningful analysis. This, in turn, enhances the reliability and interpretability of
the insights gained through data visualization in the field of data science.
Findings:

The study revealed significant differences between the two treatment groups, with
interferential therapy (IFT) showing a marked improvement in reducing pain and increasing
the range of motion (ROM) compared to ultrasound therapy (UST). Data analysis indicated
that patients in the IFT group experienced a 20% greater reduction in pain levels, as measured
by the Numerical Pain Rating Scale, and a 15% higher improvement in ROM, assessed with a
digital inclinometer. Statistical analysis revealed a p-value of 0.03, confirming the results
were statistically significant. Additionally, the IFT group reported a quicker recovery time
and fewer instances of muscle stiffness, suggesting the therapy may be more effective in
managing myofascial pain syndrome of the upper trapezius. These findings highlight the
potential benefits of incorporating interferential therapy over ultrasound therapy in treating
myofascial pain.

Conclusion:

In conclusion, the study demonstrates that interferential therapy is a more effective


intervention than ultrasound therapy for reducing pain and improving ROM in patients with
upper trapezius myofascial pain syndrome. These results suggest that IFT could be a valuable
addition to physiotherapy treatment protocols. However, the study was limited by its small
sample size and short treatment duration, and further research is needed to confirm these
findings in larger, more diverse patient populations. Additionally, exploring the long-term
effects of IFT on muscle function and pain relief would provide a more comprehensive
understanding of its therapeutic potential.
Referenes

Abela, A. (2008). Advanced Presentations By Design: Creating Communication that Drives


Action (1st ed.). San Francisco:

Barr, S. (2009, November 23). What Does “KPI” Really Mean? Dashboard Insight.

Bateman, S., Mandryk, R. L., Gutwin, C., Genest, A., McDine, D., and Brooks, C. (2010).
Useful junk? The effects of visual embellishment on comprehension and memorability of
charts. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
(pp. 2573–2582). New York, NY: ACM.

Bertin, J. (2010). Semiology of Graphics: Diagrams, Networks, Maps (1st ed.). Redlands,
CA: Esri Press.

Borkin, M. A., Vo, A. A., Bylinskii, Z., Isola, P., Sunkavalli, S., Oliva, A., and Pfister, H.
(2013). What makes a visualization memorable? IEEE Transactions on Visualization and
Computer Graphics, 19(12), 2306– 2315.

Dresner, H. (2015, May 25). IoT and the Growing Use of Location Features in Business
Intelligence Software. Sand Hill.

Few, S. (2004, March 20). Dashboard Confusion.

Segel, E. and Heer, J. (2010). Narrative visualization: Telling stories with data. IEEE
Transactions on Visualization and Computer Graphics, 16(6), 1139–1148.

Zheng, G., Zhang, C., and Li, L. (2014). Bringing business intelligence to health information
technology curriculum. Journal of Information Systems Education, 25(4).
QUESTIONNAIRE

1. Bar Chart (Multiple bars for each question, each representing an option):
o Description: Each question will have a cluster of bars (one for each option A,
B, C, D). This is useful for comparing the number of respondents for each
option for every question.
o X-Axis: Questions (Q1 to Q10)
o Y-Axis: Number of respondents
o Legend: Each option (A, B, C, D) in a different color
2. Stacked Bar Chart (Each bar represents a question, and the sections represent
options A, B, C, D):
o Description: For each question, the total bar represents the number of
respondents, stacked according to the percentage who selected each option.
o X-Axis: Questions (Q1 to Q10)
o Y-Axis: Number of respondents
o Legend: Options A, B, C, D, represented by different colors
3. Pie Chart (Individual) for each question:
o Description: A separate pie chart for each question showing the proportion of
responses for each option (A, B, C, D).
o Legend: Options A, B, C, D
4. Heat Map (Color-coded representation of the data):
o Description: Each cell in the heat map will represent the number of
respondents for a particular option (A, B, C, D) for a question. The darker the
color, the higher the number of responses.
o Rows: Questions (Q1 to Q10)
o Columns: Options A, B, C, D
o Color Scale: From light (low count) to dark (high count)
How the Visualization Looks:
 Bar Chart: Each question (Q1-Q10) will have four bars next to each other for options
A, B, C, and D, showing how many people chose each option.
 Stacked Bar Chart: The bar for Q1 might be divided into sections based on how
many people chose Option A, B, C, or D, stacked one on top of the other.
 Pie Chart: Each question will have a circular pie showing the breakdown of A, B, C,
D.
 Heat Map: The values are color-coded in a table where darker shades show higher
responses.

You might also like