0% found this document useful (0 votes)

22 views6 pages

Data Analysis Using Box and Whisker Plot

Uploaded by

Trương Vỉ Bùi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views6 pages

Data Analysis Using Box and Whisker Plot

Uploaded by

Trương Vỉ Bùi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]

Data analysis using Box and Whisker Plot for Lung

Cancer
Chandrasegar Thirumalai, IEEE Member,
School of Information Technology and Engineering,
VIT University, Vellore, India.
[email protected]

Vignesh M Balaji R
MS Software Engineering, MS Software Engineering,
School of Information Technology and Engineering School of Information Technology and Engineering,
VIT University, Vellore, India. VIT University, Vellore, India.
[email protected] [email protected]

Abstract— In statistical analysis, we have a collection of data, To analyze the relevant data of Lung cancer dataset we
with the use of these data, we have to do analysis based on our have an applied Box plot data analysis method which is shown
requirements. With the collection of data using Statistical in Section 3. A boxplot is a data analysis method used to find
analysis, we deal collection, analysis, presentation and organizing the output of the samples. With the use of boxplot, we can
the data. With the help of statistical analysis, we can find easily compare the different datasets. In other words, boxplot
underlying patterns, relationships, and trends between data also called box and whisker plot method.
samples. The R system for statistical computing is an
environment for data analysis and graphics. Here we are going to TABLE II. DATA SET OF 2ND PART OF LUNG CANCER ATTRIBUTES.
implement boxplot method and control chart methods for Lung
cancer dataset. With the help of boxplot, we can easily make Race Height Weight Family Copd Year Cancer
relations between samples and we can find the outliers. history

Asian 175 85 No Yes 2000 Yes

Keywords-component; Data analysis, Lung Cancer, Decision
Asian 180 90 Yes Yes 2001 Yes
making
Asian 182 57 Yes No 2002 No
I. INTRODUCTION American
170 80 Yes Yes 2003 Yes
Indian
We have taken lung cancer datasets of 12 primary
attributes as shown in the following Table I and II. African
182 85 No Yes 2000 No
American
TABLE I. DATA SET OF 1ST PART OF LUNG CANCER ATTRIBUTES. White 170 60 Yes Yes 2002 No

Age Smoking status Years Average Gender Grade Latin 175 65 No No 2003 No
smoked per day Asian 178 59 Yes No 2004 No
American
68 Smoker 10 15 Male UG 187 70 No No 2005 No
Indian
77 Former Smoker 15 10 Male PG
American
187 54 Yes Yes 2002 No
68 Non Smoker 0 0 Male PG Indian
71 Smoker 27 10 Male Nil American
187 56 Yes Yes 2003 No
74 Smoker 10 5 Male Nil Indian
51 Smoker 10 3 Female UG Asian 187 58 Yes Yes 2001 Yes
54 Former Smoker 14 6 Female PG Asian 185 89 Yes Yes 2003 Yes
50 Non Smoker 0 0 Female Nil Asian 185 84 No Yes 2002 No
60 Smoker 5 5 Male UG Asian 185 74 No Yes 2004 Yes
54 Smoker 12 5 Male PG This instructive datasets are used as the commitment to
figure the Pearson [6], [11], [14], [16], [22], [24]. In the present
54 Non Smoker 0 0 Male UG
days, there are enormous measures of data recorded by the
56 Former Smoker 12 12 Male Nil banks and exploring them requires complex estimations. We
87 Smoker 10 10 Male PG played out the item metric examination on the given
45 Non Smoker 0 0 Male PG enlightening accumulation. From the data examination [8], [9],
76 Former Smoker 25 12 Male UG [12], [17], [18], [20] we can pick which quality can be viewed

978-1-5090-5682-8 /17/$31.00 ©2017 IEEE

1
International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]

as and which trademark can be expelled. For instance, in the III. CALCULATION AND DISCUSSIONS
Pearson strategy if the estimation of r is more than 0.5 then the This is the sample dataset that we are going to know how
credits are thought to be unequivocally related and if it is
the boxplot method works.
underneath 0.3 the qualities are pitifully related.
A segment of the past procedures to appraise the decisions TABLE III. SAMPLE DATASET OF 1ST PART BETWEEN AGE 25 TO 45.
in perspective of their relationship of value are Spearman [11], Age Smoking status Years Average Gender Grade
Analytical Hierarchical Process (AHP) [7], [13], [15] and smoked per day
Traveling Salesman Problem (TSP) [26]. The fragile
information's among various components [19], [21], [28], [30], 25 Smoker 12 15 Male Nil
[32], [34], [36] among the bank stock model are managed by 21 Non Smoker 0 0 Male Nil
late secured systems [23], [25], [27], [29], [31], [33], [35], [37]. 22 Former Smoker 5 2 Male Nil
In boxplot method, the input data set is split to quartiles. In 28 Smoker 10 8 Female PG
a boxplot, it has a minimum value, lower quartile, median, 35 Smoker 7 3 Male PG
upper quartile, maximum value. Boxplot, it contains one box, it 18 Former Smoker 8 2 Female PG
goes from lower quartile to upper quartile. The difference 19 Non Smoker 0 0 Female PG
between upper quartile and lower quartile is the length of the PG
40 Smoker 12 6 Male
box. Inside the box of boxplot, one vertical line is drawn, it is
45 Smoker 45 4 Female PG
the median of the dataset. Median of the lower samples is
called “Lower quartile” and Median of the higher samples is 23 Smoker 2 5 Male PG
called “Upper quartile”. In the outside of the box in a boxplot,
two more vertical lines are drawn, one vertical near upper TABLE IV. SAMPLE DATASET OF 2ND PART BETWEEN AGE 25 TO 45.
quartile is called upper whisker and another one line near lower Race Height Weight Family history Cancer Year
quartile is called lower whisker is shown in the following Fig.
1. The easiest way to find the quartiles have first sorted the Asian 180 75 Yes Yes 2005
data and take the minimum and maximum values as lower Asian 178 80 No No 2004
bound and upper bound respectively. Lower quartile, median Asian 165 78 No No 2005
upper quartile is we can find using the following methods in
Asian 178 79 Yes No 2004
Section 2.
Asian 189 75 Yes Yes 2003
Asian 175 80 Yes No 2005
Asian 148 79 No Yes 2005
Asian 168 72 Yes No 2003
Asian 189 85 No No 2004
Asian 168 69 No No 2005
Fig. 1. Box Plot Attributes.

II. DATA ANALYSIS

A. Box Plot

Step 1: Sort the data on a primary attribute.

Step 2: Calculate the Median.
Step 3: Calculate the Quartiles.
Quartiles: Q1 (25th percentile), Q3 (75th percentile)

Inter-quartile range: IQR = Q3 – Q1

Five number summary: min, Q1, M, Q3, max Fig. 2. Smokers by Ages.

Boxplot: ends of the box are the quartiles, median is The above boxplot shows that when comparing to former
marked, whiskers, and plot outlier individually smoker and nonsmoker, the smoker is having higher chances
of getting affected by lung cancer, from the boxplot of a
Step 4: Calculate the Outlier: smoker having a higher median, when comparing to age
More than 1.5 x IQR. attribute from people having age 25 to 40 are high chances for
cancer disease.

2
International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]

C. Boxplot for Smoking status based on Age

Fig. 3. Cancer in Years (2003 – 2005).

Fig. 6. Boxplot for Smokers by Ages.

Above boxplot shows that when comparing the years 2004 From the above Fig. 6, it shows that the age between 60 to
to 2005, in year the boxplot for getting affected by cancer the
80, people those who are smokers and former smoker are
chances is very low, because the median is in the lower quartile
having higher chances to get cancer with comparing to a
and people in 2005, having higher chances of getting cancer
disease, because the median is near the upper quartile, we can nonsmoker.
understand this easily from the boxplot. D. Boxplot for Smoking status based on Year
IV. NUMERICAL RESULT ANALYSIS
A. Boxplot for cancer in years

Fig. 7. Boxplot for Smoking Status of the Peoples (2000 – 2005).

The numbers of smokers are increased in 2004 when

Fig. 4. Box plot for Cancer in Years.
compared to the year 2000 – 2005. Former smokers also
In the above Fig. 4, from the median, we can easily having fewer chances of getting lung cancer disease with
understand that the number of people affected by cancer is compared to nonsmoker and smoker.
increased with comparing to a nonsmoker. E. GG plot for Smoking Status vs Years Smoked
B. Boxplot for all the attributes in the dataset

Fig. 5. Boxplot for Overall Attributes.

Fig. 8. GG plot for Smoking Status vs Years Smoked.
In the above Fig. 5 shows boxplot for all attributes with
outliers. In Fig. 8 shows the average numbers of cigarette smokers
are high when compared to former smoker and nonsmoker.
Here, a maximum average of cigarette consumers per day is 20
and least is 0.

3
International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]

F. 3D plot for Lung Cancer H. 3D Scatterplot

Fig. 12. 3D Scatter plot of Year, Years Smoked, and Age.

Fig. 9. The 3D plot of Lung Cancer.

Fig 9 shows the cancer, years smoked and smoking status.

G. Scatterplot for Lung cancer

Fig. 13. Lung Cancer Chances.

In the above Fig 13 shows the getting chance for lung

cancer for all the attributes in the datasets. In the above Fig. 13
first one age, shows that when the age between 55 to 90, this
aged people who are having smoking habits have high chances
of lung cancer disease.

I. Control chart
Fig. 10. Scatter plot of Year, Smoking Status, Years Smoked, and Age.

Fig. 11. Lung Cancer Causes Options.

Fig. 14. C Chart for Cancer over a period of Years.
From the above Fig. 11, scatterplot diagram we can easily
From the above Fig. 14, the upper control limit for age is
make the relationship between the attributes. Here we have
1563 (15.63), control limit is 1448(14.48) and the lower
four attributes and four columns. The above scatterplot
control limit is 1334(13.34). Cancer disease symptoms we can
diagram first column for years, the second column for
mostly identify between the age 13 to 15.
smoking status, and the third column for year’s smoked and
fourth one for age.

4
International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]

J. Control Chart [8] P. Dhavachelvan, Chandra Segar T, K. Satheskumar, "Evaluation of

SOA Complexity Metrics Using Weyuker’s Axioms," IEEE International
Advance Computing (IACC), India, pp. 2325 – 2329, March 2009
[9] Halstead Metric for Intelligence, Effort, Time predictions,
DOI:10.13140/RG.2.2.17988.42881
[10] Fisher R.A., 1921. On the “probable error” of a coefficient of correlation
deduced from a small sample. Metron 1: 3–32.
[11] Spearman C.E, 1904b. General intelligence, objectively determined and
measured. American Journal of Psychology 15: 201–293.
[12] Software metric Numerical Data analysis using Box plot and control chart
methods, VIT University, DOI:10.13140/RG.2.2.27422.95041
[13] Vaishnavi B, Karthikeyan J, Kiran Yarrakula, Chandrasegar Thirumalai,
“An Assessment Framework for Precipitation Decision Making Using
AHP”, International Conference on Electronics and Communication
Systems (ICECS), IEEE & 978-1-4673-7832-1, Feb. 2016
[14] Griffith D.A., 2003. Spatial autocorrelation and spatial filtering. Springer,
Berlin.
Fig. 15. C Chart of 300 Samples of Smokers for Lung Cancer Cause. [15] Chandrasegar Thirumalai, Senthilkumar M, “An Assessment Framework
of Intuitionistic Fuzzy Network for C2B Decision Making”, International
Conference on Electronics and Communication Systems (ICECS), IEEE
From the above control chart, upper control limit is 18 and & 978-1-4673-7832-1, Feb. 2016
control limit is 2.9. It is the control chart for all the data in the [16] Rodgers J.L. & Nicewander W.A., 1988. Thirteen ways to look at the
dataset. In the dataset of 300 samples, someone have high correlation coefficient. The American Statistician 42 (1): 59–66.
chances of getting lung cancer disease. [17] F. Fioravanti, P. Nesi, “A method and tool for assessing object-oriented
projects and metrics management,” Journal of Systems and Software,
V. CONCLUSION Volume 53, Issue 2, 31 August 2000, Pages 111-136
The purpose of this paper is to use “box and whisker plot” [18] Galton F., 1875. Statistics by intercomparison. Philosophical Magazine
49: 33–46
method for visualizing the samples of the dataset and from
[19] Chandrasegar Thirumalai, Viswanathan P, “Diophantine based
that results we can easily make relationships between the Asymmetric Cryptomata for Cloud Confidentiality and Blind Signature
attributes. From the above boxplot method, we learned about applications,” JISA, Elsevier, 2017.
which age of people mostly smoking people or farmer [20] Galton F., 1877. Typical laws of heredity. Proceedings of the Royal
smoking people will have chances of getting lung cancer Institution 8: 282–301.
disease. we got some result with the help of these boxplot [21] Chandrasegar Thirumalai, Sathish Shanmugam, “Multi-key distribution
scheme using Diophantine form for secure IoT communications,” IEEE
method results, we can make a system that gets some input IPACT 2017.
from the user, so that can predicate whether the person has any [22] Galton F., 1888. Co-relations and their measurement, chiefly from
chances to get cancer disease. anthropometric data. Proceedings of the Royal Society of London 45:
135–145.
[23] Chandrasegar Thirumalai, Senthilkumar M, “Spanning Tree approach for
REFERENCES Error Detection and Correction,” IJPT, Vol. 8, Issue No. 4, Dec-2016, pp.
[1] Kampstra, Peter. "Beanplot: A boxplot alternative for visual comparison 5009-5020.
of distributions." Journal of statistical software 28, no. 1 (2008): 1-9. [24] Galton F., 1890. Kinship and correlation. North American Review 150:
[2] Frigge, Michael, David C. Hoaglin, and Boris Iglewicz. "Some 419–431.
implementations of the boxplot." The American Statistician 43, no. 1 [25] Chandrasegar Thirumalai, Senthilkumar M, “Secured E-Mail System
(1989): 50-54. using Base 128 Encoding Scheme,” International journal of pharmacy and
[3] Benjamini, Yoav. "Opening the Box of a Boxplot." The American technology, Vol. 8 Issue 4, Dec. 2016, pp. 21797-21806.
Statistician 42, no. 4 (1988): 257-262. [26] M.Senthilkumar, T.Chandrasegar, M.K. Nallakaruppan, S.Prasanna, “A
[4] Hubert, Mia, and Ellen Vandervieren. "An adjusted boxplot for skewed Modified and Efficient Genetic Algorithm to Address a Travelling
distributions." Computational statistics & data analysis 52, no. 12 (2008): Salesman Problem,” in International Journal of Applied Engineering
5186-5201. Research, Vol. 9 No. 10, 2014, pp. 1279-1288
[5] Thriumani, Reena, et al. "Cancer detection using an electronic nose: A [27] Nallakaruppan, M.K., Senthil Kumar, M., Chandrasegar, T., Suraj, K.A.,
preliminary study on detection and discrimination of cancerous cells." Magesh, G., “Accident avoidance in railway tracks using Adhoc wireless
Biomedical Engineering and Sciences (IECBES), 2014 IEEE Conference networks,” 2014, IJAER, 9 (21), pp. 9551-9556.
on. IEEE, 2014. [28] T Chandra Segar, R Vijayaragavan, “Pell's RSA key generation and its
[6] Hauke J., Kossowski T., Comparison of values of Pearson’s and security analysis,” in Computing, Communications and Networking
Spearman’s correlation coefficient on the same sets of data. Quaestiones Technologies (ICCCNT) 2013, pp. 1-5
Geographicae 30(2), Bogucki Wydawnictwo Naukowe, Poznań 2011, pp. [29] Chandrasegar Thirumalai, Senthilkumar M, Vaishnavi B, “Physicians
87–93, 3 figs, 1 table. DOI 10.2478/v10117-011-0021-1, ISBN 978-83- Medicament using Linear Public Key Crypto System,” in International
62662-62-3, ISSN 0137-477X. conference on Electrical, Electronics, and Optimization Techniques,
[7] Piovani J.I., 2008. The historical construction of correlation as a ICEEOT, IEEE & 978-1-4673-9939-5, March 2016.
conceptual and operative instrument for empirical research. Quality & [30] Chandrasegar Thirumalai, “Physicians Drug encoding system using an
Quantity 42: 757–777. Efficient and Secured Linear Public Key Cryptosystem (ESLPKC),”
International journal of pharmacy and technology, Vol. 8 Issue 3, Sep.
2016, pp. 16296-16303

5
International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]

[31] E Malathy, Chandra Segar Thirumalai, "Review on non-linear set

associative cache design," IJPT, Dec-2016, Vol. 8, Issue No.4, pp. 5320-
5330
[32] “DDoS: Survey Of Traceback Methods”, International Joint Journal
Conference in Engineering 2009, ISSN 1797-9617.
[33] Chandrasegar Thirumalai, Senthilkumar M, Silambarasan R, Carlos
Becker Westphall, “Analyzing the strength of Pell’s RSA,” IJPT, Vol. 8
Issue 4, Dec. 2016, pp. 21869-21874.
[34] Chandramowliswaran N, Srinivasan.S and Chandra Segar T, “A Novel
scheme for Secured Associative Mapping” The International J. of
Computer Science and Applications (TIJCSA) & India, TIJCSA
Publishers & 2278-1080, Vol. 1, No 5 / pp. 1-7 / July 2012
[35] Chandrasegar Thirumalai, “Review on the memory efficient RSA
variants,” International Journal of Pharmacy and Technology, Vol. 8
Issue 4, Dec. 2016, pp. 4907-4916.
[36] Vinothini S, Chandra Segar Thirumalai, Vijayaragavan R, Senthil Kumar
M, “A Cubic based Set Associative Cache encoded mapping,”
International Research Journal of Engineering and Technology (IRJET),
Volume: 02 Issue: 02 May -2015
[37] Chandrasegar Thirumalai, Himanshu Kar, “Memory Efficient Multi Key
(MEMK) generation scheme for secure transportation of sensitive data
over Cloud and IoT devices,” IEEE IPACT 2017.

Box Plot - Causal and Regression Models - Upload
No ratings yet
Box Plot - Causal and Regression Models - Upload
77 pages
Azure Data Factory Presentation
No ratings yet
Azure Data Factory Presentation
30 pages
Integritest 5: Spare Parts List
No ratings yet
Integritest 5: Spare Parts List
2 pages
Measures of Relative Motion
0% (1)
Measures of Relative Motion
20 pages
Estadistica Analisi
No ratings yet
Estadistica Analisi
29 pages
Boxplot
No ratings yet
Boxplot
7 pages
Final ProjectAlbertVidalLluisGimenez
No ratings yet
Final ProjectAlbertVidalLluisGimenez
19 pages
Notes 03
No ratings yet
Notes 03
21 pages
Ms Data Science S, 24 (WEEK# 2)
No ratings yet
Ms Data Science S, 24 (WEEK# 2)
19 pages
Unit 2 Data Preprocessing
No ratings yet
Unit 2 Data Preprocessing
8 pages
Advanced Data Analysis Techniques 3
No ratings yet
Advanced Data Analysis Techniques 3
31 pages
Pop-Up - A Manual of Paper Mechanisms - Duncan Birmingham
81% (36)
Pop-Up - A Manual of Paper Mechanisms - Duncan Birmingham
98 pages
Gagan Jindali Report
No ratings yet
Gagan Jindali Report
11 pages
DM Introduction
No ratings yet
DM Introduction
50 pages
Box and Whisker Plots (D3.2.15)
No ratings yet
Box and Whisker Plots (D3.2.15)
11 pages
2024 S2 IT2110 Tutotrial 02 DescriptiveStatistics
No ratings yet
2024 S2 IT2110 Tutotrial 02 DescriptiveStatistics
3 pages
Q13 and 14
No ratings yet
Q13 and 14
4 pages
Frontend Development Tasks & Instructions - CodeAlpha
No ratings yet
Frontend Development Tasks & Instructions - CodeAlpha
2 pages
Mdm4u 4
No ratings yet
Mdm4u 4
2 pages
The Architecture Machine - MIT
No ratings yet
The Architecture Machine - MIT
104 pages
ML Lab Manual Bcsl602
No ratings yet
ML Lab Manual Bcsl602
108 pages
AP Statistics Problems #12
0% (1)
AP Statistics Problems #12
2 pages
Chapt 1
No ratings yet
Chapt 1
20 pages
R22 Unit2 CH2
No ratings yet
R22 Unit2 CH2
28 pages
CHP 2
No ratings yet
CHP 2
52 pages
AE565 - AE565 - Types of Unmanned Aerial Vehicles (UAVs), Sensing Technologies, and Software For Agricultural Applications
No ratings yet
AE565 - AE565 - Types of Unmanned Aerial Vehicles (UAVs), Sensing Technologies, and Software For Agricultural Applications
12 pages
Manual 2182347
No ratings yet
Manual 2182347
102 pages
The Five-Number Summary and Boxplots
No ratings yet
The Five-Number Summary and Boxplots
4 pages
1 Program
No ratings yet
1 Program
20 pages
An Adjusted Boxplot For Skewed Distributions: M. Hubert, E. Vandervieren
No ratings yet
An Adjusted Boxplot For Skewed Distributions: M. Hubert, E. Vandervieren
16 pages
Dawson (2011) - How Significant Is A Boxplot Outlier
No ratings yet
Dawson (2011) - How Significant Is A Boxplot Outlier
14 pages
Box Plots (1) - Encrypted
No ratings yet
Box Plots (1) - Encrypted
5 pages
IDS-Boxplots and Outliers
No ratings yet
IDS-Boxplots and Outliers
16 pages
Statistics Notes Part 1
No ratings yet
Statistics Notes Part 1
26 pages
Visualization - Hist and Box
No ratings yet
Visualization - Hist and Box
23 pages
Histogram, Box and Whisker Plots
No ratings yet
Histogram, Box and Whisker Plots
7 pages
Sonost 3000 PDF
No ratings yet
Sonost 3000 PDF
95 pages
Lesson 2.3 StatAna
No ratings yet
Lesson 2.3 StatAna
3 pages
Chapter 2 Handout Jan 30
No ratings yet
Chapter 2 Handout Jan 30
12 pages
Bloxplots in Data Science
No ratings yet
Bloxplots in Data Science
3 pages
Download
No ratings yet
Download
47 pages
Histogram & Box Plot
100% (1)
Histogram & Box Plot
6 pages
Chapter Five
No ratings yet
Chapter Five
48 pages
Module - 3
No ratings yet
Module - 3
43 pages
Worksheet 3 - Inglês
No ratings yet
Worksheet 3 - Inglês
2 pages
DWDM Unit-2
No ratings yet
DWDM Unit-2
20 pages
Chapter 2 Final of Final
No ratings yet
Chapter 2 Final of Final
158 pages
Boxplots in R-1
No ratings yet
Boxplots in R-1
10 pages
MYP 4 Unit 3 Box Plot Note 11
No ratings yet
MYP 4 Unit 3 Box Plot Note 11
8 pages
Box Plot
No ratings yet
Box Plot
9 pages
Advantages and Disadvantages of Various Graphical Methods: - Dotplots
No ratings yet
Advantages and Disadvantages of Various Graphical Methods: - Dotplots
28 pages
Box Whisker Plot
No ratings yet
Box Whisker Plot
6 pages
Biostatistic Assessment!
No ratings yet
Biostatistic Assessment!
18 pages
Box Plot
No ratings yet
Box Plot
8 pages
STAT 1770 Lab 2-2
No ratings yet
STAT 1770 Lab 2-2
3 pages
Thống Kê Trong Kinh Doanh
No ratings yet
Thống Kê Trong Kinh Doanh
5 pages
Plotting
No ratings yet
Plotting
1 page
Basic Infrastructure Operations: Unit of Competence Module Title
No ratings yet
Basic Infrastructure Operations: Unit of Competence Module Title
31 pages
Data Mining and Warehousing Assignment-1: Introduction To Boxplots
No ratings yet
Data Mining and Warehousing Assignment-1: Introduction To Boxplots
4 pages
SmartFoxServer 2X Documentation - Tic-tac-Toe
No ratings yet
SmartFoxServer 2X Documentation - Tic-tac-Toe
11 pages
Boxplot - ActivityAnswerKey
No ratings yet
Boxplot - ActivityAnswerKey
6 pages
5 Class ECG
No ratings yet
5 Class ECG
19 pages
Box Plots and Distribution
No ratings yet
Box Plots and Distribution
14 pages
Tutorial 5 - Calculating Mean, Standard Deviation, Frequencies
No ratings yet
Tutorial 5 - Calculating Mean, Standard Deviation, Frequencies
6 pages
Visual Presentation of Data: by Means of Box Plots
No ratings yet
Visual Presentation of Data: by Means of Box Plots
4 pages
Boxplot Activity
0% (1)
Boxplot Activity
8 pages
Manual Aparat de Etichetare Dymo Letratag 100h Dy19757 3047
No ratings yet
Manual Aparat de Etichetare Dymo Letratag 100h Dy19757 3047
20 pages
Copia de HEXADECIMAL ADDITION 1 PDF
No ratings yet
Copia de HEXADECIMAL ADDITION 1 PDF
11 pages
OKR Template Tracker by Felipe Castro
No ratings yet
OKR Template Tracker by Felipe Castro
30 pages
Box Plot
No ratings yet
Box Plot
4 pages
Kotak Misai
No ratings yet
Kotak Misai
4 pages
Politica de Regulacion de Poppo Live PDF 241222 113204
No ratings yet
Politica de Regulacion de Poppo Live PDF 241222 113204
14 pages
Lec 10 PDF
No ratings yet
Lec 10 PDF
8 pages
2013 Selection of The Best Classifier From Different Datasets Using WEKA PDF
No ratings yet
2013 Selection of The Best Classifier From Different Datasets Using WEKA PDF
8 pages
PDF Editor
No ratings yet
PDF Editor
13 pages
Discrete Mathematics: Submitted by Komal Applied Science
No ratings yet
Discrete Mathematics: Submitted by Komal Applied Science
29 pages
Resume VARDHAN SCM FUNCTIONAL CONSULTANT MS D365
No ratings yet
Resume VARDHAN SCM FUNCTIONAL CONSULTANT MS D365
3 pages
MH-datasheet-IR Emitter JSIR-350-4
No ratings yet
MH-datasheet-IR Emitter JSIR-350-4
6 pages
Truecall 1
No ratings yet
Truecall 1
82 pages
XVC-1x00e Datasheet 20220427
No ratings yet
XVC-1x00e Datasheet 20220427
2 pages
Summary Measures: Multiple Choice Questions
No ratings yet
Summary Measures: Multiple Choice Questions
9 pages
Ais Form4b RV
No ratings yet
Ais Form4b RV
3 pages
NYSC VideoCV Guideline1
No ratings yet
NYSC VideoCV Guideline1
3 pages
Axetris - IRS Overview EMIRS200 - 2
No ratings yet
Axetris - IRS Overview EMIRS200 - 2
1 page
Apple Serial Number Info - Decode Your Mac's Serial Number!
No ratings yet
Apple Serial Number Info - Decode Your Mac's Serial Number!
1 page
Exam AZ-204: Developing Solutions For Microsoft Azure - Skills Measured
No ratings yet
Exam AZ-204: Developing Solutions For Microsoft Azure - Skills Measured
7 pages
P2 - Ch4 - Cheat Sheet
No ratings yet
P2 - Ch4 - Cheat Sheet
1 page
Security Lab 7
No ratings yet
Security Lab 7
9 pages
Wii Backup Manager - CompleteSoftmodGuide
No ratings yet
Wii Backup Manager - CompleteSoftmodGuide
4 pages
Ergonomic Guide
No ratings yet
Ergonomic Guide
5 pages
QAD Procedure - Control of Inspection, Measuring & Test Equipment P2
No ratings yet
QAD Procedure - Control of Inspection, Measuring & Test Equipment P2
1 page
SHS-H705 Manual (EN)
No ratings yet
SHS-H705 Manual (EN)
2 pages
ATI TEAS Science Questions: Questions & Explanations for Test of Essential Academic Skills (TEAS)
From Everand
ATI TEAS Science Questions: Questions & Explanations for Test of Essential Academic Skills (TEAS)
Sterling Test Prep
No ratings yet
College Organic Chemistry Semester I: Practice Questions with Detailed Explanations
From Everand
College Organic Chemistry Semester I: Practice Questions with Detailed Explanations
Sterling Test Prep
No ratings yet

Data Analysis Using Box and Whisker Plot

Uploaded by

Data Analysis Using Box and Whisker Plot

Uploaded by

International Conference on Innovations in Power and Advanced Computing Technologies [i-PACT2017]

Data analysis using Box and Whisker Plot for Lung

Asian 175 85 No Yes 2000 Yes

978-1-5090-5682-8 /17/$31.00 ©2017 IEEE

II. DATA ANALYSIS

Step 1: Sort the data on a primary attribute.

Inter-quartile range: IQR = Q3 – Q1

C. Boxplot for Smoking status based on Age

Fig. 3. Cancer in Years (2003 – 2005).

Fig. 7. Boxplot for Smoking Status of the Peoples (2000 – 2005).

The numbers of smokers are increased in 2004 when

Fig. 5. Boxplot for Overall Attributes.

F. 3D plot for Lung Cancer H. 3D Scatterplot

Fig. 12. 3D Scatter plot of Year, Years Smoked, and Age.

Fig. 9. The 3D plot of Lung Cancer.

Fig 9 shows the cancer, years smoked and smoking status.

Fig. 13. Lung Cancer Chances.

In the above Fig 13 shows the getting chance for lung

Fig. 11. Lung Cancer Causes Options.

J. Control Chart [8] P. Dhavachelvan, Chandra Segar T, K. Satheskumar, "Evaluation of

[31] E Malathy, Chandra Segar Thirumalai, "Review on non-linear set

You might also like