Lab 8

This summary analyzes a document describing exploratory data analysis tasks performed on a diamonds dataset using R and ggplot2. Key visualizations included bar plots examining cut and color distributions, histograms of carat weights overall and in a price-filtered subset, scatter plots of carat versus price and cut versus price relationships, and box plots of price distributions by cut. Calculations on a test vector with NA values demonstrated the impact of including or removing missing data. Overall, the exploratory analysis provided insights into diamond attribute distributions, correlations, and statistical summaries.

Uploaded by

Roaster Guru

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views7 pages

Lab 8

Uploaded by

Roaster Guru

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

#Lab - 8 Exploration

1. The first 6 rows from diamonds data set and its structure are given below. Using this
data set do the following tasks with the ggplot2 package:
diamond

a. Study the distribution of the quality of the cut (cut).

Code
ggplot(diamonds,aes(x = cut))+geom_bar()
Output
Using the ggplot2 package and the diamonds dataset, a bar plot was generated. This
plot showcased the frequency of each diamond cut category, offering a visual
representation of the distribution of cuts within the dataset.

b. Study the distribution of the weight of the diamond (carat).

Code
ggplot(diamonds,aes(x=carat))+geom_histogram(binwidth = 0.5)
Output

To visualize the distribution of diamond carat weights, a histogram was generated

using the geom_histogram function. A binwidth of 0.5 was specified to create the
plot, enabling us to examine the frequency of carat weights within distinct ranges.
c. Study the distribution of the weight of the diamond (carat) when the price (price)
is more than 6000$.
Code
ggplot(subset(diamonds,price>6000),aes(x = carat))+geom_histogram(binwidth =
0.5)

Output

To specifically analyze diamonds with prices over 6000, a subset function was
applied to filter the dataset based on the price condition. The resulting subset of data
was then used to create a histogram, focusing on the distribution of carat values for
higher-priced diamonds. This histogram provides insights into the carat weight
distribution within the selected subset of diamonds that have prices exceeding 6000.
d. Study the relationship between the diamond’s weight (carat) and its price (price).
Code
ggplot(diamonds,aes(x = carat,y = price))+geom_point()
Output

By creating a scatter plot, we aimed to investigate the correlation between diamond

carat weight and price. Each individual diamond in the dataset was represented by a
data point, and the position of each point on the plot was determined by its carat
weight and price values. This scatter plot offered a comprehensive visualization of the
relationship between carat weight and price, allowing for an examination of the
overall association between these two variables.
e. Study the relationship between the quality of the cut (cut) and the diamond color
(color).
Code
ggplot(diamonds,aes(x = cut,fill = color))+geom_bar()
Output

Using a bar plot, we visualized the distribution of diamond cuts based on their color.
Each cut category was represented by a separate bar, and within each bar, segments
were used to indicate different colors. This plot provided a clear visualization of the
combination of cut and color within the dataset, allowing for a better understanding of
how these attributes are distributed among the diamonds.

f. Study the relationship between the quality of the cut (cut) and the price (price).
Code
ggplot(diamonds,aes(x = cut,y=price))+geom_boxplot()
Output
To examine the distribution of prices across various diamond cuts, a box plot was
created. This plot displayed essential statistical measures such as the minimum,
maximum, median, and quartile values for each cut category. By visualizing these
measures, the box plot offered valuable insights into the variation of prices among
different diamond cuts.

2. Create a new vector with the following data: 1,2,3,4,NA,6,7,8,NA,NA. NA means ‘Not
Available’ / Missing Values. Use min, max, and mean functions to get the minimum,
maximum, and average, respectively for this vector. Try using the argument
na.rm=TRUE with these three functions and re-print the results.
Code
vec<- c(1,2,3,4,NA,6,7,8,NA,NA)

#without NA remove
cat("Minm",min(vec))
cat("Maxm",max(vec))
cat("mean",mean(vec))

#with NA removed
cat("Minm",min(vec,na.rm = TRUE))
cat("Maxm",max(vec,na.rm = TRUE))
cat("mean",mean(vec,na.rm = TRUE))
Output

Given a vector named vec, which includes numeric values as well as missing values represented
as NA, calculations were performed on this vector as follows:

a. Initially, without removing the NA values, the minimum, maximum, and mean values of the
vector were computed using the min, max, and mean functions, respectively. These calculations
considered all values in the vector, including the NA values.

b. Next, by excluding the NA values using the na.rm = TRUE argument, the same calculations
were repeated. This allowed for the determination of the minimum, maximum, and mean values
considering only the available numeric values in the vector, excluding the NA values.

Conclusion
In summary, this analysis encompassed the visualization of the diamonds dataset using different
plots, enabling us to gain valuable insights into the distribution of cuts, carat weights, prices, and
color combinations. Furthermore, calculations were conducted on a vector containing NA values,
with and without their removal, to compare statistical summaries. These exploratory tasks
facilitated a comprehensive examination and analysis of the dataset, revealing patterns,
relationships, and summary statistics pertaining to diamonds.

Introduction Game Analysis 3rd (051 100)
No ratings yet
Introduction Game Analysis 3rd (051 100)
50 pages
Pioneer X Hm82 S X Hm82d XC Hm82d K X Hm72 X Hm72d
100% (1)
Pioneer X Hm82 S X Hm82d XC Hm82d K X Hm72 X Hm72d
110 pages
Cross Site Scripting XSS CSS: Also Known As or
No ratings yet
Cross Site Scripting XSS CSS: Also Known As or
219 pages
Encyclopedia
No ratings yet
Encyclopedia
56 pages
Fortra Data Classification Suite For Windows Deployment Guide
No ratings yet
Fortra Data Classification Suite For Windows Deployment Guide
69 pages
03 01 PatMax Logic
No ratings yet
03 01 PatMax Logic
15 pages
MIT CQ University Australia
No ratings yet
MIT CQ University Australia
7 pages
GPON OLT (New 8PON Port, 16PON Port) User Manual-Command Line Operation - V1.1 20180723
No ratings yet
GPON OLT (New 8PON Port, 16PON Port) User Manual-Command Line Operation - V1.1 20180723
336 pages
Collaborative Learning For Cyberattack Detection in Blockchain Networks
No ratings yet
Collaborative Learning For Cyberattack Detection in Blockchain Networks
12 pages
ArchiCAD Report
No ratings yet
ArchiCAD Report
1 page
Documents - Pub - The Elastix Call Center Protocol Revealed
No ratings yet
Documents - Pub - The Elastix Call Center Protocol Revealed
68 pages
Weeklly Report
No ratings yet
Weeklly Report
13 pages
Worth1000 Photoshop Tutorials
100% (3)
Worth1000 Photoshop Tutorials
315 pages
Generative Adversarial Networks Seminar Report
50% (4)
Generative Adversarial Networks Seminar Report
11 pages
EEE378 - Digital Electronic II (Vol I) Week 1
No ratings yet
EEE378 - Digital Electronic II (Vol I) Week 1
41 pages
Smart Agriculture System
100% (1)
Smart Agriculture System
9 pages
GSM Fixed Wireless Terminal FCT-400 User Manual
No ratings yet
GSM Fixed Wireless Terminal FCT-400 User Manual
13 pages
A7ph 206 1
No ratings yet
A7ph 206 1
7 pages
Sustainable IT Services: Assessing The Impact of Green Computing Practices
No ratings yet
Sustainable IT Services: Assessing The Impact of Green Computing Practices
11 pages
Arrays
No ratings yet
Arrays
5 pages
Name:-Nitish Xavier Tirkey F.Y.Bca Date: - 4 October, 2010
No ratings yet
Name:-Nitish Xavier Tirkey F.Y.Bca Date: - 4 October, 2010
10 pages
SET-280. Controlling AC Lamp Dimmer Through Mobile Phone
No ratings yet
SET-280. Controlling AC Lamp Dimmer Through Mobile Phone
3 pages
Fdma Technology PDF
No ratings yet
Fdma Technology PDF
2 pages
US IT Recruiting Training Material - Road To US Staffing and USA
No ratings yet
US IT Recruiting Training Material - Road To US Staffing and USA
17 pages
Think Like Programmers
No ratings yet
Think Like Programmers
6 pages
Application of Expert System
No ratings yet
Application of Expert System
4 pages
Bank Statement PDF
50% (2)
Bank Statement PDF
3 pages
Audio Amplifier
No ratings yet
Audio Amplifier
20 pages
Greater Amman Water SCADA Project (GASS) - TECO GROUP
No ratings yet
Greater Amman Water SCADA Project (GASS) - TECO GROUP
2 pages
BSNL Landline Broadband Closure Letter
0% (1)
BSNL Landline Broadband Closure Letter
2 pages
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6458)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (464)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5181)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (1005)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (582)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1175)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (280)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2016)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1022)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4135)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2814)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4372)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
3.5/5 (2133)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)

Lab 8

Uploaded by

Lab 8

Uploaded by

#Lab - 8 Exploration

a. Study the distribution of the quality of the cut (cut).

b. Study the distribution of the weight of the diamond (carat).

To visualize the distribution of diamond carat weights, a histogram was generated

By creating a scatter plot, we aimed to investigate the correlation between diamond

You might also like