0% found this document useful (0 votes)
6 views10 pages

Dev 3

DATA EXPLORATION AND VISUALISATION LAB MANUAL

Uploaded by

953623243008
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views10 pages

Dev 3

DATA EXPLORATION AND VISUALISATION LAB MANUAL

Uploaded by

953623243008
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

3.

WORKING WITH NUMPY ARRAYS, PANDAS DATA


FRAMES AND BASIC PLOTS USING MATPLOTLIB

import pandas as pd
df = pd.read_csv('/content/data set.csv')
df.head()

LINE CHART:
import matplotlib.pyplot as plt
import pandas as pd

plt.figure(figsize=(10 ,10))

plt.plot(df.index, df['country'], label='country', marker='o', linestyle='-')

plt.plot(df.index, df['rank'], label='rank', marker='o', linestyle='-')

plt.xlabel('Sample')
plt.ylabel('Value')
plt.legend()
plt.grid(True)
plt.show()

BAR CHART:
plt.figure(figsize=(10, 10))

# Plot 'ph' as a bar chart


plt.bar(df.index, df['firstyr'], label='firstyr')

# Plot 'Hardness' as a bar chart


plt.bar(df.index, df['lastyr'], label='lastyr', alpha=0.7)
plt.xlabel('Sample')
plt.ylabel('Value')
plt.xticks(df.index)
plt.legend()
plt.grid(True)
plt.show()

SCATTER PLOT:
plt.figure(figsize=(20, 6))
plt.scatter(df['inst_name'], df['rank'], marker='o')

plt.title('Scatter Plot of inst_name vs rank')


plt.xlabel('inst_name')
plt.ylabel('rank')
plt.show()

AREA OR STACKED PLOT:


plt.figure(figsize=(10, 10))

plt.fill_between(df.index, df['authfull'], alpha=0.5, label='authfull')


plt.xlabel('Sample')
plt.ylabel('Value')
plt.legend()
plt.grid(True)
plt.show()
PIE CHART:
import pandas as pd
import matplotlib.pyplot as plt

category_counts = df['rank'].value_counts()
labels = category_counts.index
sizes = category_counts.values
plt.figure(figsize=(8, 8))
plt.pie(sizes, labels=labels, autopct='%1.1f%%', startangle=140)
plt.title('Distribution of Categories')
plt.axis('equal')
plt.show()
POLAR CHART:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

attributes = ['authfull','inst_name','country']
n = len(attributes)

values = df[attributes].values[0]
values = np.concatenate((values, [values[0]]))

angles = np.linspace(0, 2*np.pi, n, endpoint=False).tolist()


angles += angles[:1]

plt.figure(figsize=(6, 6))
plt.polar(angles, values, marker='o', linewidth=2)
plt.fill(angles, values, 'b', alpha=0.1)

plt.xticks(angles[:-1], attributes)
plt.title('Polar Chart')
plt.show()

HISTOGRAM:
import pandas as pd
import matplotlib.pyplot as plt
column_name = 'rank'
plt.figure(figsize=(6, 6))
plt.hist(df[column_name], bins=20, color='violet', edgecolor='black')
plt.title(f'Histogram of {column_name}')
plt.xlabel(column_name)
plt.ylabel('country')
plt.show()

TABLE CHART:
df
JUSTIFICATION:
Not all datasets are appropriate for every type of chart. Choosing the right chart depends on
the nature of the data and the message you wish to convey. Below is a summary of common
chart types and their ideal data applications:

1. Bar Charts: Ideal for comparing categories or groups, bar charts are effective for
showing the distribution of categorical data or comparing values across categories.
Examples include clustered and stacked bar charts.

2. Line Charts: Best suited for displaying trends over time, line charts are typically
used with time series data to show how variables change over continuous intervals
(e.g., days, months, or years).
3. Pie Charts: Useful for illustrating how different categories contribute to a whole, pie
charts show the percentage breakdown of a dataset. They work best when there are a
limited number of categories.

4. Scatter Plots: Used to depict the relationship between two variables, scatter plots are
great for identifying correlations, patterns, outliers, or clusters in the data.

5. Histograms: Designed for showing the distribution of a continuous variable,


histograms display the frequency of data points within specified ranges, making them
useful for understanding large datasets.

6. Area Plots: While area plots can be applied to various datasets, they are particularly
good for displaying cumulative data. The filled area emphasizes the magnitude of
changes over time or between categories.

7. Tables: Tables display raw data in a structured form and can accommodate any
dataset, as they merely represent the data without additional analysis or visualization
elements.

Each of these chart types serves a distinct purpose and is best suited for specific kinds of data
analysis. Selecting the most appropriate chart will enhance data interpretation and
communication.

You might also like