0% found this document useful (0 votes)
10 views6 pages

DV - QB - Solution

data visulization notes

Uploaded by

Kamini Salunkhe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views6 pages

DV - QB - Solution

data visulization notes

Uploaded by

Kamini Salunkhe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Data Visualization

QB Solution
Q1. 7 Stages of Data Visualization

Why Planning Is Key for Data Display


Different types of data need different ways to be shown. How you use the data affects how
you should display it. While there are many easy tools for making graphics, complex data
for specialized purposes needs special attention. This book will explain how to choose the
right way to visualize your data based on its characteristics.

1. Information Overload
"Information overload" means being overwhelmed by too much information. Today,
computers are incredibly powerful and cheap, allowing us to analyse large data sets
without needing a research lab. Over the past decade, computer graphics have also
improved, thanks to gaming technology, making it easier and cheaper to create detailed
and interactive visualizations.

2. Data Collection
We’re getting better at gathering data, but we struggle with making the most of it. Much of
the data available online isn’t used effectively because it isn’t visualized well. Despite
collecting a lot of data, we often can’t answer important questions quickly. We need to
improve how we understand and communicate this information.

3. Thinking About Data


Often, we don’t think deeply about what data means. When personal data is lost or stolen,
we tend to think about the immediate concerns, like credit card theft, rather than the
broader implications.

4. Data is Always Changing


Data isn’t static. It changes constantly, so we need to create visualizations that adapt to
new Information. This might involve using animations or interactive elements to show how
data Evolves over time.

5. What Is the Question?


As data collection has become easier, we often forget the original reason for gathering the
data. This can lead to confusion about how to visualize it. Good data visualization starts
with a clear Question: why was the data collected, what’s interesting about it, and what
stories can it tell?
6. Combining Disciplines
Creating effective visualizations requires knowledge from various fields like statistics,
graphic Design, and data analysis. Each field has its own methods, but they often work in
isolation. To Create meaningful visualizations, these disciplines need to be integrated.

7. The Visualization Process

Understanding data involves several steps:


1. Acquire: Get the data from a file or online.
2. Parse: Organize and categorize the data.
3. Filter: Keep only the data that’s important.
4. Mine: Use statistics to find patterns.
5. Represent: Choose a visual model like a bar graph.
6. Refine: Improve the visualization for clarity and appeal.
7. Interact: Add features to explore the data interactively.

Q2. difference between data visualization and infographics


Q3. Code of data frame

Import pandas as pd
Import numpy as np

#Creating a DataFrame
Data={
‘Name’:[‘Akash’,’Priya’,’Rithesh’,’Neha’, ‘Rahul’],
‘Age’:[25, 23, 19, 20, 26],
‘City’:[‘New Delhi’,’Mumbai’,’Indore’,’Nashik’,’Jaipur’]
}
Df = pd.DataFrame(data)
Df

#Disply The first few rows


Df.head()
#Disply basic information about the DataFrame
Df.info()
#Summary Statistic of numerical column
Df.describe()

Q4. Classification of Digital Data

To classify digital data based on structure, it can be divided into three main categories:
Structured, Semi-Structured, and Unstructured data. Here is an overview of each type:

1. Structured Data
• Definition: Structured data is highly organized and formatted into rows and
columns. It follows a predefined schema, making it easily searchable and
analyzable.
• Characteristics:
o Data is stored in databases like relational databases (e.g., MySQL, Oracle).
o Fields are clearly defined with specific data types (e.g., integer, string, date).
o Querying and processing can be done using SQL.
• Examples: Spreadsheets, databases with customer information, inventory
records.
• Advantages: Easy to manage, query, and analyze using standard tools.
2. Semi-Structured Data
• Definition: Semi-structured data has some organization but lacks a fixed schema.
It may include tags or metadata to identify elements within the data.
• Characteristics:
o Data is stored in flexible formats like JSON, XML, or NoSQL databases.
o It is more adaptable than structured data and can evolve over time.
• Examples: JSON files, XML files, NoSQL databases, email messages (where the
subject and sender are structured, but the message body is not).
• Advantages: Allows flexibility and can store diverse types of information without
strict schema requirements.

3. Unstructured Data
• Definition: Unstructured data lacks any predefined format or organization. It can
come in a variety of forms and is more challenging to process.
• Characteristics:
o Includes diverse types of data such as text, images, audio, and video.
o Requires advanced techniques like machine learning or natural language
processing (NLP) to analyze.
• Examples: Images, videos, social media posts, PDFs, audio recordings.
• Advantages: Contains a wealth of valuable insights, particularly for qualitative
analysis, but is difficult to analyze without specialized tools.

• Conclusion:
Structured data is highly organized and easy to manage but rigid, while semi-structured
data provides more flexibility and adaptability. Unstructured data, though harder to
process, contains the most diverse and valuable information. Each type of data serves
different purposes depending on the needs of an organization.
Q5. Reading the data from different files

From google.colab import drive


Drive.mount(‘/content/drive’)

Read the Excel file into a DataFrame

Import pandas as pd
#Path to your Excel file
File_path = ‘/content/drive/MyDrive/sales_data.xlsx’

#Read the Excel file into a DataFrame


Df = pd.read_excel(file_path, sheet_name = ‘SaleData’)

#Disply the first few rows of the DataFrame


Print(df.head())

Read the CSV file into a DataFrame

Import pandas as pd
#Path to your CSV file
File_path = ‘/content/drive/MyDrive/data.csv’

#Read the CSV file into a DataFrame


Df = pd.read_csv(file_path)

#Disply the first few rows of the DataFrame


Print(df.head())

Read the Json file into a DataFrame

Import pandas as pd
#Path to your Json file
File_path = ‘/content/drive/MyDrive/sample.json’

#Read the Json file into a DataFrame


Df = pd.read_json(file_path)

#Disply the first few rows of the DataFrame


Print(df.to_string())
Read data from XML file into DataFrame

# Importing Pandas
Import pandas as pd

# Path to your text File


File_path = ‘/content/drive/MyDrive/sample3.xml’

# Read the txt file into a DataFrame


Df = pd.read_xml(file_path)

# Disply DataFrame
Print(df.head())

---Harshada Patil

You might also like