1. What is Pandas? Name the two main data structures in Pandas.
Pandas is an open-source Python library used for data manipulation and analysis. The two main
data structures are Series and DataFrame.
2. Differentiate between Series and DataFrame with an example.
A Series is a one-dimensional labeled array, while a DataFrame is a two-dimensional labeled data
structure. Example:
Series:
s = pd.Series([10, 20, 30])
DataFrame:
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
3. Write Python code to create a Series from a list, a dictionary, and a NumPy array.
From list:
pd.Series([1, 2, 3])
From dictionary:
pd.Series({'a': 1, 'b': 2})
From NumPy array:
pd.Series(np.array([4, 5, 6]))
4. How can you change the index of a Pandas Series?
Use the index parameter or assign to .index:
s.index = ['a', 'b', 'c']
5. Explain the use of head() and tail() methods in Pandas.
head() shows the first 5 rows; tail() shows the last 5 rows of the DataFrame or Series.
6. What are the key features of Pandas?
Key features include:
- Fast and efficient DataFrame object
- Tools for reading/writing data
- Handling missing data
- Data alignment and reshaping
7. Write a Python program to perform basic operations on Series (addition, subtraction).
s1 = pd.Series([1, 2, 3])
s2 = pd.Series([4, 5, 6])
Addition: s1 + s2
Subtraction: s1 - s2
8. What is the difference between iloc[] and loc[]?
iloc[] is for integer-location based indexing; loc[] is label-based indexing.
9. How do you check for null values in a DataFrame?
Use df.isnull() to check nulls; df.isnull().sum() to count nulls per column.
10. Write a program to create a DataFrame from a dictionary of lists.
data = {'Name': ['A', 'B'], 'Marks': [90, 85]}
df = pd.DataFrame(data)
11. What are the different ways to read data into a DataFrame in Pandas?
Ways include: read_csv(), read_excel(), read_json(), read_sql(), read_html()
12. Explain the use of read_csv() and to_csv() with an example.
read_csv('data.csv') reads CSV file.
to_csv('output.csv') writes DataFrame to CSV.
13. How can you display only specific columns from a DataFrame?
Use column names: df[['col1', 'col2']]
14. What does the describe() function return in a DataFrame?
It returns summary statistics (count, mean, std, min, max, etc.) for numeric columns.
15. Write a Python program to read a CSV file and display basic statistics.
df = pd.read_csv('data.csv')
print(df.describe())
16. How can you handle missing data in Pandas?
Use methods like fillna(), dropna(), or interpolate().
17. Write code to sort a DataFrame based on values of a particular column.
df.sort_values(by='column_name', ascending=True)
18. What is the use of the groupby() function? Give an example.
groupby() is used for grouping data and applying aggregation.
Example: df.groupby('column').mean()
19. Explain the difference between drop() and del.
drop() removes columns/rows and returns a new object. del deletes a column in-place.
20. How can you filter rows in a DataFrame using conditions?
Use boolean indexing: df[df['column'] > value]
21. What is Matplotlib? Why is it used?
Matplotlib is a Python library for creating static, interactive, and animated plots.
22. Differentiate between plot() and bar() functions in Matplotlib.
plot() is for line plots; bar() is for bar charts.
23. Write a Python program to draw a line plot using Matplotlib.
plt.plot([1, 2, 3], [4, 5, 6])
plt.show()
24. How do you add labels and title to a plot?
Use plt.xlabel(), plt.ylabel(), and plt.title()
25. What is the purpose of the legend() function?
legend() displays the labels for different plot elements.
26. Write a program to create a bar chart for student marks in 5 subjects.
subjects = ['Math', 'Sci', 'Eng', 'Hist', 'Geo']
marks = [90, 80, 85, 70, 75]
plt.bar(subjects, marks)
plt.show()
27. How can you plot multiple lines on the same graph?
Call plt.plot() multiple times before plt.show().
28. Explain the parameters of plt.plot().
x, y: data points; label: legend label; color: line color; linestyle: style of line etc.
29. What is the role of show() and savefig() in Matplotlib?
show() displays the plot; savefig() saves the plot to a file.
30. Write a program to create a pie chart showing percentage distribution of expenses.
expenses = [300, 200, 150]
labels = ['Rent', 'Food', 'Transport']
plt.pie(expenses, labels=labels, autopct='%1.1f%%')
plt.show()