0% found this document useful (0 votes)
3 views3 pages

Viva Answers

The document provides answers to common viva questions related to data science, covering topics such as the definition of data science, the use of Python and its libraries like NumPy and Pandas, and various data visualization techniques. It explains key concepts such as arrays, standard distribution curves, scatter plots, and DataFrames. Additionally, it highlights the importance of libraries and aggregation functions in data manipulation and analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views3 pages

Viva Answers

The document provides answers to common viva questions related to data science, covering topics such as the definition of data science, the use of Python and its libraries like NumPy and Pandas, and various data visualization techniques. It explains key concepts such as arrays, standard distribution curves, scatter plots, and DataFrames. Additionally, it highlights the importance of libraries and aggregation functions in data manipulation and analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Answers to Viva Questions:

1. What is meant by data science?

Data science involves analyzing and interpreting complex data using scientific methods,

algorithms, and systems to gain insights and make decisions.

2. Why use Python in data science?

Python is versatile, has extensive libraries (e.g., NumPy, pandas), is easy to learn, and is

well-suited for data manipulation, visualization, and machine learning.

3. What is meant by NumPy? Purpose of NumPy?

NumPy (Numerical Python) is a library for numerical computation. It provides support for

multi-dimensional arrays and mathematical operations on arrays.

4. Pandas definition:

Pandas is a Python library used for data manipulation and analysis, offering data structures like

DataFrames for handling structured data.

5. Define CSV file:

A CSV (Comma-Separated Values) file stores tabular data as plain text, where each line

represents a row, and fields are separated by commas.

6. What are libraries in Python?

Libraries are collections of pre-written code (functions, classes) that perform specific tasks, such

as NumPy for numerical computation or matplotlib for visualization.


7. Basics of arrays:

An array is a collection of elements of the same type, stored at contiguous memory locations,

allowing efficient data processing.

8. Why use arrays in data science?

Arrays enable faster data processing and efficient manipulation for large datasets compared to

lists.

9. Standard distribution curve:

A bell-shaped curve representing a normal distribution, where most data points are near the mean,

and fewer are farther away.

10. Scatter plot (why, when, where):

A scatter plot visualizes relationships between two variables. Used in exploratory data analysis to

identify trends or correlations.

11. Contour plot:

A graphical representation of a 3D surface in two dimensions, showing lines of constant value.

12. Matplotlib library (define, why, when):

Matplotlib is a Python library for creating static, interactive, and animated visualizations. It is used

for plotting graphs and analyzing data.

13. Histogram:

A graphical representation of data distribution, showing frequencies of data intervals.

14. Aggregation function:


Functions like sum(), mean(), or count() that summarize multiple data points into a single value.

15. Basics of visualization tools or packages:

Tools like matplotlib, seaborn, and Plotly are used to create graphs, charts, and interactive

visualizations to interpret data effectively.

16. DataFrame:

A 2D data structure in pandas, similar to a table, with labeled rows and columns.

17. Linear recursion:

A recursive function that calls itself once per invocation, solving problems step-by-step.

18. Basemap:

A matplotlib toolkit used for plotting geographical data and creating maps.

You might also like