Day 3 - Notes Interview Questions
Day 3 - Notes Interview Questions
Question - 3
In Python, what is the primary purpose of the iloc method when working with Pandas
DataFrames?
Interview Question - 1
Scenario: You work for a popular food delivery app. The company
wants to improve the accuracy of delivery time estimates provided to
customers. Currently, delivery times are often inaccurate due to
various factors such as traffic, restaurant preparation time, and
delivery distance.
India’s Most Affordable Pay After Placement Data Analytics Course
+91-7880-113-112 Contact or Fill the Form in the Description
Day - 3 10 Days Python Data Analytics Interview Class
Interview Question - 1
1) What data cleaning and preprocessing steps would you perform on
the collected data to ensure its quality and reliability for analysis?
Descriptive Statistics: Calculate summary statistics for delivery times, distances, and other relevant
variables to get an overview of the data distribution.
Correlation Analysis: Use correlation coefficients to measure the strength and direction of relationships
between delivery times and factors like traffic, preparation time, and distance.
Data Visualization: Create visualizations such as scatter plots, histograms, or heatmaps to visualize patterns
and trends in the data, especially how delivery times vary with different factors.
Time-Series Analysis: Analyze delivery time trends over time to identify any seasonality or temporal
patterns.
Word Clouds: Create word clouds to visualize frequently mentioned keywords or phrases in user feedback,
highlighting areas of concern or praise.
Feature Extraction: Extract valuable insights from user feedback by identifying common themes, such as
complaints about late deliveries or positive comments about accurate time estimates.
Quantitative Metrics: Use quantitative metrics like Net Promoter Score (NPS) or Customer Satisfaction
Score (CSAT) to quantify user satisfaction and track improvements over time.
Interview Question - 2
Data Alignment : Pandas automatically aligns time series data based on timestamps, ensuring that data
points are correctly matched, even when dealing with missing or irregular intervals.
Data Transformation: You can easily perform common time series operations like shifting, differencing,
and rolling calculations using Pandas, simplifying data preparation for analysis.
Data Visualization: Pandas integrates seamlessly with data visualization libraries like Matplotlib and
Seaborn, enabling the creation of informative time series plots and charts.
Integration with Other Libraries: You can seamlessly integrate Pandas with other data analysis and
machine learning libraries like NumPy, Scikit-Learn, and Statsmodels, allowing for more advanced time
series modeling and forecasting.
Interview Question - 3
Python Lists: Python lists can contain elements of different data types. For example,
a single list can hold integers, floats, strings, and even other lists.
NumPy Arrays: NumPy arrays are homogeneous, meaning they store elements of the
same data type. This homogeneity allows for efficient memory storage and optimized
numerical operations.
India’s Most Affordable Pay After Placement Data Analytics Course
+91-7880-113-112 Contact or Fill the Form in the Description
Day - 3 10 Days Python Data Analytics Interview Class
Interview Question - 3
Python lists and NumPy , key differences:
2. Performance:
Python Lists: Lists are not optimized for numerical operations and can be slower when
performing operations on large datasets. They are implemented in Python's standard
library and are relatively slower for mathematical calculations.
NumPy Arrays: NumPy arrays are highly efficient for numerical computations. They are
implemented in C and provide low-level memory optimizations. This makes NumPy arrays
significantly faster than Python lists for numerical operations, especially on large
datasets.
India’s Most Affordable Pay After Placement Data Analytics Course
+91-7880-113-112 Contact or Fill the Form in the Description
Day - 3 10 Days Python Data Analytics Interview Class
Interview Question - 3
Python lists and NumPy , key differences:
3. Size:
Python Lists: Lists are dynamic, which means you can change their size by appending,
inserting, or removing elements. They do not have a fixed size.
NumPy Arrays: NumPy arrays have a fixed size upon creation, and you cannot change
their size without creating a new array. This fixed size is useful for memory optimization
and efficient data storage.
Python Lists: Python lists have limited built-in functions for numerical operations. While
you can perform basic operations, such as addition and multiplication, they are not as
optimized as NumPy functions.
NumPy Arrays: NumPy provides a wide range of mathematical and statistical functions
that are optimized for arrays. It enables vectorized operations, broadcasting, and
element-wise computations, making it a powerful tool for scientific and numerical
computing.
India’s Most Affordable Pay After Placement Data Analytics Course
+91-7880-113-112 Contact or Fill the Form in the Description
Day - 3 10 Days Python Data Analytics Interview Class
Interview Question - 3
Python lists and NumPy , key differences:
5. Syntax and Convenience:
Python Lists: Python lists are part of the core Python language and are easy to create and
manipulate. They are suitable for general-purpose programming tasks.
NumPy Arrays: NumPy arrays require importing the NumPy library, which adds an extra
step. However, they provide extensive functionality and performance benefits for
numerical tasks.
Interview Question - 4
Interview Question - 4
Interview Question - 4
Interview Question - 4
3) Using a loop:
You can manually iterate through the larger string and count occurrences of
the substring. Here's an example:
Interview Question - 5
Interview Question - 5
You can efficiently create a dictionary from two lists—one containing keys and the
other containing values—using the dict() constructor and the zip() function in
Python.
Interview Question - 5
Using Dictionary Comprehension:
Interview Question - 6
Axis Labels: You can label the x and y axes using plt.xlabel() and plt.ylabel() functions. For
example:
Legends:
When you have multiple data series or categories on the same plot,
you can add a legend to differentiate them. Seaborn often handles
legends automatically, but you can customize them using plt.legend().
For example:
Question - 1
What is the purpose of broadcasting in NumPy?