0% found this document useful (0 votes)
31 views9 pages

Questions For Preparation

question for job prep

Uploaded by

Hrishikesh Bele
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views9 pages

Questions For Preparation

question for job prep

Uploaded by

Hrishikesh Bele
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

SQL Questions

Q1: SQL Query for Joining Two Tables Using Inner Join

Syntax:

```sql

SELECT a.column1, b.column2

FROM table1 a

INNER JOIN table2 b

ON a.common_column = b.common_column;

```

Q2: Display Records Present in One Table but Not in Another Using Left Join

Syntax:

```sql

SELECT a.

FROM table1 a

LEFT JOIN table2 b

ON a.common_column = b.common_column

WHERE b.common_column IS NULL;

```

Q3: Display Students Who Scored Above the Class Average (Using Partition By)

Table Structure:

Columns: `student_name, class, total_marks, id`

SQL Query:

```sql

SELECT student_name, class, total_marks

FROM (

SELECT student_name, class, total_marks,

AVG(total_marks) OVER (PARTITION BY class) AS class_avg

FROM students
) AS subquery

WHERE total_marks > class_avg;

```

Q4: Perform Pivot operation in SQL (Using CASE WHEN)

SELECT

PRODUCT,

SUM(CASE WHEN YEAR = 2021 THEN SALES ELSE 0 END) AS [2021 SALES],

SUM(CASE WHEN YEAR = 2022 THEN SALES ELSE 0 END) AS [2022 SALES]

FROM

SALES

GROUP BY

PRODUCT;

Q5: COALACE & NVL functions?

Q6: RANK, DENSERANK, and ROW Number Functions?

Q7: Month Over Month Sales AGG. (LAG/LEAD) Functions

Q8: Explain any project with Data visibility?

Q9. Explain the record return with left outer join, right outer join, inner join and full outer join:

Table 1 Table 2
COL1 COL1
NULL 1
1 1
1 1
NULL

Pandas Questions

Q9: Percentile in Pandas

Example in Pandas:

```python

import pandas as pd
df = pd.DataFrame({

'marks': [70, 80, 90, 60, 50]

})

df['percentile'] = df['marks'].rank(pct=True) 100

print(df)

```

Q10: Describe in Pandas

Usage of `.describe()` function:

```python

df.describe()

```

This function provides summary statistics like mean, median, standard deviation, etc.

Q11: Difference between List and Tuple?

Q12: Mention all the essential steps used in EDA (Exploratory Data Analysis)? Such as Null value imputation, outlier
handling, Data types and conversions.

Q13. Find average sales for each department for each month, then present rows into columns:

Month Dept_A Dept_B Dept_C


Jan 100 220 191
Feb 199 109 188
Mar 111 88 98

Data Visualization Questions: -

Tableau

Q13: Blending vs Joins

Q14: Limitations in Tableau

Q15: Row Level Security

Q16: Custom Charts


Scenario Based Questions

Q1: Phased Migration of 2 Million Cards to Mastercard

1. Data Points to Ask From the Bank:

Customer demographics (age, income, location).

Card usage patterns (transaction frequency, value).

Card type (credit, debit, prepaid).

Account tenure and history.

Customer segments (VIP, regular, highrisk).

2. Segments Based on Data:

Highvalue, frequent users.

Lowvalue, infrequent users.

VIP customers.

Highrisk customers (frequent chargebacks).

Dormant or inactive users.

3. Order of Conversion:

Start with VIP customers and highvalue, frequent users.

Next, target regular customers.

Finally, migrate dormant or lowusage customers.


Q2: Classifying Bills in Python

Python Code:

```python

bills = [10, 25, 18, 40, 5, 30] List of bill values

classification = ['high' if bill > 20 else 'low' for bill in bills]

print(classification)

```

Q3: SQL Query to Select Cities with Population Over 1 Million

SQL Query:

```sql

SELECT city_name, population

FROM cities

WHERE population > 1000000;

```

Additional SQL Queries

Top 5 Sales per Department (With and Without Window Functions)

With Window Function:

```sql

SELECT department, sale,

RANK() OVER (PARTITION BY department ORDER BY sale DESC) AS rank

FROM sales

WHERE rank <= 5;

```
Without Window Function:

```sql

SELECT a.department, a.sale

FROM sales a

WHERE 5 > (

SELECT COUNT(DISTINCT sale)

FROM sales b

WHERE a.department = b.department

AND b.sale > a.sale

);

```

Moving Average with Department

```sql

SELECT department, sale,

AVG(sale) OVER (PARTITION BY department ORDER BY sale_date ROWS BETWEEN 2 PRECEDING AND CURRENT
ROW) AS moving_avg

FROM sales;

```

Project Explanation

Briefly explain a project you've worked on, focusing on:

Your role and responsibilities.

The challenges you faced.

The solutions you implemented.

The outcomes and success metrics.


Python Core Concepts

1. Basic Python Concepts:

Syntax and logical questions.

Differences between list, set, and dictionary.

Report automation using Python.

2. Joins in Python:

Using `merge()` function in pandas to join DataFrames.

Handling joins with nulls and duplicates.

3. Pivoting and Unstacking in Pandas:

Understanding how to transform DataFrames using pivot and unstack.

4. Working with DataFrames:

Filtering DataFrames based on conditions.

Grouping and joining based on specific scenarios.

Extracting rows where a specific day (e.g., Sunday) is present.

SQL Key Concepts and Queries

1. Date Operations:

Subtracting dates and finding customers who closed accounts within 15 days.

Table: `customer_id, product_id, name, starting_date, end_date`.

2. Joins & Window Functions:

Types of joins: left join, right join, full outer join.

Handling nulls and duplicates in joins.

Ranking functions: `RANK()`, `DENSE_RANK()`, `ROW_NUMBER()`.


Puzzles

Plane Seat Puzzle: How many bowls can fit in a plane seat?

Horse Race Puzzle: 25 horses, 5 can race at a time. How many races are needed to find the top 3 fastest horses?

PySpark Core Concepts

1. Basic to Midlevel Questions:

Joining and grouping in PySpark.

Query for students with their highestscoring subject(s), considering cases with equal max scores.

2. Advanced Problem:

Table 1: `student_id, roll_number`

Table 2: `roll_number, student_name, subject, marks`

Find the highestscoring subject(s) for each student.

Machine Learning

1. Random Forest Regression:

Techniques to improve accuracy.


2. Gini Index:

Explanation and application in decision trees.

3. Data Preparation:

Best practices for preparing data for machine learning models.

General Questions

1. Pressure Handling:

How to handle pressure and avoid delegating tasks.

2. Project Explanation:

Discuss a project where you worked with multiple stakeholders.

You might also like