SQL, Python, Azure Interview Questions
Database & Data Modeling
1. What are the components of an ER model?
Answer: Entities, Attributes, Relationships, and Keys (Primary/Foreign).
2. Explain the multidimensional model.
Answer: Used in OLAP with fact tables (metrics) and dimension tables (descriptive attributes).
3. What is Normalization?
Answer: Process to minimize redundancy (1NF to 5NF).
4. What is Denormalization? Why use it?
Answer: Adding redundancy to improve query performance (common in data warehouses).
5. Name 2 types of schemas in a data warehouse.
Answer: Star Schema (denormalized dimensions) and Snowflake Schema (normalized dimensions).
6. What are the 3 types of data marts?
Answer: Dependent (subset of DW), Independent (standalone), Hybrid (mixed sources).
7. Compare OLAP and OLTP.
Answer:
OLAP (Online Analytical
Aspect OLTP (Online Transaction Processing)
Processing)
Real-time operational processing (e.g., Historical data analysis (e.g., sales
Purpose
orders) trends)
Database Denormalized (optimized for
Normalized (minimizes redundancy)
Design queries)
Short, frequent Complex (SELECT with
Query Type
(INSERT/UPDATE/DELETE) aggregations)
Data Volume Smaller, current data Larger, historical data
Performance Optimized for write operations Optimized for read operations
Example ATM transactions, e-commerce orders Business intelligence dashboards
Users Front-line staff (clerks, cashiers) Analysts, executives
Visual Workflow:
OLTP Databases → ETL → OLAP Data Warehouse → BI Tools
8. What are the 2 approaches to Enterprise Data Warehouse design?
Answer: Top-down (centralized DW first) and Bottom-up (data marts first).
9. Name 4 set operators in SQL.
Answer: `UNION`, `UNION ALL`, `INTERSECT`, `EXCEPT/MINUS`.
10. What are the types of subqueries?
Answer: Single-row, Multi-row, Correlated, and Scalar.
11. What is an index? Explain clustered vs. non-clustered.
Answer:
- Clustered: Physically reorders data (1 per table).
- Non-Clustered: Logical order with pointers (multiple allowed).
12. What is a staging area in a data warehouse?
Answer: Temporary storage for raw data before ETL processing.
13. Explain SCD (Slowly Changing Dimension) types.
Answer:
- Type 1: Overwrite (no history).
- Type 2: Add new row (keeps history).
- Type 3: Track changes in columns.
14. What is incremental loading in ETL?
Answer: Loading only new/changed data (e.g., `WHERE LastModified > last_run`).
15. Write a query to sort records (SQL and PySpark).
Answer:
-- SQL
SELECT * FROM Employees ORDER BY Salary DESC;
# PySpark
df.orderBy("Salary", ascending=False).show()
Python
16. What is Python?
Answer: High-level, interpreted programming language for automation, data, and web apps.
17. Why is Python used?
Answer: Easy syntax, rich libraries (Pandas, NumPy), and cross-platform support.
18. List Python data types.
Answer: `int`, `float`, `str`, `list`, `tuple`, `dict`, `set`, `bool`.
19. What are numeric data types in Python?
Answer: `int`, `float`, `complex`.
20. What are sequential data types?
Answer: `list`, `tuple`, `str` (ordered sequences).
21. Explain loops in Python (`for`, `while`, `do-while`).
Answer:
# For loop
for i in range(5): print(i)
# While loop
while x < 5: x += 1
# Python has no native `do-while`, but emulate with:
while True:
print(x)
if x >= 5: break
22. What are set operators in Python?
Answer: `|` (union), `&` (intersection), `-` (difference), `^` (symmetric difference).
23. List types of operators in Python.
Answer: Arithmetic (`+`, `*`), Comparison (`==`, `>`), Logical (`and`, `or`), Assignment (`=`).
24. What is a class? Give a real-world example.
Answer:
class Car:
def __init__(self, brand):
self.brand = brand
my_car = Car("Toyota")
25. Explain inheritance with an example.
Answer:
class Animal:
def speak(self): pass
class Dog(Animal): # Inherits Animal
def speak(self): return "Bark"
26. What are the 5 types of inheritance?
Answer: Single, Multiple, Multilevel, Hierarchical, Hybrid.
27. What are function parameters?
Answer: Inputs to functions (e.g., `def greet(name):`).
28. List logical operators in Python.
Answer: `and`, `or`, `not`.
SQL & Database
29. What is SQL?
Answer: Structured Query Language for managing relational databases.
30. What is DBMS? Name types.
Answer:
- Types: Relational (RDBMS), NoSQL, Hierarchical, Network.
31. Explain DDL, DML, DCL.
Answer:
- DDL: `CREATE`, `ALTER`, `DROP`.
- DML: `SELECT`, `INSERT`, `UPDATE`.
- DCL: `GRANT`, `REVOKE`.
32. Compare `TRUNCATE` vs `DELETE`.
Answer:
Aspect TRUNCATE DELETE
DML (Data Manipulation
Type DDL (Data Definition Language)
Language)
Speed Faster (no logging of individual rows) Slower (logs each row deletion)
Resets storage allocation (deallocates
Storage Keeps storage allocated
pages)
Allowed (can delete specific
WHERE Clause Not allowed
rows)
Triggers Does not fire triggers Fires triggers
Transaction Auto-commits (cannot rollback) Can be rolled back
Identity
Resets counter (e.g., IDENTITY(1,1)) Does not reset counter
Columns
Use Case Remove all data quickly Selective deletion
Example:
-- TRUNCATE (remove all data)
TRUNCATE TABLE Employees;
-- DELETE (remove specific data)
DELETE FROM Employees WHERE Salary < 50000;
33. What is a query?
Answer: A request for data (e.g., `SELECT * FROM table`).
34. Give an example of an alias in SQL.
Answer:
SELECT e.Name AS EmployeeName FROM Employees e;
35. What are joins? Explain with syntax.
Answer:
SELECT a.*, b.* FROM TableA a INNER JOIN TableB b ON a.key = b.key;
36. List types of joins.
Answer: INNER, LEFT, RIGHT, FULL, CROSS, SELF.
37. What is a clause in SQL?
Answer: Conditions like `WHERE`, `GROUP BY`, `HAVING`.
38. How to find the first/last record in a table?
Answer:
-- First
SELECT * FROM Employees ORDER BY HireDate ASC LIMIT 1;
-- Last
SELECT * FROM Employees ORDER BY HireDate DESC LIMIT 1;
39. What are aggregate functions in SQL?
Answer: `COUNT()`, `SUM()`, `AVG()`, `MIN()`, `MAX()`.
40. Write a query to join two tables.
Answer:
SELECT a.*, b.* FROM Orders a JOIN Customers b ON a.CustomerID = b.ID;
Cloud & Azure
41. What is Azure?
Answer: Microsoft’s cloud platform offering IaaS, PaaS, SaaS.
42. Compare IaaS, PaaS, SaaS.
Answer:
Aspect IaaS (Infrastructure) PaaS (Platform) SaaS (Software)
Highest (manage OS, apps, Medium (manage None (use ready-made
Control
data) apps/data only) software)
User manages OS, patches, Provider manages Provider manages
Maintenance
security OS/runtime everything
Scalability Manual scaling Auto-scaling built-in Auto-scaling built-in
Pay for platform
Cost Pay for VMs/storage Pay per user/license
resources
Migrating legacy apps to Ready-to-use apps (e.g.,
Use Case Developing new apps
cloud email)
Azure App Service, Gmail, Office 365,
Examples AWS EC2, Azure VMs
Heroku Salesforce
Analogy:
• IaaS: Renting a plot of land (build anything, but maintain it).
• PaaS: Renting a furnished apartment (just move in your stuff).
• SaaS: Staying in a hotel (everything is managed for you).
43. List Azure storage types.
Answer: Blob, Table, Queue, File, Disk.
44. Explain Azure storage tiers.
Answer: Hot (frequent access), Cool (infrequent), Cold (rare), Archive (long-term).
45. What is Azure Scheduler?
Answer: Service to automate job execution at defined times.
46. Why is Azure Diagnostic API needed?
Answer: To monitor and collect logs from Azure resources (e.g., VM metrics).
47. Define SLA.
Answer: Service Level Agreement guaranteeing uptime (e.g., 99.9%).
48. What is Azure Blob Storage?
Answer: Object storage for unstructured data (images, videos).
49. What is a role instance in Azure?
Answer: A VM instance running a web/worker role in Cloud Services.
50. What will you do during a drive failure in Azure?
Answer: Use Azure Managed Disks with automatic replication for fault tolerance.
51. What is cloud computing?
Answer: On-demand delivery of IT resources (servers, storage, apps) over the internet.
52. Name cloud deployment models.
Answer: Public (Azure/AWS), Private (on-prem), Hybrid.
Project Explanation
53. How to explain your project?
Answer: Use the STAR method:
- Situation: Problem context.
- Task: Your role.
- Action: Steps taken.
- Result: Outcomes/metrics.