Spark SQL
--department dataframe for join
dept_df = spark.createDataFrame([
(101, "HR", "Mumbai"),
(102, "Finance", "Delhi"),
(103, "Engineering", "Bangalore"),
(104, "Sales", "Pune"),
(105, "Marketing", "Hyderabad")
], ["dept_id", "dept_name", "location"])
---------------------
--------------------ASSIGNMENT------------------------------
Sales_table
Date: Weekly timestamp.
product_id: Unique identifier for each product.
store_id: Store where the transaction occurred.
qty_sold: Number of units sold per week.
Product_table
product_id:unique identifier for each product
product_name: Name of the product
current_stock: number of units available in inventory
QUESTIONS:
1. Find the top 5 best-selling products across all stores and weeks.
2. Calculate total units sold per week.
3. Display weekly sales along with product names.
4. For each store, list product-wise total quantity sold.
5. Identify products where total weekly sales are higher than current stock.
6. Display the best-selling Product per Week