Kotakmahindra
Kotakmahindra
com/sql/online-compiler/
-- insert
INSERT INTO EMPLOYEE VALUES ("RR" ,"KKR" ,2);
SQL:
RR KKR 2
MI CSK 2
RCB KXP 1
DD RR 0
KKR RR 1
CSK RCB 2
KXP DD 2
0 => Draw
RR 3 0 2 1
CSK 2 1 1 0
RCB 2 2 0 0
Problem 1
Design a data model for a sales system. The system needs to store information about
customers, products, orders, and payments. Include tables and columns to support
reporting on total sales, customer purchasing behavior, and product performance
|orders|
|stores|
contracts customers |id |
|id |
id id
cust_id first name |customer_id|
|city |
duration last name <------------|product_id|
----------------------------->|country|
price dob |store_id |
tax email |date_id |
usage phone
image |
quantity |-------------------------------->|product|
|amount|
|id |
|name |
dates
|description|
id
|price|
month
year
quarter
payment_yearly
id
date
note
cheque_in
id
account number
payment_yearly details
cheque_number
id
cheque_value
pay_id
cheqque_due date
value
cheque bank
method
cheque branch
date
holder name
date
note
payid
df.createOrReplaceTempValue('user')
spark.sql("select userid,avg(timestamp) from (select
userid,timestamp(second,start_date,lag(end_dt) over (partition by userid)) as
"timestamp" from user) group by userid)
Problem 2
Create a data model for a social media analytics platform. The system should store
information about users, posts, comments, and likes. Include tables and columns for
tracking user engagement, popular posts, and comment trends
Problem 3
Design a data model for a healthcare system. The system needs to store information
about patients, doctors, appointments, and medical procedures. Include tables and
columns to support reporting on patient health history, doctor schedules, and
procedure outcomes.
A file contains json data in the below format. De-duplicate the contents of the
file based on event_id. If more than one record with same event_id is present, then
pick only the record with the most recent event_date.
Sample data in file
{event_id: 1, event_date: '2023-01-01 10:12:15', customer_id: customer1, event:
Login}
{event_id: 1, event_date: '2023-01-01 10:13:12', customer_id: customer1, event:
Login}
{event_id: 1, event_date: '2023-01-01 10:15:15', customer_id: customer1, event:
Login}
{event_id: 2, event_date: '2023-01-01 10:12:15', customer_id: customer2, event:
Singup}
{event_id: 3, event_date: '2023-01-01 10:12:15', customer_id: customer3, event:
Logout}
{event_id: 4, event_date: '2023-01-01 10:12:15', customer_id: customer4, event:
Browse}
Expected output
{event_id: 1, event_date: '2023-01-01 10:15:15', customer_id: customer1, event:
Login}
{event_id: 2, event_date: '2023-01-01 10:12:15', customer_id: customer2, event:
Singup}
{event_id: 3, event_date: '2023-01-01 10:12:15', customer_id: customer3, event:
Logout}
{event_id: 4, event_date: '2023-01-01 10:12:15', customer_id: customer4, event:
Browse}
df=spark.read.format("json").option('path','/file.json').load()
df1=df.selectexpr('event_id',max('event_date)),col(customer_id),col(event)).groupby
(col(event_id))
df1.show()