0% found this document useful (0 votes)
25 views6 pages

Kotakmahindra

The document outlines various SQL data models and queries for different systems including a sales system, social media analytics platform, and healthcare system. It also includes examples of SQL operations such as inserting data, calculating average delays, deduplicating JSON data, and reporting capital gains/losses for stocks. Additionally, it discusses how to aggregate visit counts for subdomains based on provided domain visit data.

Uploaded by

saurav kr. jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views6 pages

Kotakmahindra

The document outlines various SQL data models and queries for different systems including a sales system, social media analytics platform, and healthcare system. It also includes examples of SQL operations such as inserting data, calculating average delays, deduplicating JSON data, and reporting capital gains/losses for stocks. Additionally, it discusses how to aggregate visit counts for subdomains based on provided domain visit data.

Uploaded by

saurav kr. jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 6

https://www.programiz.

com/sql/online-compiler/

CREATE TABLE EMPLOYEE (


Team1 TEXT NOT NULL,
Team2 TEXT NOT NULL,
MatchResult INTEGER
);

-- insert
INSERT INTO EMPLOYEE VALUES ("RR" ,"KKR" ,2);

INSERT INTO EMPLOYEE VALUES("MI" ,"CSK" ,2);

INSERT INTO EMPLOYEE VALUES("RCB" ,"KXP" ,1);

INSERT INTO EMPLOYEE VALUES("DD", "RR", 0);

INSERT INTO EMPLOYEE VALUES("KKR", "RR", 1);

INSERT INTO EMPLOYEE VALUES("CSK", "RCB", 2);

INSERT INTO EMPLOYEE VALUES("KXP" ,"DD", 2);

SQL:

Team1 Team2 MatchResult

RR KKR 2

MI CSK 2

RCB KXP 1

DD RR 0

KKR RR 1

CSK RCB 2

KXP DD 2

Match Result descriptions:

1 => Match won by Team 1

2 => Match won by Team 2

0 => Draw

Output should have following columns


Team Played Won Lost Draw

RR 3 0 2 1

CSK 2 1 1 0

RCB 2 2 0 0

Problem 1

Design a data model for a sales system. The system needs to store information about
customers, products, orders, and payments. Include tables and columns to support
reporting on total sales, customer purchasing behavior, and product performance

|orders|
|stores|
contracts customers |id |
|id |
id id
cust_id first name |customer_id|
|city |
duration last name <------------|product_id|
----------------------------->|country|
price dob |store_id |
tax email |date_id |
usage phone
image |
quantity |-------------------------------->|product|
|amount|
|id |

|name |
dates
|description|
id
|price|
month
year

quarter
payment_yearly

id

date

note
cheque_in
id
account number
payment_yearly details
cheque_number
id
cheque_value
pay_id
cheqque_due date
value
cheque bank
method
cheque branch
date
holder name
date
note
payid

6. Find the average delay time for very user.

session_id, user_id, start_dt, end_dt delay


12345 3 2020-11-17 00:01:10 2020-11-17 01:00:00
125 3 2020-11-17 01:41:10 2020-11-17 02:00:00 41
1890 4 2020-11-17 03:41:10 2020-11-17 03:50:00

df.createOrReplaceTempValue('user')
spark.sql("select userid,avg(timestamp) from (select
userid,timestamp(second,start_date,lag(end_dt) over (partition by userid)) as
"timestamp" from user) group by userid)

Problem 2

Create a data model for a social media analytics platform. The system should store
information about users, posts, comments, and likes. Include tables and columns for
tracking user engagement, popular posts, and comment trends

Problem 3

Design a data model for a healthcare system. The system needs to store information
about patients, doctors, appointments, and medical procedures. Include tables and
columns to support reporting on patient health history, doctor schedules, and
procedure outcomes.

A file contains json data in the below format. De-duplicate the contents of the
file based on event_id. If more than one record with same event_id is present, then
pick only the record with the most recent event_date.
Sample data in file
{event_id: 1, event_date: '2023-01-01 10:12:15', customer_id: customer1, event:
Login}
{event_id: 1, event_date: '2023-01-01 10:13:12', customer_id: customer1, event:
Login}
{event_id: 1, event_date: '2023-01-01 10:15:15', customer_id: customer1, event:
Login}
{event_id: 2, event_date: '2023-01-01 10:12:15', customer_id: customer2, event:
Singup}
{event_id: 3, event_date: '2023-01-01 10:12:15', customer_id: customer3, event:
Logout}
{event_id: 4, event_date: '2023-01-01 10:12:15', customer_id: customer4, event:
Browse}
Expected output
{event_id: 1, event_date: '2023-01-01 10:15:15', customer_id: customer1, event:
Login}
{event_id: 2, event_date: '2023-01-01 10:12:15', customer_id: customer2, event:
Singup}
{event_id: 3, event_date: '2023-01-01 10:12:15', customer_id: customer3, event:
Logout}
{event_id: 4, event_date: '2023-01-01 10:12:15', customer_id: customer4, event:
Browse}

df=spark.read.format("json").option('path','/file.json').load()
df1=df.selectexpr('event_id',max('event_date)),col(customer_id),col(event)).groupby
(col(event_id))
df1.show()

Write a solution to report the Capital gain/loss for each stock.


The Capital gain/loss of a stock is the total gain or loss after buying and
selling the stock one or many times.
Return the result table in any order.
The result format is in the following example.
Input:
Stocks table:
+---------------+-----------+---------------+--------+
| stock_name | operation | operation_day | price |
+---------------+-----------+---------------+--------+
| DataPTB | Buy | 1 | 1000 |
| NothingLTD | Buy | 2 | 10 |
| DataPTB | Sell | 5 | 9000 |
| JaveriLLP | Buy | 17 | 30000 |
| NothingLTD | Sell | 3 | 1010 |
| NothingLTD | Buy | 4 | 1000 |
| NothingLTD | Sell | 5 | 500 |
| NothingLTD | Buy | 6 | 1000 |
| JaveriLLP | Sell | 29 | 7000 |
| NothingLTD | Sell | 10 | 10000 |
+---------------+-----------+---------------+--------+
Output:
+---------------+-------------------+
| stock_name | capital_gain_loss |
+---------------+-------------------+
| NothingLTD | 9500 |
| DataPTB | 8000 |
| JaveriLLP | -23000 |
+---------------+-------------------+

select stock_name,sum(case when operation='sell' then price else 0 end)-


sum(case when operation='buy' then price else 0 end ) as
capital_gain_loss
from stock group by stock_name;
A website domain like "mail.google.com" consists of various subdomains. At the top
level, we have "com", at the next level, we have "google.com", and at the lowest
level, "mail.google.com". When we visit a domain like "mail.google.com", we will
also visit the parent domains "google.com" and "com" implicitly. Given a list of
domains and their visit count, write a code to find the visit count of all the sub
domains.
Input:
900, google.mail.com
50, yahoo.com
1, intel.mail.com
5, wiki.org
Expected output:
901, mail.com
50, yahoo.com
900, google.mail.com
5, wiki.org
5, org
1, intel.mail.com
951, com

You might also like