0% found this document useful (0 votes)

15 views5 pages

SQL DM1

The document discusses methods for predicting email communication using feature engineering and SQL techniques. It covers SQL concepts such as Common Table Expressions, subqueries, and the order of execution for SQL queries, along with practical examples. Additionally, it introduces big data technologies like PostgreSQL, Hadoop, data warehouses, and HDFS, highlighting their functionalities and differences.

Uploaded by

Sania Solad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views5 pages

SQL DM1

Uploaded by

Sania Solad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Case study

How would you predict who someone may want to send a Snapchat or
Gmail to?

For each user, assign a score of how likely someone would send an email to
the rest is feature engineering:
-Number of past emails
-How many responses
-The last time they exchanged an email
-Whether the last email ends with a question mark
-Features about the other users, etc.
-People who someone sent emails the most in the past conditioning on time
decay

SQL

1.Common Table Expressions (CTEs):

A CTE is a named temporary result set that you can reference within a
SELECT, INSERT, UPDATE, or DELETE statement.
CTEs make complex queries more readable and maintainable.
Use CTEs when you need to break down a complex query into smaller, more
understandable parts

2.Subqueries vs. JOINs:

-Subqueries are nested queries within another query and are used to retrieve
data for further processing.
-JOINs combine rows from two or more tables based on a related column.
-Use subqueries when you need to retrieve a single value or a small set of
values, and use JOINs when you need to combine data from multiple tables.

3.How do the SQL commands flow at the back end?

Ans.
Order of execution for an SQL query
1) FROM, including JOINs
2) WHERE
3) GROUP BY
4) HAVING
5) WINDOW Functions
6) SELECT
7) DISTINCT
8) UNION
9) ORDER BY
10) LIMIT AND OFFSET

4.Write a SQL query to find all the student names Nitin in a table

select name
from student
where lower like ‘%nitin%’

Now the trick is to make sure you convert the name in lower for the complete
column

wrong output

name like '%nitin%'

As this will not capture Nitin, niTin, etc.

5.Write a query to get all the student with name length 10, starting with K
and ending with z.

select name
from student
where length=10 and lower like ‘k%z’

7.ACID Properties

ACID stands for Atomicity, Consistency, Isolation, and Durability. These

properties ensure the reliability of database transactions.
Atomicity ensures that a transaction is treated as a single, indivisible unit.
Consistency guarantees that a transaction brings the database from one
consistent state to another.
Isolation ensures that transactions are executed independently.
Durability guarantees that once a transaction is committed, its effects are
permanent.

1. Question: Convert '2023-10-15' to '15-Oct-2023'.

Answer: You can use the TO_CHAR function to
format the date:

SELECT TO_CHAR(TO_DATE('2023-10-15', 'YYYY-MM-DD'), 'DD-Mon-

YYYY') AS formatted_date;

1. Calculate the date that is 90 days from today. Answer: Use

the CURRENT_DATE and INTERVAL for date arithmetic:

SELECT CURRENT_DATE + INTERVAL '90 days' AS future_date;

1. Determine the day of the week for '2023-11-20'.

Answer: Use the TO_CHAR function to extract the
day of the week

SELECT TO_CHAR(TO_DATE('2023-11-20', 'YYYY-MM-DD'), 'Day') AS

day_of_week;

1. Display 'N/A' for employees with no 'hire_date'.

Answer: Use the COALESCE function to provide a
default value for NULL dates:
2. Convert a timestamp from one time zone to another.
Answer: Use the AT TIME ZONE clause to perform
the conversion:

SELECT timestamp_column AT TIME ZONE 'UTC' AT TIME ZONE

'America/New_York' AS converted_timestamp FROM table_name;

Big Data Technologies

1.What is PostgreSQL?

PostgreSQL is an enterprise-level, versatile, resilient, open-source, object-

relational database management system that supports variable workloads and
concurrent users. The international developer community has constantly
backed it. PostgreSQL has achieved significant appeal among developers
because to its fault-tolerant characteristics.
It’s a very reliable database management system, with more than two
decades of community work to thank for its high levels of resiliency, integrity,
and accuracy. Many online, mobile, geospatial, and analytics applications
utilise PostgreSQL as their primary data storage or data warehouse.

2.What is Hadoop used for?

Apache Hadoop is an open-source framework that is used to efficiently store

and process large datasets ranging in size from gigabytes to petabytes of
data. Instead of using one large computer to store and process the data,
Hadoop allows clustering multiple computers to analyze massive datasets in
parallel more quickly.

3.What is Data Warehouse?

A data warehouse is a type of data management system that is designed to

enable and support business intelligence (BI) activities, especially analytics.
Data warehouses are solely intended to perform queries and analysis and
often contain large amounts of historical data. The data within a data
warehouse is usually derived from a wide range of sources such as
application log files and transaction applications.

4.What is the difference between Hive and Presto?

Hive is optimized for query throughput, while Presto is optimized for latency.
Presto has a limitation on the maximum amount of memory that each task in a
query can store, so if a query requires a large amount of memory, the query
simply fails

5.Define HDFS

HDFS stands for Hadoop Distributed File System. The Hadoop Distributed
File System (HDFS) is the primary data storage system used by Hadoop
applications. HDFS employs a NameNode and DataNode architecture to
implement a distributed file system that provides high-performance access to
data across highly scalable Hadoop clusters.
With HDFS, data is written on the server once, and read and reused
numerous times after that. HDFS has a primary NameNode, which keeps
track of where file data is kept in the cluster.
Instagram Post - Click and
Follow

Barclays Data Engineer Interview Questions
No ratings yet
Barclays Data Engineer Interview Questions
17 pages
Mastercard Data Engineer Interview Questions
No ratings yet
Mastercard Data Engineer Interview Questions
16 pages
SQL Basic to Advance Interview Question and Answer 1731934628
No ratings yet
SQL Basic to Advance Interview Question and Answer 1731934628
12 pages
Virtual Freedom-How To Work With Virtual Staff To Buy More Time, Become More Productive, and Build Your Dream Business
0% (1)
Virtual Freedom-How To Work With Virtual Staff To Buy More Time, Become More Productive, and Build Your Dream Business
15 pages
Apache Hive: An Introduction
No ratings yet
Apache Hive: An Introduction
51 pages
BDF 2022 Combined 2
No ratings yet
BDF 2022 Combined 2
266 pages
Laboratory Record Note Book: Rajalakshmi Institute of Technology
No ratings yet
Laboratory Record Note Book: Rajalakshmi Institute of Technology
110 pages
From Data To Insights Course Summary
No ratings yet
From Data To Insights Course Summary
67 pages
Laboratory Record Note Book: Rajalakshmi Institute of Technology
No ratings yet
Laboratory Record Note Book: Rajalakshmi Institute of Technology
111 pages
Laboratory Record Note Book: Rajalakshmi Institute of Technology
100% (1)
Laboratory Record Note Book: Rajalakshmi Institute of Technology
110 pages
Examit
No ratings yet
Examit
9 pages
Hive Intoduction and Tables
No ratings yet
Hive Intoduction and Tables
31 pages
Big Data Interview Questions 1690738892
No ratings yet
Big Data Interview Questions 1690738892
189 pages
ThoughtWorks Sample Technical Placement Paper Level1
100% (2)
ThoughtWorks Sample Technical Placement Paper Level1
7 pages
Big Data Engineering Interview Questions
67% (3)
Big Data Engineering Interview Questions
189 pages
Oracle
No ratings yet
Oracle
103 pages
MODULE 3
No ratings yet
MODULE 3
37 pages
Big Data Storage and Processing
No ratings yet
Big Data Storage and Processing
49 pages
SQL - Visualisation
No ratings yet
SQL - Visualisation
57 pages
Oracle SQL Day 1
No ratings yet
Oracle SQL Day 1
14 pages
Unit 5 Da
No ratings yet
Unit 5 Da
41 pages
Exames BDF PDF
No ratings yet
Exames BDF PDF
15 pages
000_Company Interview Qns
No ratings yet
000_Company Interview Qns
13 pages
3 FINANCE Homework Assessment Digital Skills - Answer KEYS
No ratings yet
3 FINANCE Homework Assessment Digital Skills - Answer KEYS
9 pages
Hive_Basics
No ratings yet
Hive_Basics
35 pages
5) DBMS - Ans
No ratings yet
5) DBMS - Ans
35 pages
FINALREPORTGROUP
No ratings yet
FINALREPORTGROUP
41 pages
GATE DA SQL_piyushwairale
No ratings yet
GATE DA SQL_piyushwairale
30 pages
DBMS 2
No ratings yet
DBMS 2
5 pages
Adbms 2070-2076
No ratings yet
Adbms 2070-2076
60 pages
Database Summary
No ratings yet
Database Summary
29 pages
SQL_FULL_NOTES
No ratings yet
SQL_FULL_NOTES
17 pages
Interview Q & a (SQL Spark HIVE Airflow AWS Kafka)-1
No ratings yet
Interview Q & a (SQL Spark HIVE Airflow AWS Kafka)-1
25 pages
DBMS Interview Questions
No ratings yet
DBMS Interview Questions
12 pages
DATABASE MANAGEMENT SYSTEM
No ratings yet
DATABASE MANAGEMENT SYSTEM
9 pages
PPT 2.2.2
No ratings yet
PPT 2.2.2
21 pages
Hive_Main
No ratings yet
Hive_Main
33 pages
Case Files Internal Medicine PDF
100% (21)
Case Files Internal Medicine PDF
578 pages
Ds Notes
No ratings yet
Ds Notes
88 pages
Instant Download Human Resource Management Managing Employees for Competitive Advantage 2e Wei Zhi PDF All Chapters
100% (1)
Instant Download Human Resource Management Managing Employees for Competitive Advantage 2e Wei Zhi PDF All Chapters
12 pages
Session 3.2
No ratings yet
Session 3.2
27 pages
SQL Server Database Interview Questions
No ratings yet
SQL Server Database Interview Questions
31 pages
Chapter 7
No ratings yet
Chapter 7
48 pages
Lecture 2.1.1
No ratings yet
Lecture 2.1.1
21 pages
Udacity Challenge Prog - DS With Python - NOTES
No ratings yet
Udacity Challenge Prog - DS With Python - NOTES
84 pages
SQL
No ratings yet
SQL
14 pages
DBMS 02
No ratings yet
DBMS 02
13 pages
SQL
No ratings yet
SQL
13 pages
Hitachi
No ratings yet
Hitachi
7 pages
CSCE 4523 Introduction To Database Management Systems
No ratings yet
CSCE 4523 Introduction To Database Management Systems
9 pages
SQL_FINAL_NOTES
No ratings yet
SQL_FINAL_NOTES
9 pages
DBMS Interview Q&A
No ratings yet
DBMS Interview Q&A
8 pages
Master Thesis Presentation PPT Economics
100% (3)
Master Thesis Presentation PPT Economics
4 pages
Sequel Set A
No ratings yet
Sequel Set A
6 pages
SQL, Python, Azure Interview Questions
No ratings yet
SQL, Python, Azure Interview Questions
8 pages
Helpful
No ratings yet
Helpful
9 pages
Toc SG
No ratings yet
Toc SG
6 pages
SQL Interview Questions & Answers
75% (4)
SQL Interview Questions & Answers
63 pages
Download Full Clinical Protocols in Obstetrics And Gynecology PDF All Chapters
100% (2)
Download Full Clinical Protocols in Obstetrics And Gynecology PDF All Chapters
19 pages
Security and Ethics: Understanding Operating Systems, Fourth Edition
No ratings yet
Security and Ethics: Understanding Operating Systems, Fourth Edition
48 pages
cheatsheet2
No ratings yet
cheatsheet2
2 pages
Database Db2 Universal Database For Iseries SQL Programming: System I
No ratings yet
Database Db2 Universal Database For Iseries SQL Programming: System I
316 pages
Service Manual AMD2
No ratings yet
Service Manual AMD2
20 pages
Maloprodajni Cjenik
No ratings yet
Maloprodajni Cjenik
17 pages
NZ HL7 2.1 Lab Impl Guide1.3
No ratings yet
NZ HL7 2.1 Lab Impl Guide1.3
35 pages
7007 26438 1 PB
No ratings yet
7007 26438 1 PB
11 pages
EF15-3 XMC7000, Infineon's New Industrial Microcontroller
No ratings yet
EF15-3 XMC7000, Infineon's New Industrial Microcontroller
18 pages
Module ChatGPT
No ratings yet
Module ChatGPT
15 pages
Ejpt Junior Penetration Tester
No ratings yet
Ejpt Junior Penetration Tester
3 pages
Unit-2-Role and Responsibilities of A Software Project Manager
No ratings yet
Unit-2-Role and Responsibilities of A Software Project Manager
16 pages
Using Orbitron With The VK5DJ Beam Indicator Project: (HTTP://WWW - Stoff.pl/)
No ratings yet
Using Orbitron With The VK5DJ Beam Indicator Project: (HTTP://WWW - Stoff.pl/)
1 page
Shader Settings
No ratings yet
Shader Settings
5 pages
CAN Bosch
No ratings yet
CAN Bosch
2 pages
Analytic Rubric For Discrete Mathematics
No ratings yet
Analytic Rubric For Discrete Mathematics
7 pages
BCS 052
No ratings yet
BCS 052
5 pages
Document From Gomathy?
No ratings yet
Document From Gomathy?
3 pages
S3DO Briefing
No ratings yet
S3DO Briefing
16 pages
BIS - LIMS (Laboratory Information Management System) Dashboard
No ratings yet
BIS - LIMS (Laboratory Information Management System) Dashboard
2 pages
Data Structure: Pronab Kumar Adhikari
No ratings yet
Data Structure: Pronab Kumar Adhikari
16 pages
Dot1x LAB
No ratings yet
Dot1x LAB
12 pages
8051901443791812-Artificial Intelligence Sustainable Farming Presentation
No ratings yet
8051901443791812-Artificial Intelligence Sustainable Farming Presentation
29 pages
CPAR - Q2 - Week 7 8 v2
No ratings yet
CPAR - Q2 - Week 7 8 v2
10 pages
Ngohiong-Expresso
No ratings yet
Ngohiong-Expresso
2 pages
Gujarat Technological University: Subject Code: 2171103 Date: Subject Name: Industrial Automation Time: Total Marks: 70
No ratings yet
Gujarat Technological University: Subject Code: 2171103 Date: Subject Name: Industrial Automation Time: Total Marks: 70
1 page
048 Naanna Dunnani Polam 01 02
No ratings yet
048 Naanna Dunnani Polam 01 02
1 page
Microsoft Azure Database Administrator DP 300
From Everand
Microsoft Azure Database Administrator DP 300
Manish Soni
No ratings yet
Administering Microsoft Azure SQL Solutions DP 300
From Everand
Administering Microsoft Azure SQL Solutions DP 300
Manish Soni
No ratings yet
AWS Cloud Practitioner Exam Success Kit
From Everand
AWS Cloud Practitioner Exam Success Kit
SUJAN
No ratings yet
AWS Certified Solutions Architect - Professional
From Everand
AWS Certified Solutions Architect - Professional
VB Dev
No ratings yet
C# 2010 Coding Briefs Data Access
From Everand
C# 2010 Coding Briefs Data Access
Kevin Hough
No ratings yet

SQL DM1

Uploaded by

SQL DM1

Uploaded by

Case study

1.Common Table Expressions (CTEs):

2.Subqueries vs. JOINs:

3.How do the SQL commands flow at the back end?

name like '%nitin%'

As this will not capture Nitin, niTin, etc.

ACID stands for Atomicity, Consistency, Isolation, and Durability. These

1. Question: Convert '2023-10-15' to '15-Oct-2023'.

SELECT TO_CHAR(TO_DATE('2023-10-15', 'YYYY-MM-DD'), 'DD-Mon-

1. Calculate the date that is 90 days from today. Answer: Use

SELECT CURRENT_DATE + INTERVAL '90 days' AS future_date;

1. Determine the day of the week for '2023-11-20'.

SELECT TO_CHAR(TO_DATE('2023-11-20', 'YYYY-MM-DD'), 'Day') AS

1. Display 'N/A' for employees with no 'hire_date'.

SELECT timestamp_column AT TIME ZONE 'UTC' AT TIME ZONE

Big Data Technologies

PostgreSQL is an enterprise-level, versatile, resilient, open-source, object-

2.What is Hadoop used for?

Apache Hadoop is an open-source framework that is used to efficiently store

3.What is Data Warehouse?

A data warehouse is a type of data management system that is designed to

4.What is the difference between Hive and Presto?

You might also like