0% found this document useful (0 votes)
78 views21 pages

3 Ass

This report analyzes payment data from a database using SQL queries. Key findings include: 1) Descriptive statistics on 273 customer payments show a mean of $32,431 and standard deviation of $20,997, indicating high variation in amounts. 2) A chart shows the distribution is positively skewed with a long tail, suggesting some unusually large payments. 3) The large range between minimum and maximum payments, as well as the high standard deviation, may indicate issues with processing or fraud that require further research.

Uploaded by

janvi1009patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views21 pages

3 Ass

This report analyzes payment data from a database using SQL queries. Key findings include: 1) Descriptive statistics on 273 customer payments show a mean of $32,431 and standard deviation of $20,997, indicating high variation in amounts. 2) A chart shows the distribution is positively skewed with a long tail, suggesting some unusually large payments. 3) The large range between minimum and maximum payments, as well as the high standard deviation, may indicate issues with processing or fraud that require further research.

Uploaded by

janvi1009patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

1

Descriptive Report + SQL(Structured Query Language)

Table of Contents
Executive summary.............................................................................................................................3

Introduction.........................................................................................................................................4

(Descriptive Report + SQL)................................................................................................................5

Individual Assignment 1.....................................................................................................................5


2

Conclusion..........................................................................................................................................19

Appendices.........................................................................................................................................20

Reference............................................................................................................................................21

Executive summary

This report gives a brief introduction to the SQL programming language and emphasises the

advantages of utilising MySQL as a database management system. Popular open-source


3

programme MySQL is renowned for its efficiency, adaptability, and dependability. A sizable

developer community consistently contributes to its growth and upkeep (MySQL, n.d.).

Through a sequence of SQL queries that obtain certain data points based on the desired

information, the report provides a complete analysis of the supplied database. The data is then

retrieved, exported to an Excel file for comprehensive statistical analysis, and charts are made

to aid in the visualisation and comprehension of the data.

Furthermore, the report also emphasises the significance of adopting SQL for data analysis,

which helps firms to gain insightful information and make defensible judgements about their

data. The report responds to certain business inquiries through the execution of the numerous

SQL queries, such as detecting consumer behaviour and preferences, managing workforce

and inventory, and tracking order progress.

Overall, this paper shows the strength and adaptability of MySQL and SQL in delivering

data-driven insights for organisations to make wise decisions, streamline operations, and

succeed in their respective sectors.

Introduction
4

A language for coding called Structured Query Language (SQL) is used to interact with and

interact with databases. It was invented in the 1970s by IBM researchers. SQL is incredibly

user-friendly and very readily available on a variety of systems. Companies and other

organisations use SQL programmes to create and modify new tables as well as access and

edit details and information in their systems.

SQL uses INSERT -This is used to update database tables with data, SELECT - It gets

information out of database tables; and UPDATE - That alters entries in an existing database.

(Heller, n.d.)

Some of the advantages of using SQL are that it is easily accessible and portable, queries are

executed is in frequents of seconds, further is don’t not require any coding and it gives

several data perspectives. The key feature of SQL is its capacity to process massive volumes

of data fast and effectively. It is widely utilised in many different industries, including

banking, healthcare, retail, and technology, and has evolved into an indispensable tool for

data-driven organisations. (Estrera, 2022)


5

(Descriptive Report + SQL)

Individual Assignment 1

Q1. By using SQL, select and export the "customerNumber" and "amounts" from the payments

table of the given database and write a descriptive analytics report. (The report must contain a

summary of statistic calculations (MUST be done with Excel as described in class) and discuss

the meaning of each value in a business context. The report must have at least two charts that

should be used for a better data presentation. You may use any information from the database to

plot the charts.

Enter ‘SELECT customerNumber, amount

FROM payments;’. After entering code, click on execute and export the data to Excel. Excel shows

customerNumber and amount from payment database. Now apply descriptive statistic and the below

table display the descriptive statistics of both amount and customerNumber.

amount CustomerNumber

Mean 32431.64553 Mean 271.1941392

Standard Error 1270.803326 Standard 7.266937128

Error

Median 32077.44 Median 250

Mode #N/A Mode 141


6

Standard 20997.11692 Standard 120.0695067

Deviation Deviation

Sample Variance 440878918.8 Sample 14416.68644

Variance

Kurtosis 2.474868684 Kurtosis -1.195111118

Skewness 1.112727051 Skewness 0.340641606

Range 119551.13 Range 393

Minimum 615.45 Minimum 103

Maximum 120166.58 Maximum 496

Sum 8853839.23 Sum 74036

Count 273 Count 273

Chart Title
450000000
400000000
350000000
300000000
250000000
200000000
150000000
100000000
50000000
0
r t
ea
n ro ia
n
od
e
tio
n
nc
e sis ss ge um um m un
M Er ed M ia ria rto ne an i m i m Su Co
d v u w R in x
ar M
De Va K
Sk
e
M a
a nd rd ple M
St da m
an Sa
St

Figure 1: Descriptive statistics of amounts.

Above chart displays (figure 1) the descriptive statistics of amount. The amount is derived from

database from MySQL. The graph displays amount information, total of 273 observations. The

amount's standard deviation is 20,997.12. The median value is 32,077.44, while mode value is not

applicable. The distribution is significantly higher than a normal distribution, according to the kurtosis
7

of 2.47. The distribution's skewness is positive at 1.11, indicating that it is somewhat to the right and

has a larger tail on the right side of the distribution. The minimum value is 615.45, and the maximum

value is 120166.58, with a range of 119,551.13. The sample variance is 440878918.8. It may be

necessary to do more research to ascertain whether there are any problems with payment processing

or fraud because the high standard deviation and positive skewness might be an indication that certain

consumers are making big or irregular payments.

Chart Title
80000
70000
60000
50000
40000
30000
20000
10000
0
r n r an ode tion e is s e nt
-10000 be ea Erro di nc os es ang um um m
m M e M i a ri a rt n i m i m Su Cou
w R
Nu rd M v
De e V
a Ku
Sk
e i n a x
er da M M
om tan a rd pl
st S
an
d
Sa
m
cu St

Figure2: Descriptive statistics of CustomerNumber.

Above line graph depicts, that the mean value is 271.19, which represents the average value of

CustomerNumber. The identification of potential consumers who have not yet paid as well as the

estimation of the size of the client base can both benefit from this data. The median value is 250,

mode is 141. The standard error is 7.27, which calculates the sample mean's standard deviation from

the actual population mean. The standard deviation is 120.07, this data may be helpful in locating

prospective outliers or clients who exhibit odd payment patterns. The platykurtic distribution, which

has a steeper peak and shorter tails than the normal distribution, is shown by the kurtosis value of -

1.20. With a skewness obtain of 0.34, the distribution is somewhat right-skewed. The minimum value

is 103, and the maximum value is 496. The range of 393 implies that there is a 393-point variation in

the values of the customer numbers between the minimum and maximum. The payment database's
8

overall range of client numbers may be determined using this information. The standard deviation of

the squared deviations from the mean, or sample variance, is 14,416.69.

Q2. By using SQL, count how many employees work in the company.

SELECT COUNT (*) AS total employees

FROM employees;

Total number of employees= 23


9

Q3. By using SQL, select the "customerNumber" associated with the amounts lower than the

average.

SELECT customerNumber

FROM payments

WHERE amount < (SELECT AVG (amount) FROM payments) (ChatGPT,2023).


10

Q4. By using SQL, select the "customerNumber" associated with the amounts between 5000 and

10000.

SELECT customerNumber, amount

FROM payments

WHERE amount BETWEEN 5000 AND 10000;


11

Q5. By using SQL, find the average percentage markup of the MSRP on buyPrice. (The markup

of the MSRP (Manufacturer's Suggested Retail Price) on the buy price refers to the difference

between the price at which a manufacturer suggests a product should be sold and the price at

which a retailer actually purchases the product from the manufacturer).

SELECT AVG (((MSRP - buyPrice) / buyPrice) * 100) as avg_markup_percentage

FROM products. (ChatGPT,2023)

avg_markup_percentag

88.70239217
12

Q6. By using SQL, report the name and city of customers who do not have sales representatives.

SELECT customerName, city

FROM customers

WHERE salesRepEmployeeNumber IS NULL;(ChatGPT,2023)

customerName city Der Hund Imports Berlin


Havel & Zbyszek Co Warszawa
Cramer Spezialitäten, Ltd Brandenburg
Porto Imports Co. Lisboa
Asian Treasures, Inc. Cork
Asian Shopping Network, Singapore
SAR Distributors, Co Hatfield
Co
Kommission Auto Münster
Natürlich Autos Cunewalde
Lisboa Souveniers, Inc Lisboa
ANG Resellers Madrid
Stuttgart Collectable Stuttgart
Messner Shopping Frankfurt
Exchange
Network
Feuer Online Stores, Inc Leipzig
Franken Gifts, Co München
Warburg Exchange Aachen
BG&E Collectables Fribourg
Anton Designs, Ltd. Madrid
Schuyler Imports Amsterdam
Mit Vergnügen & Co. Mannheim
13

Kremlin Collectables, Co. Saint Raanan Stores, Inc Herzlia

Petersburg

Q7. By using SQL, report how many orders were cancelled and what the cancelled

"orderNumber" was.

SELECT COUNT (*) AS cancelled_orders_count, orderNumber

FROM orders

WHERE status = 'Cancelled'; (ChatGPT,2023)

cancelled_orders_count orderNumber

6 10167
14

Q8. By using SQL, find the name of 1625's supervisor.

SELECT CONCAT (e.firstName, ' ', e.lastName) AS supervisor_name

FROM employees e

WHERE e.employeeNumber = (SELECT reportsTo FROM employees WHERE employeeNumber =

1625) ; (ChatGPT,2023)

firstName lastName

Mami Nishi
15

Q9. By using SQL, find products containing the name 'Ford'.

SELECT productName

FROM products

WHERE productName LIKE '%Ford%'; (ChatGPT,2023)

productName 1903 Ford Model A

1968 Ford Mustang 1976 Ford Gran Torino

1969 Ford Falcon 1940s Ford truck

1940 Ford Pickup Truck 1957 Ford Thunderbird

1911 Ford Town Car 1912 Ford Model T Delivery

1932 Model A Ford J-Coupe Wagon

1926 Ford Fire Engine 1940 Ford Delivery Sedan

1913 Ford Model T Speedster 1928 Ford Phaeton Deluxe

1934 Ford V8 Coupe


16

Q10. By using SQL identify the "customerNumber" with the median payment amount. Provide

his/her "customerName", "contactLastName" and "phone".

SELECT c.customerNumber, c.customerName, c.contactLastName, c.phone,

p.median_payment_amount

FROM customers c

JOIN (SELECT ROUND(AVG(p.amount),2) AS median_payment_amount FROM payments

p) p

WHERE (SELECT COUNT(*) FROM payments WHERE amount <=

p.median_payment_amount) >= (SELECT COUNT(*) FROM payments WHERE amount >=

p.median_payment_amount)

ORDER BY c.customerNumber

LIMIT 1; (ChatGPT,2023).
17

customerNumb customerNa contactLastNa phone median_payment_amo

er me me unt

103 Atelier Schmitt 40.32.255 32431.65

graphique 5

Q11. By using SQL, report the "customerName" of the orders "On Hold".

SELECT c.customerName, o.status

FROM customers c

JOIN orders o ON c. customerNumber = o. customerNumber

WHERE o.status = 'On Hold'; (ChatGPT,2023).

customerName status

Volvo Model Replicas, On Hold

Co

Tekni Collectables Inc. On Hold

Gifts4AllAges.com On Hold

The Sharp Gifts On Hold

Warehouse
18

Q12. By using SQL, report the number of customers in Denmark, Norway, and Sweden.

SELECT COUNT(*) AS num_customers, country

FROM customers

WHERE country IN ('Denmark', 'Norway', 'Sweden')

GROUP BY country (ChatGPT,2023).

num_customers country

3 Norway

2 Sweden

2 Denmark
19

Conclusion

To conclude, the report consists of descriptive statistics which uses statistical measures, charts, graphs

and SQL. Using MySQL workbench, data is derived by entering SELECT and FROM commends.

Commend is then executed and provides a list of data from big data of the organisation. For the

management and analysis of data, SQL is a crucial tool. For companies in a variety of sectors and

applications, its efficiency, standardisation, scalability, flexibility, and interaction with other

programming languages make it a significant tool. By utilising MySQL to respond to a number of

business problems, this study demonstrates the strength and adaptability of SQL. My SQL helps to

determine trends in client payment behaviour help businesses plan for the future financially and retain

customers, along with this the number of workers will be helpful for management teams, HR

departments, and staff in making choices about hiring, allocating resources, and budgeting. Moreover,

the outcomes of these searches may assist companies in making wise decisions about their operations,

including discovering prospective markets, assessing credit risk, and monitoring client payments.
20

Appendices
21

Reference

ChatGPT. (2023, May 04). https://chat.openai.com/chat.

Estrera, M. (2022, February 7). Who Uses SQL? Companies That Use SQL and What SQL Is

Used For. Career Karma. https://careerkarma.com/blog/who-uses-sql/

Heller, M. (n.d.). What is SQL? The lingua franca of data analysis. InfoWorld.

https://www.infoworld.com/article/3219795/what-is-sql-the-lingua-franca-of-data-

analysis.html

MySQL. (n.d.). MySQL. https://www.mysql.com/

You might also like