SQL in 7 Days: A Quick Crash Course in Manipulating Data, Databases Operations, Writing Analytical Queries, and Server-Side Programming (English Edition)
By Alex Bolenok
()
About this ebook
With case studies drawn from the author's own experience, the book teaches you everything you need to know about SQL. This book walks you through the inner workings of a database system and teaches you how to utilize SQL to get the most out of your database. As the book progresses, you'll get a deeper understanding of database engineering principles that will speed up data mining and analysis.
Related to SQL in 7 Days
Related ebooks
Querying SQL Server: Run T-SQL operations, data extraction, data manipulation, and custom queries to deliver simplified analytics (English Edition) Rating: 0 out of 5 stars0 ratingsThe SQL Workshop: Learn to create, manipulate and secure data and manage relational databases with SQL Rating: 0 out of 5 stars0 ratingsSimply SQL: The Fun and Easy Way to Learn Best-Practice SQL Rating: 4 out of 5 stars4/5SQL 101 Crash Course: Comprehensive Guide to SQL Fundamentals and Practical Applications Rating: 5 out of 5 stars5/5SQL Database Programming: The Ultimate Guide to Learning SQL Database Programming Fast! Rating: 0 out of 5 stars0 ratingsPractical SQL Rating: 4 out of 5 stars4/5SQL Interview Questions: A complete question bank to crack your ANN SQL interview with real-time examples Rating: 2 out of 5 stars2/5SQL Server: Tips and Tricks - 2 Rating: 4 out of 5 stars4/5SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5Learn SQL in 24 Hours Rating: 5 out of 5 stars5/5Sql Simplified:: Learn to Read and Write Structured Query Language Rating: 0 out of 5 stars0 ratingsSQL Made Simple Rating: 0 out of 5 stars0 ratingsMastering SQL Server: From Basics to Expert Proficiency Rating: 0 out of 5 stars0 ratingsAdvanced SQL Queries: Writing Efficient Code for Big Data Rating: 5 out of 5 stars5/5SQL in 30 Pages Rating: 4 out of 5 stars4/5What Is Sql ?: Fundamentals of Sql,T-Sql,Pl/Sql and Datawarehousing. Rating: 0 out of 5 stars0 ratingsSql : The Ultimate Beginner to Advanced Guide To Master SQL Quickly with Step-by-Step Practical Examples Rating: 0 out of 5 stars0 ratingsConcise Oracle Database For People Who Has No Time Rating: 0 out of 5 stars0 ratingsNode.JS Guidebook: Comprehensive guide to learn Node.js Rating: 0 out of 5 stars0 ratingsSQL Tutorial For Beginners Rating: 0 out of 5 stars0 ratingsData Analysis Using SQL and Excel Rating: 3 out of 5 stars3/5Comprehensive SQL Techniques: Mastering Data Analysis and Reporting Rating: 0 out of 5 stars0 ratingsOracle Quick Guides: Part 3 - Coding in Oracle: SQL and PL/SQL Rating: 0 out of 5 stars0 ratingsSQL Made Easy: Tips and Tricks to Mastering SQL Programming Rating: 0 out of 5 stars0 ratingsSQL Server Interview Questions You'll Most Likely Be Asked Rating: 0 out of 5 stars0 ratingsBeginning Visual Basic 2015 Rating: 5 out of 5 stars5/5Troubleshooting PostgreSQL Rating: 5 out of 5 stars5/5
Enterprise Applications For You
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates Rating: 4 out of 5 stars4/5Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1 Rating: 5 out of 5 stars5/5Microsoft Excel 365 Bible Rating: 0 out of 5 stars0 ratingsBitcoin For Dummies Rating: 4 out of 5 stars4/5Notion for Beginners: Notion for Work, Play, and Productivity Rating: 4 out of 5 stars4/5Excel Formulas and Functions 2020: Excel Academy, #1 Rating: 4 out of 5 stars4/5QuickBooks 2023 All-in-One For Dummies Rating: 0 out of 5 stars0 ratingsAgile Project Management: Scrum for Beginners Rating: 4 out of 5 stars4/5Excel 2019 Bible Rating: 5 out of 5 stars5/5Microsoft Excel Formulas: Master Microsoft Excel 2016 Formulas in 30 days Rating: 4 out of 5 stars4/5Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time! Rating: 0 out of 5 stars0 ratingsExcel 2019 For Dummies Rating: 3 out of 5 stars3/5Excel Power Pivot and Power Query For Dummies Rating: 3 out of 5 stars3/5Excel 2021 Rating: 4 out of 5 stars4/5Learning Microsoft Endpoint Manager: Unified Endpoint Management with Intune and the Enterprise Mobility + Security Suite Rating: 0 out of 5 stars0 ratingsExcel All-in-One For Dummies Rating: 0 out of 5 stars0 ratings50 Useful Excel Functions: Excel Essentials, #3 Rating: 5 out of 5 stars5/5Excel VBA Programming For Dummies Rating: 4 out of 5 stars4/5CompTIA Project+ Study Guide: Exam PK0-005 Rating: 0 out of 5 stars0 ratingsMicrosoft Teams For Dummies Rating: 0 out of 5 stars0 ratingsMicrosoft Copilot For Dummies Rating: 0 out of 5 stars0 ratingsExcel 2016 For Dummies Rating: 4 out of 5 stars4/5
Reviews for SQL in 7 Days
0 ratings0 reviews
Book preview
SQL in 7 Days - Alex Bolenok
CHAPTER 1
Basic Concepts
Introduction
Welcome to the world of relational databases! At the time of writing this book, this is, by far, the most prevalent technology used to store and access all kinds of data: from online store orders to airline ticket sales, from health records to hotel reservations. Familiarity with this technology is must for any software developer worth their salt, and good command over it is a crucial requirement for many positions, from data scientist to backend developer. Let's get you familiar with it so that you can see what all the fuss is about.
Structure
In this chapter, we will discuss the following topics:
Relational data model and its historical roots
Design principles and the why
behind the relational data model
Building blocks of a relational database: table, tuple, record, field, and type
SQL, the programming language used to interact with the database
Basic SQL code
SQL as a declarative language
Features of a database management system
Objectives
This chapter will help you get familiar with the relational data model, the most widely used model to store and query data these days.
After studying this chapter, you will do the following:
Have the knowledge of the design principles behind the relational model
Understand why it was built this way
Get familiar with SQL, the language used to communicate with the databases
Learn what features other than the data modeling database systems have
Become aware of the benefits and drawbacks of the relational data model and SQL
Getting started
SQL has complicated history.
It was conceived 50 years ago based on a paper by a scientist named E. F. Codd. In that paper, Dr. Codd had described a new way to store and process data.
He was not exactly satisfied with the ways people used to store the data back then. He felt that too much effort was wasted on irrelevant implementation details.
Every application requires a data model. You must know what data you are going to store and how the different bits of data are related to each other. If you are building an HR application, your data model would deal with "organizations,
managers, and facts like
organizations employ managers. For an online retail application, it would be
orders,
products and
orders contain products". That kind of thing.
When you are thinking your application through, at some point, you must sit and figure out all those data pieces and their relationships. Maybe even draw them on a nice diagram with boxes and arrows. This process is called data modelling. It is a stage in every project's lifetime. Not understanding your project's data model is a sure road to failure, and every developer must avoid that.
Back then, it was not enough to have the data model thought through. Developers also had to worry about implementation. They had to think about how to organize, store, index and access the data, down to the tiniest level of detail: file formats, data structures, compression, and so on. Every developer would come up with their own way of doing this.
The end results were not reusable. A data model built for Alice's Banking could not be reused for Charlie's Finance or Mallory's Mortgages. Frankly speaking, it could hardly be reused for Alice's Banking version 2.
Those days, the data models had to be custom tailored and made to order. And Dr. Codd wanted them standardized and built by a blueprint.
The approach he came up with had become known as the relational data model. This model has since become the golden standard in the data processing world, although not exactly the way Mr. Codd had imagined it in the first place.
The database systems you have probably heard of, like MySQL, PostgreSQL, and SQL Server, are relational databases, meaning that they are based on the relational data model.
Before the Relational Model
Meet Denis.
It is 1969. Denis is a computer engineer fresh out of college, and he just got his first job offer from Bob's Warehouse Solutions. Bob loves technology. He just signed a lease for a mainframe computer for his warehouse company.
If you are in the warehouse business, your world has much to do with customers, orders, and products. And if it is the 1960's on the calendar, you cannot just buy software to manage this data. You must hire someone to program it. This is how Bob met Denis.
Denis had been a good student. He knew that the way to organize data is something called hierarchical database. This is just how data works. When you store your data on paper, you organize papers into files, files into binders, binders into cabinets, and so on. When you store your data in a computer, you do the same thing. You write your data into 40-byte segments and arrange them in a hierarchy.
A file has its place in a binder, and a binder in a cabinet. In the same way, every segment has one parent and can have several children.
Files and binders do not care about what you put to them, and neither does the database. It is your job to organize the data.
After 2 months of hacking, Denis is happy with what he made. All the products are catalogued. All the orders are stored nicely in the system. He has arranged the data hierarchy, from the customer down to the order and the item in the order.
At first, it works flawlessly. Just like on TV. Lucy, the operator, can see the customer's orders on the terminal while being on phone with the customer. She can take a new order and put it into the system with a few keystrokes. The invoice is printed automatically within a course of minutes. Everyone is excited about the new technology. Lucy looks at Denis with interest.
Bob wants the system improved. He asks Denis to have the system track the stock levels. Every night, the system would print the balance for each product. The managers will read the printouts the next morning and restock the warehouse.
Denis is angry. The storage hierarchy he has developed does not work for this kind of task. The system would need to go through every record in the hierarchy and calculate the stock level for each product. There is no chance the system can handle this amount of work overnight.
Denis starts optimizing. He creates a new hierarchy, starting from the product, down to the date and the order items. This hierarchy is much easier to follow for his program, but it takes extra effort to maintain it.
The system has not changed much. It is still storing the same old customers, orders, and products. But there is a new report, and the system needs a new data access path for this report. The system is doing more work, but not adding any new knowledge or information.
Denis must account for all that. He must update the software to work with the new hierarchy. He must make his code update the stock when an order comes in. He must keep the data references up to date. He keeps writing and updating his program on paper during the day and has it punched and loaded into the system in the night.
Denis is tired, and he makes coding mistakes; the system must be brought offline for 2 weeks. The accounting team is back to calculating stock reports by hand, and they only have capacity to do this weekly. The chief accountant is happy: he will not be replaced by the computer any time soon. Everyone else's enthusiasm about computers fades. Lucy does not look at Denis with such an interest anymore.
It suddenly dawns on Denis that Bob's first feature request was not his last one. And lo and behold, there is not a month Bob does not come up with a new idea. Denis does not have time to sit and think about optimizing the data storage, and there are just too many moving parts. Data references data that references data. There is no end to it.
Each time something new is added, Denis must manually untangle this spaghetti bowl of hierarchies and references, just enough to find some room for the new ones. He must put new features in and make sure the old features still work. Some parts of the system are not used anymore, but Denis does not have time to clean them up. They keep sitting there, adding to the mess. He spends most of his day doing menial, soul-consuming bug hunting and adding ugly hacks to keep the system afloat.
Every day, Denis wakes up, looks in the mirror and questions his career choice. Maybe his mother was right. Maybe he should have become an orthodontist.
Of course, Dr. Codd did not know Denis, because I made him up. And yet, he felt sorry for him. He knew the type. He understood that this path was leading nowhere.
So, one day, he sat down and wrote a paper, "A Relational Model of Data for Large Shared Data Banks".
Here are the main ideas behind the relational databases.
All the data — all of it — is strictly typed.
Data is stored in something called a tuple. A tuple is made of several parts, each one having a name and a type, always in the same order. A Bible reference, like "Genesis 1:1", is an example of a tuple. It consists of three parts: book, chapter, and verse. Every part of the tuple is typed. First goes a string with the name of one of the Bible's books, then two numbers, which are the chapter and the verse.
Collection of tuples is called a table. In a table, all tuples have the same type (same names, same number of fields and same field types). Tuples stored in a table are called records.
In a table, the order of records does not matter. You can take two rows in the table and swap them. As far as the model is concerned, nothing has changed. There is no such thing as the order of rows in the table.
Records in tables can somehow be related to each other, but these relationships are not hard coded anywhere in the database. Tables and rows are all that there is to it. The way different pieces of data are related is not defined when putting the data to the database, but when getting it out of the database. For the exact same data sitting in the database, you can make two different queries relating the pieces of data together in a different way. Data relationships are separated from the data storage.
The last two bits are important. Let them sink in. They are the revolutionary steps. They are the real killer feature.
Here is the thing. Those records are stored in the computer memory or on a hard drive. Before the advent of the relational model, the usual way to deal with the references between data records was to use pointers. If you wanted one record to reference another, you would store its exact location in memory, in a file, or on the database hierarchy.
The moment you engage in this pointer business, you are getting married to your references. You are making them a part of your data model. You must code them into your application. The way the pieces of data are related is being baked into the data model, and it becomes your responsibility. This is what happened to poor Denis.
But what if we know nothing about how and where the record is stored? All there is to know about the record is right in the record itself. To reference this record, you would have to use some part of its own data, and this data is not going anywhere. It is either there, or it is not.
What if there was some magical way to say, "I want all orders for a given date or
I want all items for a given customer or even
I want the total number of orders for a last week", and Denis would not have to think about it in terms of following the cabinet/binder/file hierarchy?
The database system would then be free to arrange the data any way it would see fit. It could compress the records, move them to and from operating memory, span it over multiple files or hard drives or even computers.
This would not concern Denis, as a developer. It would let him separate the model logic from the storage logic. It would allow him to worry about the data and let the computer deal with the boring details like storage and access paths.
For its time, it was a whole new level of thinking, and it had contributed a great deal to the popularity of the relational database model.
The relational model
No data except the data
The notion of references is one of the most challenging concepts in software development. Pointers, indirect memory access, function references, you name it. It just does not sink in with a lot of people. And even when it does, tracking and updating all these references is not a rewarding job. It is prone to bugs and does not add to the big picture.
Denis had learned it the hard way. Bob, his boss, understood the data. Bob appreciated Denis doing something useful about the data. But it was difficult to explain to Bob that most of Denis's efforts were not about adding new information into the system. They were about putting together that red yarn on a crazy person's wall that tied the existing pieces of data together so that they would be easy to follow. To be honest, Denis was having a hard time explaining that to himself.
The sexy thing about the relational model was that it was the first successful model where the relationships between the data were data. If you wanted to reference one record in a table to another record, you do not reference its position in a table or path in the hierarchy. There is no such thing as a position anymore. You put something else in the row — a string, a numeral, anything — which becomes part of the data itself and stays there forever.
If you think about it, people were doing this with physical, material things since time immemorial.
I want you to meet Elsie.
Elsie is managing the cleaning team in a large grocery store. She needs an inventory of her cleaning supplies closet. She lives in the 1930's, and computers are not around yet. She must keep her inventory on the paper.
In her closet, Elsie has a vacuum sitting on the top shelf. She wants to record it on the inventory. But she knows better than writing it down as "that vacuum that sits on the top shelf. What would happen if someone moved the vacuum across the room? She would have to go to her office, open the cabinet and fix the record. It would then say,
that vacuum that sits on the top shelf on the bottom shelf". It is easy to see that this would become unwieldy fast.
Instead, Elsie takes a can of paint and a brush, and stencils the inventory number on the vacuum itself. The inventory number says #ABC-IDK-30035
, and it becomes an integral part of the vacuum. It is stenciled on its side, and it stays there forever.
This, of course, comes with a price. Elsie cannot locate the vacuum as easily as before. If she wants to find it, she will have to go to the supplies closet and look for the inventory number on every vacuum or anything that remotely resembles a vacuum. And it might not even be there. It might have been moved or sold or stolen.
But this where the good part about being the boss kicks in. Elsie's records are OK. Locating the actual, physical vacuum has just become the problem of the cleaning crew. They may devise their own system to track equipment within the supplies closet. Maybe their memory is particularly good. They might even spend their own time looking for equipment time after time, every single day. If the piece of equipment is accounted for, Elsie could not care less.
The data modelling works the same way. Defining the data is one job, while locating and shuffling it around is another. Turns out, people mostly enjoy the former and would pay a good deal of money for the software that deals with the latter.
In software development lingo, this is called separation of concerns. It is considered a good thing —when it works.
Just like Elsie's vacuum, a data record should be referenced by something that is engraved on it and never changes.
What should such a data record look like?
Look at the shelves at the local supermarket, and you will see endless rows of clothes, toys, furniture, digital gadgets, just about anything you can imagine. All shiny, all affordable, all looking the same.
Why is that so? Because of the dystopian nature of mass production.
Every mass-produced item is made of the same parts. This is called unification. The parts are replaceable and look the same. They easily fit together on the assembly line.
The assembly process is a series of extremely simple, easy-to-follow instructions. This keeps wages and training costs low. This also makes factory workers easily replaceable.
The assembly line does not really end at the factory's gate. Billions of standardized boxes get packed into millions of standardized intermodal containers and shipped across the world on thousands of standardized cargo ships. Port, truck, forklift, shelf. No time is wasted on thinking. Everything fits. A perfectly efficient, well-oiled machine, with people reduced to its cogs.
That is quite a sad picture if you are at the business end of this machine, but being in software development means you get to enjoy the beauty and elegance of mass processing, leaving out the sad part. You build your own machines out of code, which does not mind being put to work 24/7.
The mass production principles translate well to software development.
"Easy to follow instructions" translates to simple code that is not resource demanding and runs fast and is applicable to all the data items, be it just one or a million. If it works on one data record, it works on all of them.
"Unification" translates to the fact that all the data records should look the same.
Here is how the relational model implements these principles.
As you learned before, a basic building block in the relational data model is called a tuple. A tuple has a set of fields, each with a predefined type¹.
The database stores the data in unordered collections of tuples, called tables. It is crucial to understand that a table is not a table in the usual sense of the word. It is not a nice, ordered list. It is just a dump of records.
All these definitions might be hard to follow, so let us illustrate it with an example. To be honest, every other book deals with orders and records and other boring stuff. In this chapter, we'll illustrate the concepts of table and record with a deck of cards.
There are exactly 52 cards in the deck. You know which cards these are, but there is no order in them, not when the deck is shuffled.
The value of a card is a good example of a tuple.
It has two fields. The fields have names, "rank and
suit, and a particular order. It is always
king of hearts or
6 of diamonds. Try changing the order of the words and putting it as
spades ace". Sounds dull, doesn't it?
The fields have types as well. "Ace is a rank, and
spades" is a suit. You cannot put one into another. You can tell what goes in the rank and what in the suit. These types are well defined too, there can be 13 ranks and 4 suits, and everyone knows the names and symbols for them.
When you reference the card, you are referencing it by something that stays with it forever: it is printed right on the card.
All the cards have the same size and format. The rank and suit are printed at a fixed place in the corner of the card. Everyone can shuffle the deck, go through every card in it and tell its value because cards are specifically designed to be easy to handle and identify. Even not-so-complex machines can do that.
If you want a particular card, you cannot reference it by anything other than its value. You cannot say "give me the card number 2 or
give me the card that was printed first. You would have to put it as
ace of spades.
Card number 2 in a shuffled deck could be anything, and you do not know which card was printed first, but an
ace of spades" is the ace of spades. Whoever is handling the deck would have to go through all the cards and look at each one of them until they find the ace of spades and give it to you.
Relational management systems handle data just like the shuffling machines in a casino handle the decks of cards, with the only difference being that instead of cards in a deck, they have data records in a table. They can shovel the data around and browse through the tables extremely fast and efficiently. They can follow users' instructions like "give me a king of clubs or
give me all eights". However, for this to work well, all the data records should have the same format, just like all the cards in a deck have the same format.
Tables, queries, and resultsets
A table, or a collection of typed records, is the most fundamental thing in a relational database. Other programming languages' and platforms' fundamental units can be pointers, variables, classes, files, or segments. The fundamental unit of relational model is a table.
If you want to retrieve some data from your database, all you need to do is ask.
You do it by sending a special command called query. The language you use to write queries is called SQL. This book is about this language.
SQL stands for Structured Query Language. Remember the unification and easy instructions principles? This is where the "structured" part comes from.
In most relational database management systems (RDBMS), SQL is the one and only way to interact with the system. It is used to get the data in and out, set up the tables, and even manage the system itself. If you want to interact with the database, you should talk to it in SQL.
SQL somewhat resembles plain English. Here is an example of a query in SQL:
SELECT rank, suit
FROM deck;
Query 1
Here is what its resultset looks like:
rank | suit
------+------
A | ♣
A | ♥
…
K | ♦
K | ♠
(52 records total)
(I have removed some of the tuples from this resultset to save space on the page. If you run this query, you will get back 52 tuples.)
What does this query mean?
I know there is a table called deck that stores information about cards.
I know that the records in this table are tuples that have the fields called rank and suit.
I want the database to give me the rank and the suit fields for every record from this table.
An SQL query gives you back a resultset. It is a list of tuples. This query will give you a resultset of 52 tuples, each one having two fields: rank and suit.
A response from an SQL query is always a resultset. It is always a list of tuples, all having the exact same type. All the tuples line up nicely: they have the same exact fields in the same exact order. The data in the fields might differ, but the fields stay the same. Just like those parts on the assembly line.
This is what the "structured" part is about. This is something that you, an SQL developer, will have to learn to embrace.
SQL is tight about this. If you have some experience in software development, you might be familiar with JSON and XML. In JSON or XML, you can have arrays of mixed integers and strings, next to a nested object, next to a Boolean. They are lax about the data structure.
In SQL, you can have nothing of the sort. SQL always returns a list, always of the same tuples.
Let us play a little game of questions and answers to get a better grasp of this.
I need my query to return 10 fields in the first record and 5 records in the second one. Can SQL do this?
No, SQL cannot do this. It will always return the same number of fields in each tuple.
I need my query to return 10 fields, but the first one should be a number in the first record, and a string in the second record. Can SQL do this?
No, SQL cannot do this. A column cannot mix types for different records.
I really need to have an extra field just for the header, because it has more data than the rest of the table. How can I return an extra field for the header?
You do not. You will have to send two queries: one for the header and another one for the table rows. If you want resultsets of different formats, you must submit different queries.
I need to show a nested table within the last column of my table. This is what my designer wants. His glasses look way cooler than mine, so he knows what he is talking about. How can I make SQL stuff a whole table into a column?
You do not. You will need to run a separate query for every row in a loop to get that nested table. Or, better yet, you can run two separate queries and combine the results on the client. This is something we will discuss later in the book, so let us put a pin on it for now. The short answer to the question is no.
You probably get the idea by now, but I just cannot resist another analogy.
Imagine a heavy industrial sewing machine. It can only process and produce rolls of perforated fabric. In this analogy, rolls are tables, stripes are fields, and perforation is what separates two records.
The machine can do some impressive things with the fabric, like cutting the rolls, sewing them together, making them wider or thinner or shorter or longer.
However, it is always a standard roll in, and a standard roll out. If you want something that is not a roll of fabric, like a shirt or a dress, you will have to take the fabric elsewhere. The machine can help you mass process the rolls of fabric, but