AllNotes WebTrack
AllNotes WebTrack
OpenCourseWare
Syllabus
Introduction to the intellectual enterprises of computer science and the art of programming. This course teaches students how to think
algorithmically and solve problems ef ciently. Topics include abstraction, algorithms, data structures, encapsulation, resource management,
security, and software engineering. Languages include C, Python, and SQL plus students’ choice of: HTML, CSS, and JavaScript (for web
development); Java or Swift (for mobile app development); or Lua (for game development). Problem sets inspired by the arts, humanities, social
sciences, and sciences. Course culminates in a nal project. Designed for concentrators and non-concentrators alike, with or without prior
programming experience. Two thirds of CS50 students have never taken CS before. Among the overarching goals of this course are to inspire
students to explore unfamiliar waters, without fear of failure, create an intensive, shared experience, accessible to all students, and build
community among students.
Expectations
Website
https://cs50.edx.org/
Certi cates
CS50x is free to take, and you are welcome to submit the course’s nine problem sets and nal project for automated feedback. To be eligible for
a veri ed certi cate (https://www.edx.org/veri ed-certi cate) from edX, however, you must receive a satisfactory score (at least 70%) on each
problem you submit as part of one of the course’s nine problem sets as well as on the course’s nal project.
Problems are evaluated along axes of correctness (as determined by a program called check50 ) and style (as determined by a program called
style50 ), with scores ordinarily computed as 3 × correctness + 1 × style.
Books
No books are required or recommended for this course. However, you might nd the below books of interest. Realize that free, if not superior,
resources can be found on the course’s website.
C Programming Absolute Beginner’s Guide, Third Edition + Greg Perry, Dean Miller + Pearson Education, 2014 + ISBN 0-789-75198-4
Hacker’s Delight, Second Edition + Henry S. Warren Jr. + Pearson Education, 2013 + ISBN 0-321-84268-5
How Computers Work, Tenth Edition + Ron White + Que Publishing, 2014 + ISBN 0-7897-4984-X
Programming in C, Fourth Edition + Stephen G. Kochan + Pearson Education, 2015 + ISBN 0-321-77641-0
Lectures
Integrated into problem sets are “walkthroughs,” videos that offer direction on where to begin and how to approach problems.
Problem Sets
Problem sets are programming assignments. CS50x does not have deadlines for problem sets. You are welcome to work on and submit them at
your own pace. To be eligible for a veri ed certi cate from edX, however, you must submit (and receive a score of at least 70% on) all problem
sets by 31 December 2020.
Final Project
The climax of this course is its nal project. The nal project is your opportunity to take your newfound savvy with programming out for a spin
and develop your very own piece of software. So long as your project draws upon this course’s lessons, the nature of your project is entirely up
to you. You may implement your project in any language(s). You are welcome to utilize infrastructure other than the CS50 IDE. All that we ask is
that you build something of interest to you, that you solve an actual problem, that you impact your community, or that you change the world.
Strive to create something that outlives this course.
Inasmuch as software development is rarely a one-person effort, you are allowed an opportunity to collaborate with one or two classmates for
this nal project. Needless to say, it is expected that every student in any such group contribute equally to the design and implementation of
that group’s project. Moreover, it is expected that the scope of a two- or three-person group’s project be, respectively, twice or thrice that of a
typical one-person project. A one-person project, mind you, should entail more time and effort than is required by each of the course’s problem
sets. Although no more than three students may design and implement a given project, you are welcome to solicit advice from others, so long
as you respect the course’s policy on academic honesty.
CS50x does not have a deadline for the nal project. You are welcome to work on and submit it at your own pace. To be eligible for a veri ed
certi cate from edX, however, you must submit (and receive a score of at least 70% on) it by 31 December 2020.
Academic Honesty
This course’s philosophy on academic honesty is best stated as “be reasonable.” The course recognizes that interactions with classmates and
others can facilitate mastery of the course’s material. However, there remains a line between enlisting the help of another and submitting the
work of another. This policy characterizes both sides of that line.
The essence of all work that you submit to this course must be your own. Collaboration on problem sets is not permitted except to the extent
that you may ask classmates and others for help so long as that help does not reduce to another doing your work for you. Generally speaking,
when asking for help, you may show your code to others, but you may not view theirs, so long as you and they respect this policy’s other
constraints. Collaboration on the course’s nal project is permitted to the extent prescribed by its speci cation.
Below are rules of thumb that (inexhaustively) characterize acts that the course considers reasonable and not reasonable. If in doubt as to
whether some act is reasonable, do not commit it. If the course determines that you have commited an act that is not reasonable, you may be
deemed ineligible for a certi cate. If you commit some act that is not reasonable but bring it to the attention of the course’s instructor within
72 hours, the course may reconsider that outcome.
Reasonable
Communicating with classmates about problem sets’ problems in English (or some other spoken language).
Discussing the course’s material with others in order to understand it better.
Helping a classmate identify a bug in his or her code in person or online, as by viewing, compiling, or running his or her code, even on your
own computer.
Incorporating a few lines of code that you nd online or elsewhere into your own code, provided that those lines are not themselves
solutions to assigned problems and that you cite the lines’ origins.
Sending or showing code that you’ve written to someone, possibly a classmate, so that he or she might help you identify and x a bug.
Sharing a few lines of your own code online so that others might help you identify and x a bug.
Turning to the web or elsewhere for instruction beyond the course’s own, for references, and for solutions to technical dif culties, but not
for outright solutions to problem set’s problems or your own nal project.
Whiteboarding solutions to problem sets with others using diagrams or pseudocode but not actual code.
Working with (and even paying) a tutor to help you with the course, provided the tutor does not do your work for you.
N tR bl 2/3
Not Reasonable
Accessing a solution to some problem prior to (re-)submitting your own.
Asking a classmate to see his or her solution to a problem set’s problem before (re-)submitting your own.
Decompiling, deobfuscating, or disassembling the staff’s solutions to problem sets.
Failing to cite (as with comments) the origins of code or techniques that you discover outside of the course’s own lessons and integrate
into your own work, even while respecting this policy’s other constraints.
Giving or showing to a classmate a solution to a problem set’s problem when it is he or she, and not you, who is struggling to solve it.
Paying or offering to pay an individual for work that you may submit as (part of) your own.
Searching for or soliciting outright solutions to problem sets online or elsewhere.
Splitting a problem set’s workload with another individual and combining your work.
Submitting (after possibly modifying) the work of another individual beyond the few lines allowed herein.
Submitting the same or similar work to this course that you have submitted or will submit to another.
Viewing another’s solution to a problem set’s problem and basing your own solution on it.
3/3
This is CS50x
OpenCourseWare
Lecture 0
Welcome
What is computer science?
Binary
Representing data
Algorithms
Pseudocode
Scratch
Welcome
When David was a rst year, he was too intimidated to take any computer science courses. By the time he was a sophomore, he found the
courage to take the equivalent of CS50, but only pass/fail.
In fact, two-thirds of CS50 students have never taken a CS course before.
And importantly, too:
what ultimately matters in this course is not so much where you end up relative to your classmates but where you end up relative to
yourself when you began
We need a way to represent inputs, such that we can store and work with information in a standard way.
Binary
A computer, at the lowest level, stores data in binary, a numeral system in which there are just two digits, 0 and 1.
When we rst learned to count, we might have used one nger to represent one thing. That system is called unary. When we learned to
write numbers with the digits 0 through 9, we learned to use decimal.
For example, we know the following represents one hundred and twenty-three.
1 2 3
1/13
The 3 is in the ones column, the 2 is in the tens column, and the 1 is in the hundreds column.
So 123 is 100×1 + 10×2 + 1×3 = 100 + 20 + 3 = 123.
Each place for a digit represents a power of ten, since there are ten possible digits for each place.
In binary, with just two digits, we have powers of two for each place value:
4 2 1
0 0 0
Now if we change the binary value to, say, 0 1 1 , the decimal value would be 3.
4 2 1
0 1 1
8 4 2 1
1 0 0 0
And binary makes sense for computers because we power them with electricity, which can be either on or off, so each bit only needs to be
on or off. In a computer, there are millions or billions of switches called transistors that can store electricity and represent a bit by being
“on” or “off”.
With enough bits, or binary digits, computers can count to any number.
8 bits make up one byte.
Representing data
To represent letters, all we need to do is decide how numbers map to letters. Some humans, many years ago, collectively decided on a
standard mapping called ASCII (https://en.wikipedia.org/wiki/ASCII). The letter “A”, for example, is the number 65, and “B” is 66, and so on.
The mapping also includes punctuation and other symbols. Other characters, like letters with accent marks, and emoji, are part of a
standard called Unicode (https://en.wikipedia.org/wiki/Unicode) that use more bits than ASCII to accommodate all these characters.
When we receive an emoji, our computer is actually just receiving a decimal number like 128514 ( 11111011000000010 in binary, if
you can read that more easily) that it then maps to the image of the emoji.
An image, too, is comprised of many smaller square dots, or pixels, each of which can be represented in binary with a system called RGB,
with values for red, green, and blue light in each pixel. By mixing together different amounts of each color, we can represent millions of
colors:
The red, green, and blue values are combined to get a light yellow color:
2/13
And computer programs know, based on the context of its code, whether the binary numbers should be interpreted as numbers, or letters,
or pixels.
And videos are just many, many images displayed one after another, at some number of frames per second. Music, too, can be represented
by the notes being played, their duration, and their volume.
Algorithms
So now we can represent inputs and outputs. The black box earlier will contain algorithms, step-by-step instructions for solving a problem:
Our rst solution, one page at a time, is like the red line: our time to solve increases linearly as the size of the problem increases.
The second solution, two pages at a time, is like the yellow line: our slope is less steep, but still linear.
Our nal solution, is like the green line: logarithmic, since our time to solve rises more and more slowly as the size of the problem
increases. In other words, if the phone book went from 1000 to 2000 pages, we would need one more step to nd Mike. If the size
doubled again from 2000 to 4000 pages, we would still only need one more step.
3/13
doubled again from 2000 to 4000 pages, we would still only need one more step.
Pseudocode
We can write pseudocode, an informal syntax that is just a more speci c version of English (or other human language) that represents our
algorithm:
Some of these lines start with verbs, or actions. We’ll start calling these functions:
We also have branches that lead to different paths, like forks in the road, which we’ll call conditions:
And the questions that decide where we go are called Boolean expressions, which eventually result to a value of true or false:
Fi ll h d h l d l h f ll d l 4/13
Finally, we have words that lead to cycles, where we can repeat parts of our program, called loops:
Scratch
We can write programs with the building blocks we just discovered:
functions
conditions
Boolean expressions
loops
We’ll use a graphical programming language called Scratch (https://scratch.mit.edu/), where we’ll drag and drop blocks that contain
instructions.
Later in our course, we’ll move onto textual programming languages like C, and Python, and JavaScript. All of these languages, including
Scratch, has more powerful features like:
variables
the ability to store values and change them
threads
the ability for our program to do multiple things at once
events
the ability to respond to changes in our program or inputs
…
The programming environment for Scratch looks like this:
On the left, we have puzzle pieces that represent functions or variables, or other concepts, that we can drag and drop into our
instruction area in the center.
5/13
On the right, we have a stage that will be shown by our program to a human, where we can add or change backgrounds, characters
(called sprites in Scratch), and more.
The “when green ag clicked” block is the start of our program, and below it we’ve snapped in a “say” block and typed in “hello,
world”.
We can also drag in the “ask and wait” block, with a question like “What’s your name?”, and combine it with a “say” block for the answer:
But we didn’t wait after we said “Hello” with the rst block, so we can use the “say () for () seconds” block:
We can use the “join” block to combine two phrases so Scratch can say “hello, David”:
6/13
The “ask” block, too, takes in an input (the question we want to ask), and produces the output of the “answer” block:
We can then use the “answer” block along with our own text, “hello, “, as two inputs to the join algorithm …
7/13
… which we pass as input again to the “say” block:
We can try to make Scratch (the name of the cat) say meow:
But when we click the green ag, we hear the meow sound over and over immediately. Our rst bug, or mistake! We can add a block
to wait, so the meows sound more normal.
We can have Scratch point towards the mouse and move towards it:
8/13
We’ll look at a sheep that can count:
Here, counter is a variable, the value of which we can set, use, and change.
We can also have Scratch meow if we touch it with the mouse pointer:
Here, we have two different branches, or conditions, that will repeat forever. If the mouse is touching it, Scratch will “roar”, otherwise
it will just meow.
9/13
We can make Scratch move back and forth on the screen with a few more blocks we can discover by looking around:
10/13
We look at another program, bark, where we can use the space bar to mute a sea lion:
We have a variable, muted , that’s false by default. And our program will constantly check if the space bar is pressed, and set muted
to false if it’s true , or true if not. This way, we can toggle whether the sound plays or not, since our other set of blocks for the
sea lion check the muted variable:
With multiple sprites, or characters, we can have different sets of blocks for each of them:
11/13
For one puppet, we have these blocks that say “Marco!”, and then a “broadcast event” block. This “event” is used for our two sprites to
communicate with each other, like sending a secret message. So our other puppet can just wait for this event to say “Polo!”:
Now that we know some basics, we can think about the design, or quality of our programs. For example, we might want to have Scratch
cough three times by repeating some blocks:
The next step is abstracting away some of our code into a function, or making it reusable in different ways. We can make a block called
“cough” and put some blocks inside it:
12/13
Now, all of our sprites can use the same “cough” block, in as many places as we’d like.
We can even put a number of times into our cough function, so we only need a single block to cough any number of times:
We look at some examples and discuss how we might implement components of them with different sprites that follow the mouse cursor,
or cause something else to happen on the stage.
Welcome aboard!
13/13
This is CS50x
OpenCourseWare
Lecture 1
C
hello, world
Compilers
String
Scratch blocks in C
Types, formats, operators
More examples
Screens
Memory, imprecision, and over ow
C
Today we’ll learn a new language, C: a programming language that has all the features of Scratch and more, but perhaps a little less
friendly since it’s purely in text:
#include <stdio.h>
int main(void)
{
printf("hello, world\n");
}
Though the words are new, the ideas are exactly as same as the “when green ag clicked” and “say (hello, world)” blocks in Scratch:
Though cryptic, don’t forget that 2/3 of CS50 students have never taken CS before, so don’t be daunted! And though at rst, to borrow a
phrase from MIT, trying to absorb all these new concepts may feel like drinking from a re hose, be assured that by the end of the
semester we’ll be empowered by and experienced at learning and applying these concepts.
We can compare a lot of the constructs in C, to blocks we’ve already seen and used in Scratch. The syntax is far less important than the
principles, which we’ve already been introduced to.
hello, world
The “when green ag clicked” block in Scratch starts the main program; clicking the green ag causes the right set of blocks underneath
to start. In C, the rst line for the same is int main(void) , which we’ll learn more about over the coming weeks, followed by an open
curly brace { , and a closed curly brace } , wrapping everything that should be in our program.
int main(void)
{
}
1/12
The “say (hello, world)” block is a function, and maps to printf("hello, world"); . In C, the function to print something to the screen is
printf , where f stands for “format”, meaning we can format the printed string in different ways. Then, we use parentheses to pass in
what we want to print. We have to use double quotes to surround our text so it’s understood as text, and nally, we add a semicolon ; to
end this line of code.
To make our program work, we also need another line at the top, a header line #include <stdio.h> that de nes the printf function that
we want to use. Somewhere there is a le on our computer, stdio.h , that includes the code that allows us to access the printf function,
and the #include line tells the computer to include that le with our program.
To write our rst program in Scratch, we opened Scratch’s website. Similarly, we’ll use the CS50 Sandbox (https://sandbox.cs50.io/) to start
writing and running code the same way. The CS50 Sandbox is a virtual, cloud-based environment with the libraries and tools already
installed for writing programs in various languages. At the top, there is a simple code editor, where we can type text. Below, we have a
terminal window, into which we can type commands:
We’ll type our code from earlier into the top, after using the + sign to create a new le called hello.c :
We end our program’s le with .c by convention, to indicate that it’s intended as a C program. Notice that our code is colorized, so that
certain things are more visible.
Compilers
Once we save the code that we wrote, which is called source code, we need to convert it to machine code, binary instructions that the
computer understands directly.
We use a program called a compiler to compile our source code into machine code.
To do this, we use the Terminal panel, which has a command prompt. The $ at the left is a prompt, after which we can type commands.
We type clang hello.c (where clang stands for “C languages”, a compiler written by a group of people). But before we press enter,
we click the folder icon on the top left of CS50 Sandbox. We see our le, hello.c . So we press enter in the terminal window, and see
that we have another le now, called a.out (short for “assembly output”). Inside that le is the code for our program, in binary. Now,
we can type ./a.out in the terminal prompt to run the program a.out in our current folder. We just wrote, compiled, and ran our
rst program!
String
But after we run our program, we see hello, world$ , with the new prompt on the same line as our output. It turns out that we need to
specify precisely that we need a new line after our program, so we can update our code to include a special newline character, \n :
#include <stdio.h>
2/12
int main(void)
{
printf("hello, world\n");
}
Now we need to remember to recompile our program with clang hello.c before we can run this new version.
Line 2 of our program is intentionally blank since we want to start a new section of code, much like starting new paragraphs in essays. It’s
not strictly necessary for our program to run correctly, but it helps humans read longer programs more easily.
We can change the name of our program from a.out to something else, too. We can pass command-line arguments, or additional options,
to programs in the terminal, depending on what the program is written to understand. For example, we can type clang -o hello
hello.c , and -o hello is telling the program clang to save the compiled output as just hello . Then, we can just run ./hello .
In our command prompt, we can run other commands, like ls (list), which shows the les in our current folder:
$ ls
a.out* hello* hello.c
The asterisk, * , indicates that those les are executable, or that they can be run by our computer.
We can use the rm (remove) command to delete a le:
$ rm a.out
rm: remove regular file 'a.out'?
We can type y or yes to con rm, and use ls again to see that it’s indeed gone forever.
Now, let’s try to get input from the user, as we did in Scratch when we wanted to say “hello, David”:
First, we need a string, or piece of text (speci cally, zero or more characters in a sequence in double quotes, like "" , "ba" , or
“bananas”), that we can ask the user for, with the function get_string . We pass the prompt, or what we want to ask the user, to the
function with "What is your name?\n" inside the parentheses. On the left, we want to create a variable, answer , the value of which
will be what the user enters. (The equals sign = is setting the value from right to left.) Finally, the type of variable that we want is
string , so we specify that to the left of answer .
Next, inside the printf function, we want the value of answer in what we print back out. We use a placeholder for our string
variable, %s , inside the phrase we want to print, like "hello, %s\n" , and then we give printf another argument, or option, to tell
it that we want the variable answer to be substituted.
If we made a mistake, like writing printf("hello, world"\n); with the \n outside of the double quotes for our string, we’ll see an errors
from our compiler:
The rst line of the error tells us to look at hello.c , line 5, column 26, where the compiler expected a closing parentheses, instead
of a backslash.
To simplify things (at least for the beginning), we’ll include a library, or set of code, from CS50. The library provides us with the string
variable type, the get_string function, and more. We just have to write a line at the top to include the le cs50.h :
#include <cs50.h>
#include <stdio.h>
int main(void)
{
string name = get_string("What's your name?\n");
printf("hello, name\n");
3/12
p ( , \ );
}
#include <stdio.h>
int main(void)
{
string name = get_string("What's your name?\n");
printf("hello, %s\n", name);
}
Now, if we try to compile that code, we get a lot of lines of errors. Sometimes, one mistake means that the compiler then starts
interpreting correct code incorrectly, generating more errors than there actually are. So we start with our rst error:
We didn’t mean stdin (“standard in”) instead of string , so that error message wasn’t helpful. In fact, we need to import another le
that de nes the type string (actually a training wheel from CS50, as we’ll nd out in the coming weeks).
So we can include another le, cs50.h , which also includes the function get_string , among others.
#include <cs50.h>
#include <stdio.h>
int main(void)
{
string name = get_string("What's your name?\n");
printf("hello, %s\n", name);
}
Now, when we try to compile our program, we have just one error:
It turns out that we also have to tell our compiler to add our special CS50 library le, with clang -o string string.c -lcs50 , with -
l for “link”.
We can even abstract this away and just type make string . We see that, by default in the CS50 Sandbox, make uses clang to compile
our code from string.c into string , with all the necessary arguments, or ags, passed in.
Scratch blocks in C
The “set [counter] to (0)” block is creating a variable, and in C we would write int counter = 0; , where int speci es that the type of our
variable is an integer:
“change [counter] by (1)” is counter = counter + 1; in C. (In C, the = isn’t like an equals sign in a equation, where we are saying
counter is the same as counter + 1 . Instead, = is an assignment operator that means, “copy the value on the right, into the value on
the left”.) And notice we don’t need to say int anymore, since we presume that we already speci ed previously that counter is an int ,
with some existing value. We can also say counter += 1; or counter++; both of which are “syntactic sugar”, or shortcuts that have the
same effect with fewer characters to type.
Notice that in C, we use { and } (as well as indentation) to indicate how lines of code should be nested.
We can also have if-else conditions:
if (x < y)
{
printf("x is less than y\n");
}
else
{
printf("x is not less than y\n");
}
Notice that lines of code that themselves are not some action ( if... , and the braces) don’t end in a semicolon.
And even else if :<
5/12
if (x < y)
{
printf("x is less than y\n");
}
else if (x > y)
{
printf("x is greater than y\n");
}
else if (x == y)
{
printf("x is equal to y\n");
}
while (true)
{
printf("hello, world\n");
}
The while keyword also requires a condition, so we use true as the Boolean expression to ensure that our loop will run forever. Our
program will check whether the expression evaluates to true (which it always will in this case), and then run the lines inside the
curly braces. Then it will repeat that until the expression isn’t true anymore (which won’t change in this case).
We could do something a certain number of times with while :
int i = 0;
while (i < 50)
{
printf("hello, world\n");
i++;
}
We create a variable, i , and set it to 0. Then, while i < 50 , we run some lines of code, and we add 1 to i after each run.
The curly braces around the two lines inside the while loop indicate that those lines will repeat, and we can add additional lines to
6/12
our program after if we wanted to.
To do the same repetition, more commonly we can use the for keyword:
Again, rst we create a variable named i and set it to 0. Then, we check that i < 50 every time we reach the top of the loop, before
we run any of the code inside. If that expression is true, then we run the code inside. Finally, after we run the code inside, we use i++
to add one to i , and the loop repeats.
More examples
For each of these examples, you can click on the sandbox links to run and edit your own copies of them.
In int.c , we get and print an integer:
#include <cs50.h>
#include <stdio.h>
int main(void)
{
int age = get_int("What's your age?\n");
int days = age * 365;
printf("You are at least %i days old.\n", days);
}
Though, once a line is too long or complicated, it may be better to keep two or even three lines for readability.
In float.c , we can get decimal numbers (called oating-point values in computers, because the decimal point can “ oat” between the
digits, depending on the number):
#include <cs50.h>
#include <stdio.h>
int main(void)
{
float price = get_float("What's the price?\n");
printf("Your total is %f.\n", price * 1.0625);
}
Now, if we compile and run our program, we’ll see a price printed out with tax.
We can specify the number of digits printed after the decimal with a placeholder like %.2f for two digits after the decimal point.
With parity.c , we can check if a number is even or odd:
#include <cs50.h>
#include <stdio.h>
int main(void)
{
int n = get_int("n: ");
if (n % 2 == 0)
{
printf("even\n");
}
else
{
printf("odd\n");
}
}
With the % (modulo) operator, we can get the remainder of n after it’s divided by 2. If the remainder is 0, we know that n is even.
Otherwise, we know n is odd.
And functions like get_int from the CS50 library do error-checking, where only inputs from the user that matches the type we want
is accepted.
In conditions.c , we turn the condition snippets from before into a program:
#include <cs50.h>
#include <stdio.h>
int main(void)
{
// Prompt user for x
int x = get_int("x: ");
// Compare x and y
if (x < y)
{
printf("x is less than y\n");
}
else if (x > y)
{
printf("x is greater than y\n");
}
8/12
}
else
{
printf("x is equal to y\n");
}
}
Lines that start with // are comments, or note for humans that the compiler will ignore.
For David to compile and run this program in his sandbox, he rst needed to run cd src1 in the terminal. This changes the directory,
or folder, to the one in which he saved all of the lecture’s source les. Then, he could run make conditions and ./conditions . With
pwd , he can see that he’s in a src1 folder (inside other folders). And cd by itself, with no arguments, will take us back to our
default folder in the sandbox.
// Logical operators
#include <cs50.h>
#include <stdio.h>
int main(void)
{
// Prompt user to agree
char c = get_char("Do you agree?\n");
We use two vertical bars, || , to indicate a logical “or”, whether either expression can be true for the condition to be followed.
And if none of the expressions are true, nothing will happen since our program doesn’t have a loop.
Let’s implement the coughing program from week 0:
#include <stdio.h>
int main(void)
{
printf("cough\n");
printf("cough\n");
printf("cough\n");
}
#include <stdio.h>
int main(void)
{
for (int i = 0; i < 3; i++)
{
printf("cough\n");
}
}
By convention, programmers tend to start counting at 0, and so i will have the values of 0 , 1 , and 2 before stopping, for a total
of three iterations. We could also write for (int i = 1, i <= 3, i++) for the same nal effect.
We can move the printf line to its own function:
#include <stdio.h>
void cough(void);
int main(void)
{
for (int i = 0; i < 3; i++)
{
cough();
9/12
}
}
void cough(void)
{
printf("cough\n");
}
We declared a new function with void cough(void); , before our main function calls it. The C compiler reads our code from top to
bottom, so we need to tell it that the cough function exists, before we use it. Then, after our main function, we can implement the
cough function. This way, the compiler knows the function exists, and we can keep our main function close to the top.
And our cough function doesn’t take any inputs, so we have cough(void) .
#include <stdio.h>
int main(void)
{
cough(3);
}
void cough(int n)
{
for (int i = 0; i < n; i++)
{
printf("cough\n");
}
}
Now, when we want to print “cough” any number of times, we can just call the same function. Notice that, with void cough(int n) ,
we indicate that the cough function takes as input an int , which we refer to as n . And inside cough , we use n in our for loop
to print “cough” the right number of times.
Let’s look at positive.c :
#include <cs50.h>
#include <stdio.h>
int get_positive_int(void);
int main(void)
{
int i = get_positive_int();
printf("%i\n", i);
}
The CS50 library doesn’t have a get_positive_int function, but we can write one ourselves. Our function int
get_positive_int(void) will prompt the user for an int and return that int , which our main function stores as i . In
get_positive_int , we initialize a variable, int n , without assigning a value to it yet. Then, we have a new construct, do ...
while , which does something rst, then checks a condition, and repeats until the condition is no longer true.
Once the loop ends because we have an n that is not < 1 , we can return it with the return keyword. And back in our main
function, we can set int i to that value.
Screens
We might want a program that prints part of a screen from a video game like Super Mario Bros. In mario0.c , we have:
int main(void)
{
printf("????\n");
}
We can ask the user for a number of question marks, and then print them, with mario2.c :
#include <cs50.h>
#include <stdio.h>
int main(void)
{
int n;
do
{
n = get_int("Width: ");
}
while (n < 1);
for (int i = 0; i < n; i++)
{
printf("?");
}
printf("\n");
}
#include <cs50.h>
#include <stdio.h>
int main(void)
{
int n;
do
{
n = get_int("Size: ");
}
while (n < 1);
for (int i = 0; i < n; i++)
{
for (int j = 0; j < n; j++)
{
printf("#");
}
printf("\n");
}
}
Notice we have two nested loops, where the outer loop uses i to do everything inside n times, and the inner loop uses j , a
different variable, to do something n times for each of those times. In other words, the outer loop prints n “rows”, or lines, and the
inner loop prints n “columns”, or # characters, in each line.
Other examples not covered in lecture are available under “Source Code” for Week 1.
#include <cs50.h>
#include <stdio.h>
i t i ( id) 11/12
int main(void)
{
// Prompt user for x
float x = get_float("x: ");
// Perform division
printf("x / y = %.50f\n", x / y);
}
x: 1
y: 10
x / y = 0.10000000149011611938476562500000000000000000000000
It turns out that this is called oating-point imprecision, where we don’t have enough bits to store all possible values, so the
computer has to store the closest value it can to 1 divided by 10.
We can see a similar problem in overflow.c :
#include <stdio.h>
#include <unistd.h>
int main(void)
{
for (int i = 1; ; i *= 2)
{
printf("%i\n", i);
sleep(1);
}
}
In our for loop, we set i to 1 , and double it with *= 2 . (And we’ll keep doing this forever, so there’s no condition we check.)
We also use the sleep function from unistd.h to let our program pause each time.
Now, when we run this program, we see the number getting bigger and bigger, until:
1073741824
overflow.c:6:25: runtime error: signed integer overflow: 1073741824 * 2 cannot be represented in type 'int'
-2147483648
0
0
...
It turns out, our program recognized that a signed integer (an integer with a positive or negative sign) couldn’t store that next value,
and printed an error. Then, since it tried to double it anyways, i became a negative number, and then 0.
This problem is called integer over ow, where an integer can only be so big before it runs out of bits and “rolls over”. We can picture
adding 1 to 999 in decimal. The last digit becomes 0, we carry the 1 so the next digit becomes 0, and we get 1000. But if we only had
three digits, we would end up with 000 since there’s no place to put the nal 1!
The Y2K problem arose because many programs stored the calendar year with just two digits, like 98 for 1998, and 99 for 1999. But when
the year 2000 approached, the programs would have stored 00, leading to confusion between the years 1900 and 2000.
A Boeing 787 airplane also had a bug where a counter in the generator over ows after a certain number of days of continuous operation,
since the number of seconds it has been running could no longer be stored in that counter.
So, we’ve seen a few problems that can happen, but now understand why, and how to prevent them.
With this week’s problem set, we’ll use the CS50 Lab, built on top of the CS50 Sandbox, to write some programs with walkthroughs to
guide us.
12/12
This is CS50x
OpenCourseWare
Lecture 2
Compiling
Debugging
help50 and printf
debug50
check50 and style50
Data Types
Memory
Arrays
Strings
Command-line arguments
Readability
Encryption
Compiling
Last time, we learned to write our rst program in C. We learned the syntax for the main function in our program, the printf function
for printing to the terminal, how to create strings with double quotes, and how to include stdio.h for the printf function.
Then, we compiled it with clang hello.c to be able to run ./a.out (the default name), and then clang -o hello hello.c (passing in a
command-line argument for the output’s name) to be able to run ./hello .
If we wanted to use CS50’s library, via #include <cs50.h> , for strings and the get_string function, we also have to add a ag: clang -o
hello hello.c -lcs50 . The -l ag links the cs50 le, which is already installed in the CS50 Sandbox, and includes prototypes, or
de nitions of strings and get_string (among more) that our program can then refer to and use.
We write our source code in C, but need to compile it to machine code, in binary, before our computers can run it.
clang is the compiler, and make is a utility that helps us run clang without having to indicate all the options manually.
“Compiling” source code into machine code is actually made up of smaller steps:
preprocessing
compiling
assembling
linking
Preprocessing involves looking at lines that start with a # , like #include , before everything else. For example, #include <cs50.h> will
tell clang to look for that header le rst, since it contains content that we want to include in our program. Then, clang will essentially
replace the contents of those header les into our program.
For example …
#include <cs50.h>
#include <stdio.h>
int main(void)
{
string name = get_string("Name: ");
printf("hello, %s\n", name);
}
int main(void)
{
string name = get_string("Name: ");
printf("hello, %s\n", name);
}
Compiling takes our source code, in C, and converts it to assembly code, which looks like this:
...
main: # @main
.cfi_startproc
# BB#0:
pushq %rbp
.Ltmp0:
.cfi_def_cfa_offset 16
.Ltmp1:
.cfi_offset %rbp, -16
movq %rsp, %rbp
.Ltmp2:
.cfi_def_cfa_register %rbp
subq $16, %rsp
xorl %eax, %eax
movl %eax, %edi
movabsq $.L.str, %rsi
movb $0, %al
callq get_string
movabsq $.L.str.1, %rdi
movq %rax, -8(%rbp)
movq -8(%rbp), %rsi
movb $0, %al
callq printf
...
These instructions are lower-level and is closer to the binary instructions that a computer’s CPU can directly understand. They
generally operate on bytes themselves, as opposed to abstractions like variable names.
The next step is to take the assembly code and translate it to instructions in binary by assembling it. The instructions in binary are called
machine code, which a computer’s CPU can run directly.
The last step is linking, where the contents of previously compiled libraries that we want to link, like cs50.c , are actually combined with
the binary of our program. So we end up with one binary le, a.out or hello , that is the compiled version of hello.c , cs50.c , and
printf.c .
Debugging
Bugs are mistakes in programs that we didn’t intend to make. And debugging is the process of nding and xing bugs.
int main(void)
{
printf("hello, world\n");
}
We see an error (in red), when we try to make this program, that we are implicitly declaring library function 'printf' . We don’t
really understand this, so we can run help50 make buggy0 , which will tell us, at the end, that we might have forgotten to write
#include <stdio.h> , which contains printf .
We can try this again with buggy1.c :
#include <stdio.h>
int main(void)
{
string name = get_string("What's your name?\n");
i f("h ll % \ " ) 2/11
printf("hello, %s\n", name);
}
We see a lot of errors, and even the rst one doesn’t seem to make much sense. So we can again run help50 make buggy1 , which will
hint to us that we need cs50.h since string isn’t de ned.
To clear the terminal window (so that we can see just the output of whatever we want to run next), we can press control + L , or type in
clear as a command to the terminal window.
Let’s look at buggy2.c :
#include <stdio.h>
int main(void)
{
for (int i = 0; i <= 10; i++)
{
printf("#\n");
}
}
Hmm, we intended to only see 10 # s, but there are 11. If we didn’t know what the problem is (since our program is compiling
without any errors, and we now have a logical error), we could add another print line to help us:
#include <stdio.h>
int main(void)
{
for (int i = 0; i <= 10; i++)
{
printf("i is now %i: ", i);
printf("#\n");
}
}
Now, we see that i started at 0 and continued until it was 10, but we should have it stop once it’s at 10, with i < 10 instead of i
<= 10 .
debug50
Today we’ll also take a look at CS50 IDE, which is like the CS50 Sandbox, but with more features. It is an online development environment,
with a code editor and a terminal window, but also tools for debugging and collaborating:
In the CS50 IDE, we’ll have another tool, debug50 , to help us debug programs.
We’ll open buggy2.c and try to make buggy2 . But we saved buggy2.c into a folder called src2 , so we need to run cd src2 to change
our directory to the right one. And CS50 IDE’s terminal will remind us what directory we’re in, with a prompt like ~/src/ $ . (The ~
indicates the default, or home directory.)
3/11
Instead of using printf , we can also debug our program interactively. We can add a breakpoint, or an indicator for a line of code where
the debugger should pause our program. For example, we can click to the left of line 5 of our code, and a red circle will appear:
Now, if we run debug50 ./buggy2 , we’ll see the debugger panel open on the right:
We see that the variable we made, i , is under the Local Variables section, and see that there’s a value of 0 .
Our breakpoint has paused our program after line 5, to just before line 7, since it’s the rst line of code that can run. To continue, we have a
few controls in the debugger panel. The blue triangle will continue our program until we reach another breakpoint or the end of our
program. The curved arrow to its right will “step over” the line, running it and pausing our program again immediately after.
So, we’ll use the curved arrow to run the next line, and see what changes after. We’re at the printf line, and pressing the curved arrow
again, we see a single # printed to our terminal window. With another click of the arrow, we see the value of i on the right change to
1 . And we can keep clicking the arrow to watch our program run, one line at a time.
To exit the debugger, we can press control + C to stop the program.
We can save lots of time in the future by investing a little bit now to learn how to use debug50 !
Data Types
In C, we have different types of variables we can use for storing data:
bool 1 byte
char 1 byte
int 4 bytes
oat 4 bytes
long 8 bytes
double 8 bytes
string ? bytes
Each of these types take up a certain number of bytes per variable we create, and the sizes above are what the sandbox, IDE, and most
likely your computer uses for each type in C.
Memory
Inside our computers, we have chips called RAM, random-access memory, that stores data for short-term use. We might save a program or
le to our hard drive (or SSD) for long-term storage, but when we open it, it gets copied to RAM rst. Though RAM is much smaller, and
temporary (until the power is turned off), it is much faster.
We can think of bytes, stored in RAM, as though they were in a grid:
Arrays
Let’s say we wanted to store three variables:
#include <stdio.h>
int main(void)
{
char c1 = 'H';
char c2 = 'I';
char c3 = '!';
printf("%c %c %c\n", c1, c2, c3);
}
Notice that we use single quotes to indicate a literal character, and double quotes for multiple characters together in a string.
We can compile and run this, to see H I ! .
And we know characters are just numbers, so if we change our string formatting to be printf("%i %i %i\n", c1, c2, c3); , we can see
the numeric values of each char printed: 72 73 33 .
We can explicitly convert, or cast, each character to an int before we use it, with (int) c1 , but our compiler can implicitly do that for
us.
And in memory, we might have three boxes, labeled c1 , c2 , and c3 somehow, each of which representing a byte of binary with the
values of each variable.
Let’s look at scores0.c :
#include <cs50.h>
5/11
#include <cs50.h>
#include <stdio.h>
int main(void)
{
// Scores
int score1 = 72;
int score2 = 73;
int score3 = 33;
// Print average
printf("Average: %i\n", (score1 + score2 + score3) / 3);
}
We can print the average of three numbers, but now we need to make one variable for every score we want to include, and we can’t
easily use them later.
It turns out, in memory, we can store variables one after another, back-to-back. And in C, a list of variables stored, one after another in a
contiguous chunk of memory, is called an array.
For example, we can use int scores[3]; to declare an array of 3 integers.
And we can assign and use variables in an array with:
#include <cs50.h>
#include <stdio.h>
int main(void)
{
// Scores
int scores[3];
scores[0] = 72;
scores[1] = 73;
scores[2] = 33;
// Print average
printf("Average: %i\n", (scores[0] + scores[1] + scores[2]) / 3);
}
Notice that arrays are zero-indexed, meaning that the rst element, or value, has index 0.
And we repeated the value 3, representing the length of our array, in two different places. So we can use a constant, or xed value, to
indicate it should always be the same in both places:
#include <cs50.h>
#include <stdio.h>
const int N = 3;
int main(void)
{
// Scores
int scores[N];
scores[0] = 72;
scores[1] = 73;
scores[2] = 33;
// Print average
printf("Average: %i\n", (scores[0] + scores[1] + scores[2]) / N);
}
We can use the const keyword to tell the compiler that the value of N should never be changed by our program. And by convention,
we’ll place our declaration of the variable outside of the main function and capitalize its name, which isn’t necessary for the compiler
but shows other humans that this variable is a constant and makes it easy to see from the start.
With an array, we can collect our scores in a loop, and access them later in a loop, too:
6/11
#include <cs50.h>
#include <stdio.h>
int main(void)
{
// Get number of scores
int n = get_int("Scores: ");
// Get scores
int scores[n];
for (int i = 0; i < n; i++)
{
scores[i] = get_int("Score %i: ", i + 1);
}
// Print average
printf("Average: %.1f\n", average(n, scores));
}
First, we’ll ask the user for the number of scores they have, create an array with enough int s for the number of scores they have, and
use a loop to collect all the scores.
Then we’ll write a helper function, average , to return a float , or a decimal value. We’ll pass in the length and an array of int s
(which could be any size), and use another loop inside our helper function to add up the values into a sum. We use (float) to cast
both sum and length into oats, so the result we get from dividing the two is also a oat.
Finally, when we print the result we get, we use %.1f to show just one place after the decimal.
In memory, our array is now stored like this, where each value takes up not one but four bytes:
Strings
Strings are actually just arrays of characters. If we had a string s , each character can be accessed with s[0] , s[1] , and so on.
And it turns out that a string ends with a special character, ‘\0’, or a byte with all bits set to 0. This character is called the null character, or
null terminating character. So we actually need four bytes to store our string “HI!”:
7/11
Now let’s see what four strings in an array might look like:
string names[4];
names[0] = "EMMA";
names[1] = "RODRIGO";
names[2] = "BRIAN";
names[3] = "DAVID";
printf("%s\n", names[0]);
printf("%c%c%c%c\n", names[0][0], names[0][1], names[0][2], names[0][3]);
We can print the rst value in names as a string, or we can get the rst string, and get each individual character in that string by
using [] again. (We can think of it as (names[0])[0] , though we don’t need the parentheses.)
And though we know that the rst name had four characters, printf probably used a loop to look at each character in the string,
printing them one at a time until it reached the null character that marks the end of the string. And in fact, we can print names[0][4]
as an int with %i , and see a 0 being printed.
We can visualize each character with its own label in memory:
#include <cs50.h>
#include <stdio.h>
#include <string.h>
int main(void)
{
string s = get_string("Input: ");
printf("Output: ");
for (int i = 0; i < strlen(s); i++)
{
printf("%c", s[i]);
}
printf("\n");
}
We can use the condition s[i] != '\0' , where we can check the current character and only print it if it’s not the null character.
We can also use the length of the string, but rst, we need a new library, string.h , for strlen , which tells us the length of a string.
We can improve the design of our program. string0 was a bit inef cient, since we check the length of the string, after each character is
printed, in our condition. But since the length of the string doesn’t change, we can check the length of the string once:
#include <cs50.h>
#include <stdio.h>
#include <string.h>
int main(void)
{
string s = get_string("Input: ");
printf("Output:\n");
8/11
for (int i = 0, n = strlen(s); i < n; i++)
{
printf("%c\n", s[i]);
}
}
Now, at the start of our loop, we initialize both an i and n variable, and remember the length of our string in n . Then, we can
check the values each time, without having to actually calculate the length of the string.
And we did need to use a little more memory for n , but this saves us some time with not having to check the length of the string
each time.
We can now combine what we’ve seen, to write a program that can capitalize letters:
#include <cs50.h>
#include <stdio.h>
#include <string.h>
int main(void)
{
string s = get_string("Before: ");
printf("After: ");
for (int i = 0, n = strlen(s); i < n; i++)
{
if (s[i] >= 'a' && s[i] <= 'z')
{
printf("%c", s[i] - 32);
}
else
{
printf("%c", s[i]);
}
}
printf("\n");
}
First, we get a string s . Then, for each character in the string, if it’s lowercase (its value is between that of a and z ), we convert it
to uppercase. Otherwise, we just print it.
We can convert a lowercase letter to its uppercase equivalent, by subtracting the difference between their ASCII values. (We know that
lowercase letters have a higher ASCII value than uppercase letters, and the difference is conveniently the same between the same
letters, so we can subtract that difference to get an uppercase letter from a lowercase letter.)
We can use the man pages (https://man.cs50.io/), or programmer’s manual, to nd library functions that we can use to accomplish the
same thing:
#include <cs50.h>
#include <ctype.h>
#include <stdio.h>
#include <string.h>
int main(void)
{
string s = get_string("Before: ");
printf("After: ");
for (int i = 0, n = strlen(s); i < n; i++)
{
printf("%c", toupper(s[i]));
}
printf("\n");
}
From searching the man pages, we see toupper() is a function, among others, from a library called ctype , that we can use.
Command-line arguments
We’ve used programs like make and clang , which take in extra words after their name in the command line. It turns out that programs of
our own, can also take in command-line arguments.
In argv.c , we change what our main function looks like:
#include <cs50.h>
#include <stdio.h>
argc and argv are two variables that our main function will now get, when our program is run from the command line. argc is
the argument count, or number of arguments, and argv is an array of strings that are the arguments. And the rst argument,
argv[0] , is the name of our program (the rst word typed, like ./hello ). In this example, we check if we have two arguments, and
print out the second one if so.
For example, if we run ./argv David , we’ll get hello, David printed, since we typed in David as the second word in our command.
It turns out that we can indicate errors in our program by returning a value from our main function (as implied by the int before our
main function). By default, our main function returns 0 to indicate nothing went wrong, but we can write a program to return a
different value:
#include <cs50.h>
#include <stdio.h>
Readability
Now that we know how to work with strings in our programs, we can analyze paragraphs of text for their level of readability, based on
factors like how long and complicated the words and sentences are.
Encryption
If we wanted to send a message to someone, we might want to encrypt, or somehow scramble that message so that it would be hard for
others to read. The original message, or input to our algorithm, is called plaintext, and the encrypted message, or output, is called
ciphertext.
A message like HI! could be converted to ASCII, 72 73 33 . But anyone would be able to convert that back to letters.
An encryption algorithm generally requires another input, in addition to the plaintext. A key is needed, and sometimes it is simply a
number, that is kept secret. With the key, plaintext can be converted, via some algorith, to ciphertext, and vice versa.
For example, if we wanted to send a message like I L O V E Y O U , we can rst convert it to ASCII: 73 76 79 86 69 89 79 85 . Then,
we can encrypt it with a key of just 1 and a simple algorithm, where we just add the key to each value: 74 77 80 87 70 90 80 86 .
Then, someone converting that ASCII back to text will see J M P W F Z P V . To decrypt this, someone will need to know the key.
We’ll apply these concepts in our problem set!
10/11
11/11
This is CS50x
OpenCourseWare
Lecture 3
Searching
Big O
Linear search
Structs
Sorting
Selection sort
Recursion
Merge sort
Searching
Last time, we talked about memory in a computer, or RAM, and how our data can be stored as individual variables or as arrays of many
items, or elements.
We can think of an array with a number of items as a row of lockers, where a computer can only open one locker to look at an item, one at
a time.
For example, if we want to check whether a number is in an array, with an algorithm that took in an array as input and produce a boolean
as a result, we might:
look in each locker, or at each element, one at a time, from the beginning to the end.
This is called linear search, where we move in a line, since our array isn’t sorted.
start in the middle and move left or right depending on what we’re looking for, if our array of items is sorted.
This is called binary search, since we can divide our problem in two with each step, like what David did with the phone book in
week 0.
We might write pseudocode for linear search with:
We can label each of n lockers from 0 to n–1 , and check each of them in order.
For binary search, our algorithm might look like:
If no items
Return false
If middle item is 50
Return true
Else if 50 < middle item
Search left half
Else if 50 > middle item
Search right half
Eventually, we won’t have any parts of the array left (if the item we want wasn’t in it), so we can return false .
Otherwise, we can search each half depending on the value of the middle item.
Big O
1/10
In week 0, we saw different types of algorithms and their running times:
The more formal way to describe this is with big O notation, which we can think of as “on the order of”. For example, if our algorithm is
linear search, it will take approximately O(n) steps, “on the order of n”. In fact, even an algorithm that looks at two items at a time and
takes n/2 steps has O(n). This is because, as n gets bigger and bigger, only the largest term, n, matters.
Similarly, a logarithmic running time is O(log n), no matter what the base is, since this is just an approximation of what happens with n is
very large.
There are some common running times:
O(n2)
O(n log n)
O(n)
(linear search)
O(log n)
(binary search)
O(1)
Computer scientists might also use big Ω, big Omega notation, which is the lower bound of number of steps for our algorithm. (Big O is the
upper bound of number of steps, or the worst case, and typically what we care about more.) With linear search, for example, the worst case
is n steps, but the best case is 1 step since our item might happen to be the rst item we check. The best case for binary search, too, is 1
since our item might be in the middle of the array.
And we have a similar set of the most common big Ω running times:
Ω(n2)
Ω(n log n)
Ω(n)
(counting the number of items)
Ω(log n)
Ω(1)
(linear search, binary search)
Linear search
2/10
Let’s take a look at numbers.c :
#include <cs50.h>
#include <stdio.h>
int main(void)
{
// An array of numbers
int numbers[] = {4, 8, 15, 16, 23, 42};
// Search for 50
for (int i = 0; i < 6; i++)
{
if (numbers[i] == 50)
{
printf("Found\n");
return 0;
}
}
printf("Not found\n");
return 1;
}
Here we initialize an array with some values, and we check the items in the array one at a time, in order.
And in each case, depending on whether the value was found or not, we can return an exit code of either 0 (for success) or 1 (for
failure).
We can do the same for names:
#include <cs50.h>
#include <stdio.h>
#include <string.h>
int main(void)
{
// An array of names
string names[] = {"EMMA", "RODRIGO", "BRIAN", "DAVID"};
We can’t compare strings directly, since they’re not a simple data type but rather an array of many characters, and we need to compare
them differently. Luckily, the string library has a strcmp function which compares strings for us and returns 0 if they’re the same,
so we can use that.
Let’s try to implement a phone book with the same ideas:
#include <cs50.h>
#include <stdio.h>
#include <string.h>
int main(void)
{
string names[] = {"EMMA", "RODRIGO", "BRIAN", "DAVID"};
string numbers[] = {"617–555–0100", "617–555–0101", "617–555–0102", "617–555–0103"};
3/10
string numbers[] = { 617 555 0100 , 617 555 0101 , 617 555 0102 , 617 555 0103 };
We’ll use strings for phone numbers, since they might include formatting or be too long for a number.
Now, if the name at a certain index in the names array matches who we’re looking for, we’ll return the phone number in the numbers
array, at the same index. But that means we need to particularly careful to make sure that each number corresponds to the name at
each index, especially if we add or remove names and numbers.
Structs
It turns out that we can make our own custom data types called structs:
#include <cs50.h>
#include <stdio.h>
#include <string.h>
typedef struct
{
string name;
string number;
}
person;
int main(void)
{
person people[4];
people[0].name = "EMMA";
people[0].number = "617–555–0100";
people[1].name = "RODRIGO";
people[1].number = "617–555–0101";
people[2].name = "BRIAN";
people[2].number = "617–555–0102";
people[3].name = "DAVID";
people[3].number = "617–555–0103";
We can think of structs as containers, inside of which are multiple other data types.
Here, we create our own type with a struct called person , which will have a string called name and a string called number .
Then, we can create an array of these struct types and initialize the values inside each of them, using a new syntax, . , to access the
properties of each person .
In our loop, we can now be more certain that the number corresponds to the name since they are from the same person element.
Sorting
If our input is an unsorted list of numbers, there are many algorithms we could use to produce an output of a sorted list.
4/10
With eight volunteers on the stage with the following numbers, we might consider swapping pairs of numbers next to each other as a rst
step.
Our volunteers start in the following random order:
6 3 8 5 2 7 4 1
We look at the rst two numbers, and swap them so they are in order:
6 3 8 5 2 7 4 1
– –
3 6 8 5 2 7 4 1
The next pair, 6 and 8 , are in order, so we don’t need to swap them.
The next pair, 8 and 5 , need to be swapped:
3 6 8 5 2 7 4 1
– –
3 6 5 8 2 7 4 1
3 6 5 2 8 7 4 1
– –
3 6 5 2 7 8 4 1
– –
3 6 5 2 7 4 8 1
– –
3 6 5 2 7 4 1 8
Our list isn’t sorted yet, but we’re slightly closer to the solution because the biggest value, 8 , has been shifted all the way to the right.
We repeat this with another pass through the list:
3 6 5 2 7 4 1 8
– –
3 6 5 2 7 4 1 8
– –
3 5 6 2 7 4 1 8
– –
3 5 2 6 7 4 1 8
– –
3 5 2 6 7 4 1 8
– –
3 5 2 6 4 7 1 8
– –
3 5 2 6 4 1 7 8
Since we are comparing the i'th and i+1'th element, we only need to go up to n – 2 for i . Then, we swap the two elements if
they’re out of order.
And we can stop after we’ve made n – 1 passes, since we know the largest n–1 elements will have bubbled to the right.
We have n – 2 steps for the inner loop, and n – 1 loops, so we get n2 – 3n + 2 steps total. But the largest factor, or dominant term, is n2, as
n gets larger and larger, so we can say that bubble sort is O(n2).
We’ve seen running times like the following, and so even though binary search is much faster than linear search, it might not be worth the
one–time cost of sorting the list rst, unless we do lots of searches over time:
O(n2)
bubble sort
O(n log n)
O(n)
linear search
5/10
O(log n)
binary search
O(1)
And Ω for bubble sort is still n2, since we still check each pair of elements for n – 1 passes.
Selection sort
We can take another approach with the same set of numbers:
6 3 8 5 2 7 4 1
First, we’ll look at each number, and remember the smallest one we’ve seen. Then, we can swap it with the rst number in our list, since
we know it’s the smallest:
6 3 8 5 2 7 4 1
– –
1 3 8 5 2 7 4 6
Now we know at least the rst element of our list is in the right place, so we can look for the smallest element among the rest, and swap
it with the next unsorted element (now the second element):
1 3 8 5 2 7 4 6
– –
1 2 8 5 3 7 4 6
We can repeat this over and over, until we have a sorted list.
This algorithm is called selection sort, and we might write pseudocode like this:
With big O notation, we still have running time of O(n2), since we were looking at roughly all n elements to nd the smallest, and making n
passes to sort all the elements.
More formally, we can use some formulas to show that the biggest factor is indeed n2:
n + (n – 1) + (n – 2) + ... + 1
n(n + 1)/2
(n^2 + n)/2
n^2/2 + n/2
O(n^2)
So it turns out that selection sort is fundamentally about the same as bubble sort in running time:
O(n2)
bubble sort, selection sort
O(n log n)
O(n)
linear search
O(log n)
binary search
O(1)
The best case, Ω, is also n2.
We can go back to bubble sort and change its algorithm to be something like this, which will allow us to stop early if all the elements are
sorted:
Now, we only need to look at each element once, so the best case is now Ω(n):
Ω(n2)
selection sort 6/10
selection sort
Ω(n log n)
Ω(n)
bubble sort
Ω(log n)
Ω(1)
linear search, binary search
We look at a visualization online comparing sorting algorithms (https://www.cs.usfca.edu/~galles/visualization/ComparisonSort.html) with
animations for how the elements move within arrays for both bubble sort and selection sort.
Recursion
Recall that in week 0, we had pseudocode for nding a name in a phone book, where we had lines telling us to “go back” and repeat some
steps:
We could instead just repeat our entire algorithm on the half of the book we have left:
This seems like a cyclical process that will never end, but we’re actually dividing the problem in half each time, and stopping once
there’s no more book left.
Recursion occurs when a function or algorithm refers to itself, as in the new pseudocode above.
In week 1, too, we implemented a “pyramid” of blocks in the following shape:
#
##
###
####
#include <cs50.h>
#include <stdio.h>
int main(void)
{
// Get height of pyramid
int height = get_int("Height: ");
// Draw pyramid
draw(height);
}
7/10
}
void draw(int h)
{
// Draw pyramid of height h
for (int i = 1; i <= h; i++)
{
for (int j = 1; j <= i; j++)
{
printf("#");
}
printf("\n");
}
}
#include <cs50.h>
#include <stdio.h>
int main(void)
{
// Get height of pyramid
int height = get_int("Height: ");
// Draw pyramid
draw(height);
}
void draw(int h)
{
// If nothing to draw
if (h == 0)
{
return;
}
Now, our draw function rst calls itself recursively, drawing a pyramid of height h - 1 . But even before that, we need to stop if h
is 0, since there won’t be anything left to drawn.
After, we draw the next row, or a row of width h .
Merge sort
We can take the idea of recusion to sorting, with another algorithm called merge sort. The pseudocode might look like:
7 4 5 2 6 3 8 1
First, we’ll sort the left half (the rst four elements):
8/10
7 4 5 2 | 6 3 8 1
– – – –
Well, to sort that, we need to sort the left half of the left half rst:
7 4 | 5 2 | 6 3 8 1
– –
Now, we have just one item, 7 , in the left half, and one item, 4 , in the right half. So we’ll merge that together, by taking the smallest
item from each list rst:
– – | 5 2 | 6 3 8 1
4 7
And now we go back to the right half of the left half, and sort it:
– – | – – | 6 3 8 1
4 7 | 2 5
Now, both halves of the left half are sorted, so we can merge the two of them together. We look at the start of each list, and take 2 since
it’s smaller than 4 . Then, we take 4 , since it’s now the smallest item at the front of both lists. Then, we take 5 , and nally, 7 , to get:
– – – – | 6 3 8 1
– – – –
2 4 5 7
We now sort the right half the same way. First, the left half of the right half:
– – – – | – – | 8 1
– – – – | 3 6 |
2 4 5 7
– – – – | – – | – –
– – – – | 3 6 | 1 8
2 4 5 7
– – – – | – – – –
– – – – | – – – –
2 4 5 7 | 1 3 6 8
And nally, we can merge both halves of the whole list, following the same steps as before. Notice that we don’t need to check all the
elements of each half to nd the smallest, since we know that each half is already sorted. Instead, we just take the smallest element of the
two at the start of each half:
– – – – | – – – –
– – – – | – – – –
2 4 5 7 | – 3 6 8
1
– – – – | – – – –
– – – – | – – – –
– 4 5 7 | – 3 6 8
1 2
– – – – | – – – –
– – – – | – – – –
– 4 5 7 | – – 6 8
1 2 3
– – – – | – – – –
– – – – | – – – –
– – 5 7 | – – 6 8
1 2 3 4
9/10
– – – – | – – – –
– – – – | – – – –
– – – 7 | – – 6 8
1 2 3 4 5
– – – – | – – – –
– – – – | – – – –
– – – 7 | – – – 8
1 2 3 4 5 6
– – – – | – – – –
– – – – | – – – –
– – – – | – – – 8
1 2 3 4 5 6 7
– – – – | – – – –
– – – – | – – – –
– – – – | – – – –
1 2 3 4 5 6 7 8
It took a lot of steps, but it actually took fewer steps than the other algorithms we’ve seen so far. We broke our list in half each time, until
we were “sorting” eight lists with one element each:
7 | 4 | 5 | 2 | 6 | 3 | 8 | 1
4 7 | 2 5 | 3 6 | 1 8
2 4 5 7 | 1 3 6 8
1 2 3 4 5 6 7 8
Since our algorithm divided the problem in half each time, its running time is logarithmic with O(log n). And after we sorted each half (or
half of a half), we needed to merge together all the elements, with n steps since we had to look at each element once.
So our total running time is O(n log n):
O(n2)
bubble sort, selection sort
O(n log n)
merge sort
O(n)
linear search
O(log n)
binary search
O(1)
Since log n is greater than 1 but less than n, n log n is in between n (times 1) and n2.
The best case, Ω, is still n log n, since we still sort each half rst and then merge them together:
Ω(n2)
selection sort
Ω(n log n)
merge sort
Ω(n)
bubble sort
Ω(log n)
Ω(1)
linear search, binary search
Finally, there is another notation, Θ, Theta, which we use to describe running times of algorithms if the upper bound and lower bound is
the same. For example, merge sort has Θ(n log n) since the best and worst case both require the same number of steps. And selection sort
has Θ(n2).
We look at a nal visualization (https://www.youtube.com/watch?v=ZZuD6iUe3Pc) of sorting algorithms with a larger number of inputs,
running at the same time.
10/10
This is CS50x
OpenCourseWare
Lecture 4
Hexadecimal
Pointers
string
Compare and copy
valgrind
Swap
Memory layout
get_int
Files
JPEG
Hexadecimal
In week 0, we learned binary, a counting system with 0s and 1s.
In week 2, we talked about memory and how each byte has an address, or identi er, so we can refer to where our variables are actually
stored.
It turns out that, by convention, the addresses for memory use the counting system hexadecimal, where there are 16 digits, 0-9 and A-F.
Recall that, in binary, each digit stood for a power of 2:
128 64 32 16 8 4 2 1
1 1 1 1 1 1 1 1
16^1 16^0
F F
Here, the F is a value of 15 in decimal, and each place is a power of 16, so the rst F is 16^1 * 15 = 240, plus the second F with
the value of 16^0 * 15 = 15, for a total of 255.
And 0A is the same as 10 in decimal, and 0F the same as 15. 10 in hexadecimal would be 16, and we would say it as “one zero in
hexadecimal” instead of “ten”, if we wanted to avoid confusion.
The RGB color system also conventionally uses hexadecimal to describe the amount of each color. For example, 000000 in hexadecimal
means 0 of each red, green, and blue, for a color of black. And FF0000 would be 255, or the highest possible, amount of red. With
different values for each color, we can represent millions of different colors.
In writing, we can also indicate a value is in hexadecimal by pre xing it with 0x , as in 0x10 , where the value is equal to 16 in decimal,
as opposed to 10.
Pointers
We might create a value n , and print it out:
#include <stdio.h>
int main(void)
{
int n = 50; 1/11
int n = 50;
printf("%i\n", n);
}
In our computer’s memory, there are now 4 bytes somewhere that have the binary value of 50, labeled n :
It turns out that, with the billions of bytes in memory, those bytes for the variable n starts at some unique address that might look like
0x12345678 .
In C, we can actually see the address with the & operator, which means “get the address of this variable”:
#include <stdio.h>
int main(void)
{
int n = 50;
printf("%p\n", &n);
}
And in the CS50 IDE, we might see an address like 0x7ffe00b3adbc , where this is a speci c location in the server’s memory.
The address of a variable is called a pointer, which we can think of as a value that “points” to a location in memory. The * operator lets us
“go to” the location that a pointer is pointing to.
For example, we can print *&n , where we “go to” the address of n , and that will print out the value of n , 50 , since that’s the value at
the address of n :
#include <stdio.h>
int main(void)
{
int n = 50;
printf("%i\n", *&n);
}
We also have to use the * operator (in an unfortunately confusing way) to declare a variable that we want to be a pointer:
#include <stdio.h>
int main(void)
{
int n = 50;
int *p = &n;
printf("%p\n", p);
}
Here, we use int *p to declare a variable, p , that has the type of * , a pointer, to a value of type int , an integer. Then, we can
print its value (something like 0x12345678 ), or print the value at its location with printf("%i\n", *p); .
2/11
In our computer’s memory, the variables might look like this:
Let’s say we have a mailbox labeled “123”, with the number “50” inside it. The mailbox would be int n , since it stores an integer. We
might have another mailbox with the address “456”, inside of which is the value “123”, which is the address of our other mailbox. This
would be int *p , since it’s a pointer to an integer.
With the ability to use pointers, we can create different data structures, or different ways to organize data in memory that we’ll see next
week.
Many modern computer systems are “64-bit”, meaning that they use 64 bits to address memory, so a pointer will be 8 bytes, twice as big as
an integer of 4 bytes.
string
We might have a variable string s for a name like EMMA , and be able to access each character with s[0] and so on:
But it turns out that each character is in stored in memory at a byte with some address, and s is actually just a pointer with the address
of the rst character:
3/11
And since s is just a pointer to the beginning, only the \0 indicates the end of the string.
In fact, the CS50 Library de nes a string with typedef char *string , which just says that we want to name a new type, string , as a
char * , or a pointer to a character.
Let’s print out a string:
#include <cs50.h>
#include <stdio.h>
int main(void)
{
string s = "EMMA";
printf("%s\n", s);
}
#include <stdio.h>
int main(void)
{
char *s = "EMMA";
printf("%s\n", s);
}
#include <cs50.h>
#include <stdio.h>
int main(void)
{
// Get two integers
int i = get_int("i: ");
int j = get_int("j: ");
// Compare integers
if (i == j)
{
printf("Same\n");
}
else
{
printf("Different\n");
}
}
We can compile and run this, and our program works as we’d expect, with the same values of the two integers giving us “Same” and
different values “Different”.
In compare1 , we see that the same string values are causing our program to print “Different”:
4/11
p , g g p g p
#include <cs50.h>
#include <stdio.h>
int main(void)
{
// Get two strings
string s = get_string("s: ");
string t = get_string("t: ");
Given what we now know about strings, this makes sense because each “string” variable is pointing to a different location in memory,
where the rst character of each string is stored. So even if the values of the strings are the same, this will always print “Different”.
For example, our rst string might be at address 0x123, our second might be at 0x456, and s will be 0x123 and t will be 0x456 ,
so those values will be different.
And get_string , this whole time, has been returning just a char * , or a pointer to the rst character of a string from the user.
Now let’s try to copy a string:
#include <cs50.h>
#include <ctype.h>
#include <stdio.h>
int main(void)
{
string s = get_string("s: ");
string t = s;
t[0] = toupper(t[0]);
We get a string s , and copy the value of s into t . Then, we capitalize the rst letter in t .
But when we run our program, we see that both s and t are now capitalized.
Since we set s and t to the same values, they’re actually pointers to the same character, and so we capitalized the same character!
To actually make a copy of a string, we have to do a little more work:
#include <cs50.h>
#include <ctype.h>
#include <stdio.h>
#include <string.h>
int main(void)
{
char *s = get_string("s: ");
t[0] = toupper(t[0]);
We create a new variable, t , of the type char * , with char *t . Now, we want to point it to a new chunk of memory that’s large
enough to store the copy of the string. With malloc , we can allocate some number of bytes in memory (that aren’t already used to
store other values), and we pass in the number of bytes we’d like. We already know the length of s , so we add 1 to that for the
terminating null character. So, our nal line of code is char *t = malloc(strlen(s) + 1); .
Then, we copy each character, one at a time, and now we can capitalize just the rst letter of t . And we use i < n + 1 , since we
actually want to go up to n , to ensure we copy the terminating character in the string.
We can actually also use the strcpy library function with strcpy(t, s) instead of our loop, to copy the string s into t . To be
clear, the concept of a “string” is from the C language and well-supported; the only training wheels from CS50 are the type string
instead of char * , and the get_string function.
If we didn’t copy the null terminating character, \0 , and tried to print out our string t , printf will continue and print out the unknown,
or garbage, values that we have in memory, until it happens to reach a \0 , or crashes entirely, since our program might end up trying to
read memory that doesn’t belong to it!
valgrind
It turns out that, after we’re done with memory that we’ve allocated with malloc , we should call free (as in free(t) ), which tells our
computer that those bytes are no longer useful to our program, so those bytes in memory can be reused again.
If we kept running our program and allocating memory with malloc , but never freed the memory after we were done using it, we would
have a memory leak, which will slow down our computer and use up more and more memory until our computer runs out.
valgrind is a command-line tool that we can use to run our program and see if it has any memory leaks. We can run valgrind on our
program above with help50 valgrind ./copy and see, from the error message, that line 10, we allocated memory that we never freed (or
“lost”).
So at the end, we can add a line free(t) , which won’t change how our program runs, but no errors from valgrind.
Let’s take a look at memory.c :
// http://valgrind.org/docs/manual/quick-start.html#quick-start.prepare
#include <stdlib.h>
void f(void)
{
int *x = malloc(10 * sizeof(int));
x[10] = 0;
}
int main(void)
{
f();
return 0;
}
This is an example from valgrind’s documentation (valgrind is a real tool, while help50 was written speci cally to help us in this
course).
The function f allocates enough memory for 10 integers, and stores the address in a pointer called x . Then we try to set the 11th
value of x with x[10] to 0 , which goes past the array of memory we’ve allocated for our program. This is called buffer over ow,
where we go past the boundaries of our buffer, or array, and into unknown memory.
valgrind will also tell us there’s an “Invalid write of size 4” for line 8, where we are indeed trying to change the value of an integer (of size
4 bytes).
And this whole time, the CS50 Library has been freeing memory it’s allocated in get_string , when our program nishes!
Swap
We have two colored drinks, purple and green, each of which is in a cup. We want to swap the drinks between the two cups, but we can’t
do that without a third cup to pour one of the drink into rst.
6/11
do t at w t out a t d cup to pou o e o t e d to st.
Now, let’s say we wanted to swap the values of two integers.
With a third variable to use as temporary storage space, we can do this pretty easily, by putting a into tmp , and then b to a , and
nally the original value of a , now in tmp , into b .
But, if we tried to use that function in a program, we don’t see any changes:
#include <stdio.h>
int main(void)
{
int x = 1;
int y = 2;
It turns out that the swap function gets its own variables, a and b when they are passed in, that are copies of x and y , and so
changing those values don’t change x and y in the main function.
Memory layout
Within our computer’s memory, the different types of data that need to be stored for our program are organized into different sections:
The machine code section is our compiled program’s binary code. When we run our program, that code is loaded into the “top” of
memory.
Globals are global variables we declare in our program or other shared variables that our entire program can access.
The heap section is an empty area where malloc can get free memory from, for our program to use.
The stack section is used by functions in our program as they are called. For example, our main function is at the very bottom of the
stack, and has the local variables x and y . The swap function, when it’s called, has its own frame, or slice, of memory that’s on top
7/11
of main ’s, with the local variables a , b , and tmp :
Once the function swap returns, the memory it was using is freed for the next function call, and we lose anything we did, other
than the return values, and our program goes back to the function that called swap .
So by passing in the addresses of x and y from main to swap , we can actually change the values of x and y :
By passing in the address of x and y , our swap function can actually work:
8/11
#include <stdio.h>
int main(void)
{
int x = 1;
int y = 2;
The addresses of x and y are passed in from main to swap , and we use the int *a syntax to declare that our swap function
takes in pointers. We save the value of x to tmp by following the pointer a , and then take the value of y by following the pointer
b , and store that to the location a is pointing to ( x ). Finally, we store the value of tmp to the location pointed to by b ( y ), and
we’re done.
If we call malloc too many times, we will have a heap over ow, where we end up going past our heap. Or, if we have too many functions
being called, we will have a stack over ow, where our stack has too many frames of memory allocated as well. And these two types of
over ow are generally known as buffer over ows, after which our program (or entire computer) might crash.
get_int
We can implement get_int ourselves with a C library function, scanf :
#include <stdio.h>
int main(void)
{
int x;
printf("x: ");
scanf("%i", &x);
printf("x: %i\n", x);
}
scanf takes a format, %i , so the input is “scanned” for that format, and the address in memory where we want that input to go. But
scanf doesn’t have much error checking, so we might not get an integer.
We can try to get a string the same way:
#include <stdio.h>
int main(void)
{
char *s = NULL;
printf("s: ");
scanf("%s", s);
printf("s: %s\n", s);
}
But we haven’t actually allocated any memory for s ( s is NULL , or not pointing to anything), so we might want to call char s[5]
9/11
to allocate an array of 5 characters for our string. Then, s will be treated as a pointer in scanf and printf .
Now, if the user types in a string of length 4 or less, our program will work safely. But if the user types in a longer string, scanf
might be trying to write past the end of our array into unknown memory, causing our program to crash.
Files
With the ability to use pointers, we can also open les:
#include <cs50.h>
#include <stdio.h>
#include <string.h>
int main(void)
{
// Open file
FILE *file = fopen("phonebook.csv", "a");
// Close file
fclose(file);
}
fopen is a new function we can use to open a le. It will return a pointer to a new type, FILE , that we can read from and write to.
The rst argument is the name of the le, and the second argument is the mode we want to open the le in ( r for read, w for write,
and a for append, or adding to).
After we get some strings, we can use fprintf to print to a le.
Finally, we close the le with fclose .
Now we can create our own CSV les, les of comma-separated values (like a mini-spreadsheet), programmatically.
JPEG
We can also write a program that opens a le and tells us if it’s a JPEG (image) le:
#include <stdio.h>
// Open file
FILE *file = fopen(argv[1], "r");
if (!file)
{
return 1;
}
// Close file
fclose(file);
}
Now, if we run this program with ./jpeg brian.jpg , our program will try to open the le we specify (checking that we indeed get a
non-NULL le back), and read the rst three bytes from the le with fread .
We can compare the rst three bytes (in hexadecimal) to the three bytes required to begin a JPEG le. If they’re the same, then our le
is likely to be a JPEG le (though, other types of les may still begin with those bytes). But if they’re not the same, we know it’s
de nitely not a JPEG le.
We can use these abilities to read and write les, in particular images, and modify them by changing the bytes in them, in this week’s
problem set!
11/11
This is CS50x
OpenCourseWare
Lecture 5
Pointers
Resizing arrays
Data structures
Linked Lists
More data structures
Pointers
Last time, we learned about pointers, malloc , and other useful tools for working with memory.
Let’s review this snippet of code:
int main(void)
{
int *x;
int *y;
x = malloc(sizeof(int));
*x = 42;
*y = 13;
}
Here, the rst two lines of code in our main function are declaring two pointers, x and y . Then, we allocate enough memory for an
int with malloc , and stores the address returned by malloc into x .
With *x = 42; , we go to the address pointed to by x , and stores the value 42 into that location.
The nal line, though, is buggy since we don’t know what the value of y is, since we never set a value for it. Instead, we can write:
y = x;
*y = 13;
And this will set y to point to the same location as x does, and then set that value to 13 .
We take a look at a short clip, Pointer Fun with Binky (https://www.youtube.com/watch?v=3uLKjb973HU), which also explains this snippet
in an animated way!
Resizing arrays
In week 2, we learned about arrays, where we could store the same kind of value in a list, side-by-side. But we need to declare the size of
arrays when we create them, and when we want to increase the size of the array, the memory surrounding it might be taken up by some
other data.
One solution might be to allocate more memory in a larger area that’s free, and move our array there, where it has more space. But we’ll
need to copy our array, which becomes an operation with running time of O(n), since we need to copy each of n elements in an array.
We might write a program like the following, to do this in code:
1/9
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
// Here, we allocate enough memory to fit three integers, and our variable
// list will point to the first integer.
int *list = malloc(3 * sizeof(int));
// We should check that we allocated memory correctly, since malloc might
// fail to get us enough free memory.
if (list == NULL)
{
return 1;
}
// With this syntax, the compiler will do pointer arithmetic for us, and
// calculate the byte in memory that list[0], list[1], and list[2] maps to,
// since integers are 4 bytes large.
list[0] = 1;
list[1] = 2;
list[2] = 3;
// Now, if we want to resize our array to fit 4 integers, we'll try to allocate
// enough memory for them, and temporarily use tmp to point to the first:
int *tmp = malloc(4 * sizeof(int));
if (tmp == NULL)
{
return 1;
}
// Now, we copy integers from the old array into the new array ...
for (int i = 0; i < 3; i++)
{
tmp[i] = list[i];
}
// We should free the original memory for list, which is why we need a
// temporary variable to point to the new array ...
free(list);
// ... and now we can set our list variable to point to the new array that
// tmp points to:
list = tmp;
It turns out that there’s actually a helpful function, realloc , which will reallocate some memory:
2/9
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int *list = malloc(3 * sizeof(int));
if (list == NULL)
{
return 1;
}
list[0] = 1;
list[1] = 2;
list[2] = 3;
// Here, we give realloc our original array that list points to, and it will
// return a new address for a new array, with the old data copied over:
int *tmp = realloc(list, 4 * sizeof(int));
if (tmp == NULL)
{
return 1;
}
// Now, all we need to do is remember the location of the new array:
list = tmp;
list[3] = 4;
free(list);
}
Data structures
Data structures are programming constructs that allow us to store information in different layouts in our computer’s memory.
To build a data structure, we’ll need some tools we’ve seen:
struct to create custom data types
. to access properties in a structure
* to go to an address in memory pointed to by a pointer
Linked Lists
With a linked list, we can store a list of values that can easily be grown by storing values in different parts of memory:
3/9
This is different than an array since our values are no longer next to one another in memory.
We can link our list together by allocating, for each element, enough memory for both the value we want to store, and the address of the
next element:
By the way, NUL refers to \0 , a character that ends a string, and NULL refers to an address of all zeros, or a null pointer that we can
think of as pointing nowhere.
Unlike we can with arrays, we no longer randomly access elements in a linked list. For example, we can no longer access the 5th element
of the list by calculating where it is, in constant time. (Since we know arrays store elements back-to-back, we can add 1, or 4, or the size of
our element, to calculate addresses.) Instead, we have to follow each element’s pointer, one at a time. And we need to allocate twice as
much memory as we needed before for each element.
In code, we might create our own struct called node (like a node from a graph in mathematics), and we need to store both an int and a
pointer to the next node called next :
We start this struct with typedef struct node so that we can refer to a node inside our struct.
We can build a linked list in code starting with our struct. First, we’ll want to remember an empty list, so we can use the null pointer: node
*list = NULL; .
To add an element, rst we’ll need to allocate some memory for a node, and set its values:
node *n = malloc(sizeof(node));
// We want to make sure malloc succeeded in getting memory for us:
if (n != NULL)
{
// This is equivalent to (*n).number, where we first go to the node pointed
// to by n, and then set the number property. In C, we can also use this
// arrow notation:
n->number = 2;
// Then we need to store a pointer to the next node in our list, but the
// new node won't point to anything (for now):
n->next = NULL;
}
4/9
To add to the list, we’ll create a new node the same way, perhaps with the value 4. But now we need to update the pointer in our rst node
to point to it.
Since our list pointer points only to the rst node (and we can’t be sure that the list only has one node), we need to “follow the
breadcrumbs” and follow each node’s next pointer:
If we want to insert a node to the front of our linked list, we would need to carefully update our node to point to the one following it,
before updating list. Otherwise, we’ll lose the rest of our list:
// Here, we're inserting a node into the front of the list, so we want its
// next pointer to point to the original list, before pointing the list to
// n:
n->next = list;
list = n;
And to insert a node in the middle of our list, we can go through the list, following each element one at a time, comparing its values, and
changing the next pointers carefully as well.
With some volunteers on the stage, we simulate a list, with each volunteer acting as the list variable or a node. As we insert nodes into
the list, we need a temporary pointer to follow the list, and make sure we don’t lose any parts of our list. Our linked list only points to the
rst node in our list, so we can only look at one node at a time, but we can dynamically allocate more memory as we need to grow our list.
Now, even if our linked list is sorted, the running time of searching it will be O(n), since we have to follow each node to check their values,
and we don’t know where the middle of our list will be.
We can combine all of our snippets of code into a complete program:
5/9
#include <stdio.h>
#include <stdlib.h>
// Represents a node
typedef struct node
{
int number;
struct node *next;
}
node;
int main(void)
{
// List of size 0, initially not pointing to anything
node *list = NULL;
// Print list
// Here we can iterate over all the nodes in our list with a temporary
// variable. First, we have a temporary pointer, tmp, that points to the
// list. Then, our condition for continuing is that tmp is not NULL, and
// finally, we update tmp to the next pointer of itself.
for (node *tmp = list; tmp != NULL; tmp = tmp->next)
{
// Within the node, we'll just print the number stored:
printf("%i\n", tmp->number);
}
6/9
}
// Free list
// Since we're freeing each node as we go along, we'll use a while loop
// and follow each node's next pointer before freeing it, but we'll see
// this in more detail in Problem Set 5.
while (list != NULL)
{
node *tmp = list->next;
free(list);
list = tmp;
}
}
Notice that there are now two dimensions to this data structure, where some nodes are on different “levels” than others. And we can
imagine implementing this with a more complex version of a node in a linked list, where each node has not one but two pointers, one
to the value in the “middle of the left half” and one to the value in the “middle of the right half”. And all elements to the left of a node
are smaller, and all elemnts to the right are greater.
This is called a binary search tree because each node has at most two children, or nodes it is pointing to, and a search tree because
it’s sorted in a way that allows us to search correctly.
And like a linked list, we’ll want to keep a pointer to just the beginning of the list, but in this case we want to point to the root, or top
center node of the tree (the 4).
Now, we can easily do binary search, and since each node is pointing to another, we can also insert nodes into the tree without moving all
of them around as we would have to in an array. Recursively searching this tree would look something like:
7/9
The running time of searching a tree is O(log n), and inserting nodes while keeping the tree balanced is also O(log n). By spending a bit
more memory and time to maintain the tree, we’ve now gained faster searching compared to a plain linked list.
A data structure with almost a constant time search is a hash table, which is a combination of an array and a linked list. We have an array
of linked lists, and each linked list in the array has elements of a certain category. For example, in the real world we might have lots of
nametags, and we might sort them into 26 buckets, one labeled with each letter of the alphabet, so we can nd nametags by looking in
just one bucket.
We can implement this in a hash table with an array of 26 pointers, each of which points to a linked list for a letter of the alphabet:
Since we have random access with arrays, we can add elements quickly, and also index quickly into a bucket.
A bucket might have multiple matching values, so we’ll use a linked list to store all of them horizontally. (We call this a collision, when two
values match in some way.)
This is called a hash table because we use a hash function, which takes some input and maps it to a bucket it should go in. In our example,
the hash function is just looking at the rst letter of the name, so it might return 0 for “Albus” and 25 for “Zacharias”.
But in the worst case, all the names might start with the same letter, so we might end up with the equivalent of a single linked list again.
We might look at the rst two letters, and allocate enough buckets for 26*26 possible hashed values, or even the rst three letters, and
now we’ll need 26*26*26 buckets. But we could still have a worst case where all our values start with the same three characters, so the
running time for search is O(n). In practice, though, we can get closer to O(1) if we have about as many buckets as possible values,
especially if we have an ideal hash function, where we can sort our inputs into unique buckets.
We can use another data structure called a trie (pronounced like “try”, and is short for “retrieval”):
Imagine we want to store a dictionary of words ef ciently, and be able to access each one in constant time. A trie is like a tree, but
each node is an array. Each array will have each letter, A-Z, stored. For each word, the rst letter will point to an array, where the next
valid letter will point to another array, and so on, until we reach something indicating the end of a valid word. If our word isn’t in the
8/9
p y, , g g
trie, then one of the arrays won’t have a pointer or terminating character for our word. Now, even if our data structure has lots of
words, the lookup time will be just the length of the word we’re looking for, and this might be a xed maximum so we have O(1) for
searching and insertion. The cost for this, though, is 26 times as much memory as we need for each character.
There are even higher-level constructs, abstract data structures, where we use our building blocks of arrays, linked lists, hash tables, and
tries to implement a solution to some problem.
For example, one abstract data structure is a queue, where we want to be able to add values and remove values in a rst-in- rst-out (FIFO)
way. To add a value we might enqueue it, and to remove a value we would dequeue it. And we can implement this with an array that we
resize as we add items, or a linked list where we append values to the end.
An “opposite” data structure would be a stack, where items most recently added (pushed) are removed (popped) rst, in a last-in- rst-out
(LIFO) way. Our email inbox is a stack, where our most recent emails are at the top.
Another example is a dictionary, where we can map keys to values, or strings to values, and we can implement one with a hash table
where a word comes with some other information (like its de nition or meaning).
We take a look at “Jack Learns the Facts About Queues and Stacks” (https://www.youtube.com/watch?v=2wM6_PuBIxY), an animation about
these data structures.
9/9
This is CS50x
OpenCourseWare
Lecture 6
Python Basics
Examples
More features
Files
New features
Python Basics
Today we’ll learn a new programming language called Python, and remember that one of the overall goals of the course is not learning
any particular languages, but how to program in general.
Source code in Python looks a lot simpler than C, but is capable of solving problems in elds like data science. In fact, to print “hello,
world”, all we need to write is:
print("hello, world")
Notice that, unlike in C, we don’t need to import a standard library, declare a main function, specify a newline in the print function,
or use semicolons.
Python is an interpreted language, which means that we actually run another program (an interpreter) that reads our source code and runs
it top to bottom. For example, we can save the above as hello.py , and run the command python hello.py to run our code, without
having to compile it.
We can get strings from a user:
We create a variable called answer , without specifying the type (the interpreter determins that from context for us), and we can
easily combine two strings with the + operator before we pass it into print .
We can also pass in multiple arguments to print , with print("hello,", answer) , and it will automatically join them with spaces
for us too.
print also accepts format strings like f"hello, {answer}" , which substitutes variables inside curly braces into a string.
We can create variables with just counter = 0 . To increment a variable, we can use counter = counter + 1 or counter += 1 .
Conditions look like:
if x < y:
print("x is less than y")
elif x > y:
print("x is greater than y")
else:
print("x is equal to y")
Unlike in C and JavaScript (whereby braces { } are used to indicate blocks of code), the exact indentation of each line is what
determines the level of nesting in Python.
And instead of else if , we just say elif .
Boolean expressions are slightly different, too:
while True:
print("hello, world")
1/8
We can write a loop with a variable:
i = 3
while i > 0:
print("cough")
i -= 1
We can also use a for loop, where we can do something for each element in a list:
Lists in Python are like arrays in C, but they can grow and shrink easily with the interpreter managing the implementation and
memory for us.
This for loop will set the variable i to the rst element, 0 , run, then to the second element, 1 , run, and so on.
And we can use a special function, range , to get some number of values, as in for i in range(3) . This will give us 0 , 1 , and 2 ,
for a total of thee values.
In Python, there are many data types:
bool , True or False
float , real numbers
int , integers
str , strings
range , sequence of numbers
list , sequence of mutable values, that we can change or add or remove
tuple , sequence of immutable values, that we can’t change
dict , collection of key/value pairs, like a hash table
set , collection of unique values
docs.python.org (https://docs.python.org) is the of cial source of documentation, but Google and StackOver ow will also have helpful
resources when we need to gure out how to do something in Python. In fact, programmers in the real world rarely know everything in the
documentation, but rather how to nd what they need when they need it.
Examples
We can blur an image with:
before = Image.open("bridge.bmp")
after = before.filter(ImageFilter.BLUR)
after.save("out.bmp")
In Python, we include other libraries with import , and here we’ll import the Image and ImageFilter names from the PIL library.
It turns out, if we look for documention for the PIL library, we can use the next three lines of code to open an image called
bridge.bmp , run a blur lter on it, and save it to a le called out.bmp .
And we can run this with python blur.py after saving to a le called blur.py .
We can implement a dictionary with:
words = set()
def check(word):
if word.lower() in words:
return True
else:
return False
def load(dictionary):
file = open(dictionary, "r")
for line in file:
words.add(line.rstrip("\n"))
file.close()
return True
def size():
return len(words)
2/8
def unload():
return True
First, we create a new set called words . Then, for check , we can just ask ` if word.lower() in words . For load , we open the le
and use words.add to add each line to our set. For size , we can use len to count the number of elements in our set, and nally,
for unload , we don’t have to do anything!
It turns out, even though implementing a program in Python is simpler for us, the running time of our program in Python is slower than
our program in C since our interpreter has to do more work for us. So, depending on our goals, we’ll also have to consider the tradeoff of
human time of writing a program that’s more ef cient, versus the running time of the program.
In Python, we can too include the CS50 library, but our syntax will be:
x = get_int("x: ")
y = get_int("y: ")
if x < y:
print("x is less than y")
elif x > y:
print("x is greater than y")
else:
print("x is equal to y")
if s == "Y" or s == "y":
print("Agreed.")
elif s == "N" or s == "n":
print("Not agreed.")
print("cough")
print("cough")
print("cough")
We don’t need to declare a main function, so we just write the same line of code three times.
But we can do better:
for i in range(3):
cough()
Notice that we don’t need to specify the return type of a new function, which we can de ne with def .
But this causes an error when we try to run it: NameError: name 'cough' is not defined . It turns out that we need to de ne our
function before we use it, so we can either move our de nition of cough to the top, or create a main function:
def main():
for i in range(3):
cough()
def cough():
print("cough")
main()
Now, by the time we actually call our main function, the cough function will have been read by our interpreter.
Our functions can take inputs, too:
def main():
cough(3)
def cough(n):
for i in range(n):
print("cough")
main()
def main():
i = get_positive_int()
print(i)
def get_positive_int():
while True:
n = get_int("Positive Integer: ")
if n > 0:
break
return n
main()
Since there is no do-while loop in Python as there is in C, we have a while loop that will go on in nitely, but we use break to end
the loop as soon as n > 0 . Then, our function will just return n .
Notice that variables in Python have function scope by default, meaning that n can be initialized within a loop, but still be accessible
later in the function.
We can print out a row of question marks on the screen:
for i in range(4):
print("?", end="")
print()
When we print each block, we don’t want the automatic new line, so we can pass a parameter, or named argument, to the print
function. Here, we say end="" to specify that nothing should be printed at the end of our string. Then, after we print our row, we can
call print to get a new line.
We can also “multiply” a string and print that directly with: print("?" * 4) .
We can print a column with a loop:
for i in range(3):
print("#")
for i in range(3):
for j in range(3):
print("#", end="")
print()
We don’t need to use the get_string function from the CS50 library, since we can use the input function built into Python to get a
string from the user. But if we want another type of data, like an integer, from the user, we’ll need to cast it with int() .
But our program will crash if the string isn’t convertable to an integer, so we can use get_string which will just ask again.
In Python, trying to get an integer over ow actually won’t work:
i = 1
while True:
print(i)
sleep(1)
i *= 2
We call the sleep function to pause our program for a second between each iteration.
This will continue until the integer can no longer t in your computer’s memory.
Floating-point imprecision, too, can be prevented by libraries that can represent decimal numbers with as many bits as are needed.
We can make a list:
scores = []
scores.append(72)
scores.append(73)
scores.append(33)
With append , we can add items to our list, using it like a linked list.
We can also declare a list with some values like scores = [72, 73, 33] .
We can iterate over each character in a string:
s = get_string("Input: ")
print("Output: ", end="")
for c in s:
print(c, end="")
print()
More features
We can take command-line arguments with:
for i in range(len(argv)):
print(argv[i])
Since argv is a list of strings, we can use len() to get its length, and range() for a range of values that we can use as an index for
each element in the list.
But we can also let Python iterate over the list for us:
5/8
from sys import argv, exit
if len(argv) != 2:
print("missing command-line argument")
exit(1)
print(f"hello, {argv[1]}")
exit(0)
We import the exit function, and call it with the code we want our program to exit with.
We can implement linear search by just checking each element in a list:
import sys
if "EMMA" in names:
print("Found")
sys.exit(0)
print("Not found")
sys.exit(1)
If we have a dictionary, a set of key:value pairs, we can also check each key:
import sys
people = {
"EMMA": "617-555-0100",
"RODRIGO": "617-555-0101",
"BRIAN": "617-555-0102",
"DAVID": "617-555-0103"
}
if "EMMA" in people:
print(f"Found {people['EMMA']}")
sys.exit(0)
print("Not found")
sys.exit(1)
Notice that we can get the value of of a particular key in a dictionary with people['EMMA'] . Here, we use single quotes (both single
and double quotes are allowed, as long they match for a string) to differentiate the inner string from the outer string.
And we declare dictionaries with curly braces, {} , and lists with brackets [] .
In Python, we can compare strings directly with just == :
s = get_string("s: ")
t = get_string("t: ")
if s == t:
print("Same")
else:
print("Different")
Copying strings, too, works without any extra work from us:
s = get_string("s: ")
t = s
t = t.capitalize()
print(f"s: {s}")
print(f"t: {t}")
Swapping two variables can also be done by assigning both values at the same time:
x = 1
y = 2
Files
Let’s open a CSV le:
import csv
from cs50 import get_string
writer = csv.writer(file)
writer.writerow((name, number))
file.close()
It turns out that Python also has a csv package (library) that helps us work with CSV les, so after we open the le for appending, we
can call csv.writer to create a writer from the le and then writer.writerow to write a row. With the inner parentheses, we’re
creating a tuple with the values we want to write, so we’re actually passing in a single argument that has all the values for our row.
We can use the with keyword, which will helpfully close the le for us:
...
with open("phonebook.csv", "a") as file:
writer = csv.writer(file)
writer.writerow((name, number))
New features
A feature of Python that C does not have is regular expressions, or patterns against which we can match strings. For example, its syntax
includes:
. , for any character
.* , for 0 or more characters
.+ , for 1 or more characters
? , for something optional
^ , for start of input
$ , for end of input
For example, we can match strings with:
import re
from cs50 import get_string
if re.search("^y(es)?$", s, re.IGNORECASE):
print("Agreed.")
elif re.search("^no?$", s, re.IGNORECASE):
print("Not agreed.")
import speech_recognition
recognizer = speech_recognition.Recognizer()
with speech_recognition.Microphone() as source:
print("Say something!")
audio = recognizer.listen(source)
It turns out that there’s another library we can download, called speech_recognition , that can listen to audio and convert it to a
string.
And now, we can match on the audio to print something else:
...
words = recognizer.recognize_google(audio)
# Respond to speech
if "hello" in words:
print("Hello to you too!")
elif "how are you" in words:
print("I am well, thanks!")
elif "goodbye" in words:
print("Goodbye to you too!")
else:
print("Huh?")
...
words = recognizer.recognize_google(audio)
Here, we can get all the characters after my name is with .* , and print it out.
We run detect.py and faces.py (https://cdn.cs50.net/2019/fall/lectures/6/src6/6/faces/), which nds each face (or even a speci c face) in a
photo.
qr.py (https://cdn.cs50.net/2019/fall/lectures/6/src6/6/qr/) will also generate a QR code to a particular URL.
8/8
This is CS50x
OpenCourseWare
Lecture 7
Spreadsheets
SQL
IMDb
Multiple tables
Problems
Spreadsheets
Most of us are familiar with spreadsheets, rows of data, with each column in a row having a different piece of data that relate to each
other somehow.
A database is an application that can store data, and we can think of Google Sheets as one such application.
For example, we created a Google Form to ask students their favorite TV show and genre of it. We look thorugh the responses, and see
that the spreadsheet has three columns: “Timestamp”, “title”, and “genres”:
We can download a CSV le from the spreadsheet with “File > Download”, upload it to our IDE, and see that it’s a text le with comma-
separated values matching the spreadsheet’s data.
We’ll write favorites.py :
import csv
with open("CS50 2019 - Lecture 7 - Favorite TV Shows (Responses) - Form Responses 1.csv", "r") as file:
reader = csv.DictReader(file)
We’re just going to open the le and make sure we can get the title of each row.
Now we can use a dictionary to count the number of times we’ve seen each title, with the keys being the titles and the values for each key
an integer, tracking how many times we’ve seen that title:
1/9
import csv
counts = {}
with open("CS50 2019 - Lecture 7 - Favorite TV Shows (Responses) - Form Responses 1.csv", "r") as file:
reader = csv.DictReader(file)
def f(item):
return item[1]
We de ne a function, f , which just returns the value from the item in the dictionary with item[1] . The sorted function, in turn,
can use that as the key to sort the dictionary’s items. And we’ll also pass in reverse=True to sort from largest to smallest, instead of
smallest to largest.
We can actually de ne our function in the same line, with this syntax:
We pass in a lambda, or anonymous function, as the key, which takes in the item and returns item[1] .
Finally, we can make all the titles lowercase with title = row["title"].lower() , so our counts can be a little more accurate even if the
names weren’t typed in the exact same way.
SQL
We’ll look at a new program in our terminal window, sqlite3 , a command-line program that lets us use another language, SQL
(pronounced like “sequel”).
We’ll run some commands to create a new database called favorites.db and import our CSV le into a table called “favorites”:
~/ $ sqlite3 favorites.db
SQLite version 3.22.0 2018-01-22 18:45:57
Enter ".help" for usage hints.
sqlite> .mode csv
sqlite> .import "CS50 2019 - Lecture 7 - Favorite TV Shows (Responses) - Form Responses 1.csv" favorites
We see a favorites.db in our IDE after we run this, and now we can use SQL to interact with our data:
We can even set the count of each title to a new variable, n , and order our results by that, in descending order. Then we can see the top
10 results with LIMIT 10 :
sqlite> SELECT title, COUNT(title) AS n FROM favorites GROUP BY title ORDER BY n DESC LIMIT 10;
title | n
The Office | 30
Friends | 20
Game of Thrones | 20
Breaking Bad | 14
Black Mirror | 9
Rick and Morty | 9
Brooklyn Nine-Nine | 5
Game of thrones | 5
No | 5
Prison Break | 5
SQL is a language that lets us work with a relational database, an application lets us store data and work with them more quickly than
with a CSV.
With .schema , we can see how the format for the table for our data is created:
sqlite> .schema
CREATE TABLE favorites(
"Timestamp" TEXT,
"title" TEXT,
"genres" TEXT
);
It turns out that, when working with data, we only need four operations:
CREATE
READ
UPDATE
DELETE
In SQL, the commands to perform each of these operations are:
INSERT
3/9
SELECT
UPDATE
DELETE
First, we’ll need to insert a table with the CREATE TABLE table (column type, ...); command.
SQL, too, has its own data types to optimize the amount of space used for storing data:
BLOB , for “binary large object”, raw binary data that might represent les
INTEGER
smallint
integer
bigint
NUMERIC
boolean
date
datetime
numeric(scale,precision) , which solves oating-point imprecision by using as many bits as needed, for each digit before and
after the decimal point
time
timestamp
REAL
real , for oating-point values
double precision , with more bits
TEXT
char(n) , for an exact number of characters
varchar(n) , for a variable number of characters, up to a certain limit
text
SQLite is one database application that supports SQL, and there are many companies with server applications that support SQL, includes
Oracle Database, MySQL, PostgreSQL, MariaDB, and Microsoft Access.
After inserting values, we can use functions to perform calculations, too:
AVG
COUNT
DISTINCT , for getting distinct values without duplicates
MAX
MIN
…
There are also other operations we can combine as needed:
WHERE , matching on some strict condition
LIKE , matching on substrings for text
LIMIT
GROUP BY
ORDER BY
JOIN , combining data from multiple tables
We can update data with UPDATE table SET column=value WHERE condition; , which could include 0, 1, or more rows depending on our
condition. For example, we might say UPDATE favorites SET title = "The Office" WHERE title LIKE "%office" , and that will set all the
rows with the title containing “of ce” to be “The Of ce” so we can make them consistent.
And we can remove matching rows with DELETE FROM table WHERE condition; , as in DELETE FROM favorites WHERE title = "Friends"; .
We can even delete an entire table altogether with another command, DROP .
IMDb
IMDb, or “Internet Movie Database”, has datasets available to download (https://www.imdb.com/interfaces/) as TSV, or tab-separate values,
les.
For example, we can download title.basics.tsv.gz , which will contain basic data about titles:
tconst , a unique identi er for each title, like tt4786824
titleType , the type of the title, like tvSeries
4/9
yp , yp ,
primaryTitle , the main title used, like The Crown
startYear , the year a title was released, like 2016
genres , a comma-separated list of genres, like Drama,History
We take a look at title.basics.tsv after we’ve unzipped it, and we see that the rst rows are indeed the headers we expected and each
row has values separated by tabs. But the le has more than 6 million rows, so even searching for one value takes a moment.
We’ll download the le into our IDE with wget , and then gunzip to unzip it. But our IDE doesn’t have enough space, so we’ll use our
Mac’s terminal instead.
We’ll write import.py to read the le in:
import csv
# Since the file is a TSV file, we can use the CSV reader and change
# the separator to a tab.
reader = csv.DictReader(titles, delimiter="\t")
# Create writer
writer = csv.writer(shows)
# If non-adult TV show
if row["titleType"] == "tvSeries" and row["isAdult"] == "0":
# Write row
writer.writerow([row["tconst"], row["primaryTitle"], row["startYear"], row["genres"]])
Now, we can open shows0.csv and see a smaller set of data. But it turns out, for some of the rows, startYear has a value of \N , and
that’s a special value from IMDb when they want to represent values that are missing. So we can lter out those values and convert the
startYear to an integer to lter for shows after 1970:
...
# If year not missing (We need to escape the backslash too)
if row["startYear"] != "\\N":
# If since 1970
if int(row["startYear"]) >= 1970:
# Write row
writer.writerow([row["tconst"], row["primaryTitle"], row["startYear"], row["genres"]])
import csv
# Create DictReader
reader = csv.DictReader(input)
We can run this program and see our results, but we can see how SQL can do a better job.
5/9
In Python, we can connect to a SQL database and read our le into it once, so we can make lots of queries without writing new programs
and without having to read the entire le each time.
Let’s do this more easily with the CS50 library:
import cs50
import csv
# Create DictReader
reader = csv.DictReader(titles, delimiter="\t")
# If non-adult TV show
if row["titleType"] == "tvSeries" and row["isAdult"] == "0":
# If since 1970
startYear = int(row["startYear"])
if startYear >= 1970:
Now we can run sqlite3 shows3.db and run commands like before, such as SELECT * FROM shows LIMIT 10; .
With SELECT COUNT(*) FROM shows; we can see that there are more than 150,000 shows in our table, and with SELECT COUNT(*) FROM
shows WHERE startYear = 2019; , we see that there were more than 6000 this year.
Multiple tables
But each of the rows will only have one column for genres, and the values are multiple genres put together. So we can go back to our
import program, and add another table:
6/9
import cs50
import csv
# Create database
open(f"shows4.db", "w").close()
db = cs50.SQL("sqlite:///shows4.db")
# Create tables
db.execute("CREATE TABLE shows (id INT, title TEXT, year NUMERIC, PRIMARY KEY(id))")
# The `genres` table will have a column called `show_id` that references
# the `shows` table above
db.execute("CREATE TABLE genres (show_id INT, genre TEXT, FOREIGN KEY(show_id) REFERENCES shows(id))")
# Create DictReader
reader = csv.DictReader(titles, delimiter="\t")
# If non-adult TV show
if row["titleType"] == "tvSeries" and row["isAdult"] == "0":
# If since 1970
startYear = int(row["startYear"])
if startYear >= 1970:
# Insert show
db.execute("INSERT INTO shows (id, title, year) VALUES(?, ?, ?)", id, row["primaryTitle"], startYear)
# Insert genres
if row["genres"] != "\\N":
for genre in row["genres"].split(","):
db.execute("INSERT INTO genres (show_id, genre) VALUES(?, ?)", id, genre)
So now our shows table no longer has a genres column, but instead we have a genres table with each row representing a show
and an associated genre. Now, a particular show can have multiple genres we can search for, and we can get other data about the
show from the shows table given its ID.
In fact, we can combine both tables with SELECT * FROM shows WHERE id IN (SELECT show_id FROM genres WHERE genre = "Comedy") AND
year = 2019; . We’re ltering our shows table by IDs where the ID in the genres table has a value of “Comedy” for the genre column,
and has the value of 2019 for the year column.
Our tables look like this:
Since the ID in the genre table come from the shows table, we call it show_id . And the arrow indicates that a single show ID might
have many matching rows in the genres table.
7/9
We see that some datasets from IMDb, like title.principals.tsv , have only IDs for certain columns that we’ll have to look up in other
tables.
By reading the descriptions for each table, we can see that all of the data can be used to construct these tables:
Notice that, for example, a person’s name could also be copied to the stars or writers tables, but instead only the person_id is
used to link to the data in the people table. This way, we only need to update the name in one place if we need to make a change.
We’ll open a database, shows.db , with these tables to look at some more examples.
We’ll download a program called DB Browser for SQLite (https://sqlitebrowser.org/dl/), which will have a graphical user interface to browse
our tables and data. We can use the “Execute SQL” tab to run SQL directly in the program, too.
We can run SELECT * FROM shows JOIN genres ON show.id = genres.show_id; to join two tables by matching IDs in columns we specify.
Then we’ll get back a wider table, with columns from each of those two tables.
We can take a person’s ID and nd them in shows with SELECT * FROM stars WHERE person_id = 1122; , but we can do a query inside our
query with SELECT show_id FROM stars WHERE person_id = (SELECT id FROM people WHERE name = "Ellen DeGeneres"); .
This gives us back the show_id , so to get the show data we can run: SELECT * FROM shows WHERE id IN (...); with ... being the
query above.
We can get the same results with:
We join the people table with the stars table, and then with the shows table by specifying columns that should match between
the tables, and then selecting just the title with a lter on the name.
But now we can select other elds from our combined tables, too.
It turns out that we can specify columns of our tables to be special types, such as:
PRIMARY KEY , used as the primary identi er for a row
FOREIGN KEY , which points to a row in another table
UNIQUE , which means it has to be unique in this table
INDEX , which asks our database to create a index to more quickly query based on this column. An index is a data structure like a tree,
which helps us search for values.
We can create an index with CREATE INDEX person_index ON stars (person_id); . Then the person_id column will have an index called
person_index . With the right indexes, our join query is several hundred times faster.
Problems
One problem with databases is race conditions, where the timing of two actions or events cause unexpected behavior.
For example, consider two roommates and a shared fridge in their dorm. The rst roommate comes home, and sees that there is no milk in
the fridge. So the rst roommate leaves to the store to buy milk, and while they are at the store, the second roommate comes home, sees
that there is no milk and leaves for another store to get milk Later there will be two jugs of milk in the fridge By leaving a note we can 8/9
that there is no milk, and leaves for another store to get milk. Later, there will be two jugs of milk in the fridge. By leaving a note, we can
solve this problem. We can even lock the fridge so that our roommate can’t check whether there is milk, until we’ve gotten back.
This can happen in our database if we have something like this:
First, we’re getting the number of likes on a post with a given ID. Then, we set the number of likes to that number plus one.
But now if we have two different web servers both trying to add a like, they might both set it to the same value instead of actually
adding one each time. For example, if there are 2 likes, both servers will check the number of likes, see that there are 2, and set the
value to 3. One of the likes will then be lost.
To solve this, we can use transactions, where a set of actions is guaranteed to happen together.
Another problem in SQL is called a SQL injection attack, where an adversary can execute their own commands on our database.
For example, someone might try type in [email protected]'-- as their email. If we have a SQL query that’s a formatted string (without
escaping, or substituting dangerous characters from, the input), such as f"SELECT * FROM users WHERE username = '{username}' AND
password = '{password}'" , then the query will end up being f"SELECT * FROM users WHERE username = '[email protected]'--' AND
password = '{password}'" , which will actually select the row where username = '[email protected]' and turn the rest of the line into a
comment. To prevent this, we should use ? placeholders for our SQL library to automatically escape inputs from the user.
9/9
This is CS50x
OpenCourseWare
Lecture 8
A Look Back
Privacy
A Look Back
Just a few weeks ago, 2/3rd of us had never taken a CS course before. We started with making programs in Scratch, struggled through
using C to write loops and eventually implementing more applicable algorithms, and nally took advantage of higher-level languages like
Python and its packages, and SQL, to solve even more interesting problems.
In week 0, we said:
what ultimately matters in this course is not so much where you end up relative to your classmates but where you end up relative to
yourself when you began
And now we can look back to see how far we’ve come.
Indeed, David’s own notes from when he took CS50 in 1996 includes concepts like algorithms, functions, and arguments.
To start solving problems with algorithms, we need to represent inputs and outputs. So we can use binary to represent data, whether that’s
numbers, letters, or pixels in images.
We demonstrate binary search in a phone book by dividing the book in half each time.
Precision and correctness are both critical in programming, since computers can’t infer “what we mean”. We demonstrate this with a
volunteer giving the audience instructions on how to draw an image. We see that abstractions (“draw a stick gure”) can be useful, but we
lose some precision when we use them.
Privacy
Computer science, in essence, is about the processing and storage of information. But we need to also consider not just what we can do,
but whether we should do it.
For example, we use passwords to protect many of our accounts and data, but the top 10 passwords are just:
1. 123456
2. 123456789
3. qwerty
4. password
5. 111111
6. 12345678
7. abc123
8. 1234567
9. password1
10. 12345
But unfortunately, even a more complex password can be quickly guessed by modern computers. We can write a program in just a few
minutes, that will generate all possible PINs and check them. We can even open a dictionary le that has all English words, and iterate
over each of them.
Cookies are small pieces of data that websites store on our computers when we visit them, useful for identifying us such that we don’t
have to log in on every visit, but can also be used for advertising and tracking purposes.
1/3
In Chrome, we can use View > Developer > Developer Tools to see the cookies that a particular site leaves under the “Network” tab:
And on other websites, where Google’s ads might be embedded, Google can track us there, too, with the same cookie.
And the request that our web browser sends to each site also includes a string called “user-agent”, which describes the version of the
browser we have.
On the internet, too, we have unique IP addresses that identify us so that we can receive responses from servers.
We also explored how we might recover “deleted” photos in a problem set, and services like Snapchat that promise to delete photos after
some time, may not actually remove the data.
In fact, a “soft delete” might set a value of “deleted” to be “true” to hide it from us, but the rest of the data is still stored.
Photos of ourselves on social media, too, can help someone else track us, what we do, and who we’re with.
In the Chrome’s Developer Tools again, we can run some code in a website that prompts us to share our location and then puts it on the
screen:
We’ll now have the opportunity to explore one of four tracks: web programming, mobile app development for either iOS or Android, and
game de elopment ith L a 2/3
game development with Lua.
With these new skills, we’ll be working on a nal project of our own design, solving a problem in the real world that we’re interested in.
We’ll have an overnight hackathon, focused on collaborating with classmates and staff on our nal projects.
Finally, we’ll have the CS50 Fair, where we’ll celebrate our nal projects to friends and visitors.
We give a big thanks to our staff, without whom this course would not be possible!
3/3
This is CS50x
OpenCourseWare
Web
What to Do
1. After watching Introduction, HTTP, HTML, CSS, JavaScript, and Homepage, submit Homepage.
2. After watching Flask, Databases, and Finance, submit Finance.
When to Do It
How to Do It
Source Code
Introduction
In this track, we’ll write programs that can run on the internet. We’ll rst learn about the basics of the internet and how it works, and then
dive into the languages of the internet, from HTML and CSS to JavaScript to frameworks in Python and SQL that can turn a webpage into
an application.
HTTP
1/9
Computers talk to each other across the network by sending and receiving messages. At the most basic level, there are standard protocols,
or rules to follow, for sending and receiving messages. In the context of the internet, the standard protocol is TCP/IP, Transmission Control
Protocol and Internet Protocol. We can think of this at a high-level as sending a letter in the mail, with an address for the recipient and the
address of the sender. On the internet, computers have IP addresses, usually in the format #.#.#.# , so our digital envelope might include
1.2.3.4 for the address of the computer we want to message, and our own address 5.6.7.8 , so that we can get a response.
[2:16] With four numbers of one byte each, an IP address is 32 bits, which only allows us to count up to about 4 billion. It turns out that we
now have more devices than 32 bits will support, and so in addition to IPv4, the protocol with 32-bit addresses, we also have IPv6, a
protocol with 128-bit addresses.
[4:10] In addition to the address of the recipient, we also specify a port number, or a number assigned to a particular service or type of
message, like emails, webpages, or les. This way, the recipient computer can process incoming messages with the right program. So our
envelope might say 1.2.3.4:80 .
[5:50] But when we visit a website, we probably type in something like example.com , and it turns out that there’s something called DNS,
Domain Name System, which maps domain names to IP addresses of the servers that can respond for that domain.
[7:40] And we might notice URLs are the form http://www.example.com , and HTTP is short for another protocol, Hypertext Transfer
Protocol, which essentially describe the format of the contents inside each digital envelope. The content of a request in HTTP might look
like:
GET / HTTP/1.1
Host: www.example.com
...
The rst parameter, GET , speci es what the action we’re trying to do here, which is just getting something. The next one, / , stands
for the root, or the top-most directory. Finally, HTTP/1.1 is the version of protocol we’re asking to use. We also specify the host, or the
website, since the same server might be able to handle multiple, and there’s also additional information in a request that are less
important.
[10:15] The response we get back might look like:
HTTP/1.1 200 OK
Content-Type: text/html
...
Here we get an HTTP status code of 200, which means “OK”, and then a line describing the type of content. HTML, Hypertext Markup
Language, is a format that webpages use to markup content. Finally, we’ll get the actual data for the page.
[11:40] Other common status codes include 404, for a page not found, and 500 for an internal server error, where the server itself had an
error trying to respond.
[13:05] We can open Google Chrome, and open the Developer Tools panel. In the Network tab, we can load a site, and see lots of requests.
At the very top, we can see the original request for google.com , and we’ll see the Request Headers that we sent, and the Response
Headers we got back. In fact, the rst response we got back was HTTP/1.1 301 Moved Permanently , to http://www.google.com , since by
convention URLs for websites start with www . Next, we get redirected to https://www.google.com , with the more secure, encrypted
ersion of HTTP In this response e nall get a 200 OK code and some content to load the page Later e’ll be riting o r o n ser er 2/9
version of HTTP. In this response, we nally get a 200 OK code and some content to load the page. Later, we ll be writing our own server
programs that return these codes and content in response to requests from browsers.
HTML
Now that our computers can communicate over the internet, we can take a closer look at the actual data we get back. In Chrome, we can
go to View > Developer > View Source, to see the HTML, Hypertext Markup Language, that makes up the text-based content of a webpage.
[1:30] We’ll look at a simple HTML page, where we rst declare to the browser of the version and format of the page. Then, we have a tag,
<html> , which starts the HTML content. Generally, HTML is made up of lots of nested tags that map to a tree structure, with opening tags
and closing tags that determine the structure of the page. Next we have the <head> tag, which includes metadata, data about the page,
such as the <title> tag inside that de nes what the title of the webpage will be, as displayed in the tab of the browser. After, we have
the <body> tag, which contains the visible content displayed by the browser.
[6:00] In the CS50 IDE, we can start by writing this code in a le called index.html . And the CS50 IDE has a built-in server we can use. In
the terminal, we can run http-server , and there will be a URL for our IDE’s server that we can open. Then, we’ll see the les in our IDE,
and we can open index.html . We can change our le, save, and refresh to see what it looks like.
[10:20] We take a look at an example where we use an <img> tag to display an image. Here, we add attributes, or additional parameters
to the tag, like src="cat.jpg" to indicate that the source of the image is a le called cat.jpg , and alt="" to indicate alternative text
for the image. And the <img> tag doesn’t have a closing tag, since it doesn’t make sense for there to be other tags inside the image.
[13:30] We add links to go between pages with the <a> , or anchor, tag. Notice that we can have any text for any URL for our link, so we
should pay attention to the URL we end up at.
[18:00] We can add additional elements, like paragraphs with the <p> tag, headings with <h1> or <h2> , or tables with <table> .
[22:35] We’ll add aesthetic styling like borders and colors later, but we can think about HTML as describing the structure of the content of
our webpage.
[22:55] We’ll add a <form> element with some <input> elements where we can get some information from the user. Finally, we can
redirect ourself to Google’s search page for whatever we typed in, by using https://www.google.com/search . We noticed that
https://www.google.com/search?q=cats takes us to a search page for cats, and the ? indicates some HTTP GET parameters, where here
we have a q , or query, parameter, with the value cats . So our form can have an action that submits our text input with name="q" , to
https://www.google.com/search .
[29:35] There are so many more HTML elements. We can likely nd an HTML tag that lets us add a particular feature, just by searching
Google for relevant documentation.
CSS
3/9
To style webpages, we’ll use another language, CSS, Cascading Style Sheets.
[0:40] First, in our HTML, we’ll need to add a style attribute to a tag, and set the value to something like style="color: blue;" . The
key-value pairs in the style will change how the browser displays the element. In fact, we can add a style to the <body> , and all the
elements inside the body will inherit the style unless they speci cally have a different style.
[5:20] We can also change the alignment, like centering or right-aligning text, or the font size. We can add multiple properties by
separating them with semicolons.
[8:40] We might have multiple elements of the same type, like <h1> , and we can add a common set of styles in the <head> element with
the <style> tag. In that tag, we can specify that all h1 elements share some set of styles.
[14:00] If we want set the same styles to multiple types of elements, we can add classes, which we can think of as names, to any number
and type of element. We’ll do this by adding the class="title" attribute, with a class name of our choosing, to elements we want to style
the same way. Then, in our CSS we can select all elements with the class with .title .
[18:25] We can create another class, and even give the same element multiple classes with class="title green" , and the styles for both
will apply.
[20:40] We can include CSS in a separate le, like styles.css , so all of our webpages can share the same styles. We’ll use a new tag,
<link> , to link a le to our HTML page. And we can include many different CSS les, each of which having some subset of styles.
[24:00] With CSS, we can also style tables in HTML by selecting the table , tr , and td classes. By looking at CSS documentation online,
we can gure out what styles will give us the border styles we want.
[27:40] We can add padding, or spacing, within each table data cell. And we can select the rst row by adding a class like header , or use a
special table header cell element <th> that we can select precisely.
[31:05] It turns out that there are lots of CSS libraries, written by other people, that will include styles for common elements that can
quickly apply a theme or aesthetic to our HTML. Bootstrap is one such popular library, and its documentation will include a <link>
element we can add, such that our page will use Boostrap’s CSS les. The documentation will also show us various components we can
use, and classes we can use to style them easily. A <div> element in HTML is like a generic container or section, so we’ll see that
commonly used for elements that don’t have a more semantic HTML tag.
JavaScript
4/9
To build a more interactive website, we’ll need a programming language that will allow us to run code on the browser that changes how it
behaves with our webpage, beyond just the content and style. The language that we’ll use is JavaScript, a language that browsers can
interpret and run, with syntax similar to that of C.
[0:35] We take a look at syntax for declaring and changing variables, conditions, loops, and functions.
[5:00] A simple webpage has elements that we can represent as a graphical tree, where each nested element is a child of a node in the
tree. This is called the Document Object Model, and JavaScript can manipulate, or change this, without having to refresh the page.
[7:15] We’ll add JavaScript to our page with a <script> tag inside our <head> tag. We can call a built-in function, alert() , to show an
alert on our page. After we save our le, we can run a server in our IDE with http-server , and see our page.
[9:20] We can add a form, and have our form call a function and return false; to stop any default behavior after our function is called.
[12:00] Our form can have a text eld, and our JavaScript button can get its value. Fist, we need to add an ID to our element with an
attribute to the element, like id="name" . And in Javascript, we can use document.querySelector('#name') to get that element by its id.
[17:25] We can change our alert to display something else with a condition.
[18:45] Instead of just reading the content of the DOM, we can also change the contents of elements by setting their innerHTML property,
after selecting them with document.querySelector .
[22:00] We’ll look at another example that has a counter, or a variable that we can increment by pressing a button.
[24:25] It turns out that we can even change these variables or call these functions in our browser, with View > Developer > Developer
Tools in Chrome. In the Console tab, we can type in JavaScript code, and it will run in our page. If our JavaScript code has errors, those
errors will also show up in the console.
[26:00] We can dynamically change the style of the page. We’ll create three buttons, each with a unique id . And in our script tag, we’ll
select each button, and we’ll set their onclick property to a function that our browser will call when the button is clicked. We can create
an anonymous function, or a function with no name, directly with function() { ... } , instead of de ning it separately rst. And in our
function, we can select the body tag by type since there’s only one of them on our page, and set the style.backgroundColor property to
a color.
[30:25] It turns out that we can’t add the onclick function in the beginning of our JavaScript code, since our browser interprets the code
from top to bottom, and our code can’t nd the buttons. There are a few ways to solve this problem, but for now we can simply move our
script tag to the end of our body tag.
[33:55] The onclick function is an event handler, or a function that is called when an event happens. There are many such events that we
can listen for, like a change to the selected option in a dropdown menu. We’ll look at another example, where we add onChange to a
<select> element. Here, inside our event handler function, we can use this.value() to get the value of the option that was just
selected. We can think of this as a special variable that contains some kind of context for how a function is called. In this case, this is
the event that triggered our event handler.
[39:20] We can update our page periodically with window.setInterval , which calls a function for us at some interval of time. We’ll create
a function, blink() , that will change the body ’s visibility to be either visible or hidden .
[43:10] We can also create a separate le like blink.js , where we only have our JavaScript code, and include it in our HTML le with
<script src="blink.js"></script> .
[44:45] Finally, we can ask the browser to give the user’s location to our JavaScript code, with
navigator.geolocation.getCurrentPosition . The argument we pass in is a callback function, or a function that will be called by the
browser when the getCurrentPosition nishes running. Inside our function, we’ll just write the coordinates we get to the page.
[47:05] With JavaScript, we can read and write to the DOM, and take advantage of even more features that browsers provide.
Homepage
5/9
Our rst assignment will be to create a homepage of our choice using HTML, CSS, and JavaScript.
We’ll create four different pages in HTML, each linked to one another somehow. Recall that we can use the <a> tag, with the link to
another le in our IDE.
We’ll also use at least ve different CSS selectors, for ve different types elements, classes, or IDs. And we’ll want to use at least ve
different properties overall to style our page, and documentation online will help us nd what we’re looking for. We’ll also use the
Bootstrap library to style at least one of our components, so we don’t have to write the CSS ourselves for that.
Finally, after we’ve written the content for our pages and styled them, we’ll use JavaScript to make our page interactive somehow, through
alerts, buttons, dropdowns, forms, intervals, or even more.
Be as creative as you’d like!
Flask
So far, we’ve learned how to write webpages that are saved as a le and returned by an HTTP server. But we can also have web servers, or
applications, that generate content dynamically before returning it as a response.
[1:00] We’ll use a framework in Python called Flask, which allows us to write a web server with many features. We’ll create a new folder in
our IDE, called hello/ , and create a new le called application.py . By reading the documentation and experimenting, we can write our
rst Flask application which returns something for the / route. And in our terminal, we can cd into our folder and run flask run ,
which will nd our application.py le and run it. We’ll open the URL, and see our returned string.
[4:10] We’ll add another route, /goodbye , and a function that returns different content. We can return any content we want in our routes.
[6:00] It turns out that Flask allows us to use template les, or les with HTML that are like format strings, with some parts that are the
same every time, and some parts that will contain variables that we can substitute in. The render_template function in the Flask library
will allow us to use templates and plug in variables like ``.
[10:35] We can generate a random number, for example, and display it each time our page is loaded. We can use control + c to stop our
server, and then restart it, to make sure any changes we make are reloaded. And once we load our page in the browser, we can view its
source to make sure that Flask substituted our variable as we expected.
[13:25] We can add conditions to our templates, with if ... , so depending on the value of our variables, we can return different content
6/9
[ 3: 5] We ca add co d t o s to ou te plates, w t ... , so depe d g o t e value o ou va ables, we ca etu d e e t co te t
entirely.
[16:25] We can even write a form that our server can accept, with another route that the form can submit to. Then, in that route, our server
can receive and use the form data. We write a form that has a name input, and write a route function that gets the input with
request.args.get() , and returns a template with the input substituted in.
[21:30] We see an Internal Server Error, and in our terminal we see the error that request is not de ned, and it turns out that we need to
import it from Flask. We try again, and see that the GET parameters in the URL changes based on what we submit in the form.
[24:00] We can add additional logic in our route to handle the case where name is empty, and return a different template.
[26:00] It turns out that we can have templates for our templates, since many of our pages might have similar HTML code around its
content. We’ll create layout.html , and add a special block inside the <body> tag. Then, our other les like index.html can use the
template with extends "layout.html" , and only have the content block for the body .
[30:35] And we can add additional blocks, like for content we would want to have inside a <style> tag in the page.
[32:20] We’ll start writing a new application by creating a new folder called tasks , and creating an application.py le. Inside, we’ll
create routes for / to list tasks and /add to add a new task. We’ll create a templates folder with a layout.html before, a tasks.html
showing a list of items, and a add.html that includes a simple form. We’ll have our routes render each of these templates, and set our
form to use a new method, POST , to send the form’s data back to the /add route. Our add() function can then either display the form
for a GET request, or create a new task for a POST request.
[42:30] We can create a global variable, todos , to store a list of task names that we can display later. In our add() function, if we get a
POST request with some data, we’ll add the new task name to our list on the server, and redirect back to the default route, which will
show a list.
[44:15] And in our tasks.html template, we can loop over our todos list variable with for todo in todos , and create a <li> element
with the contents set to each item.
[48:00] We can also make sure that the task name is not empty, by adding some JavaScript code that only enables the submit button if the
input eld’s value is not empty. Otherwise, we disable the submit button. We do this by adding an event handler to listen to the onkeyup
event for our task input, which is triggered by the browser every time the user presses a key and releases it.
[52:40] But our task list goes away when we stop and start our web server, since we initialize our todos variable to an empty list each
time. Next, we’ll use a database with SQL to store and modify data.
Databases
So far, we’ve learned how to write a server that can respond with webpages that are the same for every user. But there are websites where
we can log in, and it will show us information speci c to us.
Recall that cookies are small les that websites ask our browser to store on our computer, with some kind of identi er that our browser
shows the website the next time we go there, so the website knows who we are. This allows our server to have sessions, or data for users’
interactions with a website, speci c to each of them.
[1:20] We’ll look at the task list application we made last time. Since our task list was stored in a global variable in our server application,
everyone who visits our page will see the same list.
[2:40] To solve this, we can use sessions from Flask, by importing and initializing their implementation. By doing so, our tasks() function
7/9
[ ] , , y p g g p y g , ()
can look in the global session variable, and read, set, or update a todos key within it. Flask will take care of making sure that the global
session variable is actually speci c to the user who made that request, by storing and checking some cookies.
[7:30] If we want to store more complex data, it would make more sense to use a database instead of session objects. So we’ll create a new
application to store registration information, like names and emails.
[9:25] We’ll make a new empty le, lecture.db , and run sqlite3 lecture.db to create a table and set column names and types for the
data we think we’ll need.
[11:00] In sqlite3 , we can run queries to select or insert into the table to check that everything works. In our new Flask application, we’ll
import the SQL library from CS50 so we can work with our database more easily, and establish a connection to our lecture.db le. In our
/ route, we can run a SELECT query to get the rows from our registrants table, and pass them into our template. Our template will in
turn iterate over each row, and generate an <li> item with the values of each column in each row.
[17:35] Once we have our index route, we can add more rows to our table with the sqlite3 prompt, and see our server return the new
data.
[18:05] We can add a new route to our application that will insert new data, too. In our register() function, we can return a
register.html le with a form that has the inputs we need, and ensure that the form submits to our register route with the POST
method. Then, in our register route, we can check for a POST request, insert the data from the request into our table, and redirect to
the main route. In our SQL query, we’ll be careful to substitute our variables safely with the db.execute function, instead of combining
the strings ourselves, to avoid SQL injection attacks.
[23:05] We’ll try out our application, and everything seems to be working as we expect. To improve the design of our server’s code, we’ll
factor out some common template code into layout.html , and create an apology.html page where we’ll tell the user an error message if
something in their form is blank.
[28:40] Now we can write Flask applications to read and store data in a database, saving our data ef ciently for the long term.
Finance
We’ll take the concepts we’ve seen to create CS50 Finance, a virtual stock trading website with an account for users to register for, the
ability to get quotes for shares of stocks and to virtually buy or sell them. We’ll also have a history page for each account to see what we’ve
done in the past.
[2:45] We look at the distribution code for CS50 Finance, or the code that we’ll all start off with. We have an application.py le that our
Flask app will run, with various con guration options, a connection to a database le finance.db , and routes for . This follows the MVC,
Model-View-Controller, pattern, which generally separates the concerns of data and how that’s stored (our database), the views that display
some amount of data (our templates), and controllers that control the logic of what is displayed when (our application.py routes).
[4:45] Since we’re using a third-party API, or Application Programming Interface, some code that someone else wrote designed for us to
use, we’ll also need an API key to get stock information.
[5:30] Notice that our routes also have a @login_required decorator, or extra attribute in Python to indicate that the function should
behave differently. Flask allows us to automatically redirect users to a login page, and we have the login functionality implemented in our
distribution code too. The /login route checks whether a matching user and password exists in our database (for a POST method, as
from the login form), or displays the login form for a GET method. And in our database, instead of storing the user’s raw password, which
8/9
is more insecure since hackers might use them against other websites, we store the hash of their password which is suf cient for
veri cation, but dif cult from which to recover the original password.
[14:30] After the login route we have logout , which just clears the session, and we have quote , register , and sell routes left to
implement.
[15:10] We’ll implement:
register so we can register for a new account
quote so we can get a price quote for a stock
buy to buy some shares of a stock
index to show the stocks in our account
sell to sell some shares of a stock
Conclusion
In this track, we learned about how computers communicate over an internet, structured web pages with HTML and styled them with CSS,
and added some interactivity with JavaScript. Then we learned how to write a web server application with Flask, that can dynamically
generate web pages and use a database to read and write data.
9/9
This is CS50x
OpenCourseWare
Homepage
Build a simple homepage using HTML, CSS, and JavaScript.
Background
The internet has enabled incredible things: we can use a search engine to research anything imaginable, communicate with friends and family
members around the globe, play games, take courses, and so much more. But it turns out that nearly all pages we may visit are built on three
core languages, each of which serves a slightly different purpose:
1. HTML, or HyperText Markup Language, which is used to describe the content of websites;
2. CSS, Cascading Style Sheets, which is used to describe the aesthetics of websites; and
3. JavaScript, which is used to make websites interactive and dynamic.
Create a simple homepage that introduces yourself, your favorite hobby or extracurricular, or anything else of interest to you.
Getting Started
Here’s how to download this problem’s “distribution code” (i.e., starter code) into your own CS50 IDE. Log into CS50 IDE (https://ide.cs50.io/)
and then, in a terminal window, execute each of the below.
$ http-server
Speci cation
Contain at least four different .html pages, at least one of which is index.html (the main page of your website), and it should be
possible to get from any page on your website to any other page by following one or more hyperlinks.
Use at least ten (10) distinct HTML tags besides <html> , <head> , <body> , and <title> . Using some tag (e.g., <p> ) multiple times still
counts as just one (1) of those ten!
I f f B i i B i l lib (h ihl f CSS l d ) 1/3
Integrate one or more features from Bootstrap into your site. Bootstrap is a popular library (that comes with lots of CSS classes and more)
via which you can beautify your site. See Bootstrap’s documentation (https://getbootstrap.com/docs/4.1/getting-started/introduction/) to
get started. To add Bootstrap to your site, it suf ces to include
Have at least one stylesheet le of your own creation, styles.css , which uses at least ve (5) different CSS selectors (e.g. tag ( example ),
class ( .example ), or ID ( #example )), and within which you use a total of at least ve (5) different CSS properties, such as font-size , or
margin ; and
Integrate one or more features of JavaScript into your site to make your site more interactive. For example, you can use JavaScript to add
alerts, to have an effect at a recurring interval, or to add interactivity to buttons, dropdowns, or forms. Feel free to be creative!
Ensure that your site looks nice on browsers both on mobile devices as well as laptops and desktops.
Testing
If you want to view how your site looks while you work on it, there are two options:
1. Within CS50 IDE, navigate to your homepage directory (remember how?) and then execute
$ http-server
1. Within CS50 IDE, right-click (or Ctrl+click, on a Mac) on the homepage directory in the le tree at left. From the options that appear, select
Serve, which should open a new tab in your browser (it may take a second or two) with your site therein.
Recall also that by opening Developer Tools in Google Chrome, you can simulate visiting your page on a mobile device by clicking the phone-
shaped icon to the left of Elements in the developer tools window, or, once the Developer Tools tab has already been opened, by typing
Ctrl + Shift + M on a PC or Cmd + Shift + M on a Mac, rather than needing to visit your site on a mobile device separately!
Assessment
No check50 for this assignment! Instead, your site’s correctness will be assessed based on whether you meet the requirements of the
speci cation as outlined above, and whether your HTML is well-formed and valid. To ensure that your pages are, you can use the W3Schools
HTML Validator (https://validator.w3.org/#validate_by_input) service, copying and pasting your HTML directly into the provided text box. Take
care to eliminate any warnings or errors suggested by the validator before submitting!
Consider also:
whether the aesthetics of your site are such that it is intuitive and straightforward for a user to navigate;
whether your CSS has been factored out into a separate CSS le(s); and
whether you have avoided repetition and redundancy by “cascading” style properties from parent tags.
Afraid style50 does not support HTML les, and so it is incumbent upon you to indent and align your HTML tags cleanly. Know also that you
can create an HTML comment with:
but commenting your HTML code is not as imperative as it is when commenting code in, say, C or Python. You can also comment your CSS, in
CSS les, with:
Hints
2/3
For fairly comprehensive guides on the languages introduced in this problem, check out the documentation for each on W3Schools.
HTML (https://www.w3schools.com/html)
CSS (https://www.w3schools.com/css)
JavaScript (https://www.w3schools.com/js)
How to Submit
Execute the below, logging in with your GitHub username and password when prompted. For security, you’ll see asterisks ( * ) instead of the
actual characters in your password.
submit50 cs50/problems/2020/x/tracks/web/homepage
3/3
This is CS50x
OpenCourseWare
C$50 Finance
Implement a website via which users can “buy” and “sell” stocks, a la the below.
Background
If you’re not quite sure what it means to buy and sell stocks (i.e., shares of a company), head here
(https://www.investopedia.com/articles/basics/06/invest1000.asp) for a tutorial.
You’re about to implement C$50 Finance, a web app via which you can manage portfolios of stocks. Not only will this tool allow you to check
real stocks’ actual prices and portfolios’ values, it will also let you buy (okay, “buy”) and sell (okay, “sell”) stocks by querying IEX
(https://iextrading.com/developer/) for stocks’ prices.
Indeed, IEX lets you download stock quotes via their API (application programming interface) using URLs like https://cloud-
sse.iexapis.com/stable/stock/nflx/quote?token=API_KEY . Notice how Net ix’s symbol (NFLX) is embedded in this URL; that’s how IEX knows
whose data to return. That link won’t actually return any data because IEX requires you to use an API key (more about that in a bit), but if it did,
you’d see a response in JSON (JavaScript Object Notation) format like this:
1/7
{
"symbol": "NFLX",
"companyName": "Netflix, Inc.",
"primaryExchange": "NASDAQ",
"calculationPrice": "close",
"open": 317.49,
"openTime": 1564752600327,
"close": 318.83,
"closeTime": 1564776000616,
"high": 319.41,
"low": 311.8,
"latestPrice": 318.83,
"latestSource": "Close",
"latestTime": "August 2, 2019",
"latestUpdate": 1564776000616,
"latestVolume": 6232279,
"iexRealtimePrice": null,
"iexRealtimeSize": null,
"iexLastUpdated": null,
"delayedPrice": 318.83,
"delayedPriceTime": 1564776000616,
"extendedPrice": 319.37,
"extendedChange": 0.54,
"extendedChangePercent": 0.00169,
"extendedPriceTime": 1564876784244,
"previousClose": 319.5,
"previousVolume": 6563156,
"change": -0.67,
"changePercent": -0.0021,
"volume": 6232279,
"iexMarketPercent": null,
"iexVolume": null,
"avgTotalVolume": 7998833,
"iexBidPrice": null,
"iexBidSize": null,
"iexAskPrice": null,
"iexAskSize": null,
"marketCap": 139594933050,
"peRatio": 120.77,
"week52High": 386.79,
"week52Low": 231.23,
"ytdChange": 0.18907500000000002,
"lastTradeTime": 1564776000616
}
Notice how, between the curly braces, there’s a comma-separated list of key-value pairs, with a colon separating each key from its value.
Distribution
Downloading
$ wget https://cdn.cs50.net/2019/fall/tracks/web/finance/finance.zip
$ unzip finance.zip
$ rm finance.zip
$ cd finance
$ ls
li i h l i / 2/7
application.py helpers.py static/
finance.db requirements.txt templates/
Con guring
Before getting started on this assignment, we’ll need to register for an API key in order to be able to query IEX’s data. To do so, follow these
steps:
$ export API_KEY=value
where value is that (pasted) value, without any space immediately before or after the = . You also may wish to paste that value in a text
document somewhere, in case you need it again later.
Running
$ flask run
Visit the URL outputted by flask to see the distribution code in action. You won’t be able to log in or register, though, just yet!
Via CS50’s le browser, double-click finance.db in order to open it with phpLiteAdmin. Notice how finance.db comes with a table called
users . Take a look at its structure (i.e., schema). Notice how, by default, new users will receive $10,000 in cash. But there aren’t (yet!) any
users (i.e., rows) therein to browse. + Here on out, if you’d prefer a command line, you’re welcome to use sqlite3 instead of phpLiteAdmin.
Understanding
application.py
Open up application.py . Atop the le are a bunch of imports, among them CS50’s SQL module and a few helper functions. More on those
soon.
After con guring Flask (http:// ask.pocoo.org/), notice how this le disables caching of responses (provided you’re in debugging mode, which
you are by default on CS50 IDE), lest you make a change to some le but your browser not notice. Notice next how it con gures Jinja
(http://jinja.pocoo.org/) with a custom “ lter,” usd , a function (de ned in helpers.py ) that will make it easier to format values as US dollars
(USD). It then further con gures Flask to store sessions (http:// ask.pocoo.org/docs/1.0/quickstart/#sessions) on the local lesystem (i.e., disk)
as opposed to storing them inside of (digitally signed) cookies, which is Flask’s default. The le then con gures CS50’s SQL module to use
finance.db , a SQLite database whose contents we’ll soon see!
Thereafter are a whole bunch of routes, only two of which are fully implemented: login and logout . Read through the implementation of
login rst. Notice how it uses db.execute (from CS50’s library) to query finance.db . And notice how it uses check_password_hash to
compare hashes of users’ passwords. Finally, notice how login “remembers” that a user is logged in by storing his or her user_id , an
INTEGER, in session . That way, any of this le’s routes can check which user, if any, is logged in. Meanwhile, notice how logout simply clears
session , effectively logging a user out.
Notice how most routes are “decorated” with @login_required (a function de ned in helpers.py too). That decorator ensures that, if a user
tries to visit any of those routes, he or she will rst be redirected to login so as to log in.
Notice too how most routes support GET and POST. Even so, most of them (for now!) simply return an “apology,” since they’re not yet
implemented.
helpers.py
Next take a look at helpers.py . Ah, there’s the implementation of apology . Notice how it ultimately renders a template, apology.html . It
3/7
also happens to de ne within itself another function, escape , that it simply uses to replace special characters in apologies. By de ning
escape inside of apology , we’ve scoped the former to the latter alone; no other functions will be able (or need) to call it.
Next in the le is login_required . No worries if this one’s a bit cryptic, but if you’ve ever wondered how a function can return another
function, here’s an example!
Thereafter is lookup , a function that, given a symbol (e.g., NFLX), returns a stock quote for a company in the form of a dict with three keys:
name , whose value is a str , the name of the company; price , whose value is a float ; and symbol , whose value is a str , a canonicalized
(uppercase) version of a stock’s symbol, irrespective of how that symbol was capitalized when passed into lookup .
Last in the le is usd , a short function that simply formats a float as USD (e.g., 1234.56 is formatted as $1,234.56 ).
requirements.txt
Next take a quick look at requirements.txt . That le simply prescribes the packages on which this app will depend.
static/
Glance too at static/ , inside of which is styles.css . That’s where some initial CSS lives. You’re welcome to alter it as you see t.
templates/
Now look in templates/ . In login.html is, essentially, just an HTML form, stylized with Bootstrap (http://getbootstrap.com/.) In
apology.html , meanwhile, is a template for an apology. Recall that apology in helpers.py took two arguments: message , which was passed
to render_template as the value of bottom , and, optionally, code , which was passed to render_template as the value of top . Notice in
apology.html how those values are ultimately used! And here’s why (https://github.com/jacebrowning/memegen). 0:-)
Last up is layout.html . It’s a bit bigger than usual, but that’s mostly because it comes with a fancy, mobile-friendly “navbar” (navigation bar),
also based on Bootstrap. Notice how it de nes a block, main , inside of which templates (including apology.html and login.html ) shall go. It
also includes support for Flask’s message ashing (http:// ask.pocoo.org/docs/1.0/patterns/ ashing/) so that you can relay messages from one
route to another for the user to see.
Speci cation
register
Complete the implementation of register in such a way that it allows a user to register for an account via a form.
Require that a user input a username, implemented as a text eld whose name is username . Render an apology if the user’s input is blank
or the username already exists.
Require that a user input a password, implemented as a text eld whose name is password , and then that same password again,
implemented as a text eld whose name is confirmation . Render an apology if either input is blank or the passwords do not match.
Submit the user’s input via POST to /register .
INSERT the new user into users , storing a hash of the user’s password, not the password itself. Hash the user’s password with
generate_password_hash (http://werkzeug.pocoo.org/docs/0.14/utils/#werkzeug.security.generate_password_hash. *) Odds are you’ll want
to create a new template (e.g., register.html ) that’s quite similar to login.html .
Once you’ve implemented register correctly, you should be able to register for an account and log in (since login and logout already
work)! And you should be able to see your rows via phpLiteAdmin or sqlite3 .
quote
Complete the implementation of quote in such a way that it allows a user to look up a stock’s current price.
Require that a user input a stock’s symbol, implemented as a text eld whose name is symbol .
Submit the user’s input via POST to /quote .
Odds are you’ll want to create two new templates (e.g., quote.html and quoted.html ). When a user visits /quote via GET, render one of
those templates, inside of which should be an HTML form that submits to /quote via POST. In response to a POST, quote can render that
second template, embedding within it one or more values from lookup .
buy
Complete the implementation of buy in such a way that it enables a user to buy stocks.
4/7
Complete the implementation of buy in such a way that it enables a user to buy stocks.
Require that a user input a stock’s symbol, implemented as a text eld whose name is symbol . Render an apology if the input is blank or
the symbol does not exist (as per the return value of lookup ).
Require that a user input a number of shares, implemented as a text eld whose name is shares . Render an apology if the input is not a
positive integer.
Submit the user’s input via POST to /buy .
Odds are you’ll want to call lookup to look up a stock’s current price.
Odds are you’ll want to SELECT how much cash the user currently has in users .
Add one or more new tables to finance.db via which to keep track of the purchase. Store enough information so that you know who
bought what at what price and when.
Once you’ve implemented buy correctly, you should be able to see users’ purchases in your new table(s) via phpLiteAdmin or sqlite3 .
index
Complete the implementation of index in such a way that it displays an HTML table summarizing, for the user currently logged in, which
stocks the user owns, the numbers of shares owned, the current price of each stock, and the total value of each holding (i.e., shares times price).
Also display the user’s current cash balance along with a grand total (i.e., stocks’ total value plus cash).
Odds are you’ll want to execute multiple SELECT s. Depending on how you implement your table(s), you might nd GROUP BY
(https://www.google.com/search?q=SQLite+GROUP+BY,) HAVING (https://www.google.com/search?q=SQLite+HAVING,) SUM
(https://www.google.com/search?q=SQLite+SUM,) and/or WHERE (https://www.google.com/search?q=SQLite+WHERE) of interest.
Odds are you’ll want to call lookup for each stock.
sell
Complete the implementation of sell in such a way that it enables a user to sell shares of a stock (that he or she owns).
Require that a user input a stock’s symbol, implemented as a select menu whose name is symbol . Render an apology if the user fails to
select a stock or if (somehow, once submitted) the user does not own any shares of that stock.
Require that a user input a number of shares, implemented as a text eld whose name is shares . Render an apology if the input is not a
positive integer or if the user does not own that many shares of the stock.
Submit the user’s input via POST to /sell .
You don’t need to worry about race conditions (or use transactions).
history
Complete the implementation of history in such a way that it displays an HTML table summarizing all of a user’s transactions ever, listing
row by row each and every buy and every sell.
For each row, make clear whether a stock was bought or sold and include the stock’s symbol, the (purchase or sale) price, the number of
shares bought or sold, and the date and time at which the transaction occurred.
You might need to alter the table you created for buy or supplement it with an additional table. Try to minimize redundancies.
personal touch
Implement at least one personal touch of your choice:
5/7
Testing
inputting alpabetical strings into forms when only numbers are expected,
inputting zero or negative numbers into forms when only positive numbers are expected,
inputting oating-point values into forms when only integers are expected,
trying to spend more cash than a user has,
trying to sell more shares than a user has,
inputting an invalid stock symbol, and
Staff’s Solution
You’re welcome to stylize your own app differently, but here’s what the staff’s solution looks like!
https:// nance.cs50.net/
Feel free to register for an account and play around. Do not use a password that you use on other sites.
Hints
Within cs50.SQL is an execute method whose rst argument should be a str of SQL. If that str contains named parameters to which
values should be bound, those values can be provided as additional named parameters to execute . See the implementation of login for
one such example. The return value of execute is as follows:
If str is a SELECT , then execute returns a list of zero or more dict objects, inside of which are keys and values representing a
table’s elds and cells, respectively.
If str is an INSERT , and the table into which data was inserted contains an autoincrementing PRIMARY KEY , then execute returns
the value of the newly inserted row’s primary key.
If str is a DELETE or an UPDATE , then execute returns the number of rows deleted or updated by str .
If an INSERT or UPDATE would violate some constraint (e.g., a UNIQUE index), then execute returns None . In cases of error, execute raises
a RuntimeError .
Recall that cs50.SQL will log to your terminal window any queries that you execute via execute (so that you can con rm whether they’re
as intended).
Be sure to use named bind parameters (i.e., a paramstyle (https://www.python.org/dev/peps/pep-0249/#paramstyle) of named ) when
calling CS50’s execute method, a la WHERE name=:name . Do not use f-strings, format
(https://docs.python.org/3.6/library/functions.html#format,) or + (i.e., concatenation), lest you risk a SQL injection attack.
If (and only if) already comfortable with SQL, you’re welcome to use SQLAlchemy Core (http://docs.sqlalchemy.org/en/latest/index.html) or
Flask-SQLAlchemy (http:// ask-sqlalchemy.pocoo.org/) (i.e., SQLAlchemy ORM (http://docs.sqlalchemy.org/en/latest/index.html)) instead of
cs50.SQL .
You’re welcome to add additional static les to static/ .
Odds are you’ll want to consult Jinja’s documentation (http://jinja.pocoo.org/docs/dev/) when implementing your templates.
It is reasonable to ask others to try out (and try to trigger errors in) your site.
You’re welcome to alter the aesthetics of the sites, as via
https://bootswatch.com/,
https://getbootstrap.com/docs/4.1/content/,
https://getbootstrap.com/docs/4.1/components/, and/or
https://memegen.link/.
FAQs
By default, flask looks for a le called application.py in your current working directory (because we’ve con gured the value of FLASK_APP ,
6/7
an environment variable, to be application.py ). If seeing this error, odds are you’ve run flask in the wrong directory!
How to Submit
Execute the below from within your finance directory, logging in with your GitHub username and password when prompted. For security,
you’ll see asterisks ( * ) instead of the actual characters in your password.
submit50 cs50/problems/2020/x/tracks/web/finance
7/7
This is CS50x
OpenCourseWare
Final Project
The climax of this course is its nal project. The nal project is your opportunity to take your newfound savvy with programming out for a spin
and develop your very own piece of software. So long as your project draws upon this course’s lessons, the nature of your project is entirely up
to you. You may implement your project in any language(s). You are welcome to utilize infrastructure other than the CS50 IDE. All that we ask is
that you build something of interest to you, that you solve an actual problem, that you impact your community, or that you change the world.
Strive to create something that outlives this course.
Inasmuch as software development is rarely a one-person effort, you are allowed an opportunity to collaborate with one or two classmates for
this nal project. Needless to say, it is expected that every student in any such group contribute equally to the design and implementation of
that group’s project. Moreover, it is expected that the scope of a two- or three-person group’s project be, respectively, twice or thrice that of a
typical one-person project. A one-person project, mind you, should entail more time and effort than is required by each of the course’s problem
sets.
Ideas
a web-based application using JavaScript, Python, and SQL, based in part on the web track’s distribution code
an iOS app using Swift
a game using Lua with LÖVE
an Android app using Java
a Chrome extension using JavaScript
a command-line program using C
a hardware-based application for which you program some device
…
How to Submit
Step 1 of 2
Create a README.md text le that explains your project and save it in a new folder called project in your ~/ directory. Note that your project
source code itself does not need to be submitted, but this README.md le must.
Execute the below from within your ~/project directory, logging in with your GitHub username and password when prompted. For security,
you’ll see asterisks instead of the actual characters in your password.
submit50 cs50/problems/2020/x/project
Step 2 of 2
Submit a short video (that’s no more than 2 minutes in length) in which you present your project to the world, as with slides, screenshots,
voiceover, and/or live action. Your video should somehow include your project’s title, your name, your city and country, and any other details
that you’d like to convey to viewers. See https://www.howtogeek.com/205742/how-to-record-your-windows-mac-linux-android-or-ios-screen/
for tips on how to make a “screencast,” though you’re welcome to use an actual camera. Upload your video to YouTube (or, if blocked in your
country, a similar site) and take note of its URL; it’s ne to ag it as “unlisted,” but don’t ag it as “private.”
1/2
That’s it! Your project should be graded within a few minutes. If you don’t see any results in your gradebook, best to resubmit (running the
above submit50 command) with only your README.md le this time. No need to resubmit your form.
2/2