0% found this document useful (0 votes)
15 views445 pages

Harvard CS50

Harvard's CS50 is a renowned introductory computer science course taught by Dr. David Malan, focusing on problem-solving and programming skills. The course emphasizes algorithmic thinking and provides resources for self-learning, catering to students with varying levels of prior experience. It aims to teach students how to think methodically and understand the fundamentals of computer science through practical programming exercises.

Uploaded by

heartnight9984
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views445 pages

Harvard CS50

Harvard's CS50 is a renowned introductory computer science course taught by Dr. David Malan, focusing on problem-solving and programming skills. The course emphasizes algorithmic thinking and provides resources for self-learning, catering to students with varying levels of prior experience. It aims to teach students how to think methodically and understand the fundamentals of computer science through practical programming exercises.

Uploaded by

heartnight9984
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 445

# Harvard CS50 – Full Computer Science University Course - YouTube

https://www.youtube.com/watch?v=8mAITcNt710

![](https://www.youtube.com/watch?v=8mAITcNt710)

## Transcript:

- [00:00](https://www.youtube.com/watch?v=undefined&t=0s) If you want to learn about computer


science and the art of programming, this course is where to start. CS50 is considered by many to be one
of the best computer science courses in the world. This is a Harvard University course taught by Dr. David
Malan and we are proud to bring it to the freeCodeCamp YouTube channel.

- [00:17](https://www.youtube.com/watch?v=undefined&t=17s) Throughout a series of lectures, Dr.


Malan will teach you how to think algorithmically and solve problems efficiently. And make sure to check
the description for a lot of extra resources that go along with the course. [MUSIC PLAYING]

- [01:46](https://www.youtube.com/watch?v=undefined&t=106s) DAVID MALAN: All right, this is CS50,


Harvard University's introduction to the intellectual enterprises of computer science and the art of
programming, back here on campus in beautiful Sanders Theatre for the first time in quite a while. So
welcome to the class. My name is David-- OK. [CHEERING AND APPLAUSE] So my name is David Malan.

- [02:13](https://www.youtube.com/watch?v=undefined&t=133s) And I took this class myself some time


ago, but almost didn't. It was sophomore fall and I was sitting in on the class. And I was a little curious
but, eh, it didn't really feel like the field for me. I was definitely a computer person, but computer science
felt like something altogether. And I only got up the nerve to take the class, ultimately, because the
professor at the time, Brian Kernighan, allowed me to take the class pass/fail, initially.

- [02:36](https://www.youtube.com/watch?v=undefined&t=156s) And that is what made all the


difference. I quickly found that computer science is not just about programming and working in isolation
on your computer. It's really about problem solving more generally. And there was something about
homework, frankly, that was, like, actually fun for perhaps the first time in, what, 19 years.

- [02:51](https://www.youtube.com/watch?v=undefined&t=171s) And there was something about this


ability that I discovered, along with all of my classmates, to actually create something and bring a
computer to life to solve a problem, and sort of bring to bear something that I'd been using every day
but didn't really know how to harness, that's been gratifying ever since, and definitely challenging and
frustrating.

- [03:08](https://www.youtube.com/watch?v=undefined&t=188s) Like, to this day, all these years later,


you're going to run up against mistakes, otherwise known as bugs, in programming, that just drive you
nuts. And you feel like you've hit a wall. But the trick really is to give it enough time, to take a step back,
take a break when you need to. And there's nothing better, I daresay, than that sense of gratification and
pride, really, when you get something to work, and in a class like this, present, ultimately, at term's end,
something like your very own final project.

- [03:32](https://www.youtube.com/watch?v=undefined&t=212s) Now, this isn't to say that I took to it


100% perfectly. In fact, just this past week, I looked in my old CS50 binder, which I still have from some
25 years ago, and took a photo of what was apparently the very first program that I wrote and
submitted, and quickly received minus 2 points on. But this is a program that we'll soon see in the
coming days that does something quite simply like print "Hello, CS50," in this case, to the screen.

- [03:58](https://www.youtube.com/watch?v=undefined&t=238s) And to be fair, I technically hadn't


really followed the directions, which is why I lost those couple of points. But if you just look at this,
especially if you've never programmed before, you might have heard about programming language but
you've never typed something like this out, undoubtedly it's going to look cryptic.

- [04:11](https://www.youtube.com/watch?v=undefined&t=251s) But unlike human languages, frankly,


which were a lot more sophisticated, a lot more vocabulary, a lot more grammatical rules, programming,
once you start to wrap your mind around what it is and how it works and what these various languages
are, it's so easy, you'll see, after a few months of a class like this, to start teaching yourself, subsequently,
other languages, as they may come, in the coming years as well.

- [04:33](https://www.youtube.com/watch?v=undefined&t=273s) So what ultimately matters in this


particular course is not so much where you end up relative to your classmates but where you end up
relative to yourself when you began. And indeed, you'll begin today. And the only experience that
matters ultimately in this class is your own. And so, consider where you are today.

- [04:49](https://www.youtube.com/watch?v=undefined&t=289s) Consider, perhaps, just how cryptic


something like that looked a few seconds ago. And take comfort in knowing just some months from now
all of that will be within your own grasp. And if you're thinking that, OK, surely the person in front of me,
to the left, to the right, behind me, knows more than me, that's statistically not the case.

- [05:05](https://www.youtube.com/watch?v=undefined&t=305s) 2/3 of CS50 students have never taken


a CS course before, which is to say, you're in very good company throughout this whole term. So then,
what is computer science? I claim that it's problem solving. And the upside of that is that problem
solving is something we sort of do all the time. But a computer science class, learning to program, I think
kind of cleans up your thoughts.

- [05:28](https://www.youtube.com/watch?v=undefined&t=328s) It helps you learn how to think more


methodically, more carefully, more correctly, more precisely. Because, honestly, the computer is not
going to do what you want unless you are correct and precise and methodical. And so, as such, there's
these fringe benefits of just learning to think like a computer scientist and a programmer.

- [05:43](https://www.youtube.com/watch?v=undefined&t=343s) And it doesn't take all that much to


start doing so. This, for instance, is perhaps the simplest picture of computer science, sure, but really
problem solving in general. Problems are all about taking input, like the problem you want to solve. You
want to get the solution, a.k.a. output. And so, something interesting has got to be happening in here, in
here, when you're trying to get from those inputs to outputs.

- [06:03](https://www.youtube.com/watch?v=undefined&t=363s) Now, in the world of computers


specifically, we need to decide in advance how we represent these inputs and outputs. We all just need
to decide, whether it's Macs or PCs or phones or something else, that we're all going to speak some
common language, irrespective of our human languages as well. And you may very well know that
computers tend to speak only what language, so to speak? Assembly, one, but binary, two, might be your
go-to.
- [06:29](https://www.youtube.com/watch?v=undefined&t=389s) And binary, by implying two, means
that the world of computers has just two digits at its disposal, 0 and 1. And indeed, we humans have
many more than that, certainly not just zeros and ones alone. But a computer indeed only has zeros and
ones. And yet, somehow they can do so much. They can crunch numbers in Excel, send text messages,
create images and artwork and movies and more.

- [06:51](https://www.youtube.com/watch?v=undefined&t=411s) And so, how do you get from


something as simple as a few zeros, a few ones, to all of the stuff that we're doing today in our pockets
and laptops and desktops? Well, it turns out that we can start quite simply. If a computer were to want
to do something as simple as count, well, what could it do? Well, in our human world, we might count
doing this, like 1, 2, 3, 4, 5, using so-called unitary notation, literally the digits on your fingers where one
finger represents one person in the room, if I'm, for instance, taking attendance.

- [07:19](https://www.youtube.com/watch?v=undefined&t=439s) Now, we humans would typically


actually count 1, 2, 3, 4, 5, 6. And we'd go past just those five digits and count much higher, using zeros
through nines. But computers, somehow, only have these zeros and ones. So if a computer only
somehow speaks binary, zeros and ones, how does it even count past the number 1? Well, here are 3
zeros, of course.

- [07:39](https://www.youtube.com/watch?v=undefined&t=459s) And if you translate this number in


binary, 000, to a more familiar number in decimal, we would just call this zero. Enough said. If we were
to represent, with a computer, the number 1, it would actually be 001, which, not surprisingly, is exactly
the same as we might do in our human world, but we might not bother writing out the two zeros at the
beginning.

- [08:00](https://www.youtube.com/watch?v=undefined&t=480s) But a computer, now, if it wants to


count as high as two, it doesn't have the digit 2. And so it has to use a different pattern of zeros and
ones. And that happens to be 010. So this is not 10 with a zero in front of it. It's indeed zero one zero in
the context of binary. And if we want to count higher now than two, we're going to have to tweak these
zeros and ones further to get 3.

- [08:20](https://www.youtube.com/watch?v=undefined&t=500s) And then if we want 4 or 5 or 6 or 7,


we're just kind of toggling these zeros and ones, a.k.a. bits, for binary digits that represent, via these
different patterns, different numbers that you and I, as humans, know, of course, as the so-called
decimal system, 0 through 9, dec implying 10, 10 digits, those zeros through nine.

- [08:40](https://www.youtube.com/watch?v=undefined&t=520s) So why that particular pattern? And


why these particular zeros and ones? Well, it turns out that representing one thing or the other is just
really simple for a computer. Why? At the end of the day, they're powered by electricity. And it's a really
simple thing to just either store some electricity or don't store some electricity.

- [08:58](https://www.youtube.com/watch?v=undefined&t=538s) Like, that's as simple as the world can


get, on or off. 1 or 0, so to speak. So, in fact, inside of a computer, a phone, anything these days that's
electronic, pretty much, is some number of switches, otherwise known as transistors. And they're tiny.
You've got thousands, millions of them in your Mac or PC or phone these days.

- [09:15](https://www.youtube.com/watch?v=undefined&t=555s) And these are just tiny little switches


that can get turned on and off. And by turning those things on and off in patterns, a computer can count
from 0 on up to 7, and even higher than that. And so these switches, really, you can think of being as like
switches like this. Let me just borrow one of our little stage lights here.

- [09:30](https://www.youtube.com/watch?v=undefined&t=570s) Here's a light bulb. It's currently off.


And so, I could just think of this as representing, in my laptop, a transistor, a switch, representing 0. But if
I allow some electricity to flow, now I, in fact, have a 1. Well, how do I count higher than 1? I, of course,
need another light bulb. So let me grab another one here.

- [09:49](https://www.youtube.com/watch?v=undefined&t=589s) And if I put it in that same kind of


pattern, I don't want to just do this. That's sort of the old finger counting way of unary, just 1, 2. I want to
actually take into account the pattern of these things being on and off. So if this was one a moment ago,
what I think I did earlier was I turned it off and let the next one over be on, a.k.a.

- [10:11](https://www.youtube.com/watch?v=undefined&t=611s) 010. And let me get us a third bit, if


you will. And that feels like enough. Here is that same pattern now, starting at the beginning with 3. So
here is 000. Here is 001. Here is 010, a.k.a., in our human world of decimal, 2. And then we could, of
course, keep counting further. This now would be 3 and dot dot dot.

- [10:38](https://www.youtube.com/watch?v=undefined&t=638s) If this other bulb now goes on, and


that switch is turned and all three stay on-- this, again, was what number? AUDIENCE: Seven. DAVID
MALAN: OK, so, seven. So it's just as simple, relatively, as that, if you will. But how is it that these
patterns came to be? Well, these patterns actually follow something very familiar.

- [10:57](https://www.youtube.com/watch?v=undefined&t=657s) You and I don't really think about it at


this level anymore because we've probably been doing math and numbers since grade school or
whatnot. But if we consider something in decimal, like the number 123, I immediately jump to that. This
looks like 123 in decimal. But why? It's really just three symbols, a 1, a 2 with a bit of curve, a 3 with a
couple of curves, that you and I now instinctively just assign meaning to.

- [11:22](https://www.youtube.com/watch?v=undefined&t=682s) But if we do rewind a few years, that


is one hundred twenty-three because you're assigning meaning to each of these columns. The 3 is in the
so-called ones place. The 2 is in the so-called tens place. And the 1 is in the so-called hundreds place.
And then the math ensues quickly in your head. This is technically 100 times 1, plus 10 times 2, plus 1
times 3, a.k.a.

- [11:47](https://www.youtube.com/watch?v=undefined&t=707s) 100 plus 20 plus 3. And there we get


the sort of mathematical notion we know as 123. Well, nicely enough, in binary, it's actually the same
thing. It's just these columns mean a little something different. If you use three digits in decimal, and you
have the ones place, the tens place, and the hundreds place, well, why was that 1, 10, and 100? They're
technically just powers of 10.

- [12:11](https://www.youtube.com/watch?v=undefined&t=731s) So 10 to the 0, 10 to the 1, 10 to the 2.


Why 10? Decimal system, "dec" meaning 10. You have 8 and 10 digits, 0 through 9. In the binary system,
if you're going to use three digits, just change the bases if you're using only zeros and ones. So now it's
powers of 2, 2 to the 0, 2 to the 1, 2 to the 2, a.k.a.

- [12:29](https://www.youtube.com/watch?v=undefined&t=749s) 1 and 2 and 4, respectively. And if you


keep going, it's going to be 8s column, 16s column, 32, 64, and so forth. So, why did we get these
patterns that we did? Here's your 000 because it's 4 times 0, 2 times 0, 1 times 0, obviously 0. This is why
we got the decimal number 1 in binary. This is why we got the number 2 in binary, because it's 4 times 0,
plus 2 times 1, plus 1 times 0, and now 3, and now 4, and now 5, and now 6, and now 7.

- [13:02](https://www.youtube.com/watch?v=undefined&t=782s) And, of course, if you wanted to count


as high as 8, to be clear, what do you have to do? What does a computer need to do to count even
higher than 7? AUDIENCE: Add a bit. DAVID MALAN: Add a bit. Add another light bulb, another switch.
And, indeed, computers have standardized just how many zeros and ones, or bits or switches, they throw
at these kinds of problems.

- [13:19](https://www.youtube.com/watch?v=undefined&t=799s) And, in fact, most computers would


typically use at least eight at a time. And even if you're only counting as high as three or seven, you
would still use eight and have a whole bunch of zeros. But that's OK, because the computers these days
certainly have so many more, thousands, millions of transistors and switches that that's quite OK.

- [13:38](https://www.youtube.com/watch?v=undefined&t=818s) All right, so, with that said, if we can


now count as high as seven or, frankly, as high as we want, that only seems to make computers useful for
things like Excel, like number crunching. But computers, of course, let you send text messages, write
documents, and so much more. So how would a computer represent something like a letter, like the
letter A of the English alphabet, if, at the end of the day, all they have is switches? Any thoughts? Yeah.

- [14:04](https://www.youtube.com/watch?v=undefined&t=844s) AUDIENCE: You can represent letters


in numbers. DAVID MALAN: OK, so we could represent letters using numbers. OK, so what's a proposal?
What number should represent what? AUDIENCE: Say if you were starting at the beginning of the
alphabet, you could say 1 is A, 2 is B, 3 is C. DAVID MALAN: Perfect. Yeah, we just all have to agree
somehow that one number is going to represent one letter.

- [14:24](https://www.youtube.com/watch?v=undefined&t=864s) So 1 is A, 2 is B, 3 is C, Z is 26, and so


forth. Maybe we can even take into account uppercase and lowercase. We just have to agree and sort of
write it down in some global standard. And humans, indeed, did just that. They didn't use 1, 2, 3. It turns
out they started a little higher up. Capital A has been standardized as the number 65.

- [14:44](https://www.youtube.com/watch?v=undefined&t=884s) And capital B has been standardized


as the number 66. And you can kind of imagine how it goes up from there. And that's because whatever
you're representing, ultimately, can only be stored, at the end of the day, as zeros and ones. And so,
some humans in a room before, decided that capital A shall be 65, or, really, this pattern of zeros and
ones inside of every computer in the world, 01000001.

- [15:09](https://www.youtube.com/watch?v=undefined&t=909s) So if that pattern of zeros and ones


ever appears in a computer, it might be interpreted then as indeed a capital letter A, eight of those bits
at a time. But I worry, just to be clear, we might have now created a problem. It might seem, if I play this
naively, that, OK, how do I now actually do math with the number 65? If now Excel displays 65 is an A, let
alone Bs and Cs.

- [15:34](https://www.youtube.com/watch?v=undefined&t=934s) So how might a computer do as


you've proposed, have this mapping from numbers to letters, but still support numbers? It feels like
we've given something up. Yeah? AUDIENCE: By having a prefix for letters? DAVID MALAN: By having a
prefix? AUDIENCE: You could have prefixes and suffixes. DAVID MALAN: OK, so we could perhaps have
some kind of prefix, like some pattern of zeros and ones-- I like this-- that indicates to the computer here
comes another pattern that represents a letter.

- [15:59](https://www.youtube.com/watch?v=undefined&t=959s) Here comes another pattern that


represents a number or a letter. So, not bad. I like that. Other thoughts? How might a computer
distinguish these two? Yeah. AUDIENCE: Have a different file format, so, like, odd text or just check the
graphic or-- DAVID MALAN: Indeed, and that's spot-on. Nothing wrong with what you suggested, but the
world generally does just that.

- [16:21](https://www.youtube.com/watch?v=undefined&t=981s) The reason we have all of these


different file formats in the world, like JPEG and GIF and PNGs and Word documents, .docx, and Excel
files and so forth, is because a bunch of humans got in a room and decided, well, in the context of this
type of file, or really, more specifically, in the context of this type of program, Excel versus Photoshop
versus Google Docs or the like, we shall interpret any patterns of zeros and ones as being maybe
numbers for Excel, maybe letters in, like, a text messaging program or Google Docs,

- [16:51](https://www.youtube.com/watch?v=undefined&t=1011s) or maybe even colors of the rainbow


in something like Photoshop and more. So it's context dependent. And we'll see, when we ourselves
start programming, you the programmer will ultimately provide some hints to the computer that tells
the computer, interpret it as follows. So, similar in spirit to that, but not quite a standardized with these
prefixes.

- [17:09](https://www.youtube.com/watch?v=undefined&t=1029s) So this system here actually has a


name ASCII, the American Standard Code for Information Interchange. And indeed, it began here in the
US, and that's why it's actually a little biased toward A's through Z's and a bit of punctuation as well. And
that quickly became a problem. But if we start simply now, in English, the mapping itself is fairly
straightforward.

- [17:29](https://www.youtube.com/watch?v=undefined&t=1049s) So if A is 65, B it 66, and dot dot dot,


suppose that you received a text message, an email, from a friend, and underneath the hood, so to
speak, if you kind of looked inside the computer, what you technically received in this text or this email
happened to be the numbers 72, 73, 33, or, really, the underlying pattern of zeros and ones.

- [17:51](https://www.youtube.com/watch?v=undefined&t=1071s) What might your friend have sent


you as a message, if it's 72, 73, 33? AUDIENCE: Hey. DAVID MALAN: Hey? Close. AUDIENCE: Hi. DAVID
MALAN: Hi. It's, indeed, hi. Why? Well, apparently, according to this little cheat sheet, H is 72, I is 73. It's
not obvious from this chart what the 33 is, but indeed, this pattern represents "hi.

- [18:12](https://www.youtube.com/watch?v=undefined&t=1092s) " And anyone want to guess, or if you


know, what 33 is? AUDIENCE: Exclamation point. DAVID MALAN: Exclamation point. And this is, frankly,
not the kind of thing most people know. But it's easily accessible by a nice user-friendly chart like this. So
this is an ASCII chart. When I said that we just need to write down this mapping earlier, this is what
people did.

- [18:28](https://www.youtube.com/watch?v=undefined&t=1108s) They wrote it down in a book or in a


chart. And, for instance, here is our 72 for H, here is our 73 for I, and here is our 33 for exclamation
point. And computers, Macs, PCs, iPhones, Android devices, just know this mapping by heart, if you will.
They've been designed to understand those letters. So here, I might have received "hi.
- [18:48](https://www.youtube.com/watch?v=undefined&t=1128s) " Technically, what I've received is
these patterns of zeros and ones. But it's important to note that when you get these patterns of zeros
and ones in any format, be it email or text or a file, they do tend to come in standard lengths, with a
certain number of zeros and ones altogether. And this happens to be 8 plus 8, plus 8.

- [19:07](https://www.youtube.com/watch?v=undefined&t=1147s) So just to get the message "hi,


exclamation point," you would have received at least, it would seem, some 24 bits. But frankly, bits are
so tiny, literally and mathematically, that we don't tend to think or talk, generally, in terms of bits. You're
probably more familiar with bytes. B-Y-T-E-S is a byte, is a byte, is a byte.

- [19:27](https://www.youtube.com/watch?v=undefined&t=1167s) A byte is just 8 bits. And even those,


frankly, aren't that useful if we do out the math. How high can you count if you have eight bits? Anyone
know? Say it again? Higher than that. Unless you want to go negative, that's fine. 256, technically 255.
Long story short, if we actually got into the weeds of all of these zeros and ones, and we figured out
what 11111111 mathematically adds up to in decimal, it would indeed be 255, or less if you want to
represent negative numbers as well.

- [20:00](https://www.youtube.com/watch?v=undefined&t=1200s) So this is useful because now we can


speak, not just in terms of bytes but, if the files are bigger, kilobytes is thousands of bytes, megabytes is
millions of bytes, gigabytes is billions of bytes, terabytes are trillions of bytes, and so forth. We have a
vocabulary for these increasingly large quantities of data.

- [20:21](https://www.youtube.com/watch?v=undefined&t=1221s) The problem is that, if you're using


ASCII and, therefore, eight bits or one byte per character, and originally, only seven, you can only
represent 255 characters. And that's actually 256 total characters, including zero. And that's fine if you're
using literally English, in this case, plus a bunch of punctuation.

- [20:39](https://www.youtube.com/watch?v=undefined&t=1239s) But there's many human languages


in the world that need many more symbols and, therefore, many more bits. So, thankfully, the world
decided that we'll indeed support not just the US English keyboard, but all of the accented characters
that you might want for some languages. And heck, if we use enough bits, zeros and ones, not only can
we represent all human languages in written form, as well as some emotions along the way, we can
capture the latter with these things called emojis.

- [21:07](https://www.youtube.com/watch?v=undefined&t=1267s) And indeed, these are very much in


vogue these days. You probably send and/or receive many of these things any given day. These are just
characters, like letters of an alphabet, patterns of zeros and ones that you're receiving, that the world
has also standardized. For instance, there are certain emojis that are represented with certain patterns
of bits.

- [21:25](https://www.youtube.com/watch?v=undefined&t=1285s) And when you receive them, your


phone, your laptop, your desktop, displays them as such. And this newer standard is called Unicode. So
it's a superset of what we called ASCII. And Unicode is just a mapping of many more numbers to many
more letters or characters, more generally, that might use eight bits for backwards compatibility with the
old way of doing things with ASCII, but they might also use 16 bits.

- [21:50](https://www.youtube.com/watch?v=undefined&t=1310s) And if you have 16 bits, you can


actually represent more than 65,000 possible letters. And that's getting up there. And heck, Unicode
might even use 32 bits to represent letters and numbers and punctuation symbols and emojis. And that
would give you up to 4 billion possibilities. And, I daresay, one of the reasons we see so many emojis
these days is we have so much room.

- [22:11](https://www.youtube.com/watch?v=undefined&t=1331s) I mean, we've got room for billions


more, literally. So, in fact, just as a little bit of trivia, has anyone ever received this decimal number, or if
you prefer binary now, has anyone ever received this pattern of zeros and ones on your phone, in a text
or an email, perhaps this past year? Well, if you actually look this up, this esoteric sequence of zeros and
ones happens to represent face with medical mask.

- [22:37](https://www.youtube.com/watch?v=undefined&t=1357s) And notice that if you've got an


iPhone or an Android device, you might be seeing different things. In fact, this is the Android version of
this, most recently. This is the iOS version of it, most recently. And there's bunches of other
interpretations by other companies as well. So Unicode, as a consortium, if you will, has standardized the
descriptions of what these things are.

- [22:59](https://www.youtube.com/watch?v=undefined&t=1379s) But the companies themselves,


manufacturers out there, have generally interpreted it as you see fit. And this can lead to some human
miscommunications. In fact, for like, literally, embarrassingly, like a year or two, I started being in the
habit of using the emoji that kind of looks like this because I thought it was like woo, happy face, or
whatever.

- [23:18](https://www.youtube.com/watch?v=undefined&t=1398s) I didn't realize this is the emoji for


hug because whatever device I was using sort of looks like this, not like this. And that's because of their
interpretation of the data. This has happened too when what was a gun became a water pistol in some
manufacturers' eyes. And so it's an interesting dichotomy between what information we all want to
represent and how we choose, ultimately, to represent it.

- [23:43](https://www.youtube.com/watch?v=undefined&t=1423s) Questions, then, on these


representations of formats, be it numbers or letters, or soon more. Yeah? AUDIENCE: Why is decimal
popular for a computer if binary is the basis for everything? DAVID MALAN: Sorry, why is what so
popular? AUDIENCE: Why is the decimal popular if binary is the fundamental-- DAVID MALAN: Yeah, so
we'll come back to this in a few weeks, in fact.

- [24:02](https://www.youtube.com/watch?v=undefined&t=1442s) There are other ways to represent


numbers. Binary is one. Decimal is another. Unary is another. And hexadecimal is yet a fourth that uses
16 total digits, literally 0 through 9 plus A, B, C, D, E, F. And somehow, you can similarly count even higher
with those. We'll see in a few weeks why this is compelling.

- [24:21](https://www.youtube.com/watch?v=undefined&t=1461s) But hexadecimal, long story short,


uses four bits per digit. And so, four bits, if you have two digits in hex, that gives you eight. And it's just a
very convenient unit of measure. And it's also human convention in the world of files and other things.
But we'll come back to that soon. Other questions? AUDIENCE: Do the lights on the stage supposedly say
that-- DAVID MALAN: Do the lights on the stage supposedly say anything? Well, if we had thought in
advance to use maybe 64 light bulbs, that would seem to give us 8 total bytes on stage, 8 times 8,

- [24:52](https://www.youtube.com/watch?v=undefined&t=1492s) giving us just that. Maybe. Good


question. Other questions on 0's and 1's? It's a little bright in here. No? Oh, yes? Where everyone's
pointing somewhere specific. There we go. Sorry. Very bright in this corner. AUDIENCE: I was just going to
ask about the 255 bits, like with the maximum characters.

- [25:16](https://www.youtube.com/watch?v=undefined&t=1516s) [INAUDIBLE] DAVID MALAN: Ah,


sure, and we'll come back to this, in some form, in the coming days too, at a slower pace too, we have,
with eight bits, two possible values for the first and then two for the next, two for the next, and so forth.
So that's 2 times 2 times 2. That's 2 to the eighth power total, which means you can have 256 total
possible patterns of zeros and ones.

- [25:37](https://www.youtube.com/watch?v=undefined&t=1537s) But as we'll see soon computer


scientists, programmers, software often starts counting at 0 by convention and if you use one of those
patterns, 00000000 to represent the decimal number we know is zero, you only have 255 other patterns
left to count as high as therefore 255. That's all. Good question.

- [25:59](https://www.youtube.com/watch?v=undefined&t=1559s) All right, so what then might we have


besides these emojis and letters and numbers? Well, we of course have things like colors and programs
like Photoshop and pictures and photos. Well let me ask the question again. How might a computer, do
you think, knowing what you know now, represents something like a color? Like what are our options if
all we've got are zeros and ones and switches? Yeah? AUDIENCE: RGB DAVID MALAN: RGB.

- [26:23](https://www.youtube.com/watch?v=undefined&t=1583s) RGB indeed is this acronym that


represents some amount of red and some amount of green and blue and indeed computers can
represent colors by just doing that. Remembering, for instance, this dot. This yellow dot on the screen
that might be part of any of those emojis these days, well that's some amount of red, some amount of
green, some amount of blue.

- [26:40](https://www.youtube.com/watch?v=undefined&t=1600s) And if you sort of mix those colors


together, you can indeed get a very specific one. And we'll see you in just a moment just that. So indeed
earlier on, humans only used seven bits total. And it was only once they decided, well, let's add an
eighth bit that they got extended ASCII and that was initially in part a solution to the same problem of
not having enough room, if you will, in those patterns of zeros and ones to represent all of the characters
that you might want.

- [27:06](https://www.youtube.com/watch?v=undefined&t=1626s) But even that wasn't enough and


that's why we've now gone up to 16 and 32 and long past 7. So if we come back now to this one
particular color. RGB was proposed as a scheme, but how might this work? Well, consider for instance
this. If we do indeed decide as a group to represent any color of the rainbow with some mixture of some
red, some green, and some blue, we have to decide how to represent the amount of red and green and
blue.

- [27:33](https://www.youtube.com/watch?v=undefined&t=1653s) Well, it turns out if all we have are


zeros and ones, ergo numbers, let's do just that. For instance, suppose a computer we're using, these
three numbers 72, 73, 33, no longer in the context of an email or a text message, but now in the context
of something like Photoshop, a program for editing and creating graphical files, maybe this first number
could be interpreted as representing some amount of red, green, and blue, respectively.

- [28:01](https://www.youtube.com/watch?v=undefined&t=1681s) And that's exactly what happens. You


can think of the first digit as red, second as green, third as blue. And so ultimately when you combine
that amount of red, that amount of green, that amount of blue, it turns out it's going to resemble the
shade of yellow. And indeed, you can come up with a numbers between 0 and 255 for each of those
colors to mix any other color that you might want.

- [28:21](https://www.youtube.com/watch?v=undefined&t=1701s) And you can actually see this in


practice. Even though our screens, admittedly, are getting really good on our phones and laptops such
that you barely see the dots, they are there. You might have heard the term pixel before. Pixel's just a
dot on the screen and you've got thousands, millions of them these days horizontally and vertically.

- [28:38](https://www.youtube.com/watch?v=undefined&t=1718s) If I take even this emoji, which again


happens to be one company's interpretation of a face with medical mask and zoom in a bit, maybe zoom
in a bit more, you can actually start to see these pixels. Things get pixelated because what you're seeing
is each of the individual dots that compose this particular image.

- [28:57](https://www.youtube.com/watch?v=undefined&t=1737s) And apparently each of these


individual dots are probably using 24 bits, eight bits for red, eight bits for green, eight bits for blue, in
some pattern. This program or some other like Photoshop is interpreting one pattern and it's white or
yellow or black or some brown in between. So if you look sort of awkwardly, but up close to your phone
or your laptop or maybe your TV, you can see exactly this, too.

- [29:24](https://www.youtube.com/watch?v=undefined&t=1764s) All right, well, what about things that


we also watch every day on YouTube or the like? Things like videos. How would a computer, knowing
what we know now, represent something like a video? How might you represent a video using only zeros
and ones? Yeah? AUDIENCE: As we can see here, they represent images, right? [INAUDIBLE] sounds of
the 0 and 1s as well.

- [29:48](https://www.youtube.com/watch?v=undefined&t=1788s) [INAUDIBLE] DAVID MALAN: Yeah,


exactly. To summarize, what video really adds is just some notion of time. It's not just one image, it's not
just one letter or a number, it's presumably some kind of sequence because time is passing. So with a
whole bunch of images, maybe 24 maybe 30 per second, if you fly them by the human's eyes, we can
interpret them using our eyes and brain that there is now movement and therefore video.

- [30:13](https://www.youtube.com/watch?v=undefined&t=1813s) Similarly with audio or music. If we


just came up with some convention for representing those same notes on a musical instrument, could
we have the computer synthesize them, too? And this might be actually pretty familiar. Let me pull up a
quick video here, which happens to be an old school version of the same idea.

- [30:31](https://www.youtube.com/watch?v=undefined&t=1831s) You might remember from


childhood. [MUSIC PLAYING] [CLICKING] So granted that particular video is an actual video of a paper-
based animation, but indeed, that's really all you need, is some sequence of these images, which
themselves of course are just zeros and ones because they're just this grid of these pixels or dots.

- [31:10](https://www.youtube.com/watch?v=undefined&t=1870s) Now something like musical notes


like these, those of you who are musicians might just naturally play these on physical devices, but
computers can certainly represent those sounds, too. For instance, a popular format for audio is called
MIDI and MIDI might just represent each note that you saw a moment ago essentially as a sequence of
numbers.
- [31:29](https://www.youtube.com/watch?v=undefined&t=1889s) But more generally, you might think
about music as having notes, for instance, A through G, maybe some flats and some sharps, you might
have the duration like how long is the note being heard or played on a piano or some other device, and
then just the volume like how hard does a human in the real world press down on that key and therefore
how loud is that sound? It would seem that just remembering little details like that quantitatively we can
then represent really all of these otherwise analog human realities.

- [31:57](https://www.youtube.com/watch?v=undefined&t=1917s) So that then is really a laundry list of


ways that we can just represent information. Again, computers or digital have all of these different
formats, but at the end of the day and as fancy as those devices in years are, it's just zeros and ones, tiny
little switches or light bulbs, if you will, represented in some way and it's up to the software that you and
I and others write to use those zeros and ones in ways we want to get the computers to do something
more powerfully.

- [32:23](https://www.youtube.com/watch?v=undefined&t=1943s) Questions, then, on this


representation of information, which I daresay is ultimately what problem solving is all about, taking in
information and producing new via some process in between. Any questions there? Yeah, in back.
AUDIENCE: Yeah, so we talked about how different file formats kind of allow you to interpret
information.

- [32:46](https://www.youtube.com/watch?v=undefined&t=1966s) How does a file format like .mp4


discriminate between audio and video within itself as a value? DAVID MALAN: So a really good question.
There are many other file formats out there. You allude to MP4 for video and more generally the use are
these things called codecs and containers. It's not quite as simple when using larger files, for instance, in
more modern formats that a video is just a sequence of images, for instance.

- [33:09](https://www.youtube.com/watch?v=undefined&t=1989s) Why? If you stored that many images


for like a Hollywood movie, like 24 or 30 of them per second, that's a huge number of images. And if
you've ever taken photos on your phone, you might know how many megabytes or larger even individual
photographs might be. So humans have developed over the years a fancier software that uses much
more math to represent the same information more minimally just using somehow shorter patterns of
zeros and ones than are most simplistic representation here.

- [33:38](https://www.youtube.com/watch?v=undefined&t=2018s) And they use what might be called


compression. If you've ever used a zip file or something else, somehow your computer is using fewer
zeros and ones to represent the same amount of information, ideally without losing any information. In
the world of multimedia, which we'll touch on a little bit in a few weeks, there are both lossy and lossless
formats out there.

- [33:56](https://www.youtube.com/watch?v=undefined&t=2036s) Lossless means you lose no


information whatsoever. But more commonly as you're alluding to one is lossy compression, L-O-S-S-Y,
where you're actually throwing away some amount of quality. You're getting some amount of pixelation
that might not look perfect to the human, but heck it's a lot cheaper and a lot easier to distribute.

- [34:15](https://www.youtube.com/watch?v=undefined&t=2055s) And in the world of multimedia, you


have containers like QuickTime and other MPEG containers that can combine different formats of video,
different formats of audio in one file, but there, too, do designers have discretion. So more in a few
weeks, too. Other questions, then, on information here as well? Yeah? AUDIENCE: So I know computers
used to be very big and taking up like a whole room and stuff.

- [34:38](https://www.youtube.com/watch?v=undefined&t=2078s) Is the reason they've gotten smaller


because we can store this information piecemeal or what? DAVID MALAN: Exactly. I mean, back in the
day you might have heard of the expression a vacuum tube, which is like some physically large device
that might have only stored some 0 or 1. Yes, it is the miniaturization of hardware these days that has
allowed us to store as many and many more zeros and ones much more closely together.

- [35:02](https://www.youtube.com/watch?v=undefined&t=2102s) And as we've built more fancy


machines that can sort of design this hardware at an even smaller scale, we're just packing more and
more into these devices. But there, too, is a trade off. For instance, you might know by using your phone
or your laptop for quite a while, maybe on your lap, starts to get warm.

- [35:16](https://www.youtube.com/watch?v=undefined&t=2116s) So there are these literal physical


side effects of this where now some of our devices run hot. This is why like a data center in the real
world might need more air conditioning than a typical place, because there are these physical artifacts as
well. In fact, if you'd like to see one of the earliest computers from decades ago, across the river here in
now Allston in the new engineering building is the Harvard Mark 1 computer that will give you a much
better mental model of just that.

- [35:42](https://www.youtube.com/watch?v=undefined&t=2142s) Well if we come back now to this first


picture being computer science or really problem solving, I daresay we have more than enough ways
now to represent information, input and output, so long as we all just agree on something and thankfully
all of those before us have given us things like ASCII and Unicode.

- [35:58](https://www.youtube.com/watch?v=undefined&t=2158s) Not to mention MP4s, word


documents, and the like. But what's inside of this proverbial black box into which these inputs are going
in the outputs are coming? Well that's where we get this term you might have heard, too. An algorithm,
which is just step-by-step instructions for solving some problem incarnated in the world of computers by
software.

- [36:18](https://www.youtube.com/watch?v=undefined&t=2178s) When you write software aka


programs, you are implementing one or more algorithms, one or more sets of instructions for solving
some problem, and maybe you're using this language or that, but at the end of the day, no matter the
language you use the computer is going to represent what you type using just zeros and ones.

- [36:38](https://www.youtube.com/watch?v=undefined&t=2198s) So what might be a representative


algorithm? Nowadays you might use your phone quite a bit to make calls or send texts or emails and
therefore you have a whole bunch of contacts in your address book. Nowadays, of course, this is very
digital, but whether on iOS or Android or the like, you might have a whole bunch of names, first name
and/or last, as well as numbers and emails and the like.

- [36:59](https://www.youtube.com/watch?v=undefined&t=2219s) You might be in the habit of like


scrolling through on your phone all of those names to find the person you want to call. It's probably
sorted alphabetically by first name or last name, A through Z, or some other symbol. This is frankly quite
the same as we used to do back in my day, CS50, when we just used a physical book.
- [37:18](https://www.youtube.com/watch?v=undefined&t=2238s) In this physical book might be a
whole bunch of names alphabetically sorted from left to right corresponding to a whole bunch of
numbers. So suppose that in this old Harvard phone book we want to search for John Harvard. We might
of course start quite simply at the beginning here, looking at one page at a time, and this is an algorithm.

- [37:36](https://www.youtube.com/watch?v=undefined&t=2256s) This is like literally step-by-step


looking for the solution to this problem. In that sense, if John Harvard's in the phone book, is this
algorithm page-by-page correct, would you say? AUDIENCE: Yes. DAVID MALAN: Yes. Like if John
Harvard's in the phone book, obviously I'm eventually going to get to him, so that's what we mean by
correct.

- [37:55](https://www.youtube.com/watch?v=undefined&t=2275s) Is it efficient? Is it well designed,


would you say? No. I mean this is going to take forever even just to get to the Js or the Hs, depending
how this thing's sorted. All right, well let me go a little faster. I'll start like two pages at a time. 2, 4, 6, 8,
10, 12, and so forth. Sounds faster, is faster, is it correct? AUDIENCE: No.

- [38:14](https://www.youtube.com/watch?v=undefined&t=2294s) DAVID MALAN: OK, why is it not


correct? Yeah? AUDIENCE: So if you're starting on page 1, you're only going odd number of pages, so if
it's on an even number page, you'll miss it. DAVID MALAN: Exactly. If I start on an odd number of pages
and I'm going two at a time I might miss pages in between.

- [38:28](https://www.youtube.com/watch?v=undefined&t=2308s) And if I therefore conclude when I


get to the back of the book there was no John Harvard, I might have just errored. This would be again
one of these bugs. But if I try a little harder, I feel like there's a solution. We don't have to completely
throw out this algorithm. I think we can probably go roughly twice as fast still.

- [38:44](https://www.youtube.com/watch?v=undefined&t=2324s) But what should we do instead to fix


this? Yeah, in back. AUDIENCE: [INAUDIBLE] DAVID MALAN: Nice. So I think what many of us, most of us,
if we even use this technology any more these days, we might go roughly to the middle of the phone
book just to kind of get us started. And now I'm looking down, I'm looking for J, assuming first name, J
Harvard, and it looks like I'm in the M section.

- [39:13](https://www.youtube.com/watch?v=undefined&t=2353s) So just to be clear, what should I do


next? AUDIENCE: [INAUDIBLE] DAVID MALAN: OK, and presumably it is John Harvard would be to the left
of this. So here's an opportunity to figuratively and literally tear this particular problem in half, throw half
of the problem away. It's actually pretty easy if you just do it that way.

- [39:35](https://www.youtube.com/watch?v=undefined&t=2375s) The hard way is this way. But I've


now just decreased the size of this problem really in half. So if I started with 1,000 pages of phone
numbers and names, now I'm down to 500. And already we haven't found John Harvard, but that's a big
bite out of this problem. I do think it's correct because if J is to the left of M, of course, he's definitely not
going to be over there.

- [39:56](https://www.youtube.com/watch?v=undefined&t=2396s) I think if I repeat this again dividing


and conquering, if you will, here I might have gone a little too far. Now I'm in like the E section. So let me
tear the problem in half again, throw another 250 pages away, and again repeat, dividing and dividing
and conquering until finally, presumably, I end up with just one page of a phone book on which John
Harvard's name either is or is not, but because of the algorithm you proposed, step by step, I know that
he's not in anything I discarded.

- [40:25](https://www.youtube.com/watch?v=undefined&t=2425s) So traumatic is that might have been


made out to be, it's actually just harnessing pretty good human intuition. Indeed, this is what
programming is all about, too. It's not about learning a completely new world, but really just how to
harness intuition and ideas that you might already have and take naturally but learning how to express
them now more succinctly, more precisely, using things called programming languages.

- [40:49](https://www.youtube.com/watch?v=undefined&t=2449s) Why is an algorithm like that if I


found John Harvard better than, ultimately, just doing the first one or even the second and maybe
doubling back to check those even pages? Well let's just look at little charts here. Again, we don't have to
get into the nuances of numbers, but if we've got like a chart here, xy plot, on the x-axis here I claim as
the size of the problem.

- [41:09](https://www.youtube.com/watch?v=undefined&t=2469s) So measured in the numbers of


pages in the phone book. So the farther you go out here, the more pages are in the phone book. And
here we have time to solve on the y-axis. So the higher you go up, the more time it's going to be taking
to solve that particular problem. So let's just arbitrarily say that the first algorithm, involving like n pages,
might be represented graphically like this.

- [41:32](https://www.youtube.com/watch?v=undefined&t=2492s) No matter the slope, it's a straight


line because there's presumably a one to one relationship between numbers of pages and number of
seconds or number of page turns. Why? If the phone company adds another page next year because
some new people move to town, that's going to require one additional page for me.

- [41:47](https://www.youtube.com/watch?v=undefined&t=2507s) One to one. If, though, we use the


second algorithm, flawed though it was, unless we double back a little bit to fix someone being in
between, that's too going to be a straight line, but it's going to be a different slope because now there's
a 2 to 1 or a 1 to 2 relationship because I'm going to pages at a time.

- [42:06](https://www.youtube.com/watch?v=undefined&t=2526s) So if the phone company adds


another page or another two pages, that still only just one more step. You can see the difference if I kind
of draw this. If this is the phone book in question, this number of pages, it might take this many seconds
on the yellow line to represent or to find someone like John Harvard.

- [42:25](https://www.youtube.com/watch?v=undefined&t=2545s) But of course on the first algorithm,


the red line, it's literally going to take twice as many steps. And what do the n here mean? n is the go-to
variable for computer scientist or programmer just generically representing a number. So if the number
of pages in the phone book is n, the number of steps the second algorithm would have taken would be in
the worst case n over 2.

- [42:44](https://www.youtube.com/watch?v=undefined&t=2564s) Half as many because you're going


twice as fast. But the third algorithm, actually if you recall your logarithms, looks a little something like
this. There's a fundamentally different relationship between the size of the problem and the amount of
time required to solve it that technically is log-based, too, again, but it's really the shape that's different.
- [43:03](https://www.youtube.com/watch?v=undefined&t=2583s) The implication there is that if, for
instance, Cambridge and Allston, two different towns here in Massachusetts, merge next year and
there's just one phone book that's twice as big, no big deal for that third and final algorithm. Why? You
just tear the problem one more time in half, taking one more byte, that's it, not another 1,000 bytes just
to get to the solution.

- [43:24](https://www.youtube.com/watch?v=undefined&t=2604s) Put another way, you can walk it way,


way, way out here to a much bigger phone book and ultimately that green line is barely going to have
budged. So this then is just a way of now formalizing and thinking about what the performance or
quality of these algorithms might be. Before we now make one more formalization of the algorithm
itself, any questions then on this notion of efficiency or now performance of ideas? Yeah.

- [43:53](https://www.youtube.com/watch?v=undefined&t=2633s) AUDIENCE: How many phone books


have you got? DAVID MALAN: (LAUGHING) A lot of phone books over the years and if you or your
parents have any more still somewhere we could definitely use them because they're hard to find. Other
questions? But thanks. Other questions here, too? No. Oh, was that a murmur? Yes, over here.

- [44:13](https://www.youtube.com/watch?v=undefined&t=2653s) AUDIENCE: You could get Harry


Potter as a guest speaker. DAVID MALAN: Sorry, say again. AUDIENCE: You could get Harry Potter as a
guest speaker. DAVID MALAN: (LAUGHING) Oh, yeah. Hopefully. Then we'd have a little something more
to use here. So now if we want to formalize further what it is we just did, we can go ahead and introduce
this.

- [44:30](https://www.youtube.com/watch?v=undefined&t=2670s) A form of code aka pseudocode.


Pseudocode is not a specific language, it's not like something we're about to start coding in, it's just a
way of expressing yourself in English or any human language succinctly correctly toward an end of
getting your idea for an algorithm across. So for instance, here might be how we could formalize the
code, the pseudocode for that same algorithm.

- [44:51](https://www.youtube.com/watch?v=undefined&t=2691s) Step one was pick up the phone


book, as I did. Step two might be open to the middle of the phone book, as you proposed that we do
first. Step three was probably to look down at the pages, I did. And step four gets a little more interesting
because I had to quickly make a decision and ask myself a question.

- [45:08](https://www.youtube.com/watch?v=undefined&t=2708s) If person is on page, then I should


probably just go ahead and call that person. But that probably wasn't the case at least for John Harvard,
and I opened the M section. So there's this other question I should now ask else if the person is earlier in
the book, then I should tear the problem in half as I did but go left, so to speak, and then not just open
to the middle of the left half of the book, but really just go back to step three, repeat myself.

- [45:33](https://www.youtube.com/watch?v=undefined&t=2733s) Why? Because I can just repeat what


I just did, but with a smaller problem having taken this big bite. But, if the person was later in the book,
as might have happened with a different person than John Harvard, then I should open to the middle of
the right half of the book, again go back to line three, but again, I'm not going to get sucked doing
something forever like this because I keep shrinking the size of the problem.

- [45:55](https://www.youtube.com/watch?v=undefined&t=2755s) Lastly, the only possible scenario


that's left, if John Harvard is not on the page and he's not to the left and he's not to the right, what
should our conclusion be? AUDIENCE: He's not there. DAVID MALAN: He's not there. He's not listed. So
we need to quit in some other form. Now as an aside, it's kind of deliberate that I buried that last
question at the end because this is what happens all too often in programming, whether you're new at it
or professional, just not considering all possible cases, corner cases if you will,

- [46:22](https://www.youtube.com/watch?v=undefined&t=2782s) that might not happen that often,


but if you don't anticipate them in your own code, pseudocode or otherwise, this is when and why
programs might crash or you might say stupid little spinning beach balls or hourglasses or your computer
might reboot. Why? It's doing something sort of unpredictable if a human, maybe myself, didn't
anticipate this.

- [46:42](https://www.youtube.com/watch?v=undefined&t=2802s) Like what does this program do if


John Harvard is not in the phone book if I had omitted lines 12 and 13? I don't know. Maybe it would
behave differently on a Mac or PC because it's sort of undefined behavior. These are the kinds of
omissions that frankly you're invariably going to make, bugs you're going to introduce, mistakes you're
going to make early on, and me, too, 25 years later.

- [47:03](https://www.youtube.com/watch?v=undefined&t=2823s) But you'll get better at thinking


about those corner cases and handling anything that can possibly go wrong and as a result, your code
will be all the better for it. Now the problem ultimately with learning how to program, especially if
you've never had experience or even if you do but you learned one language only, is that they all look a
little cryptic at first glance.

- [47:25](https://www.youtube.com/watch?v=undefined&t=2845s) But they do share certain


commonalities. In fact, we'll use this pseudocode to define those first. Highlighted in yellow here are
what henceforth we're going to start calling functions. Lots of different programming languages exist, but
most of them have what we might call functions, which are actions or verbs that solve some smaller
problem.

- [47:44](https://www.youtube.com/watch?v=undefined&t=2864s) That is to say, you might use a whole


bunch of functions to solve a bigger problem because each function tends to do something very specific
or precise. These then in English might be translated in code, actual computer code, to these things
called functions. Highlighted in yellow now are what we might call conditionals.

- [48:03](https://www.youtube.com/watch?v=undefined&t=2883s) Conditionals are things that you do


conditionally based on the answer to some question. You can think of them kind of like forks in the road.
Do you go left or go right or some other direction based on the answer to some question? Well, what are
those questions? Highlighted now in yellow or what we would call Boolean expressions, named after a
mathematician last name Bool, that simply have yes no answers.

- [48:25](https://www.youtube.com/watch?v=undefined&t=2905s) Or, if you prefer, true or false


answers or, heck, if you prefer 1 or 0 answers. We just need to distinguish one scenario from another.
The last thing manifests in this pseudocode is what I might highlight now and call loops. Some kind of
cycle, some kind of directive that tells us to do something again and again so that I don't need a 1,000-
line program to search a 1,000-page phone book, I can get away with a 13-line program but sort of
repeat myself inherently in order to solve some problem until I get to that
- [48:58](https://www.youtube.com/watch?v=undefined&t=2938s) last step. So this then is what we
might call pseudocode and indeed there are other characteristics of programs that we'll touch on before
long, things like arguments and return values, variables, and more, but unfortunately in most languages,
including some we will very deliberately use in this class and that everyone in the real world these days
still uses, its programs tend to look like this.

- [49:22](https://www.youtube.com/watch?v=undefined&t=2962s) This for instance, is a distillation of


that very first program I wrote in 1996 in CS50 itself just to print something on the screen. In fact, this
version here just tries to print quote unquote, "Hello, world." Which is, dare say, the most canonical first
thing that most any programmer ever gets a computer to say just because, but look at this mess.

- [49:43](https://www.youtube.com/watch?v=undefined&t=2983s) I mean, there's a hash symbol, these


angled brackets, parentheses, words like int, curly braces, quotes, parentheses, semicolons, and back
slashes. I mean there's more overhead and more syntax and clutter than there is an actual idea. Now
that's not to say that you won't be able to understand this before long, because honestly there's not that
many patterns, indeed programming languages have typically a much smaller vocabulary than any actual
human language, but at first it might indeed look quite cryptic.

- [50:11](https://www.youtube.com/watch?v=undefined&t=3011s) But you can perhaps infer I have no


idea what these other lines do yet, but "Hello, world." is presumably quote unquote what will be printed
on the screen. But what we'll do today, after a short break, and set the stage for next week is introduce
these exact same ideas in just a bit using Scratch, something that you yourselves might have used when
you're quite younger but without the same vocabulary applied to those ideas.

- [50:34](https://www.youtube.com/watch?v=undefined&t=3034s) The upside of what we'll soon do


using Scratch, this graphical programming language from our friends down the road at MIT, it'll let us
today start to drag and drop things that look like puzzle pieces that interlock together if it makes logical
sense to do so, but without the distraction of hashes, parentheses, curly braces, angle brackets,
semicolons, and things that are quite beside the point.

- [50:56](https://www.youtube.com/watch?v=undefined&t=3056s) But for now, let's go ahead and take


a 10 minute break here and when we resume, we will start programming. So this on the screen is a
language called C something that will dive into next week and thankfully this now on the screen is
another language called Python that we'll also take a look at in a few weeks before long along with other
languages along the way.

- [51:16](https://www.youtube.com/watch?v=undefined&t=3076s) Today though, and for this first week,


week zero, so to speak, we use Scratch because again it will allow us to explore some of those
programming fundamentals that will be in C and in Python and in JavaScript and other languages, too,
but in a way where we don't have to worry about the distractions of syntax.

- [51:33](https://www.youtube.com/watch?v=undefined&t=3093s) So the world of Scratch looks like


this. It's a web-based or downloadable programming environment that has this layout here by default.
On the left here we'll soon see is a palette of puzzle pieces, programming blocks that represent all of
those ideas we just discussed. And by dragging and dropping these puzzle pieces or blocks over this big
area and connecting them together, if it makes logical sense to do so, we'll start programming in this
environment.
- [51:59](https://www.youtube.com/watch?v=undefined&t=3119s) The environment allows you to have
multiple sprites, so to speak. Multiple characters, things like a cat or anything else, and those sprites
exist in this rectangular world up here that you can full screen to make bigger and this here by default is
Scratch, who can move up, down, left, right and do many more things, too.

- [52:16](https://www.youtube.com/watch?v=undefined&t=3136s) Within its Scratch's world you can


think of it as perhaps a familiar coordinate system with Xs and Ys which is helpful only when it comes
time to position things on the screen. Right now Scratch is at the default, 0,0, where x equals 0 and y
equals 0. If you were to move the cat way up to the top, x would stay zero, y would be positive 180.

- [52:38](https://www.youtube.com/watch?v=undefined&t=3158s) If you move the cat all the way to the


bottom, x would stay zero, but y would now be negative 180. And if you went left, x would become
negative 240 but y would stay 0, or to the right x would be 240 and y would stay zero. So those numbers
generally don't so much matter because you can just move relatively in this world up, down, left, right,
but when it comes time to precisely position some of these sprites or other imagery, it'll be helpful just
to have that mental model off up, down, left, and right.

- [53:07](https://www.youtube.com/watch?v=undefined&t=3187s) Well let's go ahead and make


perhaps the simplest of programs here. I'm going to switch over to the same programming environment
now for a tour of the left hand side. So by default selected here are the category in blue motion, which
has a whole bunch of puzzle pieces or blocks that relate to motion.

- [53:24](https://www.youtube.com/watch?v=undefined&t=3204s) And whereas Scratch as a graphical


language categorizes things by the type of things that these pieces do, we'll see that throughout this
whole palette we'll have functions and variables and conditionals and Boolean expressions and more
each in a different color and shape. So for instance, moving 10 steps or turning one way or the other
would be functions categorized here as things like motion.

- [53:46](https://www.youtube.com/watch?v=undefined&t=3226s) Under looks in purple, you might


have speech bubbles that you can create by dragging and dropping these that might say "hello" or
whatever for some number of seconds. Or you could switch costumes, change the cat to look like a dog
or a bird or anything else in between. Sounds, too. You can play sounds like "meow" or anything you
might import or record, yourself.

- [54:06](https://www.youtube.com/watch?v=undefined&t=3246s) Then there's these things Scratch


calls events and the most important of these is the first, when green flag clicked. Because if we look over
to the right of Scratch's world here, this rectangular region has this green flag and red stop sign up
above, one of which is for Play one of which is for Stop and so that's going to allow us to start and stop
our actual programs when that green flag is initially clicked.

- [54:28](https://www.youtube.com/watch?v=undefined&t=3268s) But you can listen for other types of


events when the spacebar is pressed or something else, when this sprite is clicked or something else.
Here you already see like a programmer's incarnation of things you and I take for granted like every day
now on our phones. Any time you tap an icon or drag your finger or hit a button on the side.

- [54:46](https://www.youtube.com/watch?v=undefined&t=3286s) These are what a programmer would


call events, things that happen and are often triggered by us humans and things that a program be it in
Scratch or Python or C or anything else can listen for and respond to. Indeed, that's why when you tap
the phone icon on your phone, the phone application starts up because someone wrote software that's
listening for a finger press on that particular icon.

- [55:09](https://www.youtube.com/watch?v=undefined&t=3309s) So Scratch has these same things,


too. Under Control in orange, you can see that we can wait for one second or repeat something some
number of times, 10 by default, but we can change anything in these white circles to anything else.
There's another puzzle piece here forever, which implies some kind of loop where we can do something
again and again.

- [55:26](https://www.youtube.com/watch?v=undefined&t=3326s) Even though it seems a little tight,


there's not much room to fit something there, Scratch is going to have these things grow and shrink
however we want to fill similarly shaped pieces. Here are those conditionals. If something is true or false,
then do this next thing. And that's how we can put in this little trapezoid-like shape.

- [55:43](https://www.youtube.com/watch?v=undefined&t=3343s) Some form of Boolean expression, a


question with a yes/no, true/false, or one/zero answer and decide whether to do something or not. You
can combine these things, too. If something is true, do this, else do this other thing. And you can even
tuck one inside of the other if you want to ask three or four or more questions.

- [56:01](https://www.youtube.com/watch?v=undefined&t=3361s) Sensing, too, is going to be a thing.


You can ask questions aka Boolean expressions like is the sprite touching the mouse pointer, the arrow
on the screen? So that you can start to interact with these programs. What is the distance between a
sprite and a mouse pointer? You can do simple calculations just to figure out maybe if the enemy is
getting close to the cat.

- [56:20](https://www.youtube.com/watch?v=undefined&t=3380s) Under Operator some lower level


stuff like math, but also the ability to pick random numbers, which for a game is great because then you
can kind of vary the difficulty or what's happening in a game without the same game playing the same
way every time. And you can combine ideas. Something and something must be true in order to make
that kind of decision before.

- [56:38](https://www.youtube.com/watch?v=undefined&t=3398s) Or we can even join two words


together. Says apple and banana by default, but you can type in or drag and drop whatever you want
there to combine multiple words into full, larger sentences. Then lastly down here, there's in orange
things called variables. In math we've obviously got x and y and whatnot.

- [56:55](https://www.youtube.com/watch?v=undefined&t=3415s) In programming we'll have the same


ability to store in these named symbols, x or y, values that we care about. Numbers or letters or words or
colors or anything, ultimately. But in programming you'll see that it's much more conventional not to just
use simple letters like x and y and z, but to actually give variables full singular or plural words to describe
what they are.

- [57:20](https://www.youtube.com/watch?v=undefined&t=3440s) Then lastly, if this isn't enough color


blocks for you, you can create your own blocks. Indeed, this is going to be a programming principle we'll
apply today and with the first problem set whereby once you start to assemble these puzzle pieces and
you realize, oh, would have been nice if those several pieces could have just been replaced by one had
MIT thought to give me that one puzzle piece, you yourself can make your own blocks by connecting
these all together, giving them a name, and boom, a new puzzle piece will exist.
- [57:49](https://www.youtube.com/watch?v=undefined&t=3469s) So let's do the simplest, most
canonical programs here, starting up with control, and I'm going to click and drag and drop this thing
here when green flag clicked. Then I'm going to grab one more, for instance under Looks, and under
Looks I'm going to go ahead and just say something like initially not just Hello but the more canonical
Hello comma world.

- [58:10](https://www.youtube.com/watch?v=undefined&t=3490s) Now you might guess that in this


programming environment, I can go over here now and click the green flag and voila, Hello comma
world. So that's my first program and obviously much more user friendly than typing out the much more
cryptic text that we saw on the screen that you, too, will type out next week.

- [58:25](https://www.youtube.com/watch?v=undefined&t=3505s) But for now, we'll just focus on these


ideas, in this case, a function. So what it is that just happened? This purple block here is Say, that's the
function, and it seems to take some form of input in the white oval, specifically Hello comma world. Well
this actually fits the paradigm that we looked at earlier of just inputs and outputs.

- [58:42](https://www.youtube.com/watch?v=undefined&t=3522s) So if I may, if you consider what this


puzzle piece is doing, it actually fits this model. The input in this case is going to be Hello comma world in
white. The algorithm is going to be implemented as a function by MIT called Say and the output of that is
going to be some kind of side effect, like the cat and the speech bubble are saying Hello, world.

- [59:01](https://www.youtube.com/watch?v=undefined&t=3541s) So already even that simple drag and


drop mimics exactly this relatively simple mental model. So let's take things further. Let's go ahead now
and make the program a little more interactive so that it says something like Hello, David, or Hello,
Carter, or Hello to you specifically. And for this, I'm going to go under Sensing.

- [59:19](https://www.youtube.com/watch?v=undefined&t=3559s) And you might have to poke around


to find these things the first time around, but I've done this a few times so I kind of know where things
are and what color. There's this function here. Ask what's your name, but that's in white, so we can
change the question to anything we want, and it's going to wait for the human to type in their answer.

- [59:35](https://www.youtube.com/watch?v=undefined&t=3575s) This function called Ask is a little


different from the Say block, which just had this side effect of printing a speech bubble to the screen.
The ask function is even more powerful in that after it asks the human to type something in. This
function is going to hand you back what they typed in in the form of what's called a return value, which
is stored ultimately and by default this thing called Answer.

- [59:58](https://www.youtube.com/watch?v=undefined&t=3598s) This little blue oval here called


Answer is again one of these variables that in math would be called just x or y but in programming we're
saying what it does. So I'm going to go ahead and do this. Let me go ahead and drag and drop this block
and I want to ask the question before saying anything, but you'll notice that Scratch is smart and it's
going to realize I want to insert something in between and it's just going to move things up and down.

- [1:00:18](https://www.youtube.com/watch?v=undefined&t=3618s) I'm going to let go and ask the


default question, what's your name? And now if I want to go ahead and say hello, David or Carter, let's
just do Hello comma, because I obviously don't know when I'm writing the program who's going to use
it. So let me now grab another looks block up here, say something again, and now let me go back to
Sensing and now grab the return value, represented by this other puzzle piece, and let me just drag and
drop it here.

- [1:00:43](https://www.youtube.com/watch?v=undefined&t=3643s) Notice it's the same shape, even if


it's not quite the same size. Things will grow or shrink as needed. All right, so let's now zoom out. Let me
go and stop the old version because I don't want to say Hello, world anymore. Let me hit the green flag
and what's my name? All right, David. Enter.

- [1:00:58](https://www.youtube.com/watch?v=undefined&t=3658s) Huh. All right, maybe I just wasn't


paying close enough attention. Let me try it again. Green flag, D-A-V-I-D, Enter. This seems like a bug.
What's the bug or mistake might you think? Yeah? AUDIENCE: Do you need to somehow add them
together in the same text box? DAVID MALAN: Yeah, we kind of want to combine them in the same text
box.

- [1:01:21](https://www.youtube.com/watch?v=undefined&t=3681s) And it's technically a bug because


this just looks kind of stupid. It's just saying David after I asked for my name. I'd like it to say maybe Hello
then David, but it's just blowing past the Hello and printing David. But let's put our finger on why this is
happening. You're right for the solution, but what's the actual fundamental problem? In back.

- [1:01:39](https://www.youtube.com/watch?v=undefined&t=3699s) AUDIENCE: So it says hello, but it


gets to that last step so quickly you can't see it. DAVID MALAN: Perfect. I mean, computers are really
darn fast these days. It is saying Hello, all of us are just too slow in this room to even see it because it's
then saying David on the screen so fast as well. So there's a couple of solutions here, and yours is spot
on, but just to poke around, you'll see the first example of how many ways in programming be it Scratch
or C or Python or anything else, that there are going to be to solve problems?

- [1:02:05](https://www.youtube.com/watch?v=undefined&t=3725s) We'll teach you over the course of


these weeks, sometimes some ways are better relatively than others, but rarely is there a best way
necessarily, because again reasonable people will disagree. And what we'll try to teach you over the
coming weeks is how to kind of think through those nuances. And it's not going to be obvious at first
glance, but the more programs you write, the more feedback you get, the more bugs that you introduce,
the more you'll get your footing with exactly this kind of problem solving.

- [1:02:31](https://www.youtube.com/watch?v=undefined&t=3751s) So let me try this in a couple of


ways. Up here would be one solution to the problem. MIT anticipated this kind of issue, especially with
first-time programmers, and I could just use a puzzle piece that says say the following for two seconds or
one second or whatever, then do the same with the next word and it might be kind of a bit of a pause,
Hello, one second, two seconds, David, one second, two seconds, but at least it would look a little more
grammatically correct.

- [1:02:57](https://www.youtube.com/watch?v=undefined&t=3777s) But I can do it a little more


elegantly, as you've proposed. Let me go ahead and throw away one of these blocks, and you can just
drag and let go and it'll delete itself. Let me go down to Operators because this Join block here is the
right shape. So even if you're not sure what goes where, just focus on the shapes first.

- [1:03:14](https://www.youtube.com/watch?v=undefined&t=3794s) Let me drag this over here. It grew


to fill that. Let me go ahead and say hello comma space. Now it could just say by default Hello, banana,
but let me go back to Sensing, Drag answer, and that's going to drag and drop there. So now notice we're
sort of stacking or nesting one block on another so that the output of one becomes the input to another,
but that's OK here.

- [1:03:38](https://www.youtube.com/watch?v=undefined&t=3818s) Let me go ahead and zoom out, hit


Stop, and hit Play. All right, what's your name? D-A-V-I-D, Enter, and voila. Now it's presumably as we
first intended. [APPLAUSE] (LAUGHING) Oh, thank you. Thank you. No minus 2 this time. So consider that
even with this additional example, it still fits the same mental model, but in a little more interesting way.

- [1:04:06](https://www.youtube.com/watch?v=undefined&t=3846s) Here's that new function Ask


something and wait. And notice that in this case too there's an input, otherwise known henceforth as an
argument or a parameter, programming speak for just an input in the context of a function. If we use our
drawing as before to represent this thing here, we'll see that the input now is going to be quote unquote
"What's your name?" The algorithm is going to be implemented by way of this new puzzle piece, the
function called Ask, and the output of that thing this time

- [1:04:33](https://www.youtube.com/watch?v=undefined&t=3873s) is not going to be the cat saying


anything yet, but rather it's going to be the actual answer. So instead of the visual side effect of the
speech bubble appearing, now nothing visible is happening yet. Thanks to this function it's sort of
handing me back like a scrap of paper with whatever I typed in written on it so I can reuse D-A-V-I-D one
or more times even like I did.

- [1:04:58](https://www.youtube.com/watch?v=undefined&t=3898s) Now what did I then do with that


value? Well consider that with the subsequent function we had this Say block, too, combined with a join.
So we have this variable called Answer, we're joining it with that first argument, Hello. So already we see
that some functions like Join can take not one but two arguments, or inputs, and that's fine.

- [1:05:20](https://www.youtube.com/watch?v=undefined&t=3920s) The output of Join is presumably


going to be Hello, David or Hello, Carter or whatever the human typed in. That output notice is
essentially becoming the input to another function, Say, just because we've kind of stacked things or
nested them on top of one another. But methodically, it's really the same idea.

- [1:05:40](https://www.youtube.com/watch?v=undefined&t=3940s) The input now are two things,


Hello comma and the return value from the previous Ask function. The function now is going to be Join,
the output is going to be Hello, David. But that Hello, David output is now going to become the input to
another function, namely that first block called Say, and that's then going to have the side effect of
printing out Hello, David on the screen.

- [1:06:03](https://www.youtube.com/watch?v=undefined&t=3963s) So again as sort of sophisticated as


ours as yours as others programs are going to get, they really do fit this very simple mental model of
inputs and outputs and you just have to learn to recognize the vocabulary and to know what kinds of
puzzle pieces or concepts ultimately to apply. But you can ultimately really kind of spice these things up.

- [1:06:20](https://www.youtube.com/watch?v=undefined&t=3980s) Let me go back to my program here


that just is using the speech bubble at the moment. Scratch's inside has some pretty fancy interactive
features, too. I click the Extensions button in the bottom left corner. And let me go ahead and choose the
Text to Speech extension. This is using a Cloud service, so if you have an internet connection it can
actually talk to the Cloud or a third party service, and this one is going to give me a few new green puzzle
pieces, namely the ability to speak something from my speakers
- [1:06:45](https://www.youtube.com/watch?v=undefined&t=4005s) instead of just saying it textually. So
let me go ahead and drag this. Now notice I don't have to interlock them if I'm just kind of playing
around and I want to move some things around. I just want to use this as like a canvas temporarily. Let
me go ahead and steal the Join from here, put it there, let me throw away the Say block by just moving it
left and letting go, and now let me join this in so I've now changed my program to be a little more
interesting.

- [1:07:09](https://www.youtube.com/watch?v=undefined&t=4029s) So now let me stop the old version.


Let me start the new. What's your name? Type in David. And voila: PROGRAM: Hello, banana. DAVID
MALAN: (LAUGHING) OK, minus 2 for real. All right, so what I accidentally threw away there,
intentionally for instructional purposes, was the actual answer that came back from the ask block.

- [1:07:33](https://www.youtube.com/watch?v=undefined&t=4053s) That's embarrassing. So now if I


play this again, let's click the green icon. What's your name? David. And now: PROGRAM: Hello, David.
DAVID MALAN: There we go. Hello, David. All right, thank you. [APPLAUSE] OK, so we have these
functions then in place, but what more can we do? Well what about those conditionals and loops and
other constructs? How can we bring these programs to life so it's not just clicking a button and voila,
something's happening? Let's go ahead and make this now even more interactive.

- [1:08:04](https://www.youtube.com/watch?v=undefined&t=4084s) Let me go ahead and throw away


most of these pieces and let me just spice things up with some more audio under Sound. I'm going to go
to Play Sound Meow until done. Here we go, green flag. [MEOW] OK, it's a little loud, but it did exactly
do what it said. Let's hear it again. [QUIETER MEOW] OK. It's kind of an underwhelming program
eventually since you'd like to think that the cat would just meow on its own, but.

- [1:08:27](https://www.youtube.com/watch?v=undefined&t=4107s) [MEOW] I have to keep hitting the


button. Well this seems like an opportunity for doing something again and again. So all right, well if I
wanted to meow, meow, meow, let me just grab a few of these, or you can even right click or Control
click and you can Copy Paste even in code here. Let me play this now.

- [1:08:41](https://www.youtube.com/watch?v=undefined&t=4121s) [THREE MEOWS] All right, so now


like it's not really emoting happiness in quite the same way. It might be hungry or upset. So let's slow it
down. Let me go to Control, wait one second in between, which might be a little less worrisome. Here
we go, Play. [THREE SLOWER MEOWS] OK, so if my goal was to make the cat meow three times, I dare
say this code or algorithm is correct.

- [1:09:10](https://www.youtube.com/watch?v=undefined&t=4150s) But let's now critique its design. Is


this well-designed? And if not, why not? What are your thoughts here? Yeah? AUDIENCE: You could use
the forever or a repeat to make it more-- DAVID MALAN: Yeah, so yeah, agreed. I could use forever or
repeat, but let me push a little harder. But why? Like this works, I'm kind of done with the assignments,
what's bad about it? AUDIENCE: There's too much repetition.

- [1:09:38](https://www.youtube.com/watch?v=undefined&t=4178s) DAVID MALAN: Yeah, there's too


much repetition, right? If I wanted to change the sound that the cat is making to a different variant of
meow or have it bark instead like a dog, I could change it from the dropdown here apparently, but then
I'd have to change it here and then I'd have to change it here, and God, if this were even longer that just
gets tedious quickly and you're probably increasing the probability that you're going to screw up and
you're going to miss one of the dropdowns or something stupid and introduce a bug.

- [1:10:00](https://www.youtube.com/watch?v=undefined&t=4200s) Or, if you wanted to change the


number of seconds you're waiting, you've got to change it in two, maybe even more places. Again, you're
just creating risk for yourself and potential bugs in the program. So I do like the repeat or the forever
idea so that I don't repeat myself. And indeed, what I alluded to being possible, copy pasting earlier,
doesn't mean it's a good thing.

- [1:10:19](https://www.youtube.com/watch?v=undefined&t=4219s) And in code, generally speaking,


when you start to copy and paste puzzle pieces or text next week, you're probably not doing something
quite well. So let me go ahead and throw away most of these to get rid of the duplication, keeping just
two of the blocks that I care about. Let me grab the Repeat block for now, let me move this inside of the
Repeat block, it's going to grow to fit it, let me reconnect all this and change the 10 just to a 3, and now,
Play.

- [1:10:44](https://www.youtube.com/watch?v=undefined&t=4244s) [THREE SLOW MEOWS] So, better.


It's the same thing. It's still correct, but now I've set the stage to let the cat meow, for instance, four
times by changing one thing, 40 times by changing one thing, or it could just use the Forever block and
just walk away and it will meow forever instead. If that's your goal, that would be better.

- [1:11:05](https://www.youtube.com/watch?v=undefined&t=4265s) A better design but still correct. But


you know what? Now that I have a program that's designed to have a cat meow, wow like why? I mean,
MIT invented Scratch, Scratch as a cat, why is there no puzzle piece called Meow? This feels like a missed
opportunity. Now to be fair, they gave us all the building blocks with which we could implement that
idea, but a principle of programming and really computer science is to leverage what we're going to now
start calling Abstraction.

- [1:11:30](https://www.youtube.com/watch?v=undefined&t=4290s) We have step-by-step instructions


here, the Repeat, the Play, and the Wait that collectively implements this idea that we humans would
call meowing. Wouldn't it be nice to abstract away those several puzzle pieces into just one that literally
just says what it does, meow? Well here's where we can make our own blocks.

- [1:11:48](https://www.youtube.com/watch?v=undefined&t=4308s) Let me go over here to Scratch


under the pink block category here and let me click Make a Block. Here I see a slightly different interface
where I can choose a name for it and I'm going to call it Meow. I'm going to keep it simple. That's it. No
inputs to meow yet. I'm just going to click OK.

- [1:12:05](https://www.youtube.com/watch?v=undefined&t=4325s) Now I'm just going to clean this up


a bit here. Let me drag and drop Play Sound and Wait over here. And you know what? I'm just going to
drag this way down here, way down here because now that I'm done implementing Meow, I'm going to
literally abstract it away, sort of out of sight, out of mind, because now notice at top left there is a new
pink puzzle piece called Meow.

- [1:12:27](https://www.youtube.com/watch?v=undefined&t=4347s) So at this point, I'd argue it doesn't


really matter how Meow is implemented. Frankly, I don't know how Ask or Say was implemented by MIT.
They abstracted those things away for us. Now I have a brand new puzzle piece that just says what it is.
And this is now still correct, but arguably better design.
- [1:12:44](https://www.youtube.com/watch?v=undefined&t=4364s) Why? Because it's just more
readable to me, to you, it's more maintainable when you look at your code a year from now for the first
time because you're sort of finally looking back at the very first program you wrote. It says what it does.
The function itself has semantics, which conveys what's going on.

- [1:12:59](https://www.youtube.com/watch?v=undefined&t=4379s) If you really care about how Meow


is implemented, you could scroll down and start to tinker with the underlying implementation details,
but otherwise you don't need to care anymore. Now I feel like there's an even additional opportunity
here for abstraction and to factor out some of this functionality.

- [1:13:18](https://www.youtube.com/watch?v=undefined&t=4398s) It's kind of lame that I have this


Repeat block that lets me call the Meow function, so to speak, use the Meow function three times.
Wouldn't it be nice if I could just call them Meow function, aka use the Meow function, and pass it in
input that tells the puzzle piece how many times I want it to meow? Well let me go ahead and zoom out
and scroll down.

- [1:13:38](https://www.youtube.com/watch?v=undefined&t=4418s) Let me right click or Control click


on the pink piece here and choose Edit, or I could just start from scratch, no pun intended, with a new
one. Now here, rather than just give this thing a name Meow, let me go ahead and add an input here.
I'm going to go ahead and type in, for instance, n, for number of times to meow, and just to make this
even more user friendly and self descriptive, I'm going to add a label, which has no functional impact, it's
just an aesthetic, and I'm just going to say Times, just to make it read more like English

- [1:14:06](https://www.youtube.com/watch?v=undefined&t=4446s) in this case that tells me what the


puzzle piece does. Now I'm going to click OK. And now I need to refine this a little bit. Let me go ahead
and grab under Control a repeat block, let me move the Play, Sound, and Wait, into the repeat block. I
don't want 10 and I also don't want 3 here. What I want now is this n that is my actual variable that
Scratch is creating for me that represents whatever input the human programmer provides.

- [1:14:34](https://www.youtube.com/watch?v=undefined&t=4474s) Notice that snaps right in place. Let


me connect this and now voila, I have an even fancier version of Meow that is parameterized. It takes
input that affects its behavior accordingly. Now I'm going to scroll back up, because out of sight, out of
mind, I just care that Meow exists. Now I can tighten up my code, so to speak, use even fewer lines to do
the same thing by throwing away the Repeat block, reconnecting this new puzzle piece here that takes
an input like 3 and voila, now we're really programming, right?

- [1:15:02](https://www.youtube.com/watch?v=undefined&t=4502s) We've not made any forward


progress functionally. The thing just mouse three times. But it's a better design. As you program more
and more, these are the kinds of instincts still start to acquire so that one, you can start to take a big
assignment, a big problem set, something for homework even, that feels kind of overwhelming at first,
like, oh my God where do I even begin? But if you start to identify what are the subproblems of a bigger
problem? Then you can start making progress.

- [1:15:27](https://www.youtube.com/watch?v=undefined&t=4527s) I do this to this day where if I have


to tackle some programming-related project it's so easy to drag my feet and ugh, it's going to take
forever to start, until I just start writing down like a to do list and I start to modularize the program and
say, all right, well what do I want this thing to do? Meowing.
- [1:15:43](https://www.youtube.com/watch?v=undefined&t=4543s) What's that mean? I've got to have
it say something on the screen. All right, I need to have it say something on the screen some number of
times. Like literally a mental or written checklist, or pseudocode code, if you will, in English on a piece of
paper or text file, and then you can decide, OK, the first thing I need to do for homework to solve this
real world problem, I just need a Meow function.

- [1:16:02](https://www.youtube.com/watch?v=undefined&t=4562s) I need to use a bunch of other


code, too, but I need to create a Meow function and boom, now you have a piece of the problem solved
not unlike we did with the phone book there, but in this case, we'll have presumably other problems to
solve. All right, so what more can we do? Let's add a few more pieces to the puzzle here.

- [1:16:19](https://www.youtube.com/watch?v=undefined&t=4579s) Let's actually interact with the cat


now. Let me go ahead and now when the green flag is clicked, let me go ahead and ask a question using
an event here. Let me go ahead and say, let's see, I want to do something like implement the notion of
petting the cat. So if the cursor is touching the cat like here, something like this, it'd be cute if the cat
meows like you're petting a cat.

- [1:16:42](https://www.youtube.com/watch?v=undefined&t=4602s) So I'm going to ask the question,


when the green flag is clicked, if let's see I think I need Sensing. So if touching mouse pointer, this is way
too big but again the shape is fine, so there goes. Grew to fill. And then if it's touching the mouse
pointer, that is if the cat to whom this script or this program, any time I attach puzzle pieces MIT calls
them a script or like a program, if you will, let me go ahead then and choose a sound and say play sound
meow until done.

- [1:17:10](https://www.youtube.com/watch?v=undefined&t=4630s) All right, so here it is to be clear.


When the green flag is clicked, ask the question, if the cat is touching the mouse pointer then place
sound meow. Here we go. Play. [SILENCE] All right, let's try again. Play. [SILENCE] Huh. I'm worried it's not
Scratch's fault. Feels like mine. What's the bug here? Why doesn't this work? Yeah, in back, who just
turned.

- [1:17:42](https://www.youtube.com/watch?v=undefined&t=4662s) AUDIENCE: [INAUDIBLE] DAVID


MALAN: Yeah, the problem is the moment I click that green flag, Scratch asks the question, is the cat
touching the mouse pointer? And obviously it's not because the cursor was like up there a moment ago
and it's not down there. It's fine if I move the cursor down there, but too late.

- [1:18:01](https://www.youtube.com/watch?v=undefined&t=4681s) The program already asked the


question. The answer was no or false or zero, however you want to think about it, so no sound was
played. So what might be the solution here be? I could move my cursor quickly, but that feels like never
going to work out right. Other solutions here? Yeah, in way back? Could you use the forever loop? The
Forever loop.

- [1:18:21](https://www.youtube.com/watch?v=undefined&t=4701s) So I could indeed use this Forever


loop because if I want my program to just constantly listen to me, well let's literally do something
forever, or at least forever as long as the program is running until I explicitly hit Stop. So let me grab that.
Let me go to Control, let me grab the Forever block, let me move the If inside of this Forever block,
reconnect this, go back up here, click the green flag, and now nothing's happened yet, but let me try
moving my cursor now.
- [1:18:46](https://www.youtube.com/watch?v=undefined&t=4726s) [MEOW] Oh. So now. [MEOW]
That's kind of cute. So now the cat is actually responding and it's going to keep doing this again and
again. So now we have this idea of taking these different ideas, these different puzzle pieces, assembling
them into something more complicated. I could definitely put a name to this.

- [1:19:04](https://www.youtube.com/watch?v=undefined&t=4744s) I could create a custom block, but


for now let's just consider what kind of more interactivity we can do. Let me go ahead and do this. By
again grabbing a, when green flag clicked, let me go ahead and click the video sensing, and I'm going to
rotate the laptop because otherwise we're going to get a little inception thing here where the camera is
picking up the camera is up there.

- [1:19:22](https://www.youtube.com/watch?v=undefined&t=4762s) So I'm going to go reveal to you


what's inside the lectern here while we rotate this. Now that we have a non video backdrop, I'm going to
say this. Instead of the green flag clicked, actually, I'm going to say when the video motion is greater than
some arbitrary measurement of motion, I'm going to go ahead and play sound meow until done.

- [1:19:47](https://www.youtube.com/watch?v=undefined&t=4787s) And then I'm going to get out of


the way. So here's the cat. We'll put them on top of there. [MEOW] OK. All right, and here we go.
[MEOW] So my hand is moving faster than 50 something or other, whatever the unit of measure is.
[MEOW] AUDIENCE: Aw. DAVID MALAN: (LAUGHING) Thank you. So now we have an even more
interactive version.

- [1:20:11](https://www.youtube.com/watch?v=undefined&t=4811s) [MEOW] But I think if I sort of


slowly. [LAUGHING] (LAUGHING) Right? It's completely creepy, but I'm not like exceeding the threshold--
[MEOW] Until finally my hand moves as fast as that. And so here actually is an opportunity to show you
something a former student did. Let me go ahead here and-- [MEOW TWICE] OK, got to stop this.

- [1:20:36](https://www.youtube.com/watch?v=undefined&t=4836s) Let me go ahead and zoom out of


this in just a moment. [MEOW] If someone would be-- [LAUGHING] (LAUGHING) If someone would be
comfortable coming up not only masked but also on camera on the internet I thought we'd play one of
your former classmate's projects here up on stage. Would anyone like to volunteer here and be up on
stage? Who's that? Yeah.

- [1:20:52](https://www.youtube.com/watch?v=undefined&t=4852s) Come on down. What's your


name? AUDIENCE: Sahar. DAVID MALAN: Sahar. All right, come on down. Let me get it set up for you
here. [MEOW] [APPLAUSE] [MEOW] All right, let me go ahead and full screen this here. So this is whack-
a-mole by one of your firmer predecessors. It's going to use the camera focusing on your head, which
will have to position inside of this rectangle.

- [1:21:22](https://www.youtube.com/watch?v=undefined&t=4882s) Have you ever played the whack-a-


mole game at an arcade? AUDIENCE: Yeah. DAVID MALAN: OK. So for those who haven't, these little
moles pop up and with a very fuzzy hammer you sort of hit down. You though, if you don't mind, you're
going to use your head to do this virtually. So let's line up your head with this red rectangle, if you could,
we'll do beginner.

- [1:21:40](https://www.youtube.com/watch?v=undefined&t=4900s) [MUSIC PLAYING] All right, here we


go. Sahar. Give it a moment. OK, come a little closer. [DINGING] And now hit the moles with your head.
[DING] There we go, one point. [DING] One point. [DINGING] Nice. 15 seconds to go. There we go. Oh
yeah. One point. [LAUGHING] [DINGING] Six seconds. AUDIENCE: Oh no. DAVID MALAN: There we go.

- [1:22:12](https://www.youtube.com/watch?v=undefined&t=4932s) Quick! [DINGING] All right, a round


of applause for Sahar. Thank you. [APPLAUSE] So beyond having a little bit of fun here, the goal was to
demonstrate that by using some fairly simple, primitive, some basic building blocks but assembling them
in a fun way with some music, maybe some new costumes or artwork, you can really bring programs to
life.

- [1:22:38](https://www.youtube.com/watch?v=undefined&t=4958s) But at the end of the day, the only


puzzle pieces really involved were ones like the ones I just dragged and dropped and a few more,
because there were clearly lots of moles. So the student probably created a few different sprites, not a
single cap, but at least four different moles. They had like some kind of graphic on the screen that
showed Sahar where to position her head.

- [1:22:55](https://www.youtube.com/watch?v=undefined&t=4975s) There were some kind of timer,


maybe a variable that every second was counting down. So you can imagine taking what looks like a
pretty impressive project at first glance, and perhaps overwhelming to solve yourself, but just think
about what are the basic building blocks? And pluck off one piece of the puzzle, so to speak, at a time.

- [1:23:12](https://www.youtube.com/watch?v=undefined&t=4992s) So indeed if we rewind a little bit.


Let me go ahead here and introduce a program that I myself made back in graduate school when Scratch
was first being developed by MIT. Let me go ahead and open here, give me just one second, something
that I called back in the day Oscar Time that looks a little something like this.

- [1:23:33](https://www.youtube.com/watch?v=undefined&t=5013s) If I fullscreen it and hit Play.


[MUSIC - SESAME STREET, "I LOVE TRASH"] OSCAR THE GROUCH: (SINGING) Oh, I love trash. DAVID
MALAN: So you'll notice a piece of trash is falling. I can click on it and drag and as I get close and close to
the trash can notice OSCAR THE GROUCH: (SINGING) Anything ragged or-- DAVID MALAN: It wants to go
in, it seems.

- [1:23:50](https://www.youtube.com/watch?v=undefined&t=5030s) And if I let go-- OSCAR THE


GROUCH: (SINGING) Yes, I-- DAVID MALAN: One point. Here comes another. OSCAR THE GROUCH:
(SINGING) If you really want to see something trashy-- DAVID MALAN: I'll do the same, two points.
OSCAR THE GROUCH: (SINGING) I have here a sneaker that's tattered and worn-- DAVID MALAN: There's
a sneaker falling from the sky, so another sprite of some sort.

- [1:24:03](https://www.youtube.com/watch?v=undefined&t=5043s) OSCAR THE GROUCH: (SINGING)


The laces are torn. A gift from my mother-- DAVID MALAN: I can also get just a little lazy and just let
them fall into the trash themself if I want to. So you can see it doesn't have to do with my mouse cursor,
it has to do apparently with the distance here. Let's listen a little further.

- [1:24:20](https://www.youtube.com/watch?v=undefined&t=5060s) I think some additional trash is


about to make its appearance. Presumably there's some kind of variable that's keeping track of this
score. OSCAR THE GROUCH: (SINGING) I love-- DAVID MALAN: OK, let's see what the last chorus here is.
OSCAR THE GROUCH: (SINGING) Rotten stuff. I have here some newspaper, crusty and DAVID MALAN:
OK, and thus he continues.
- [1:24:38](https://www.youtube.com/watch?v=undefined&t=5078s) And the song actually goes on and
on and on and I do not have fond memories of implementing this and hearing this song for like 10
straight hours, but it's a good example to just consider how was this program composed? How did I go
about implementing it the first time around? And let me go ahead and open up some programs now that
I wrote in advance just so that we could see how these things are assembled.

- [1:24:59](https://www.youtube.com/watch?v=undefined&t=5099s) Honestly, the first thing I probably


did was probably to do something a little like this. Here is just a version of the program where I set out to
solve just one problem first of planting a lamp post in the program. Right? I kind of had a vision of what I
wanted. You know, it evolved over time, certainly, but I knew I wanted trash to fall, I wanted a cute little
Oscar the Grouch to pop out of the trashcan, and some other stuff, but wow that's a lot to just tackle all
at once.

- [1:25:24](https://www.youtube.com/watch?v=undefined&t=5124s) I'm going to start easy, download a


picture of a lamp post, and then drag and drop it into the stage as a costume and boom, that's version
one. It doesn't functionally do anything. I mean, literally that's the code that I wrote to do this. All I did
was use like the Backdrops feature and drag and drop and move things around, but it got me to version
one of my program.

- [1:25:44](https://www.youtube.com/watch?v=undefined&t=5144s) Then what might version two be?


Well I considered what piece of functionality frankly might be the easiest to pluck off next and the trash
can. That seems like a pretty core piece of functionality. It just needs to sit there most of the time. So the
next thing I probably did was to open up, for instance, the trash can version here that looks a little
something now like this.

- [1:26:06](https://www.youtube.com/watch?v=undefined&t=5166s) So this time I'll show you what's


inside here. There is some code, but not much. Notice at bottom right I change the default cat to a
picture of a trashcan, instead, but it's the same principle that I can control. And then over here I added
this code. When the green flag is clicked, switch the costume to something I arbitrarily called Oscar 1.

- [1:26:25](https://www.youtube.com/watch?v=undefined&t=5185s) So I found a couple of different


pictures of a trash can, one that looks closed, one that looks partly open, and eventually one that has
Oscar coming out, and I just gave them different names. So I said Switch to Oscar 1, which is the closed
one by default, then forever do the following: if touching the mouse pointer, then switch the costume to
Oscar 2, else switch to Oscar 1.

- [1:26:46](https://www.youtube.com/watch?v=undefined&t=5206s) That is to say, I just wanted to


implement this idea of the can opening and closing, even if it's not exactly what I wanted ultimately, I
just wanted to make some forward progress. So here, when I run this program by clicking Play, notice
what happens. Nothing yet, but if I get closer to the trash can, it indeed pops open because it's forever
listening for whether the sprite, the trash can in this case, is touching the mouse pointer.

- [1:27:12](https://www.youtube.com/watch?v=undefined&t=5232s) And that's it. That was version 2, if


you will. If I went in now and added the lamp post and compose the program together, now we're
starting to make progress. Right? Now it would look a little something more like the program I intended
ultimately to create. What piece did I probably bite off after that? Well, I think what I did is I probably
decided let me implement one of the pieces of trash, not the shoe in the newspaper all at once.
- [1:27:34](https://www.youtube.com/watch?v=undefined&t=5254s) Let's just get one piece of trash
working correctly first. So let me go ahead and open this one. And again, all of these examples will be
available on the course's website so you can see all of these examples, too. It's not terribly long, I just
implement it in advance so we could flip through kind of quickly.

- [1:27:51](https://www.youtube.com/watch?v=undefined&t=5271s) Here's what I did here. On the right


hand side, I turned my sprite into a piece of trash this time instead of a cat, instead of a trash can, and I
also created, with Carter's help, a second sprite, this one a floor. It's literally just a black line because I
just wanted initially to have some notion of a floor so I could detect if the trash is touching the floor.

- [1:28:12](https://www.youtube.com/watch?v=undefined&t=5292s) Now without seeing the code yet,


just hearing that description, why might I have wanted the second sprite and this black line for a floor
with the trash intending to fall from the sky? What might I have been thinking? Like what problem might
I be trying to solve? Yeah? AUDIENCE: You don't want the first sprite to go through it.

- [1:28:29](https://www.youtube.com/watch?v=undefined&t=5309s) DAVID MALAN: Yeah, you don't


want the first sprite to start at the top, go through, and then boom, you completely lose it. That would
not be a very useful thing. Or it would seem to maybe eat up more and more of the computer's memory
if the trash is just endlessly falling and I can't grab it.

- [1:28:43](https://www.youtube.com/watch?v=undefined&t=5323s) It might be a little traumatic if you


tried to get it and you can't pull it back out and you can't fix the program. So I just wanted the thing to
stop. So how might I have implemented this? Let's look at the code at left. Here I have a bit of
randomness, like I proposed earlier exists. There's this blue function called Go To x, y that lets me move a
sprite to any position, up, down, left, right, I picked a random x location, either here or over here,
negative 240 to positive 240, and then a y value of 180, which is the top.

- [1:29:12](https://www.youtube.com/watch?v=undefined&t=5352s) This just makes the game more


interesting. It's kind of lame pretty quickly if the trash always falls from the same spot. Here's this a little
bit of randomness, like most any game would have, that spices things up. So now if I click the green flag,
you'll see that it just falls, nothing interesting is going to happen, but it does stop when it touches the
black line because notice what we did here.

- [1:29:33](https://www.youtube.com/watch?v=undefined&t=5373s) I'm forever asking the question if


the distance of the sprite, the trash, is to the floor is greater than zero, that's fine. Change the y location
by negative 3. So move it down 3 pixels, down 3 pixels, until the distance to the floor is not greater than
zero, it is zero or even negative, at which point it should just stop moving altogether.

- [1:29:54](https://www.youtube.com/watch?v=undefined&t=5394s) There's other ways we could have


implemented this, but this felt like a nice, clean way that logically, just made it make sense. OK, now I got
some trash falling, I got a trash can that opens and closes, I have a lamp post, now I'm a good three steps
into the program. We're making progress. If we consider one or two final pieces, something like the
dragging of the trash, let me go ahead and open up this version 2.

- [1:30:18](https://www.youtube.com/watch?v=undefined&t=5418s) Dragging the trash requires a


different type of question. Let me zoom in here. Here's the piece of trash. I only need one sprite, no floor
here because I just want the human to move it up, down, left, right and the human's not going to
physically be able to move it outside of the world. If we zoom in on this code, the way we've solved this
is as follows.

- [1:30:36](https://www.youtube.com/watch?v=undefined&t=5436s) We're using that And conjunction


that we glimpsed earlier because when the green flag is clicked, we're forever asking this question or
really these questions, plural, if the mouse is down and the trash is touching the mouse pointer, that's
equivalent logically to clicking on the trash. Go ahead and move the trash to the mouse pointer.

- [1:30:59](https://www.youtube.com/watch?v=undefined&t=5459s) So again it takes this very familiar


idea that you and I take for granted every day on Macs and PCs of clicking and dragging and dropping.
How is that implemented? Well Mac OS or Windows are probably asking a question. For every icon, is
the mouse down and is the icon touching the mouse? If so, go to the location of the mouse forever while
the mouse button is clicked down.

- [1:31:22](https://www.youtube.com/watch?v=undefined&t=5482s) So how does this work in reality


now? Let me go ahead and click on the Play. Nothing happens at first, but if I click on it, I can move it up,
down, left, right. It doesn't move thereafter. So I now need to kind of combine this idea of dragging with
falling, but I bet I could just start to use just one single program.

- [1:31:40](https://www.youtube.com/watch?v=undefined&t=5500s) Right now I'm using separate ones


to show different ideas, but now that's another bite out of the problem. If we do one last one,
something like the scorekeeping is interesting, because recall that every time we dragged a piece of trash
into the can, Oscar popped out and told us the current score.

- [1:31:55](https://www.youtube.com/watch?v=undefined&t=5515s) So let me go ahead and find this


one, Oscar variables, and let me zoom in on this one. This one is longer because we combined all of
these elements. So this is the kind of thing that if you looked at first glance, like, I have no idea how I
would have implemented this from nothing, from scratch literally.

- [1:32:12](https://www.youtube.com/watch?v=undefined&t=5532s) But again, if you take your vision


and componenitize it into these smaller, bite-sized problems, you could take these baby steps, so to
speak, and then solve everything collectively. So what's new here is this bottom one. Forever do the
following: if the trash is touching Oscar, the other sprite that we've now added to the program, change
the score by 1.

- [1:32:36](https://www.youtube.com/watch?v=undefined&t=5556s) This is an orange and indeed if we


poke around we'll see that orange is a variable, like an x or y but with a better name, changing it means
to add 1 or if it's negative subtract 1. Then go ahead and have the trash go to pick random. What is this
all about? Well, let me show you what it's doing and then we can infer backwards.

- [1:32:57](https://www.youtube.com/watch?v=undefined&t=5577s) Let me go ahead and hit Play. All


right, it's falling, I'm clicking and dragging it, I'm moving it over, and I'm letting go. All right, let me do it
once more. Letting go, let me stop. Why do I have this function at the end called Go To x and y
randomly? Like what problem is this solving here? Yeah, in way back.

- [1:33:18](https://www.youtube.com/watch?v=undefined&t=5598s) AUDIENCE: Just the same track


teleported to the top after you put it in the trash can. DAVID MALAN: Yeah, exactly. Even though the
human perceives this as like a lot of trash falling from the sky, it's actually the same piece of trash, just
kind of being magically moved back to the top as though it's a new one.

- [1:33:34](https://www.youtube.com/watch?v=undefined&t=5614s) There, too, you have this idea of


reusable code. If you were constantly copying and pasting your pieces of trash and creating 20 pieces of
trash, 30 pieces of trash, just because you want the game to have that many levels, probably doing
something wrong. Reuse the code that you wrote, reuse the sprites that you wrote, and that would give
you not just correctness, but also a better design.

- [1:33:55](https://www.youtube.com/watch?v=undefined&t=5635s) Well let's take a look at one final


set of building blocks that we can compose ultimately into something particularly interactive as follows.
Let me go ahead and zoom out here and let me propose that we implement something like some kind of
maze-based game. Let me go ahead here. So I want to implement some maze-based game that looks at
first glance like this.

- [1:34:15](https://www.youtube.com/watch?v=undefined&t=5655s) Let me hit Play. It's not a very fun


game yet, but here's a little Harvard shield, a couple of black lines, this time vertical instead of horizontal,
but notice you can't quite see my hand here, but I'm using my arrow keys to go down, to go up, to go
left, to go right, but if I keep going right, right, right, right, right, right, right it's not going anywhere.

- [1:34:33](https://www.youtube.com/watch?v=undefined&t=5673s) And left, left, left, left, left, left, left,


left, left, left, left, left, left it eventually stops. So before we look at the code, how might this be working?
What kinds of scripts, collections of puzzle pieces, might collectively help us implement this? What do
you think? AUDIENCE: [INAUDIBLE] DAVID MALAN: Perfect, yeah.

- [1:34:58](https://www.youtube.com/watch?v=undefined&t=5698s) There's probably some question


being asked, if touching the black line, and it happens to be a couple of sprites, each of which is just
literally a vertical black line we're probably asking a question like, are you touching it? Is the distance to
it zero or close to zero? And if so, we just ignore the left or the right arrow at that point.

- [1:35:16](https://www.youtube.com/watch?v=undefined&t=5716s) So that works. But otherwise, if


we're not touching a wall, what are we probably doing instead forever here? How is the movement
working presumably? Yeah and back. Oh are you scratching? OK, sure. Let's go on. AUDIENCE:
[INAUDIBLE] DAVID MALAN: Sorry, say a little louder. AUDIENCE: Presumably it's continually looking for
you to hit the arrow keys and then moving when you do.

- [1:35:40](https://www.youtube.com/watch?v=undefined&t=5740s) DAVID MALAN: Exactly. It's


continually, forever listening for the arrow keys up, down, left, right, and if the up arrow is pressed, we're
probably changing the y by a positive value. If the down arrow is pressed, we're going down by y, and left
and right accordingly. So let's actually take a quick look.

- [1:35:56](https://www.youtube.com/watch?v=undefined&t=5756s) If I zoom out here and take a look


at the code that implements this, there's a lot going on at first glance, but let's see. First of all, let me
drag some stuff out of the way because it's kind of overwhelming at first glance, especially if you, for
instance, were poking around online as for problem set 0 just to get inspiration, most projects out there
are going to look overwhelming at first glance until you start to wrap your mind around what's going on.
- [1:36:16](https://www.youtube.com/watch?v=undefined&t=5776s) But in this case, we've
implemented some abstractions from the get go to explain to ourselves and to anyone else looking at
the code what's going on. This is that program with the two black lines and the Harvard shield going up,
down, left, and right. It initially puts the shield in the middle, 0,0, then forever listens for keyboard, as I
think you were describing, and it feels for the walls, as I think you were describing.

- [1:36:40](https://www.youtube.com/watch?v=undefined&t=5800s) Now how is that implemented?


Don't know yet. These are custom blocks we created as abstractions to kind of hide those
implementation details because honestly that's all I need to know right now. But, as aspiring
programmers, if we're curious now, let's scroll down to the actual implementation of listening for
keyboard.

- [1:36:57](https://www.youtube.com/watch?v=undefined&t=5817s) This is the one on the left and it is a


little long, but it's a lot of similar structure. We're doing the following, if the up arrow is pressed, then
change y by 1. Go up. If the down arrow is pressed, then change y by negative 1. Go down. Right arrow,
left arrow, and that's it. So it just assembles all of those ideas, combines it into one new block just
because it's kind of overwhelming, let's just implement it once and tuck it away.

- [1:37:23](https://www.youtube.com/watch?v=undefined&t=5843s) And if we scroll now over to the


Feel for Walls function, this now is asking the question as hypothesized, if I'm touching the left wall,
change my x value by 1, sort of move away from it a little bit. If I'm touching the right wall, then move x
by negative 1 to move a little bit away from it. So it kind of bounces off the wall.

- [1:37:42](https://www.youtube.com/watch?v=undefined&t=5862s) Just in case it slightly went over, we


keep the crest within those two walls. All right, then a couple of more pieces here to introduce. What if
we want to actually add some kind of adversary or opponent to this game? Well, let me go ahead to
maybe this one here where the adversary in this game might, for instance, be designed to be bouncing
to stand in your way.

- [1:38:05](https://www.youtube.com/watch?v=undefined&t=5885s) This is like a maze and you're trying


to get the Harvard shield from the bottom to the top or vice versa. Uh oh, Yale is in the way and it seems
to be automatically bouncing back and forth here. Well, let me ask someone else. Hypothesize. How is
this working? This is an idea you have, this as an idea you see.

- [1:38:21](https://www.youtube.com/watch?v=undefined&t=5901s) Let's reverse engineer in your head


how it works. How might this be working? Yeah, in back. AUDIENCE: If the Yale symbol is touching a right
wall or left wall, then have it bounce. DAVID MALAN: Yeah, so if the Yale symbol is touching the left wall
or the right wall, we somehow have it bounce. And indeed we'll see there's a puzzle piece that can do
exactly that technically off the edge, as we'll see, but there's another way we can do this.

- [1:38:46](https://www.youtube.com/watch?v=undefined&t=5926s) Let's look at the code. The way we


ourselves can implement exactly that idea bounce is just with a little bit of logic. So here's what this
version of the program is doing. It's moving Yale by default to 0,0 just to arbitrarily put it somewhere,
pointing it direction 90 degrees, which means just horizontally, essentially, and then it's forever doing
this: if touching the left wall or touching the right wall, here's our translation of bounce.

- [1:39:10](https://www.youtube.com/watch?v=undefined&t=5950s) We're just turning 180 degrees.


And the nice thing about that is we don't have to worry if we're going from right to left or left to right.
180 degrees is going to work on both of the walls. And that's it. After we do that, we just move one step,
one pixel, at a time but we're doing it forever so something is happening continually and the Yale icon is
bouncing back and forth.

- [1:39:31](https://www.youtube.com/watch?v=undefined&t=5971s) Well one final piece here, what if


now we want another adversary, a more advanced adversary down the road for instance, to go and
follow us wherever we are such that this time we want the other sprite to not just bounce back and
forth, but literally follow us no matter where we go. How might this be implemented on the screen? I bet
it's another forever block, but what's inside? AUDIENCE: So forever get the location of the of the Harvard
shield and move one step towards it.

- [1:40:07](https://www.youtube.com/watch?v=undefined&t=6007s) DAVID MALAN: Yeah, forever point


at the location of the Harvard shield and go one step toward it. This is just going to go on forever if I just
give up, at least in this version. Notice it's sort of twitching back and forth because it goes one pixel then
one pixel then one pixel. It's sort of in a frantic state here.

- [1:40:22](https://www.youtube.com/watch?v=undefined&t=6022s) We haven't finished the game yet,


but if we see inside, we'll see exactly that. It didn't take much to implement this simple idea. Go to a
random position just to make it kind of fair, initially, then forever point towards Harvard, which is what
we called the Harvard crest sprite, move one step.

- [1:40:37](https://www.youtube.com/watch?v=undefined&t=6037s) Suppose we now wanted to make a


more advanced level. What's a minor change I could logically make to this code just to make MIT even
better at this? AUDIENCE: Change the number of steps to two. DAVID MALAN: All right, change the
number of steps to two. So let's try that. So now they got twice as fast.

- [1:40:51](https://www.youtube.com/watch?v=undefined&t=6051s) Let me go ahead and just get this


out of the way. Oops, let me make it a fair fight. Green flag. All right, I unfortunately am still moving one
pixel at a time, so this isn't going to end well. It caught up to me. And if we're really aggressive and do
something like 20 steps at a time, click the green flag.

- [1:41:12](https://www.youtube.com/watch?v=undefined&t=6072s) Jesus, OK, so that's how you might


then make your levels progressively harder and harder. So it's not an accident that we chose these
particular examples here involving these particular schools because we have one more demonstration
we thought we'd introduce today if we could get one other volunteer to come up and play what was
called by one of your predecessors Ivy's Hardest Game.

- [1:41:34](https://www.youtube.com/watch?v=undefined&t=6094s) Let's see, you in the middle. Do you


want to come on up? What's your name? AUDIENCE: Celeste. DAVID MALAN: Say again? AUDIENCE:
Celeste. DAVID MALAN: Come a little closer, actually. Sorry, hard to hear here. All right, round of
applause here if we could, too. [APPLAUSE] OK, sorry, what was your name? AUDIENCE: Celeste.

- [1:41:55](https://www.youtube.com/watch?v=undefined&t=6115s) DAVID MALAN: Ceweste


AUDIENCE: Celeste. DAVID MALAN: Celeste. AUDIENCE: Yes. DAVID MALAN: Come on over. Nice to meet
you, too. So here we have on this other screen Ivy's Hardest Game written by a former CS50 student. I
think you'll see that it combines these same principles. The maze is clearly a little more advanced.
- [1:42:10](https://www.youtube.com/watch?v=undefined&t=6130s) The goal at hand is to initially move
the Harvard crest to the sprite all the way on the right so that you catch up to him in this case, but you'll
see that there's different levels and different levels of sophistication. So if you're up for it, you can use
just these arrow keys up, down, left, right.

- [1:42:25](https://www.youtube.com/watch?v=undefined&t=6145s) You'll be controlling the Harvard


sprite and if we could raise the volume just a little bit, we'll make this our final example. Here we go,
clicking the green flag. [MUSIC PLAYING] Feeling ready? AUDIENCE: Yep. DAVID MALAN: Spacebar.
[MUSIC - MC HAMMER, "U CAN'T TOUCH THIS"] MC HAMMER: (SINGING) Can't touch this.

- [1:42:45](https://www.youtube.com/watch?v=undefined&t=6165s) You can't touch this. You can't


touch this. Can't touch this. My, my, my, my music-- DAVID MALAN: Excellent. MC HAMMER: (SINGING)
so hard. Makes me want to say, oh my Lord. Thank you for blessing me-- DAVID MALAN: Two Yales now.
MC HAMMER: (SINGING) Feels good when you know you're down. A super dope homeboy-- AUDIENCE:
Oh! DAVID MALAN: Oh! Keep going.

- [1:43:10](https://www.youtube.com/watch?v=undefined&t=6190s) MC HAMMER: (SINGING) You can't


touch this. I told you, homeboy. Can't touch this. Yeah, that's how living-- DAVID MALAN: All right. MC
HAMMER: (SINGING) Can't touch this. Look at my eyes, man. You can't touch this. You let me bust the
funky lyrics. You can't touch this. Fresh new kicks and pants.

- [1:43:27](https://www.youtube.com/watch?v=undefined&t=6207s) You got it like that and you know


you want to dance. So move out of your seat and get a fly girl and catch this beat. [LAUGHING] Hold on.
Pump a little bit and let them know what's going on like that, like that. Cold on a mission, so fall on back.
Let them know that you're too-- DAVID MALAN: There you go.

- [1:43:42](https://www.youtube.com/watch?v=undefined&t=6222s) There you go. [APPLAUSE] MC


HAMMER: (SINGING) Can't touch this. Why you standing there, man? You can't touch this. Yo, sound the
bell. School's in, sucker. Can't touch this. Give me a song or rhythm, making them sweat that's what give
them. [CHEERING] They know. You talking the Hammer when you're talking about a show.

- [1:44:03](https://www.youtube.com/watch?v=undefined&t=6243s) That's hyped and tight. Singers are


sweating so them a wipe or a tame to learn. DAVID MALAN: Second to last level. Oh! MC HAMMER:
(SINGING) That chart's legit. Either work hard or you might as well quit. That word because you know--
DAVID MALAN: Oh! Keep going, keep going! Yes! MC HAMMER: (SINGING) You can't touch this.

- [1:44:20](https://www.youtube.com/watch?v=undefined&t=6260s) DAVID MALAN: You're almost


there. MC HAMMER: (SINGING) Break it down. DAVID MALAN: There you go. Go, go, go! Oh. One more.
Yes! [CHEERING] There you go. MC HAMMER: (SINGING) Stop, Hammer time. "Go with the flow," it is
said. If you can't groove to this, then you're probably dead. So wave your hands in the air, bust a few
moves, run your fingers through your hair.

- [1:44:45](https://www.youtube.com/watch?v=undefined&t=6285s) This is it. For a winner. Dance to


this and you're going to get thinner. Now move, slide your rump. Just for a minute let's all do the bump.
[CHEERING] DAVID MALAN: Yes! [APPLAUSE] Congratulations. All right, that's it for CS50. Welcome to the
class. We'll see you next time. [MUSIC PLAYING] DAVID J. MALAN: All right.
- [1:46:28](https://www.youtube.com/watch?v=undefined&t=6388s) So this is CS50. And this is week 1,
the one in which you learn a new language, which is something we technically said last week, at least if
you had never played with this graphical language known as Scratch before, which itself was a
programming language. But today, as promised, we transition to something a little more traditional, a
little more text-based, not puzzle piece- or block-based, known as C.

- [1:46:49](https://www.youtube.com/watch?v=undefined&t=6409s) This is an older language. It's been


around for decades. But it's a language that underlies so many of today's more modern languages,
among them something called Python that we'll also come to in a few weeks' time. Indeed, at the end of
the semester, the goal is for you to feel that you've not learned Scratch, you've not learned C, or even
Python, for that matter, but fundamentally that you've learned how to program.

- [1:47:07](https://www.youtube.com/watch?v=undefined&t=6427s) Unfortunately, when you learn how


to program with a more traditional language like this, there's just so much distraction. Last week I
described all of the syntax, all of the weird punctuation that you see in this, like the hash symbol, these
angled brackets, parentheses, curly braces, backslash n, and more.

- [1:47:23](https://www.youtube.com/watch?v=undefined&t=6443s) Well, today we're not going to


reveal what all of those little particulars mean. But by next week, will this no longer look like the
proverbial Greek to you, a language that, presumably, you've never actually seen or typed before. But to
do that, we'll explore some of the very same topics as last week.

- [1:47:40](https://www.youtube.com/watch?v=undefined&t=6460s) So recall that, via Scratch-- and


presumably via problem set 1-- we took a look at things called functions that are actions or verbs. And
related to functions were arguments like inputs. And related to some functions were returned values like
outputs. Then we talked a bit about conditionals, forks in the road, so to speak, Boolean expressions,
which are just yes/no questions or true/false questions, loops, which let you do things again and again,
variables, like in math, that let you store values temporarily,

- [1:48:06](https://www.youtube.com/watch?v=undefined&t=6486s) and then even other topics still. So


if you were comfortable on the heels of problem set 0 and last week, realize that all of these topics are
going to remain with us. So really, today is just about acquiring all the more of a mental model for how
you translate those ideas into, presumably, a very cryptic new syntax-- a new syntax, frankly, that's
actually more simple in some ways than your own human language, be it English or something else,
because there's far fewer vocabulary words.

- [1:48:31](https://www.youtube.com/watch?v=undefined&t=6511s) There's actually far less syntax that


you might have in, say, a typical human language. But you need to be with these computer languages all
the more precise so that you're most, ultimately, correct, and ultimately will see to your code is
successful along a few other lines as well. So if you think about the last time you kind of wandered
around not really knowing what you were doing or encountered something new-- might not have been
that long ago, entering Harvard Yard for the very first time, or Old Campus or the like, be it in Cambridge
or New Haven--

- [1:48:58](https://www.youtube.com/watch?v=undefined&t=6538s) you didn't really need to know how


to do everything as a first year. You didn't need to know who everyone was, where everything was, how
Harvard or Yale, or anything else for that matter, worked. You sort of got by day to day by just focusing
on those things that matter. And anything you didn't really understand, you sort of turned a blind eye to
until it's important.

- [1:49:15](https://www.youtube.com/watch?v=undefined&t=6555s) And that's, indeed, what we're


going to do today. And really, for the next several weeks, we'll focus on details that are initially important
and try to wave our hands, so to speak, at details that, yeah, eventually we'll get to, might be interesting.
But for now, they might be distractions.

- [1:49:27](https://www.youtube.com/watch?v=undefined&t=6567s) And by distractions, I really mean


some of that syntax to which I alluded earlier. So by the end of today-- and really, by the end of problem
set 1, your first foray, presumably, into this language called C-- you'll have written some code. And you'll
be asking yourself-- we'll be asking yourselves-- just how good is that code? Well, first and foremost, per
last week, be it in Scratch or phone book form, code ultimately needs to be correct, to be well done.

- [1:49:52](https://www.youtube.com/watch?v=undefined&t=6592s) You want the problem to be solved


correctly. So that one sort of goes without saying. And along the way this term, we'll provide you with
tools and techniques so you don't have to just sit there sort of endlessly trying an input, checking the
output, trying another input, checking the output. There's a lot of automation tools in the real world--
and in this class and others like it-- that will help facilitate you answering that question for yourself, is my
code correct, according to our specifications or the like.

- [1:50:16](https://www.youtube.com/watch?v=undefined&t=6616s) But then something that's going to


take more time and you're probably not going to feel 100% comfortable with the first week, the first
weeks, is just how well designed your code is. It's one thing to speak English or write English, but it's
another thing-- or any language, for that matter.

- [1:50:30](https://www.youtube.com/watch?v=undefined&t=6630s) But it's another thing to speak it or


write it well. And we spend all these years in middle school, high school, presumably, writing papers and
other documents, getting grades and feedback on them as to how well formulated your arguments were,
how well structured your paper was, and the like. And there's that same idea in programming.

- [1:50:45](https://www.youtube.com/watch?v=undefined&t=6645s) It doesn't matter necessarily that


you've just solved a problem correctly. If your code is a complete visual mess, or if it's crazy long, it's
going to be really hard for someone else to wrap their mind around what your code is doing and, indeed,
to be confident if it is correct. And honestly, you-- the next morning, the next year, the next time you
look at that code-- might have no idea what you yourself were even thinking.

- [1:51:09](https://www.youtube.com/watch?v=undefined&t=6669s) But you will if you focus, too, on


designing good code, getting your algorithms efficient, getting your code nice and clean, and even
making sure your code looks pretty, which we'd describe as a matter of style. So in the written human
world, having punctuation in the right place, capitalization and the like-- the sort of way you write an
essay but not necessarily send a text message-- relates to style, for instance.

- [1:51:31](https://www.youtube.com/watch?v=undefined&t=6691s) And so good style in code is going


to have a few of these characteristics that are pretty easily taught and remembered. But you just have to
start to get in the habit of writing code in a certain way. So these three axes, so to speak, correctness,
design, and style, are really the overarching goals when writing code that ultimately is going to look like
this.
- [1:51:50](https://www.youtube.com/watch?v=undefined&t=6710s) So this program we conjectured
last week does what if you run it on a Mac or PC or somewhere else, presumably? What does it do?
Yeah? AUDIENCE: [INAUDIBLE]. DAVID J. MALAN: It just prints, Hello, world. And honestly, that's kind of
atrocious that you need to hit your keyboard keys this many times with this cryptic syntax just to get a
program to say, Hello, world.

- [1:52:10](https://www.youtube.com/watch?v=undefined&t=6730s) So a spoiler-- in a few weeks' time


when we introduce other, more modern languages, like Python, you can distill this same logic into
literally one line of code. And so we're getting there, ultimately. But it's helpful to understand what it is
that's going on here, because even though this is a pretty cryptic syntax, there's nothing after this week
and, really, next week that you shouldn't be able to understand even about something that right now
looks a little something like this.

- [1:52:32](https://www.youtube.com/watch?v=undefined&t=6752s) So how do you write code? Well,


I've given us sort of the answer to a problem. How do you print, Hello, world, on the screen? So what do
I do with this code? Well, we're in the habit of typically writing things with, like, Microsoft Word or
Google documents. And yeah, I could open up Word or Google Docs or Pages or the like and just literally
transcribe that character for character, save it, and boom, I've got a program.

- [1:52:53](https://www.youtube.com/watch?v=undefined&t=6773s) But the problem, per last week, is


that computers only understand or speak what other language, so to speak? AUDIENCE: [INAUDIBLE].
DAVID J. MALAN: Yeah, so binary, zeros and ones. And so this, obviously, is not zeros and ones. So it
doesn't matter if I put it in a Word doc, Google Doc, Pages file, or the like.

- [1:53:08](https://www.youtube.com/watch?v=undefined&t=6788s) The computer is not going to


understand it until I somehow translate it to zeros and ones. And honestly, none of those tools that I
rattled off are really appropriate for programming. Why? Well, they come with features like bold facing
and italics and sort of fluffy, aesthetic stuff that has no functional impact on what you're trying to do
with your code.

- [1:53:25](https://www.youtube.com/watch?v=undefined&t=6805s) And they don't have the ability, it


would seem, to convert that code ultimately to zeros and ones. But tools that do have this capability
might be called Integrated Development Environments, or IDEs, or, more simply, text editors. A text
editor is a tool that a programmer uses perhaps every day to write their code.

- [1:53:44](https://www.youtube.com/watch?v=undefined&t=6824s) And it's a simple program-- here,


for instance, a very popular one called Visual Studio Code, or VS Code. And at the top here, you see that
I've actually created in advance before class a very simple empty file called "hello.c." Why? Well, .

- [1:54:01](https://www.youtube.com/watch?v=undefined&t=6841s) c indicates by convention that this


is going to be a file in which there is C code. It's not .docx, which would mean in this file is a Microsoft
Word document, or .pages is a Pages file. This is .c, which means in this file is going to be text in the
language called C. This number 1 here is just an automatic line number that's going help me keep track
of how long or short this program is.

- [1:54:18](https://www.youtube.com/watch?v=undefined&t=6858s) And the cursor is just blinking


there, waiting for me to start typing some code. Well, let me go ahead and type out exactly the same
code. For me, it comes pretty comfortably from memory. So I'm going to go ahead and include
something called standardio.h-- more on that later. I'm going to magically type int main(void), whatever
that means-- we'll come back to that later-- one of these curly braces and then a sibling there that closes
the same.

- [1:54:44](https://www.youtube.com/watch?v=undefined&t=6884s) Then I'm going to hit Tab to indent


a few spaces. And then I'm going to type not print, but printf, then "Hello, world," /n, close quote, close
parenthesis, semicolon. And I dare say this was essentially the very first program I wrote some 25 years
ago. I wrote it to say, "Hi, CS50.

- [1:55:03](https://www.youtube.com/watch?v=undefined&t=6903s) " Now it just says the more


canonical, conventional, "Hello, world." But that's it. That's my very first program. And all I need to now
do is maybe hit Command-S or Control-S to save the file. And voila, I am a programmer. The catch
though, is, OK, how do I run this? Like, on your Mac or PC, how do you run a program? Well, usually
double-click an icon.

- [1:55:21](https://www.youtube.com/watch?v=undefined&t=6921s) On your phone, you tap an icon. In


this environment that we're using and that many programmers-- dare say most programmers-- use, you
don't have immediately a nice, pretty icon to double-click on. That's very user friendly, but it's not very
necessary. Especially when you get more comfortable with programming, you're going to want to type
commands because it's just faster than pointing and clicking a mouse.

- [1:55:42](https://www.youtube.com/watch?v=undefined&t=6942s) And you're going to want to


automate things, which is a lot easier if it's all command or text-based, as opposed to mouse and
muscular movements. And so here I have my program. It lives in this file called "hello.c." I need to now
convert it, though, to zeros and ones. Well, how do I go about doing this, and how am I going to get from
this so-called code-- or source code, as it's conventionally called-- to this, these zeros and ones that we'll
now start calling machine code.

- [1:56:11](https://www.youtube.com/watch?v=undefined&t=6971s) The zeros and ones from last week


can be used not only to represent numbers and letters, colors, audio, video, and more. It can also
represent instructions to a computer, like print, or play a sound, or delete a file, or save a file. All the sort
of basics of a computer somehow can be represented by other patterns of zeros and ones.

- [1:56:32](https://www.youtube.com/watch?v=undefined&t=6992s) And just like last week, it depends


on the context in which these numbers are stored. Sometimes they're interpreted as numbers, like in a
spreadsheet. Sometimes they're interpreted as colors. Sometimes they're interpreted as instructions,
commands to your computer to do very low-level operations, like print something on the screen.

- [1:56:51](https://www.youtube.com/watch?v=undefined&t=7011s) So fortunately, last week's


definition of computer science of problem solving is a nice mental model for exactly the goal at hand. I
have some input, AKA source code. I want to output ultimately machine code, those zeros and ones. I
certainly don't want to do this kind of process by hand. So hopefully there's an algorithm implemented
by some special program that does exactly that.

- [1:57:13](https://www.youtube.com/watch?v=undefined&t=7033s) And those of you who do have


some prior experience, this program might be called a? A compiler. So a few of you have, indeed,
programmed before. Not all languages use compilers. C, in fact, is a language that does use a compiler.
And so I just need to find myself-- on my computer somewhere, presumably-- a so-called compiler, a
program whose purpose in life is to convert one language to another.

- [1:57:36](https://www.youtube.com/watch?v=undefined&t=7056s) And source code written textually


in C, like we saw a moment ago, is source code. The machine code is the corresponding zeros and ones.
So let me go back to the same programming environment called Visual Studio Code or VS Code. This is
typically a program you or any programmer on the internet can download onto their own Mac or PC and
be on their way with whatever computer you own writing some code.

- [1:58:00](https://www.youtube.com/watch?v=undefined&t=7080s) A downside, though, of that


approach is that all of us have slightly different versions of Macs or PCs. We have slightly different
versions of operating systems. They may or may not be up to date. It's just a technical support nightmare
to create a uniform environment, especially for an introductory class, where everyone should ideally be
on the same page so we can get you up and running quickly.

- [1:58:19](https://www.youtube.com/watch?v=undefined&t=7099s) And so I'm actually using a cloud-


based version of VS Code, something that you only need a browser to access. And then you can be on
any computer, today or tomorrow. By the end of the semester, we're going to get you out of the cloud,
so to speak, as best we can and get you onto your own Mac or PC, so that after this class, especially if it's
the only CS class you ever take, you feel like you can continue programming in any number of languages,
even with CS50 behind you.

- [1:58:45](https://www.youtube.com/watch?v=undefined&t=7125s) But for now, wonderfully, the


browser version of VS Code should pretty much be identical to what the eventual downloadable version
of the same would be. And you'll see in problem set 1 how to access this and how to get going yourself
with your first programs. But I haven't mentioned this bottom part of the screen, this bottom part of the
screen.

- [1:59:03](https://www.youtube.com/watch?v=undefined&t=7143s) And this is an area where we have


what's called a terminal window. So this is sort of old-school technology that allows you, with a
keyboard, to interact with a computer, wherever it may be-- on your lap, in your pocket, or even, in this
case, in the cloud. So on the top-hand portion of this screen is my text editor, like tabbed windows, like
in many programs, where I can just create files and write code.

- [1:59:27](https://www.youtube.com/watch?v=undefined&t=7167s) The bottom of the screen here, my


so-called terminal window, gives me the ability to run commands on a server that currently I have
exclusive access to. So because I logged into VS Code with my account online, I have my own sort of
virtual server, if you will, in the cloud-- otherwise known as, in this context, a container.

- [1:59:47](https://www.youtube.com/watch?v=undefined&t=7187s) This has its own operating system


for me, its own hard drive, if you will, where I can save and create files of my own, separate from yours
and vice versa. And it's at this very simple prompt, which is conventionally-- but not always-- abbreviated
by a dollar sign, has nothing to do with currency. It just means, type your commands here.

- [2:00:04](https://www.youtube.com/watch?v=undefined&t=7204s) This is where I'm going to be able


to type commands, like compile my source code into machine code. So it's a Command Line Interface, or
CLI, on top of an operating system that you might not have ever used or seen, but it's very popular, called
Linux. Odds are almost all of us in this room are using Mac OS or Windows right now, but we're all going
to start using an operating system called Linux, which is in a family of operating systems that offer not
only this command line interface, but are used not just for programming, but for serving

- [2:00:33](https://www.youtube.com/watch?v=undefined&t=7233s) websites and developing


applications and the like. And it's, indeed, a familiar and very powerful interface, as we'll see. So how do
I go about making this file, hello.c, into a program? There's no icon to double-click, but there is a
command. I can type, make hello, at this dollar sign prompt, go ahead and hit Enter, and nothing appears
to happen.

- [2:00:56](https://www.youtube.com/watch?v=undefined&t=7256s) But that's a good thing. And as


we'll see in programming, almost always, if you don't see anything go wrong, that means everything
went right. So this is going to be a rarity at first, but this is a good thing that it just seems to do nothing.
But now there is in the folder in my accounts in this on the cloud a file called "hello.

- [2:01:15](https://www.youtube.com/watch?v=undefined&t=7275s) " And it's a bit of a weird


command, but you'll get familiar with it before long. . just means go into my current folder. /hello means
run the program called "hello" in this current folder. So ./hello, and then Enter, and voila, now I'm
actually not just programming, but running my actual code.

- [2:01:36](https://www.youtube.com/watch?v=undefined&t=7296s) So what have I just done? Let me


go ahead and do this. I'm going to go ahead and open up the sidebar of this program, and you'll see in
problem set 1 how to do this. And this might look a little different based on your own configuration. Even
the color scheme I'm using might ultimately look different from yours, because it supports a nice colorful
theme.

- [2:01:53](https://www.youtube.com/watch?v=undefined&t=7313s) So you can have different colors


and brightnesses depending on your mood or the time of day. What I've opened here, though, is what is
called in VS Code Explorer, and this is just all of the files in my cloud account. And there's not many right
now. There's only two. One is the file called hello.

- [2:02:09](https://www.youtube.com/watch?v=undefined&t=7329s) c, and it's highlighted because I've


got it open right there. And the other is a file called "hello," which is brand new and was created when I
ran that command. And what's now worth noting is that now things are getting a little more like Mac OS
and Windows. Like on the left-hand side, you have a GUI, a Graphical User Interface.

- [2:02:26](https://www.youtube.com/watch?v=undefined&t=7346s) But on the bottom here, again, you


have a CLI, Command Line Interface. These are just different ways to interact with computers, and you'll
get comfortable with both. And honestly, you're certainly familiar and comfortable with GUIs already, so
it's the command line one with which we'll spend some time.

- [2:02:41](https://www.youtube.com/watch?v=undefined&t=7361s) Now suppose that I just wanted to


do something more than compile this program. Suppose I wanted to go ahead and remove it. Like, uh-
uh, no, I made a mistake. I want to say, "Hello, CS50," not "Hello, world." I could just hover up here, like
in any software, and I could right-click, and I could poke around, and there, delete permanently.

- [2:02:58](https://www.youtube.com/watch?v=undefined&t=7378s) So most of us might have that


instinct on a Mac or PC. You right-click or Control-click, and you poke around. But in a command line
interface, let me do this instead. The command for removing or deleting a file in the world of Linux, this
other operating system, is just a type rm for remove, and then "hello," Enter.

- [2:03:16](https://www.youtube.com/watch?v=undefined&t=7396s) It's a somewhat cryptic


confirmation message, but this just means, are you sure? I'm going to go ahead and type Y for Yes. And
now when I hit Enter, watch what happens at top left in the Explorer, the GUI, the graphical interface.
Voila, it disappears. Not terribly exciting, but this just means this is a graphical version of what we're
seeing here.

- [2:03:36](https://www.youtube.com/watch?v=undefined&t=7416s) And in fact, if you want to never


use the GUI again-- I'll go ahead and close it with a keyboard shortcut here-- you can forever just type ls
for list and hit Enter. And you will see in the command line interface all of the files in your current folder.
So anything you can do with a mouse, you can do with this command line interface.

- [2:03:54](https://www.youtube.com/watch?v=undefined&t=7434s) And indeed, we'll see many more


things that you can do as well. But the inventors of this, this operating system and its predecessors, were
very succinct. Like, the command is rm for remove. The command is ls for list. It's very terse. Why?
Because it's just faster to type. So before we forge ahead with making something more interesting than
just "Hello, world," let me pause here to see if there's questions on source code or machine code or
compiler or this command line interface.

- [2:04:24](https://www.youtube.com/watch?v=undefined&t=7464s) Yeah? AUDIENCE: [INAUDIBLE].


DAVID J. MALAN: Really good question, and let me recap. If I were to make changes to the program, run
it, and then maybe make other changes and try to rerun it, would those changes be reflected, even
though I've reworded slightly. Well, let's do this. I already removed the old version.

- [2:04:40](https://www.youtube.com/watch?v=undefined&t=7480s) So let me go ahead and point out


that if I do ./hello now, I'm going to see some kind of error because I just deleted the file. No such file or
directory, so it's not terribly user friendly, but it's saying what the problem is. Let me go ahead and
remake it by typing make hello. Now if I type ls, I'll see not one but two files again, and one of them is
even green with a little asterisk to indicate that it's executable.

- [2:05:04](https://www.youtube.com/watch?v=undefined&t=7504s) It's sort of the textual version of


something you could double-click in our human world. So now, of course, if I run hello, we're back where
I started, "Hello, world." But now suppose I change it to "Hello, CS50," like I did years ago. Let me go
ahead and save the file with Command-S or Control-S.

- [2:05:21](https://www.youtube.com/watch?v=undefined&t=7521s) Down here now, let me run ./hello


again, and voila. Huh. So let me ask someone else to answer that question. What's the missing step?
Why did it not say, "Hello, CS50." Yeah? AUDIENCE: [INAUDIBLE]. DAVID J. MALAN: Yeah, so I didn't
compile it again. So sort of newbie mistake, you're going to make this mistake and many others before
long.

- [2:05:39](https://www.youtube.com/watch?v=undefined&t=7539s) But now let me go ahead and


remake hello, enter. It's going to seemingly make the same program. But this time when I run it, it's,
"Hello, CS50." Any other questions on some of these building blocks? And we'll come back to all the
crazy syntax I typed before long. But for now, we're focusing on just the output.
- [2:05:59](https://www.youtube.com/watch?v=undefined&t=7559s) Yeah? AUDIENCE: [INAUDIBLE].
DAVID J. MALAN: When I keep running make, it creates a new version of the machine code. So it keeps
changing the hello program and the hello file, and that's it. There's no make file, per se. AUDIENCE:
[INAUDIBLE]. DAVID J. MALAN: Good question, no. If I open up that directory, you'll see that there's just
the one.

- [2:06:18](https://www.youtube.com/watch?v=undefined&t=7578s) And it doesn't matter how many


times I run make hello-- three, four, five-- it just keeps overwriting the original. So it's kind of like just
saving in the world of Google Docs or Microsoft Word or the like. But there's an additional step today.
We have to then convert my words to the computer's, the zeros and ones.

- [2:06:36](https://www.youtube.com/watch?v=undefined&t=7596s) Yeah, in front. AUDIENCE:


[INAUDIBLE]. DAVID J. MALAN: Oh, what happens if I run hello.c? So let me go ahead and do ./hello.c,
which is a mistake you'll invariably make early on. Permission denied. So what does that mean? This is
where the error messages mean something to the people who designed the operating system, but it's a
little cryptic.

- [2:06:54](https://www.youtube.com/watch?v=undefined&t=7614s) It's not that you don't have access


to the file. It means that it's not executable. This is not something you have permission to run, but you
do have permission to read or write it-- that is, change it. AUDIENCE: [INAUDIBLE]. DAVID J. MALAN: Oh,
really good question. So if I have named my file, hello dot C, or more generally something dot C, of the
things that Make does is it automatically picks the file name for me.

- [2:07:17](https://www.youtube.com/watch?v=undefined&t=7637s) And we'll discuss a bit-- we'll


discuss this a bit more next week. Make itself-- is kind of the first of white lies today-- itself is not a
compiler. It's a program that knows how to find and use the compiler on the system and automatically
create the program. If I use, as we'll discuss next week, the actual compiler myself, I have to type a much
longer sequence of commands to specify explicitly what do I want the name of my program to be.

- [2:07:43](https://www.youtube.com/watch?v=undefined&t=7663s) Make is a nice program, especially


in week 1, because it just automates all of that for us. And so here, we have now a program that very
simply prints something on the screen. So let's not put this into the context of where we left off last time
in the context of Scratch and inputs and outputs. So we discuss the last time, of course, functions and
arguments.

- [2:08:02](https://www.youtube.com/watch?v=undefined&t=7682s) Functions, again, are those actions


and verbs like say, or ask, or the like. And the arguments were the inputs to those functions, generally in
those little white ovals that, in Scratch, you could type words or numbers into. We'll see, in all of the
languages we're going to see this term, have that same capability.

- [2:08:18](https://www.youtube.com/watch?v=undefined&t=7698s) And let's just start to translate one


of these things to another. So for instance, let's put this same program in C, in the context of Scratch.
This is what Hello, World looked like last week in the form of one function. This week, of course, it looks
like print. And then the parentheses, notice, are kind of deliberately designed in the world of Scratch to
resemble that same shape.

- [2:08:39](https://www.youtube.com/watch?v=undefined&t=7719s) Even though this is a white oval,


you kind of get that it's kind of evoking that same idea with the parentheses. Technically the function in
C, it's not called say. It's not even called print. It's called printf. The F stands for formatted, but we'll see
what that means in a moment. But printf is the closest analogous function for say in the world of C.

- [2:09:01](https://www.youtube.com/watch?v=undefined&t=7741s) Notice if, though, you want to print


something like Hello, World or Hello CS50 in C, you don't just write the words as we did last week. You
also had an add what, if you notice already what's missing from this version. Yeah, so the double quotes
on the left and the right. So, that's necessary in C whenever you have a string of words.

- [2:09:21](https://www.youtube.com/watch?v=undefined&t=7761s) And I'm using that word


deliberately. Whenever you have multiple words like this, this is known as a string as we'll see. And you
have to put it in double quotes, not single quotes. You have to put it in double quotes. There's one other
stupid thing that we need to have in my C code in order to get this function to do something ultimately,
which is what? Semicolon.

- [2:09:42](https://www.youtube.com/watch?v=undefined&t=7782s) So just like in our human world,


you eventually got into the habit of using, at least in formal writing, periods. Semicolon is generally what
you use to finish your thought in the world of programming with C. All right, so we have that function in
place. Now, what does this really fit into in terms of the mental model? Well, functions take arguments.

- [2:10:01](https://www.youtube.com/watch?v=undefined&t=7801s) And it turns out functions can have


different types of outputs. And we've actually seen both already last week. One type of output from a
function can be something called a side effect. And it generally refers to something visual, like something
appearing on the screen or a sound playing from your computer.

- [2:10:18](https://www.youtube.com/watch?v=undefined&t=7818s) It's sort of a side effect of the


function doing its thing. And indeed, last week we saw this in the context of passing in something like
Hello, World as input to the say function. And we saw on the screen Hello, World, but it was kind of a
one off. It's one and done. You can't actually do anything with that visual output other than consume it,
visually, with your human eyes.

- [2:10:39](https://www.youtube.com/watch?v=undefined&t=7839s) But sometimes, recall last week,


we had functions like the ask block that actually returned me some value. Remember the ask, what's
your name. It handed me back whatever answer the human typed in. It didn't just arbitrarily display it on
the screen. The cat didn't necessarily say it on the screen.

- [2:10:55](https://www.youtube.com/watch?v=undefined&t=7855s) It was stored, instead, in that


special variable that was called answer. Because some functions have not side effects but return values.
They hand you back an output that you can use and reuse, unlike the side effect, which, again displays
and that's it. You can't sort of catch it and hold on to it.

- [2:11:14](https://www.youtube.com/watch?v=undefined&t=7874s) So, in the context of last week, we


had the ask block. And that had this special answer return value. In C, we're going to see in just a
moment, we could translate this as follows. The closest match I can propose for the ask block is a
function that we're going to start calling get string. String is, again, a word, a set of words, like a phrase
or a sentence in programming.

- [2:11:37](https://www.youtube.com/watch?v=undefined&t=7897s) It, too, is a function insofar as it


takes input and pretty much-- this isn't always true-- but very often when you have a word in C followed
by an open parenthesis and a closed parenthesis, it's most likely the name of a function. And we're going
to see that there's some exceptions to that.

- [2:11:53](https://www.youtube.com/watch?v=undefined&t=7913s) But for now this indeed looks like a


function because it matches that pattern. If I want to ask the question, what's your name, question
mark-- and I'm even going to deliberately put a space there just to kind of move the cursor a little bit
over so that the human isn't typing literally after the question mark.

- [2:12:08](https://www.youtube.com/watch?v=undefined&t=7928s) So that's just the nitpicky aesthetic.


This is perhaps the closest analog to just asking that question. But because the ask block returns a value,
the analog here forget string is that it, too, returns a value. It doesn't just print the human's input. It han

- [2:12:28](https://www.youtube.com/watch?v=undefined&t=7948s) ds it back to you in the form of a


variable, a.k.a. return value, that I can then use and reuse. Now ideally it would be as simple as this
literally saying answer on the left equals. And this is where things start to diverge from math and sort of
our human world. This equal sign, henceforth, is not the equal sign. It is the assignment operator. To
assign a value means to store a value in some variable.

- [2:12:50](https://www.youtube.com/watch?v=undefined&t=7970s) And you read these things, weirdly,


right to left. So here is a function called get string. I claim that it's going to return to you whatever the
human types in as their name. It's going to get stored over here on the left because of this so-called
assignment operator, that yes is an equal sign. But it doesn't mean equality in this context.

- [2:13:09](https://www.youtube.com/watch?v=undefined&t=7989s) It makes things equal. But it does


so by copying the value on the right into the thing on the left. Unfortunately, we're not quite done yet
with C. And this is where, again, it gets a little annoying at first where Scratch just let us express our ideas
without so much syntax. In C when you have a variable you don't just give it a name like you did in
Scratch.

- [2:13:29](https://www.youtube.com/watch?v=undefined&t=8009s) You also have to tell the computer


in advance what type of value it is storing. String is one such type of value. Int, for integer, is going to be
another. And there's even more than that we'll see today and beyond. And this is partly an answer to the
question that came up one or more times last week, which was how does a computer distinguish this
pattern of zeros and ones from this.

- [2:13:51](https://www.youtube.com/watch?v=undefined&t=8031s) Like is this a letter, a number, a


color, a piece of video. And I just claimed last week that it totally depends on the program. It depends on
the context. And that's true. But within those programs, it often depends on what the human
programmer said the type of the value is. If this specifies that the string, which means interpret the
following zeros and ones that are stored in my program as words or letters, more generally.

- [2:14:17](https://www.youtube.com/watch?v=undefined&t=8057s) If it's an int for integer, it would be


implying, by the programmer, treat the following zeros and ones in my program as a number, an integer,
not a string. So here's where this week, unlike with Scratch, which is kind of figures out what you mean,
with C in a lot of languages you have to be this pedantic and tell it what you mean.

- [2:14:36](https://www.youtube.com/watch?v=undefined&t=8076s) There's still one stupid thing


missing from my code here. What's still missing here? Yeah. AUDIENCE: [INAUDIBLE] DAVID J. MALAN:
And we still need the stupid semicolon. And I'm sort of impugning it here. Because honestly, these are
the kinds of stupid mistakes you're going to make today, tomorrow, this weekend, next week, a few
weeks from now, until you start to notice this and recognize it as well as you do English or whatever your
spoken language is.

- [2:14:58](https://www.youtube.com/watch?v=undefined&t=8098s) Yeah, question. Good question.


Suppose I mix apples and oranges, so to speak, and I try to put a string in an int or an int in a string, the
compiler is going to complain. So when I run that make command as I did earlier, it's not going to be nice
and blissfully quiet and just give me another prompt. It's going to yell at me with honestly a very cryptic
looking error message until we get the muscle memory for reading it.

- [2:15:20](https://www.youtube.com/watch?v=undefined&t=8120s) Other questions. Ah, what


happened to the backslash n. So, we'll come back to that in just a moment, if we may. Because I have
deliberately omitted it here but we did have it earlier. And we'll see the different behavior in a sec. Other
questions. Yeah, not at all nitpicky. These are the kinds of things that just matter.

- [2:15:40](https://www.youtube.com/watch?v=undefined&t=8140s) And it's going to take time to


recognize and develop this muscle memory. Everything I've typed here except, for the W at the moment,
is lowercase. And the W is capitalized just because it's English. Everything else is lowercase. And this kind
of varies by language and also context. So, in many languages the convention is to use all lowercase
letters for your variable names.

- [2:16:00](https://www.youtube.com/watch?v=undefined&t=8160s) Other languages might use some


capitals, as well. But we'll talk about that before long. But this is the kind of thing that matters and is
hard to see at first, especially when a little S doesn't look that different when it's on your tiny laptop
screen from a capital S. But you'll start to develop these instincts.

- [2:16:16](https://www.youtube.com/watch?v=undefined&t=8176s) All right, so besides this particular


block, let's go ahead and consider how we can go about implementing this now in code. So let me switch
back to VS Code here. This was the program I had earlier. And let me go ahead and undo my CS50
change. And this time just rerun it. Rerun Make on Hello with the original version with the backslash n.

- [2:16:35](https://www.youtube.com/watch?v=undefined&t=8195s) Enter, nothing bad seems to have


happened. So dot slash Hello, enter Hello, World. Now, if you're curious, this is a good instinct to start to
acquire what happens if I get rid of this. Well, I'm probably not going to break things too badly. So let's
try. Let me go ahead now and do Make Hello. Still compile.

- [2:16:52](https://www.youtube.com/watch?v=undefined&t=8212s) So it's not a really bad mistake. So


let me go ahead and run dot slash Hello. What's the difference here? Yeah, what do you see that's
different? Yeah, the dollar sign, my so-called prompt, stayed on the same line. Why? Well, we can
presumably infer now that the backslash n is some fancy notation for saying create a new line, move the
cursor, so to speak, to the next line.

- [2:17:15](https://www.youtube.com/watch?v=undefined&t=8235s) Notice that the cursor will move to


the next line in my terminal window. If I keep hitting it, it just automatically, by nature of hitting enter,
does it. But it'd be kind of stupid if when you run a program in this world, simple as it is, if the next
command is now weirdly spaced in the middle of the terminal with the dollar sign, it just looks sloppy.
- [2:17:33](https://www.youtube.com/watch?v=undefined&t=8253s) It's really just an aesthetic
argument. And notice that it's not acceptable or correct to do this, to hit enter there. Let me go ahead
and save that, though, and see what happens. Let me go ahead now and run Make Hello enter. Oh my
god, like four errors. This is like, what, 10 lines of errors for a one line program.

- [2:17:53](https://www.youtube.com/watch?v=undefined&t=8273s) And this is where, again, you'll start


to develop the instincts for just reading this stuff. These kinds of tools, like the compiler tool we're using,
were not designed necessarily with user friendliness in mind. That's changed over the decades, but
certainly early on it's really just meant to be correct and precise with its errors.

- [2:18:10](https://www.youtube.com/watch?v=undefined&t=8290s) So what did I do here? Missing


terminating close quote character, long story short, when you have a string in C, your double quotes just
have to be on the same line just because. Now, there's the slight white lie. There's ways around this. But
the best way around it is to use this so-called escape sequence.

- [2:18:30](https://www.youtube.com/watch?v=undefined&t=8310s) To escape something means


generally to put a backslash, and then a special symbol like n for new line. And this is just the agreed
upon way that humans, decades ago, decided, OK you don't just hit your enter key. You instead put
backslash n and that tells the computer to move the cursor to the new line.

- [2:18:49](https://www.youtube.com/watch?v=undefined&t=8329s) So again, kind of cryptic. But once


you know it, that's it. It's just another word in our vocabulary. So now let me transition to making my
program a little more interactive. Instead of just saying Hello, world, let me change it like last week to say
Hello, David, or whoever is interacting with the program.

- [2:19:04](https://www.youtube.com/watch?v=undefined&t=8344s) So I'm going to do string answer


gets, get string, quote unquote, what's your name. I'm not going to bother with a new line here. I could.
This is now just a judgment call. I deliberately want the human to type their name on the same line just
because. And how do I now print this? Well last week recall we used say.

- [2:19:23](https://www.youtube.com/watch?v=undefined&t=8363s) And then we use the other block


called join. So the idea here is the same. But the syntax this week is going to be a little different. It's
going to be printf, which prints something on the screen. I'm going to go ahead and say Hello comma.
And let me just go with this initially with the backslash n, semicolon.

- [2:19:43](https://www.youtube.com/watch?v=undefined&t=8383s) Let me go ahead and recompile my


code. Whoops, damn doesn't work still. And look at all these errors. There's more errors than code I
wrote. But what's going on here? Well, this is actually something, a mistake you'll see, somewhat often,
at least initially. And let's start to glean what's going on here.

- [2:20:03](https://www.youtube.com/watch?v=undefined&t=8403s) So here, if I look at the very first


line of output after the dollar sign-- so even though it jumped down the screen pretty fast, I wrote Make
Hello at the dollar sign, prompt. And then here's the first error. On Hello dot C, line 5-- technically
character 5, but generally line is enough to get you going-- there's an error, use of undeclared identifier
string.

- [2:20:25](https://www.youtube.com/watch?v=undefined&t=8425s) Did you mean standard in? So, I


didn't. And this is not an obvious solution at first. But you'll start to recognize these patterns in error
messages. It turns out that if I want to use string, I actually have to do this. I have to include another
library up here, another line of code, rather, called CS50 dot H.

- [2:20:46](https://www.youtube.com/watch?v=undefined&t=8446s) We'll come back to what this


means in just a moment. But if I now retroactively say, all right, what does standard I/O do for us up
here. Before I added that new line, what is standard I/O doing? Well, if you think back to Scratch, there
were a few examples with the camera and with the speech to-- the text to voice. Remember I had to
poke around in the extensions button.

- [2:21:10](https://www.youtube.com/watch?v=undefined&t=8470s) And then I had to load it into


Scratch. It didn't come natively with Scratch. C is quite like that. Some functions come with the language.
But for the most part, if you want to use a function, an action or a verb like printf, you have to load that
extension, so to speak, that more traditionally is called a library.

- [2:21:29](https://www.youtube.com/watch?v=undefined&t=8489s) So there is a standard I/O library,


STD I/O, standard I/O, where I/O just means input and output. Which means, just like in MIT's World,
there was an extension for doing text to voice or for using your camera. In C, there's an extension, a.k.a.
a library, for doing standard input and output. And so if you want to use any functions related to
standard input and output, like text from a keyboard, you have to include standard I/O dot H. And then
can you use printf.

- [2:22:02](https://www.youtube.com/watch?v=undefined&t=8522s) Same goes here. Get string, it turns


out, is a function that CS50 wrote some time ago. And as we'll see over the coming weeks, it just makes
it way easier to get input from a user. C is very good with printf at printing output on the screen. C makes
it really annoying and hard, as we'll see in a few weeks, to just get input from the user.

- [2:22:23](https://www.youtube.com/watch?v=undefined&t=8543s) So we wrote a function called


get_string, but the only way you can use that is to load the extension, a.k.a. load the library called CS50.
And we'll come back in time, like, why is it .h, why is it a hash symbol. But for now, standard I/O is a
library that gives you access to printf and input- and output-related stuff.

- [2:22:43](https://www.youtube.com/watch?v=undefined&t=8563s) CS50 is a second library that


provides you with access to functions that don't come with C that include something like get_string. So
with that said, we've now kind of teased apart at a high level what lines 2 and now 1 are doing. Let me
go ahead and rerun make hello. Now it worked. So all those crazy error messages were resolved by just
one fix, so key takeaway is not to get overwhelmed by the sheer number of errors.

- [2:23:09](https://www.youtube.com/watch?v=undefined&t=8589s) Let me now do ./hello and if I type


in my name, what am I going to say? What do you think? Yeah, hello answer, because the computer is
going to take me literally. And it turns out that if you just write "hello, answer" all in the double quotes,
you're really just passing English as the input to the printf function, you're not actually passing in the
variable.

- [2:23:34](https://www.youtube.com/watch?v=undefined&t=8614s) And unfortunately in C, it's not


quite as easy to plug things in to other things that you've typed. Remember in Scratch, there was not just
the Save block but the Join block, which was kind of pretty, you can combine apples and oranges-- or was
it apple and banana? Then we changed it to hello and then the answer that the human typed in.
- [2:23:53](https://www.youtube.com/watch?v=undefined&t=8633s) In C, the syntax is going to be a
little different. You tell the computer inside of your double quotes that you want to have a placeholder
there, a so-called format code. %s means, hey, computer, put a string here eventually. Then outside of
your quotes, you just add a comma and then you type in whatever variable you want the computer to
plug in at that %s location for you.

- [2:24:19](https://www.youtube.com/watch?v=undefined&t=8659s) So %s is a format code which


serves as a placeholder. And now the printf function was designed by humans years ago to figure out
how to do the apple and banana thing of joining two words together. It's not nearly as user-friendly as it
is in Scratch, but it's a very common paradigm. So let me try and rerun this now. make hello.

- [2:24:39](https://www.youtube.com/watch?v=undefined&t=8679s) No errors, that's good. ./hello.


What's my name, David? If I type Enter now, now it's hello. David. And the printf, here's the F in printf. It
formats its input for you by using these placeholders for things like strings, represented again by %s. So a
quick question then, if I focus here on line 7 for just a moment and even zoom in here, how many inputs
is printf taking as a function? A moment ago, I'll admit that it was taking one input, "hello, world," quote
unquote.

- [2:25:15](https://www.youtube.com/watch?v=undefined&t=8715s) How many inputs might you infer


printf is taking now? 2. And it's implied by this comma here, which is separating the first one, quote,
unquote, "hello, %s" from the second one, answer. And then just as a quick safety check here, why is it
not 3? Because there's obviously two commas here.

- [2:25:35](https://www.youtube.com/watch?v=undefined&t=8735s) Why is it not actually 3 arguments


or inputs? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Exactly. The comma to the left is actually part of my
English grammar, that's all, so same syntax. And, again, here's where programming can just be confusing
early on because we're using the same special punctuation to mean different things, it just depends on
the context.

- [2:25:56](https://www.youtube.com/watch?v=undefined&t=8756s) And so now is actually a good time


to point out all of the somewhat pretty colors that have been popping up on the screen here-- even
though I wasn't going to a format menu, I wasn't boldfacing things, I certainly wasn't changing things to
red or blue or whatnot-- that's because a text editor like VS Code syntax highlights for you.

- [2:26:14](https://www.youtube.com/watch?v=undefined&t=8774s) This is a feature of so many


different programming environments nowadays, VS Code does it as well. If your text editor understands
the language that you're programming in-- C, in this case-- it highlights in different colors the different
types of ideas in your code. So, for instance, string and answer here are in black, but get_string a
function is in this sort of nasty brown-yellow here right now, but that's just how it displays on the screen.

- [2:26:38](https://www.youtube.com/watch?v=undefined&t=8798s) The string, though, here in red is


kind of jumping out at me, and that's marginally useful. The %s is in blue. That's kind of nice, because it's
jumping out at me. And so it's just using different colors to make different things on the screen pop so
you can focus on how these ideas interrelate and, honestly, when you might make a mistake.

- [2:26:55](https://www.youtube.com/watch?v=undefined&t=8815s) For instance, let me accidentally


leave off this quote here. And now all of a sudden, notice if I delete the quote, the colors start to get a
little awry. But if I go back there and put it back, now everything's back in place. What's another feature
of this text editor? Notice when my cursor is next to this parenthesis, which demarcates the end of the
inputs to the function, notice that highlighted in green here is the opening parenthesis.

- [2:27:22](https://www.youtube.com/watch?v=undefined&t=8842s) Why? It's just a visually useful


thing, especially when you start writing more and more code, just to make sure your parentheses are
lining up. And that's true for these curly braces over here on the left and the right. We'll come back to
those in a moment. If I put my cursor there, you can see that these things correspond to one another.

- [2:27:38](https://www.youtube.com/watch?v=undefined&t=8858s) So it's nothing in your code


fundamentally, it's just the editor trying to help you, the human, program. And you can even see it,
though it's a little subtle-- see these four dots here and these four dots here? That's my indentation. I
configured VS Code to indent by four spaces, which is a very common convention.

- [2:27:55](https://www.youtube.com/watch?v=undefined&t=8875s) Any time I hit the Tab key, this too


can help you make sure-- once we have more interesting and longer programs-- that everything lines up
nice and neatly. Phew. All right, any questions then on printf or more? Yeah. AUDIENCE: [? Would ?] the
printf [INAUDIBLE]?? DAVID J. MALAN: Short answer, yes. printf can handle more than one type of
variable or value.

- [2:28:16](https://www.youtube.com/watch?v=undefined&t=8896s) %s is one. We're going to see %i is


another for plugging in an integer. You can have multiple i's, multiple s's, and even other symbols too.
We'll come back to that in just a little bit. printf can take many more arguments than just these two. This
is just meant to be representative. Yeah, over here.

- [2:28:34](https://www.youtube.com/watch?v=undefined&t=8914s) Can you declare variables within


the printf? No. The only variable I'm using right now is answer, and it's got to be done outside the
context of printf in this case. Good question, we'll see more of that before long. Yeah, in back.
AUDIENCE: [INAUDIBLE] DAVID J.

- [2:28:52](https://www.youtube.com/watch?v=undefined&t=8932s) MALAN: How do we download the


CS50 library? So we will show you in problems set 1 exactly how to do that. It's automatically done for
you in our version of VS Code in the cloud. If, ultimately, you program on your own Mac or PC, either
initially or later on, it's also installable online. But if you want to ask that via online or afterward, we can
point you in the right direction.

- [2:29:09](https://www.youtube.com/watch?v=undefined&t=8949s) But PSet 1 will itself. Yeah.


AUDIENCE: [INAUDIBLE] DAVID J. MALAN: String is the type of the variable or, more properly, the data
type of the variable. int is another keyword I alluded to earlier, I haven't used it yet. int, for integer, is
going to be another type, or data type, of variable. AUDIENCE: OK. [? Thank you.

- [2:29:27](https://www.youtube.com/watch?v=undefined&t=8967s) ?] DAVID J. MALAN: Yeah.


AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Oh, good question. Could I go ahead and just plug in this
function, kind of like we did in Scratch, getting rid of the variable altogether and just do this, which recall,
is reminiscent of what I did in Scratch by plopping block on top of block on block? Am I answering that
right? Can I put string in front of get_string? No.

- [2:29:51](https://www.youtube.com/watch?v=undefined&t=8991s) You only put the word string in


front of a variable that you want to make string. And even though I'm apparently answering the wrong
question, let me go ahead and zoom out, save this, do make hello again. Seems to compile OK. If I run
./hello, type in David, voila. That, too, works. And so, actually, let's go down this rabbit hole for just a
moment.

- [2:30:10](https://www.youtube.com/watch?v=undefined&t=9010s) Clearly, it's still correct-- at least,


based on my limited testing. Is this better designed or worse designed? Let's open that question like we
did last week. Yeah? Yeah, I kind of agree with that. Reasonable people could disagree, but I do agree
that this seems harder to read because I start reading here, but wait a minute.

- [2:30:32](https://www.youtube.com/watch?v=undefined&t=9032s) get_string is going to get used first,


and then it's going to give me back a value. So, yeah, it just feels like it was nicer to read top to bottom, I
would say. Your thoughts? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yeah. And so this is useful if I only
want to print out the person's name once. If I want to use it later in a longer program, I'm out of luck,
and so I haven't saved it in a variable.

- [2:30:49](https://www.youtube.com/watch?v=undefined&t=9049s) So I think, long story short, we


could debate this all day long. But in this case, eh, if you can make a reasonable argument one way or
the other, that's a pretty solid ground to stand on. But, invariably, reasonable people are going to
disagree, whether first-time programmers or many years after that.

- [2:31:06](https://www.youtube.com/watch?v=undefined&t=9066s) So let's frame this one last example


in the context of the same process of taking inputs and outputs. The functions we've been talking about
all take inputs, otherwise now known as arguments, or parameters, pretty much synonymous. That's just
the fancy word for an input to a function. And some functions have either side effects, like we saw--
printing something, saying something on the screen, sort of visually or audibly-- or they return a value,
which is a reusable value, like name or answer,

- [2:31:34](https://www.youtube.com/watch?v=undefined&t=9094s) in this case. If we look then at what


we did last time in the world of Scratch last week, the input was what's your name, the function was ask,
and the return value was answer. And now let's take a look at this block, which is honestly a more user-
friendly version of what we just did with the %s. Last week we said save, then join, then hello and
answer.

- [2:31:55](https://www.youtube.com/watch?v=undefined&t=9115s) But the interesting takeaway there


was not how to say hello anything. It was the fact that in Scratch 2, the output of one function, like the
green join, could become the input to another function, the purple say. The syntax in C is admittedly
pretty different, but the idea is essentially the same. Here, though, we have hello, a placeholder, but we
have to, in this world of C, tell printf what we want to plug in for that placeholder.

- [2:32:25](https://www.youtube.com/watch?v=undefined&t=9145s) It's just different. But that's the


way to do it. When we get to Python and other languages later in the term, there's actually easier ways
to do this. But this is a very common paradigm, particularly when you want to format your data in some
way. All right, let's then take a step back to where we began, which was with that whole program, which
had the include and it had int main(void) and all of this other cryptic syntax.

- [2:32:48](https://www.youtube.com/watch?v=undefined&t=9168s) This Scratch piece last week was


kind of like the go-to whenever you want to have a main part of your program. It's not the only way to
start a Scratch program. You could listen for clicks or other things, not just the green flag. But this was
probably the most popular place to start a program in Scratch.

- [2:33:04](https://www.youtube.com/watch?v=undefined&t=9184s) In C, the closest analog is to


literally write this out. So just like last week, if you were in the habit of dragging and dropping when
green flag clicked, as a C programmer, the first thing you would do is after creating an empty file, like I
did with hello.c, you'd probably type int main(void) open curly brace, closed curly brace, and then you
can put all of your code inside of those curly braces.

- [2:33:26](https://www.youtube.com/watch?v=undefined&t=9206s) So just like Scratch had this sort of


magnetic nature to it where the puzzle pieces would snap together, C, as a text-based language, tends to
use these curly braces, one of them opened, the other one closed. And anything inside of those braces,
so to speak, is part of this puzzle piece, a.k.a. main. So what was atop them? We went down this rabbit
hole moment ago with these things called header files, even though I didn't call them by this name.

- [2:33:53](https://www.youtube.com/watch?v=undefined&t=9233s) But, indeed, when we have a


whole program in Scratch, super easy. Just have the one green flag clicked and then say hello, world.
There's no special syntax. After all, it's meant to be very user-friendly and graphical. In C, though, you
technically can't just put int main(void) printf hello, world.

- [2:34:10](https://www.youtube.com/watch?v=undefined&t=9250s) You also need this. Because, again,


you need to tell the compiler to load the library-- code that someone else wrote-- so that the compiler
knows what printf even is. You have to load the CS50 library whenever you want to use get_string or
other functions, like get_int, as we'll soon see. Otherwise, the compiler won't know what get_string is.

- [2:34:33](https://www.youtube.com/watch?v=undefined&t=9273s) You just have to do it this way. The


specific file name I'm mentioning here, stdio.h, cs50.h, is what C programmers called a call a header file.
We'll see eventually what's inside of those files. But long story short, it's like a menu of all of the
available functions. So in cs50.

- [2:34:55](https://www.youtube.com/watch?v=undefined&t=9295s) h, there's a menu mentioning


get_string, get_int, and a bunch of other stuff. And in stdio.h, there's a menu of functions, among which
are printf. And that menu is what prepares the compiler to know how to implement those same
functions. All right, let me pause here. Question. AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Not quite. A
library provides all of the functionality we're talking about.

- [2:35:21](https://www.youtube.com/watch?v=undefined&t=9321s) A header file is the very specific


mechanism via which you include it. And we'll discuss this more next week. For now, they're essentially
the same, but we'll discuss nuances between the two next week. Yeah, the library would be standard
I/O. The library would CS50. The corresponding header file is stdio.h, cs50.h.

- [2:35:42](https://www.youtube.com/watch?v=undefined&t=9342s) Indeed. Other questions. Yeah.


AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Indeed. That, too, is on the menu. We'll come back to that.
But the word string-- incredibly common in the world of programming, it's not a CS50 idea-- but in C,
there's technically no such data type as string by default. We have sort of conjured it up to simplify the
first few weeks.
- [2:36:02](https://www.youtube.com/watch?v=undefined&t=9362s) That's a training wheel that we'll
very deliberately, in a few weeks, take away, and we'll see why we've been using get_string and string.
Because C otherwise makes things quite more challenging early on, which then gets besides the point for
us. Yeah. AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yes.

- [2:36:21](https://www.youtube.com/watch?v=undefined&t=9381s) Early on, you will have to use


whatever is prescribed by the specification. That will include CS50's functions. Long story short, you
referred, I think, a moment ago to another function called scanf, we won't talk about for a few weeks.
Long story short, in C, it's pretty easy and possible to get input from a user.

- [2:36:37](https://www.youtube.com/watch?v=undefined&t=9397s) The catch is that it's really easy to


do it dangerously. And C, because it's an older, lower-level language, so to speak, that gives you pretty
much ultimate control over your computer's hardware. It's very easy to make mistakes. And, indeed,
that's too why we use the library, so your code won't crash unintendedly.

- [2:36:59](https://www.youtube.com/watch?v=undefined&t=9419s) All right, so with this in mind, we


have this now mapping between the Scratch version and the other. Let me just give you a quick tour of
some of the other placeholders and data types that students will start seeing as we assemble more
interesting programs. In the world of Linux, here is a non-exhaustive list of commands with which you'll
get familiar over the next few weeks by playing with problem sets.

- [2:37:18](https://www.youtube.com/watch?v=undefined&t=9438s) We've only seen two of these so


far, ls for list, rm for others. But I mention them now just so that it doesn't feel too foreign when you see
them on screen or online in a problem set. cp is going to stand for copy. mkdir is going to stand for make
directory. mv is going to stand for move or rename.

- [2:37:38](https://www.youtube.com/watch?v=undefined&t=9458s) rmdir is going to be remove


directory, and cd is going to be for change / and let me show you this last one here first, only because it's
something you'll use so commonly. If I go back to my code here on the screen, I'm going to go ahead and
re-open the little GUI on the left-hand side, the so-called Explorer, revealing that I've got two files, hello
and hello.

- [2:38:01](https://www.youtube.com/watch?v=undefined&t=9481s) c so nothing has changed since


there. Suppose now that it's a few weeks into class and I want to start organizing the code I'm writing so
that I have a folder for this week or next week, or maybe a folder for problem set 1, problem set 2. I can
do this in a few ways. In the GUI, I can go up here and do what most of you would do instinctively on a
Mac or PC.

- [2:38:20](https://www.youtube.com/watch?v=undefined&t=9500s) You look for like a folder icon, you


click it, and then you name a folder like PSet1, Enter. Voila, you've got a folder called PSet1. I can confirm
as much with my command line interface by typing what command? How can I list what's in my folder?
Yeah, so ls for list. And now I see hello-- and it's green with an asterisk because that's my executable, my
runnable program-- hello.

- [2:38:47](https://www.youtube.com/watch?v=undefined&t=9527s) c, which is my source code, and


now PSet1 with a slash at the end, which just implies that it's indeed a folder. All right, I didn't really
want to do it that way. I'd like to do it more advanced. So let me go ahead and right-click on PSet1,
delete permanently. I get a scary irreversible error message. But there's nothing in it, so that's fine.
- [2:39:01](https://www.youtube.com/watch?v=undefined&t=9541s) Now I've deleted it using the GUI.
But now let me go ahead and start doing the same thing from the command line. And if you're
wondering how things keep disappearing, if you hit Control-L in your terminal window or explicitly type
clear, it will delete everything you previously typed just to clean things up.

- [2:39:18](https://www.youtube.com/watch?v=undefined&t=9558s) In practice, you don't need to be


doing this often. I'm doing it just to keep our focus on my latest commands. If I do-- what was the
command to make a new directory? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yeah, so mkdir, make
directory. Let me create PSet1, Enter. And notice at left, there's my PSet1.

- [2:39:35](https://www.youtube.com/watch?v=undefined&t=9575s) If I want to get a little overzealous,


plan for next week, here's my PSet2 directory. Suppose now I want to open those folders on a Mac or PC
or in this GUI, I could double-click on it like this, and you'd see this little arrow is moving. It's not doing
anything because there's nothing in there, but that's fine.

- [2:39:51](https://www.youtube.com/watch?v=undefined&t=9591s) But suppose again I want to get


more comfortable with my command line. Notice if I type ls now, I see all four same things. Let me
change directories with cd space PSet1 Enter. And now notice two things will have happened. One, my
prompt has changed slightly to remind me where I am, just to keep me sane so that I don't forget what
folder I'm actually in.

- [2:40:18](https://www.youtube.com/watch?v=undefined&t=9618s) So here is just a visual reminder of


what folder I'm currently in. If I type ls now, what should I see after hitting Enter? Nothing, because I've
only created empty folders so far. And, indeed, I see nothing. If I wanted to create a folder called Mario
for a program that might be called Mario this week, I can do that.

- [2:40:38](https://www.youtube.com/watch?v=undefined&t=9638s) Now if I type ls, there is Mario.


Now if I do cd Mario, notice my prompt's going to change to be a little more precise. Now I'm in
PSet1/Mario. And notice what's happening at top left. Nothing now, because these folders are collapsed.
But if I click the little triangle, there I see Mario. Nothing's going on in there because there's no files yet.

- [2:40:57](https://www.youtube.com/watch?v=undefined&t=9657s) But suppose now I want to create a


file called mario.c. I could go up here, I could click the little plus icon, and use the GUI. Or I can just type
code mario.c. Voila. That creates a new tab for me. I'm not going to write any code in here yet, but I am
going to save the file. And now at top left, you'll see that mario.c appears.

- [2:41:17](https://www.youtube.com/watch?v=undefined&t=9677s) So at some point, you can


eventually just close the Explorer. Because, again, it's not providing you with any new information. It's
maybe more user-friendly, but there's nothing you can't do at the command line that you could do with
the GUI. All right, but now I'm kind of stuck. How do I get out of this folder? In my Mac or PC world, I'd
probably click the Back button or something like that or just close it and start all over.

- [2:41:37](https://www.youtube.com/watch?v=undefined&t=9697s) In the terminal window, I can do cd


dot dot. Dot dot is a nickname, if you will, for the parent directory. That is, the previous directory. So if I
hit Enter now, notice I'm going to close the Mario folder, a.k.a. directory, and now I'm back in PSet1. Or, if
I want to be fancy, let me go back into Mario temporarily.
- [2:42:00](https://www.youtube.com/watch?v=undefined&t=9720s) If I type ls, there's mario.c, just to
orient us. If I want to do multiple things at a time, I could do cd../.. which goes to my parent to my
grandparent all in one breath. And voila, now I'm back in my default folder, if you will. And one last little
trick of the trade, if I'm in PSet1/Mario like I was a moment ago, and you're just tired of all the
navigation, if you just type cd and hit Enter, it'll whisk you away back to your default folder, and you don't
have to worry about getting there manually.

- [2:42:31](https://www.youtube.com/watch?v=undefined&t=9751s) Recall a bit ago, though, that I was


running hello as this, ./hello. If dot refers to my parent, perhaps infer here syntactically, what does a
single dot mean instead? It means this directory, your current directory. Why is that necessary? It just
makes super explicit to the computer that I want the program called hello that's installed here, not in
some random other folder on my hard drive, so to speak.

- [2:43:00](https://www.youtube.com/watch?v=undefined&t=9780s) I want the one that's right here


instead. All right, so besides these commands, there's going to be others that we encounter over time.
Those are kind of the basics. That allows you to wean yourself off of a GUI, Graphical User Interface, and
start using more comfortably, with practice and time, a command line interface instead.

- [2:43:16](https://www.youtube.com/watch?v=undefined&t=9796s) Well, what about those other


types, now back in the world of C? Those commands were not C. Those are just command-specific to a
command line interface, like in Linux, which, again, we're using in the cloud. It's an alternative to Mac OS
and Windows. Back in the world of C now, we've seen strings, which are words.

- [2:43:35](https://www.youtube.com/watch?v=undefined&t=9815s) I mentioned int or integer, but


there's others as well. In the world of C, we've seen string, we will see int. If you want a bigger integer,
there's something literally called a long. If you want a single character, there's something called a char. If
you want a Boolean value, true or false, there is a bool.

- [2:43:54](https://www.youtube.com/watch?v=undefined&t=9834s) And if you want a floating-point


value-- a fancy way of saying a real number, something with a decimal point in it-- that is what C and
other languages call a float. And if you want even more numbers after the decimal point that is more
precision, you can use something called a double. That is to say, here is, again, an example in
programming where it's up to you now to provide the computer with hints, essentially, that it will rely on
to know what is this pattern of zeros and ones.

- [2:44:22](https://www.youtube.com/watch?v=undefined&t=9862s) Is it a number, a letter? Is it a


sound, an image, a color, or the like? These are the types of data types that provide exactly those hints.
What are the functions that come in the menu that is the CS50 library? We talked about standard I/O,
and that's just one function so far, printf. In the CS50 library, you can see that it follows a pattern.

- [2:44:43](https://www.youtube.com/watch?v=undefined&t=9883s) The C50 library exists largely for


the first few weeks of the class to make our lives easier when you just want to get user input. So if you
want to get a string, like a word or words from the human, you use get_string. If you want to get an
integer from the user, you're going to use get_int. When you want to get any of those other data types,
for the most part, you use get_ something else.

- [2:45:04](https://www.youtube.com/watch?v=undefined&t=9904s) And they're indeed all lowercase by


convention. What about printf? If we have the ability now to store different types of data and we have
functions with which to get different types of data, how might you go about printing different types of
data? Well, we've seen %s for string, %i for integer, %c for char, %f for a float or a double, those real
numbers I described earlier, and then %li for a long integer.

- [2:45:34](https://www.youtube.com/watch?v=undefined&t=9934s) So here's the first example of


inconsistencies. In an ideal world, that would just be %l and we'd move on. It's %li instead in this case.
That's printf and some of its format codes. What more might we do? Well, in C, as we'll see-- no pun
intended-- there is a whole bunch of operators. And, indeed, computers, one of the first things they did
was a lot of math and calculations, so there's a lot of operators like these.

- [2:45:58](https://www.youtube.com/watch?v=undefined&t=9958s) Computers, and in turn, C, really


good at addition, subtraction, multiplication, division, and even the percent sign, which is the remainder
operator. There's a special symbol in C and other languages just for getting the remainder, when you
divide one number by another. There are other features in the world of C, like variables, as we've seen.

- [2:46:19](https://www.youtube.com/watch?v=undefined&t=9979s) And there's also what is of playfully


called syntactic sugar that makes it easier over time to write fewer characters but express your thoughts
the same. So just as a single example of this, as a single example, consider this use of a variable last
week. Here in Scratch is how you might set a variable called counter to 0.

- [2:46:42](https://www.youtube.com/watch?v=undefined&t=10002s) In C, it's going to be similar. If you


want the variable to be called counter, you literally write the word counter, or whatever you want it to be
called. You then use the assignment operator, a.k.a. the equals sign, and you assign it whatever its initial
value should be here on the right. So, again, the 0 is going to get copied from right to left into the
variable because of that single equal sign.

- [2:47:03](https://www.youtube.com/watch?v=undefined&t=10023s) But this isn't sufficient in C. What


else is missing on the right-hand side, instinctively now? Even if you've never programmed in this before.
Yeah, in front. AUDIENCE: Semicolon. DAVID J. MALAN: A semicolon at the end. And one other thing, I
think, is probably missing. Again. AUDIENCE: A data type.

- [2:47:18](https://www.youtube.com/watch?v=undefined&t=10038s) DAVID J. MALAN: A data type. So


if we can keep going back and forth here, what data type seems appropriate intuitively for counter? int
for integer. So, indeed, we need to tell the computer when creating a variable what type of data we
want, and we need to finish our thought with the semicolon. So there might be a counterpart there.

- [2:47:38](https://www.youtube.com/watch?v=undefined&t=10058s) What about in Scratch if we


wanted to increment that counter variable? We had this very user-friendly puzzle piece last time that
was change counter by 1, or add 1 to counter. In C, here's where things get a little more interesting. And
pretty commonly done, you might do this. counter = counter + 1; with a semicolon.

- [2:47:59](https://www.youtube.com/watch?v=undefined&t=10079s) And this is where, again, it's


important to note, the equal sign, it's not equality. Otherwise, this makes no sense. counter cannot equal
counter plus 1, right? That just doesn't work if we're talking about integers here. That's because the
equal sign is assignment. So it can certainly be the case that you calculate counter plus 1, whatever that
is, then you update the value of counter from right to left to be that new value.
- [2:48:23](https://www.youtube.com/watch?v=undefined&t=10103s) This, as we'll see, is a very
common thing to do in programming just to kind of count upward, for whatever reason. You can write
this more succinctly. This code here is what we'll call syntactic sugar, sort of a fancy way of saying the
same thing with fewer words or fewer characters on the screen.

- [2:48:40](https://www.youtube.com/watch?v=undefined&t=10120s) This also adds 1, or whatever


number you type over here, to the variable on the left. And there's one other form of syntactic sugar
we're going to start seeing too, and it's even more terse than this. That too will increment counter by 1
by literally changing its value by 1. Or if you change it to minus minus, subtracting 1 from it.

- [2:49:00](https://www.youtube.com/watch?v=undefined&t=10140s) You can't do that with 2 and 3 and


4, but you can do it by default with just plus plus or minus minus adding or subtracting 1. Yeah.
AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Ah, so when you are changing a variable that already has been
created, as we did with the code that looked like this, you no longer need to remind the computer what
the data type is.

- [2:49:24](https://www.youtube.com/watch?v=undefined&t=10164s) Thankfully, the computer is at


least as smart as that. It will remember the type of the data that you intended. Other questions or
comments on this? All right, that's quite a lot. Why don't we go ahead and here take a 10-minute break?
And we'll be back, we'll start writing some code. All right, so we are back.

- [2:49:45](https://www.youtube.com/watch?v=undefined&t=10185s) We've just looked at some of the


basics of compiling, even if it doesn't quite feel that basic. But now, let's actually start focusing really on
writing more and more code, more and more interesting code, kind of like we dove into Scratch last
week. So here I have these code open. I've closed the GUI.

- [2:50:02](https://www.youtube.com/watch?v=undefined&t=10202s) I'm going to focus more on my


terminal window and my code editors. Many different ways I can create new files, but I want to create
something called a calculator. So, again, within this environment of VS Code, I can literally write the code
command which is VS Code specific, and it just creates a new file for me automatically.

- [2:50:18](https://www.youtube.com/watch?v=undefined&t=10218s) Or I could do that in the GUI. I'm


going to go ahead and create this file called calculator.c and I'm going to go ahead and include some
familiar things. So I'm just going to go ahead and proactively include cs50.h, stdio.h. I'm going to go
ahead from memory and do the int void main-- more on that next week, why it's int, why it's void, and so
forth.

- [2:50:38](https://www.youtube.com/watch?v=undefined&t=10238s) And now let me just implement a


very simple calculator. We saw some mathematical operators, like plus and the like. So let's actually use
this. So let me go ahead and first give myself a variable called x, sort of like grade school math or algebra.
Let me go ahead then and get an int, which is new, but I mentioned this exists.

- [2:50:57](https://www.youtube.com/watch?v=undefined&t=10257s) And then let me just ask the user


for whatever their x value is. The thing in the quotes is just the English, or the string that I'm printing on
the screen. so I could say anything I want. I'm just going to say x colon to prompt the user accordingly.
Now I'm going to go ahead and get another variable called y.
- [2:51:12](https://www.youtube.com/watch?v=undefined&t=10272s) I'm going to get int again. And
now, I'm going to prompt the user for y. And I'm just very nitpickly using a space just to move the cursor
so it doesn't look too messy on the screen. And then lastly, let me go ahead and just print out the sum of
x and y. In an ideal world, I would just say something like printf x + y.

- [2:51:32](https://www.youtube.com/watch?v=undefined&t=10292s) But that is not valid in C. The first


argument, recall, in printf has to be a string in double quotes. So if I want to print out the value of an
integer, I need to put something in quotes here, maybe followed by a newline, if I want to move the
cursor as well. So, again, we only glimpsed it briefly, but what do I replace these question marks with if I
want a placeholder for an integer? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yeah, so %i.

- [2:51:58](https://www.youtube.com/watch?v=undefined&t=10318s) Just like %s was string, %i is


integer. So I change this %i. And now if I want to add x and y, for instance, super-- simple calculator,
doesn't do much of anything other than addition of two integers-- I think this works. And, again, it looks
definitely cryptic at first glance. It would be if programming weren't this cryptic.

- [2:52:16](https://www.youtube.com/watch?v=undefined&t=10336s) Other languages will clean this up


for us. But, again, if you focus on the basics, printf takes one input first-- which is a format string with
English or whatever language, some placeholders, maybe-- then it takes potentially more arguments
after the comma, like the value of x plus y. All right, let me go ahead now and make calculator, which,
again, compiles my source code in C, pictured above, and converts it into corresponding machine code,
or zeros and ones.

- [2:52:46](https://www.youtube.com/watch?v=undefined&t=10366s) No error messages. so that's


already good. Now I do ./calculator. Let's do 1 plus 1 and Enter. Voila. Now I have the makings of a
calculator. Now let's start to tinker with this a little bit. What if I instead had done this? int z = x + y and
then plug-in z here. If I rerun make calculator, Enter, rerun .

- [2:53:15](https://www.youtube.com/watch?v=undefined&t=10395s) /calculator, type in 1 plus 1, still


equals 2, and let me claim that it will work for other values as well-- which of these versions is better
designed? If both seem to be correct at very cursory glance, is this version better or is the previous one
without the z? OK, so this one is arguably better because I've now got a reusable variable called z that I
cannot only print but, heck, if my program is longer, I can use it elsewhere.

- [2:53:39](https://www.youtube.com/watch?v=undefined&t=10419s) Counterthoughts? AUDIENCE:


[INAUDIBLE] DAVID J. MALAN: Yeah. Debatable, like before, because it depends on my intent. And,
honestly, I think a pretty good argument can be made for the first version. Because if I have no intention
of-- as you note-- using that variable again, you know what? Maybe I might as well do this, just because
it's one less thing to think about.

- [2:53:57](https://www.youtube.com/watch?v=undefined&t=10437s) It's one less distraction. It's one


less line of code to have to understand. It's just a little tighter. So here, again, it does depend on your
intention. But this field is pretty reasonable. And I think, as someone noted earlier, when I did the same
thing with get_string, that, yeah, maybe kind of crossed s line because get_string and the what's your
name inside of it, it was just so much longer.

- [2:54:17](https://www.youtube.com/watch?v=undefined&t=10457s) But x + y, eh, it's not that hard to


wrap our mind around what's going on inside of the printf argument. So, again, these are the kinds of
thoughts that hopefully you'll acquire the instinct for on not necessarily reaching the same answer as
someone else, but, again, the thought process is what matters here.

- [2:54:33](https://www.youtube.com/watch?v=undefined&t=10473s) All right, so how might I enhance


this program a little bit? Let's just talk about style for just a moment. So x and y, at least in this case, are
pretty reasonable variable names. Why? Because that's the go-to variable names in math when you're
adding two things together. So x and y seem pretty reasonable.

- [2:54:49](https://www.youtube.com/watch?v=undefined&t=10489s) I could have done something like,


well, maybe my first variable should be called first number and my next variable should be called second
number. And then down here, I would have to change this to first number plus second number. Like, eh,
this isn't really adding anything semantically to help my comprehension.

- [2:55:08](https://www.youtube.com/watch?v=undefined&t=10508s) But that would be one other


direction we could have taken things. So if you have very simple ideas that are conventionally expressed
with common variable names like x and y, totally fine here. What if I want to annotate this program and
remind myself what it is it does? Well, I can add in C what are called comments.

- [2:55:25](https://www.youtube.com/watch?v=undefined&t=10525s) With a slash slash, two forward


slashes, you can write a note to yourself, like prompt user for x. And then down here, I could do
something like prompt user for y, just to remind myself what I'm doing there. And down here, perform
addition. Now, in this case, I'm not sure these commands are really adding all that much.

- [2:55:44](https://www.youtube.com/watch?v=undefined&t=10544s) Because in the time it took me to


write and eventually read these comments, I could have just read the three lines of code. But as our
programs get more sophisticated and you start to learn more syntax-- that, honestly, you might forget
the next day, the next week, the next month-- might be useful to have these notes to self that reminds
you of what your code is doing or maybe even how it is doing that thing.

- [2:56:06](https://www.youtube.com/watch?v=undefined&t=10566s) With these early programs, not


really necessary, doesn't really add all that much to our comprehension, but it is a mechanism you have
in place that can help you actually remind yourself or remind someone else what it is that's going on.
Well, let me go ahead and rerun this again in this current version, make calculator.

- [2:56:24](https://www.youtube.com/watch?v=undefined&t=10584s) And here, too, you might think I'm


typing crazy fast-- not really. I'm hitting Tab a lot. So it turns out that Linux, the operating system we're
using here in the cloud-- but, actually, Windows and Mac OS nowadays support this too-- supports
autocomplete. So if you only have one program that starts with C-A-L, you don't have to finish writing
calculator, you can just hit Tab, and the computer will finish your thought for you.

- [2:56:48](https://www.youtube.com/watch?v=undefined&t=10608s) The other thing you can do is if


you hit Up and keep going up, you'll scroll through your entire history of commands. So there too, I've
been saving some keystrokes by hitting Up quickly rather than retyping the same darn thing again and
again. So, again, just another little convenience to make programming and interacting with the command
line interface even faster.

- [2:57:06](https://www.youtube.com/watch?v=undefined&t=10626s) All right, let me go ahead and just


make sure it's compiled in the current form. The comments have no functional impact. These green
things are just notes to self. Let me run calculator with maybe-- how about this? Instead of 1 plus 1, how
about 1 billion-- whoops, let's do that again. Wa, da, da. 1 million, 1 billion, and another 1 billion, and
that answer is 2 billion.

- [2:57:30](https://www.youtube.com/watch?v=undefined&t=10650s) All right, so that seems correct.


Let's run this program one more time. How about 2 billion plus another 2 billion? Did you know that? So,
apparently, it's not so correct. And, clearly, running 1 plus 1 was not the most robust testing of my code
here. What might have gone wrong? What might have gone wrong? Yeah.

- [2:57:55](https://www.youtube.com/watch?v=undefined&t=10675s) AUDIENCE: [INAUDIBLE] DAVID J.


MALAN: Yeah. The computer probably ran out of space with bits. So it turns out with these data types--
we've been talking about string and int and also float and char and those other things-- they all use a
specific, and, most importantly, finite number of bits to represent them.

- [2:58:13](https://www.youtube.com/watch?v=undefined&t=10693s) It can vary by computer. Newer


computers use more bits, older computers tended to use fewer bits. It's not necessarily standardized for
all of these data types. But in this case, in this environment, it is using 32 bits for an integer. That's a lot.
So with 32 bits, you can count pretty high. This is 64 light bulbs on the stage and could count even higher.

- [2:58:34](https://www.youtube.com/watch?v=undefined&t=10714s) An int is only using half of these,


or we have two integers here on the stage. Now, if you think back to last week, we talked about 8 bits at
one point. And if you have 8 bits, 8 zeros and ones, you can count as high as 256-- just a good number to
generally remember as trivia. 8 bits gives you 256 permutations of zeros and ones.

- [2:58:54](https://www.youtube.com/watch?v=undefined&t=10734s) 32 gives you roughly how many, if


anyone knows? It's 2 to the 32 power. So it's roughly 4 billion, 2 to the 32. If you don't know that, it's
fine. Most programmers, though, eventually remember these kinds of heuristics. So it's roughly 4 billion.
So that feels like enough. 2 billion plus 2 billion is exactly 4 billion.

- [2:59:14](https://www.youtube.com/watch?v=undefined&t=10754s) And that actually should fit in a


32-bit integer. The catch is that my Mac, your PC, and the like also like to support negative numbers. And
if you want to support both positive and negative numbers, that technically means with 32-bit integers,
you can count as high as roughly 2 billion positive or 2 billion negative in the other direction.

- [2:59:35](https://www.youtube.com/watch?v=undefined&t=10775s) That's still 4 billion, give or take,


but it's only half as many in one direction or the other. So how could I go about implementing a correct
calculator here? What might the solution be? Yeah, so not just li, which was for long integer. I have to
make one more change, which is to the data type itself.

- [2:59:55](https://www.youtube.com/watch?v=undefined&t=10795s) So let me go back up here and


change x from an int to a long, a.k.a. long integer. And then let me change y as well. And then let me
change the format code per the little cheat sheet we had up a few minutes ago to li. Let me recompile
the calculator-- seems to work OK. Let's rerun it. Now let's do 1 plus 1.

- [3:00:14](https://www.youtube.com/watch?v=undefined&t=10814s) That's should obviously be the


same. Now let's do 2 billion and another 2 billion and cross our fingers this time. Now we're counting as
high as 4 billion. And we can go way higher than 4 billion, but we're only kicking the can down the street
a bit. Even though we're now using-- with a long-- 64 bits, which is as long as this stage now, that's still a
finite value.

- [3:00:38](https://www.youtube.com/watch?v=undefined&t=10838s) It might be a really big value, but


it's still finite. And we'll come back at the end of today to these kinds of fundamental limitations.
Because arguably now, my calculator is correct for like millions, billions of possible inputs but not all. And
that's problematic if you actually want to use my calculator for any possible inputs, not just ones that are
roughly less than, say, 2 billion, as in this case.

- [3:01:03](https://www.youtube.com/watch?v=undefined&t=10863s) All right, any questions then on


that? But it's really just a precursor for all the problems that we're going to have to eventually deal with
later on. AUDIENCE: [INAUDIBLE] DAVID J. MALAN: A good question. Yes. If we were still using z, we
would also have to change it to a long. Otherwise, we'd be ignoring 32 of the bits that had been added
together via the longs.

- [3:01:26](https://www.youtube.com/watch?v=undefined&t=10886s) Good question. All right, so how


about we spice things up with maybe not just addition here, how about something with some
conditions? Let's start to ask some actual questions. So a moment ago, recall that we had just the
declaration of variables. Now let's look back at something in Scratch that looked a little something like
this, a bunch of puzzle pieces asking questions by way of these conditionals and then these Boolean
expressions here in green, maybe saying something like x is less than y.

- [3:01:55](https://www.youtube.com/watch?v=undefined&t=10915s) In C, this actually maps pretty


cleanly. It's much cleaner from left to right than it was with printf and join. Here, we have just code that
looks like this. If, a space, two parentheses and then x less than y, and then we have something like printf
there in the middle. So here, it's actually kind of a nice mapping.

- [3:02:14](https://www.youtube.com/watch?v=undefined&t=10934s) Notice that, just as the yellow


puzzle piece in Scratch is kind of hugging the purple puzzle piece, that's effectively the role that these
curly braces are playing. They're sort of encapsulating all of the code on the inside. The parentheses
represent the Boolean expression that needs to be asked and answered to decide whether or not to do
this thing.

- [3:02:32](https://www.youtube.com/watch?v=undefined&t=10952s) And here's an exception to what I


alluded to earlier. Usually, when you see a word and then a parenthesis, something, and then closed
parenthesis, I claimed that's usually a function. And I'm still feeling pretty good about that claim. But
there are exceptions. And the word if is not a function.

- [3:02:48](https://www.youtube.com/watch?v=undefined&t=10968s) It's just a programming construct.


It's a feature of the C language that similarly uses parentheses, just for different purposes for a Boolean
expression. How about something like this? Last week, if you wanted to have a two-way fork in the road,
go this way or that way, you can have if and else. In C, that would look a little something like this.

- [3:03:08](https://www.youtube.com/watch?v=undefined&t=10988s) And if we add in the printf's, it


now looks quite like the same, but it adds, of course, the word else and then a couple of more curly
braces. As an aside, in C, It's not strictly necessary to have curly braces if you have only one line of code
indented underneath. For best practice, though, do so anyway, because it makes super clear to you and
ultimately anyone else reading your code that you intend for just that one or more line of code to
execute.

- [3:03:35](https://www.youtube.com/watch?v=undefined&t=11015s) How about this from last week?


Here was a three-way fork in the road. If x is less than y, else if x is greater than y, else if x equals y. Now,
here's where you have some disparities between Scratch and C. Scratch uses an equals sign for equality,
to compare two values. C uses a single equals sign for assignment from right to left, minor difference
between the two worlds.

- [3:03:59](https://www.youtube.com/watch?v=undefined&t=11039s) In C, we could implement the


same code like this, the addition being just this additional else if. And if we add in the printf's, it looks a
little something now like this. This is correct both in the Scratch world and in the C world. But could
someone make a claim that this is not, again, well-designed? Exactly.

- [3:04:17](https://www.youtube.com/watch?v=undefined&t=11057s) We don't need the last if. We


need the else, at least, but we don't need the last if. Because, at least in the world of comparing integers,
it's either going to be less than, greater than, or equal to. There is no other case. So you can save a few
seconds, if you will, of your program running-- a blink of the eye-- by only asking two questions and then
inferring what the answer to the third must be just by nature of your own human logic here.

- [3:04:41](https://www.youtube.com/watch?v=undefined&t=11081s) Now, why is that a good thing? If,


for instance, x and y happen to equal each other-- I type in 1 and 1 for both values, either in Scratch or in
the C world-- in the case of this version, you're sort of stupidly asking three questions, all of which are
going to get asked even though the answer is no, no, yes.

- [3:05:02](https://www.youtube.com/watch?v=undefined&t=11102s) That is false, false, true. That


seems to be unnecessary because if we instead optimize this code, get rid of the unnecessary if and just
do as you proposed logically-- else print that x is equal to y-- now if x indeed equals y because they're
both 1 or some other value, now you're only going to ask two questions, so 2/3 as many questions, and
then you're going to get your same correct result.

- [3:05:29](https://www.youtube.com/watch?v=undefined&t=11129s) So, again, a minor detail, but,


again, the kinds of things you should be thinking about, not only as you write your code to be correct but
also write it to be well-designed as well. All right, so why don't we go ahead and translate this into the
context of an actual program here? I'll create a blank window here.

- [3:05:47](https://www.youtube.com/watch?v=undefined&t=11147s) And let's do something with


points, like points on my own very first CS50 problem set. Let me go ahead and run code of points.c.
That's just going to give me a new text file. And then up here, I'm going to do my usual, include cs50.h.
include stdio.h. int main void. So a lot of boilerplate, so to speak, in these early programs.

- [3:06:08](https://www.youtube.com/watch?v=undefined&t=11168s) And now, let's see. Let's ask the


user, how many points did they lose on their most recent CS50 PSet? So sort of evoke my photograph of
my own very first PSet last week where I lost a couple of points myself. So int points = get_int. Then I'll
ask a question in English like, how many points did you lose, question mark, space? And then once I have
this answer, let's now ask some questions of it.
- [3:06:34](https://www.youtube.com/watch?v=undefined&t=11194s) So if points is less than 2--
borrowing the syntax that we saw on the screen a moment ago-- let's go ahead and print out something
explanatory like you lost fewer points than me, backslash n. else if points greater than 2-- which is, again
how many I lost-- I'm going to go ahead and print out you lost more points than me, backslash n.

- [3:07:00](https://www.youtube.com/watch?v=undefined&t=11220s) else if-- wait a minute, else seems


to be sufficient logically here. I'm just going to go ahead and print out something like you lost the same
number of points as me, backslash n. So, really, just a straightforward application of that simple idea but
to a concrete scenario here. So let me go ahead and save this.

- [3:07:21](https://www.youtube.com/watch?v=undefined&t=11241s) Let me go ahead and run make


points, Enter. No errors, that's good. Run points. And then, how many points did you lose? How about,
it's 1 point? All right, you lost fewer points than me. How about 0 points? Even better. How about 3
points? And so forth. So, again, we have the ability to express in C now pretty basic idea from last week
in reality, which is this notion of conditionals and asking questions.

- [3:07:46](https://www.youtube.com/watch?v=undefined&t=11266s) There's something subtle here,


though, that's maybe not super well-designed that someone might call a magic number. This is
programming speak for something I've done here. There's a bit of redundancy unrelated to the if and the
else and the else. But is there something I typed twice just to ask, perhaps, for the obvious? Exactly, I've
hard-coded, so to speak, manually typed out the number 2-- in two locations, in this case-- that did not
come from the user.

- [3:08:16](https://www.youtube.com/watch?v=undefined&t=11296s) So, apparently, once I compile


this, this is it. You're always comparing yourself to me in like, 1996, which for better or for worse, is all
the program can do. But this is an example too of a magic number in the sense like, wait, where did that
2 come from, and why is it in two places? It feels like we are setting the stage for just a higher probability
of screwing up down the road.

- [3:08:36](https://www.youtube.com/watch?v=undefined&t=11316s) Because the longer this code gets,


suppose I'm comparing against 2 points elsewhere-- 2, 3, 4, 5 places-- am I going to keep typing the
number 2? Like, yeah, that's fine. It's correct. It's going to work. But, honestly, eventually, you're going to
screw up, and you're going to miss one of the 2's, you're going to change it to a 3, because maybe I did
worse the next year, or 1, I did better.

- [3:08:55](https://www.youtube.com/watch?v=undefined&t=11335s) And you don't want these


numbers to get out of sync. So what would be a logical improvement to this design, rather than hard-
coding the same number sort of magically in two or more places? Yeah, why don't I make a variable that
I can use in there? So, for instance, I could create a variable like this, another integer called mine.

- [3:09:15](https://www.youtube.com/watch?v=undefined&t=11355s) And I'm just going to initialize it to


2. And then I'm going to change mentions of 2 to this. And mine is a pretty reasonable name for a
variable insofar as it refers to exactly whose points are in question. There's a risk here, though, minor
though it is. I could accidentally change mine at some point.

- [3:09:32](https://www.youtube.com/watch?v=undefined&t=11372s) Maybe I forget what mine


represents, and I do some addition or subtraction. So there's a way to tell the computer "don't trust me,
because I'm going to screw up eventually" by making a variable constant too. So a constant in a
programming language-- this did not exist in Scratch-- is just an additional hint to the computer that
essentially enables you to program more defensively.

- [3:09:52](https://www.youtube.com/watch?v=undefined&t=11392s) If you don't trust yourself


necessarily to not screw up later, or honestly, in practice, if you know that number should never change,
make it constant and never think about it again. This tells the compiler to make sure that even you later
in your code cannot change the number 2. And another convention in C and other languages, when you
have a constant, it's often common to just capitalize the variable.

- [3:10:17](https://www.youtube.com/watch?v=undefined&t=11417s) Kind of like you're yelling, but it


really just visually makes it stand out. So it's kind of like a nice rule of thumb that helps you realize, oh,
that must be a constant. Capitalization alone does not make it constant. The word const does. But the
capitalization is just a visual reminder that this is somewhere, somehow a constant.

- [3:10:35](https://www.youtube.com/watch?v=undefined&t=11435s) So just a minor refinement, but,


again, we're sort of getting better at programming just by instilling these kinds of heuristics. Questions,
then, on conditionals in C or these constants? Yeah. AUDIENCE: Why do you not use semicolons after line
9 and line 13? DAVID J.

- [3:10:59](https://www.youtube.com/watch?v=undefined&t=11459s) MALAN: Yeah, why do you not use


a semicolon in lines 9, 13? Just because. This is the way the language was designed. And it's confusing
early on. Generally speaking, when you're using conditionals-- and eventually, we'll see loops-- there's no
semicolons involved. For now, assume that semicolons usually finish your thought after a function. That's
not 100% reliable of a heuristic, but it'll get you most of the way there.

- [3:11:20](https://www.youtube.com/watch?v=undefined&t=11480s) And just because. Left hand was


not talking to the right hand when some of these languages were designed. All right, so let's do
something else. How about this? If I have the ability to ask something conditionally-- is this thing true or
is this other thing-- could I write a very simple program that does something basic like, tells me if a
number the human types is even or odd? Well, let me just get the framework for that in place.

- [3:11:42](https://www.youtube.com/watch?v=undefined&t=11502s) Let me go ahead and write code


of a parity-- is a fancy way of saying even or odd. And let me go ahead and include cs50.h, include
stdio.h, int main void-- again, more on those down the road. But, for now, I'm going to go ahead and get
a number n from the user by calling get_int and asking them for whatever n is.

- [3:12:03](https://www.youtube.com/watch?v=undefined&t=11523s) And then now I'm going to


introduce some pseudocode. So here's the first example of a program, honestly, that I'm not really sure
how to proceed. So let me just resort to some pseudocode using comments. Eventually, I'll get rid of this
and write actual code. But if n is even, then print-- actually, let me just print that.

- [3:12:24](https://www.youtube.com/watch?v=undefined&t=11544s) Let me just go ahead and say


printf, quote unquote, "even", because I know how to use printf. else-- all right, I know how to printf odd,
so let me just say printf, quote unquote, "odd". So here, I've sort of taken a bite out of the problem, if
you will. And let me go ahead and put in my little placeholders.

- [3:12:43](https://www.youtube.com/watch?v=undefined&t=11563s) I want to do some kind of


conditions. So if, question marks now, let me go ahead and fill in the blanks here. else I'll put this here.
So I think I'm OK now. I'm getting closer to solving this. But I still have this question mark here. How,
using syntax we've seen, might I determine if n is even or odd? What do you think? Nice.

- [3:13:08](https://www.youtube.com/watch?v=undefined&t=11588s) There's this little operator I


mentioned by name earlier, the remainder operator, that will let you do exactly that. If you divide any
number by 2, that mathematical heuristic is going to tell you if it's even or odd based on whether there's
a remainder of 0 or 1. And that's nice because the alternative would seem to be doing something stupid
like if n == 0 or if n equals 2 or n equals 4-- your code would be infinitely long if you had to ask all
possible questions.

- [3:13:37](https://www.youtube.com/watch?v=undefined&t=11617s) But if I do n divided by 2 and look


at the remainder-- it's a little cryptic, but this will indeed do the trick. So the percent sign is the
remainder operator. It does numerator divided by denominator and returns not the result of that but,
rather, the remainder of that. So if you divide anything by 2, it's going to be a 0 or 1 remainder.

- [3:14:03](https://www.youtube.com/watch?v=undefined&t=11643s) And if, indeed, 2 divides into n


evenly, giving you 0, then you're going to print even. Else, it's got to be odd. But there is something odd--
pun intended-- in this highlighted line. What is another new piece of syntax, apparently, besides the
percent sign? What's a little off there? Yeah. Yeah, so that's not a typo.

- [3:14:26](https://www.youtube.com/watch?v=undefined&t=11666s) And I even caught myself verbally


saying it a moment ago, just because it's so ingrained. What must this mean here? Yeah. AUDIENCE:
[INAUDIBLE] DAVID J. MALAN: Yeah, if something's equivalent to the other. So now this is the equality
operator. It's not assignment from right to left. And this one too is an example of, literally, humans not
really planning ahead, perhaps, left hand not talking to right hand in that someone decided, let's use the
equals sign for assignment.

- [3:14:49](https://www.youtube.com/watch?v=undefined&t=11689s) And then some number of


minutes or days later, people are like, damn, how do we now compare for equality? Well, let's just use
two. And if you think this is a little weird, in some languages, like JavaScript, there's a third version where
you use three equal signs. So, again, it's humans that design these languages.

- [3:15:03](https://www.youtube.com/watch?v=undefined&t=11703s) So if you're ever frustrated by


them, confused by them, eh, admittedly, it might just not have been the best design. But we just kind of
have to live with it ever since. So let me go ahead and zoom out here. Let me go ahead and make parity
here. So make parity-- and, again, parity is just the name of my file, parity.c.

- [3:15:21](https://www.youtube.com/watch?v=undefined&t=11721s) ./parity, type in a number like 2.


That's indeed even. 4, that's indeed even. 3, that's indeed odd, and so forth. If we continue testing,
presumably, we'll get the same kinds of answers. How about something else? Let me go ahead now and
let me start copying and pasting some of this code because, admittedly, it's getting a little tedious to
keep typing out all of that boilerplate at the top.

- [3:15:43](https://www.youtube.com/watch?v=undefined&t=11743s) Let me create a program called


agree.c that's reminiscent of any of those forms you have to agree to online with a checkbox or typing in
yes or no or the like. So let me throw away all the guts of this main program and now ask something like
this. Let me go ahead and prompt user to agree to something.
- [3:16:02](https://www.youtube.com/watch?v=undefined&t=11762s) I'm going to go ahead and say,
how about get_string do you agree-- whatever the question might be-- and I want the human to type y
or n for yes or no, respectively. So if it's only a single character, actually, I can actually get by with just
get_char. Not used it before, but it was on our menu of functions from the CS50 library.

- [3:16:22](https://www.youtube.com/watch?v=undefined&t=11782s) And if I want to get the user's


response, the return value should be a char also on the left. So now we've seen strings, ints, and now
chars, if we only care about a single letter. And now let's go ahead, check whether user agreed. So how
about if c == "y", then let me go ahead and, inside of my curly braces, print out agreed or some such
sentence like that.

- [3:16:52](https://www.youtube.com/watch?v=undefined&t=11812s) else if they did not type c-- or you


know what? Let's be explicit here, just so they can't type z or b or some random letter. else if c=="n" n
for no, then let me go ahead and print out not agreed, or something like that. And I'm just going to
ignore the user if they don't cooperate and they type z or b or something that's not y or n.

- [3:17:15](https://www.youtube.com/watch?v=undefined&t=11835s) All right, let me go ahead now


and compile this code, make agree, ./agree. All right, do I agree? Yes. Let's go with the default. OK, so
that seems to work. No, I don't agree this time. That seems to work. How about my caps lock key is on or
I'm just really yelling, capital Y? It ignores me. Capital N, it ignores me.

- [3:17:38](https://www.youtube.com/watch?v=undefined&t=11858s) So, obviously, a bug, at least if I


want to tolerate uppercase and lowercase, which is kind of reasonable. So what would be the possible
solutions here, do you think? How do I solve this and tolerate both capital and lowercase? Maybe what's
the simplest, most naive implementation? AUDIENCE: [INAUDIBLE] DAVID J.

- [3:17:59](https://www.youtube.com/watch?v=undefined&t=11879s) MALAN: Yeah, so why don't I just


ask two questions? Or you know what, even more simplistic based only on what we've seen before-- if
you will, let me just copy and paste some of this code. Change this to an else-- whoops, not in caps-- else
if "Y". And then I bet I could do the same thing with n. But here too, just like with Scratch, as soon as you
start to find yourself copying and pasting, you're probably doing something wrong.

- [3:18:19](https://www.youtube.com/watch?v=undefined&t=11899s) And what you said verbally, if I


may, was actually better. Because you're implying that I could just say something like OR c == "Y" or,
down here, c == "N". The catch is, you can't use the word OR in C. It's actually two vertical bars. So you
can express one question or another.

- [3:18:43](https://www.youtube.com/watch?v=undefined&t=11923s) You only need one of the answers


to be yes or true, and you use two vertical bars. By contrast, just so you've seen it, if you wanted to check
if something is equal to something AND something else, you could use two ampersands. This logically
would make no sense here, though. Certainly, what the human typed can't both be lowercase and
uppercase.

- [3:19:03](https://www.youtube.com/watch?v=undefined&t=11943s) That just makes no sense. So in


this case, we do want OR. But that allows me to tighten my code up. I don't have to start copying and
pasting whole branches. I can now ask two questions at once. Questions, then, on this variation? Really
good question. Can you convert the input to all lowercase? Absolutely, you could.
- [3:19:21](https://www.youtube.com/watch?v=undefined&t=11961s) We don't have the capability yet.
It turns out that's going to require-- to be easy, another library, though we could do it ourselves knowing
a little bit about ASCII or Unicode from last week. But, yes, that would be an alternative, but more on
that a different time. Other questions? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Good question.

- [3:19:38](https://www.youtube.com/watch?v=undefined&t=11978s) Unfortunately, you have to be


explicit in C. You can't just say this, even though that's kind of how you might think about it. You have to
ask a complete question using the equality sign twice in this case. Let me ask a question now too. It's not
a typo. I deliberately used single quotes around all of my single letters here.

- [3:19:58](https://www.youtube.com/watch?v=undefined&t=11998s) Why might that be? Previously,


we used double quotes for anything that looked like text. Yeah. Correct, string is double quotes for
multiple characters-- or even one, technically, but yes. And single quotes for single characters. Because
my data type is different. I chose the simple route of just using a single char.

- [3:20:19](https://www.youtube.com/watch?v=undefined&t=12019s) In fact, this program won't work


with Y-E-S or N-O. That's not supported at the moment-- more on that another time. I had to use single
quotes because that's how C does it. If you're dealing with single characters, a.k.a. chars, use single
quotes. If it's a string-- even if it's one single character in a string as though you're starting to write out a
longer word or sentence-- that would be double quotes.

- [3:20:41](https://www.youtube.com/watch?v=undefined&t=12041s) And we'll see why this is before


long too. But, again, just things to keep in mind whenever writing code in this particular language. Yeah,
down here. So, short answer, if I'm understanding correctly, this would be incorrect. And this would be
even more incorrect. But if you don't mind, let me kick the can a couple of weeks on this as to why this
doesn't work.

- [3:21:02](https://www.youtube.com/watch?v=undefined&t=12062s) The most pleasant way to do this


would indeed be to do something like this. But even this is a slippery slope, because what if the user
does something weird, like they capitalize just the Y? You can imagine this getting messy quickly. I like
your idea earlier about just forcing everything to lowercase just to standardize things.

- [3:21:19](https://www.youtube.com/watch?v=undefined&t=12079s) Unfortunately, you cannot


compare strings for equality like this for, again, reasons will come to before long. So for today, we're
keeping it simple, even though, arguably, it's not nearly as user-friendly to only tolerate individual letters.
And there's a question over here. On the US English keyboard it's shift and then the backslash key above
Return, but depending on your keyboard, it will vary.

- [3:21:42](https://www.youtube.com/watch?v=undefined&t=12102s) All right, so let's actually now look


back at something we did a little bit of last week. Let me go ahead and open a file called meow.c,
because, recall, that's what we had Scratch do initially. Let me include not the C50 library this time, but
just stdio.h because I only want printf for this demo.

- [3:21:59](https://www.youtube.com/watch?v=undefined&t=12119s) Let me go ahead now and just


print out meow. And then if I want the cat to meow three times, like it did last week, meow, meow,
meow. Save it. make meow, ./meow. Voila. The program is written-- correct, I claim. It ran. It compiled
OK. But, again, this was the beginning of our conversation last week of not being particularly well-
designed.
- [3:22:21](https://www.youtube.com/watch?v=undefined&t=12141s) And if someone wants to maybe
point out the now obvious, why is this not well-designed, necessarily? Yeah, it's just repetition, right?
Again, I literally resorted to copy-paste. That should be the signal that you're probably doing something
wrong or, at best, just lazy of you, in this case. So the solution, as you might glean from last week, is
probably going to be one of those things called loops.

- [3:22:43](https://www.youtube.com/watch?v=undefined&t=12163s) So let's just take a look at some of


the syntax for loops in C. But, again, no new ideas, it's just some new syntax that'll take some getting
used to. In Scratch, if you wanted to meow forever with something like this, there's not a forever
keyword in C, so this one's a little weird to look at.

- [3:22:58](https://www.youtube.com/watch?v=undefined&t=12178s) But this is the best we can do. It


turns out there is a keyword called while in C. And that kind of has the right semantics, because it's like
while I do something again and again, that's the best I can do. But just like an if condition or an else if
condition, those took a Boolean expression in parentheses, a while loop also takes a Boolean expression
in parentheses.

- [3:23:21](https://www.youtube.com/watch?v=undefined&t=12201s) So I have to ask a question. Now,


if I want to do something forever, I could kind of stupidly just say while 2 is greater than 1, while 3 is
greater than 2, or just something completely arbitrary. But that should rub you the wrong way, because
like, why 2 versus 1? Why 3-- if you want true, just say true.

- [3:23:41](https://www.youtube.com/watch?v=undefined&t=12221s) So it turns out in C, there are


special keywords, true and false, that are literally true and false, respectively. I could also put the number
1 for true and the number 0 for false, but most people would just say true to be explicit. So it's a little
hackish, if you will, but very conventional. There's no forever keyword in C.

- [3:24:04](https://www.youtube.com/watch?v=undefined&t=12244s) If I want to then print meow


forever, I'm going to just use something like printf here. So, again, not perfect translation from one to the
other, but absolutely possible in C. What about this? This is a little more common if you want to do
something a finite number of times, like repeat 3. There's a few different ways we can do this in C. Here's
one approach.

- [3:24:22](https://www.youtube.com/watch?v=undefined&t=12262s) And here's where C-- like a lot of


text-based languages, you kind of have to whip out that toolkit of all of the basic building blocks and
think about, all right, how can I build a little machine in software that does something some number of
times? Well, let me give myself a variable called counter, set it equal to 0.

- [3:24:40](https://www.youtube.com/watch?v=undefined&t=12280s) Let me create a loop whose


Boolean expression is counter less than 3, the idea being here, why don't I just kind of count 1, 2, 3? So
how do I implement this physicality in code? I give myself a variable, set it to 0, 0 fingers up. Now, I ask
the question, is counter less than 3? If so, go ahead and print out meow.

- [3:25:03](https://www.youtube.com/watch?v=undefined&t=12303s) And just intuitively, even if you've


never seen C code or any code before Scratch, what more do I need to do? I've left room here for one
more line of logic. Yeah. We have to increase counter. So I need code like I showed earlier, like counter
equals counter plus 1. And so here's where programming sometimes becomes a bit more like plumbing.
- [3:25:23](https://www.youtube.com/watch?v=undefined&t=12323s) You can't just say what you mean,
like you couldn't Scratch. You have to build a little sort of software machine that initializes a value, does
something, increments it, checks it. And so it's kind of like this software-based machine, but together,
that's just using some familiar building blocks.

- [3:25:37](https://www.youtube.com/watch?v=undefined&t=12337s) But this is pretty common. Just


like in Scratch, you might have used loops a bunch of times, pretty common in C. So can we tighten this
code up? This is correct, but here are some conventions that are popular. If you're going to count, just
say i. A convention in programming-- with, at least, languages like C-- is just use i as an integer if all its
purpose is is to count from like, 0 on up.

- [3:26:00](https://www.youtube.com/watch?v=undefined&t=12360s) Counter is not wrong. It's not bad.


It's just more verbose than you need to be. Just call it i. You don't need more semantics than that. All
right, what else can I do here? There's another opportunity to tighten up this code. Do you recall? Yeah.
Yeah, that syntactic sugar that does nothing new, but it does it more succinctly.

- [3:26:20](https://www.youtube.com/watch?v=undefined&t=12380s) I can change this to either the


intermediate format or even tighter format of just i++. Now, this is pretty canonical. This is how most
people would implement something three times using a loop in C-- using a while loop, that is. Turns out
that it's so common in C and other languages to do something finitely many times, there's a couple of
ways to do it.

- [3:26:43](https://www.youtube.com/watch?v=undefined&t=12403s) In this model, to be clear, the


logic, though, is that we start by initializing the variable, like I've highlighted here. We then ask the
question, is i less than 0? If so, everything that's indented inside the curly braces gets executed-- namely,
meow then the update. Then the computer is going to have to recheck the condition to make sure that i
hasn't gotten so big that it's greater than 3.

- [3:27:07](https://www.youtube.com/watch?v=undefined&t=12427s) But if not, it then does this again


and it does this again. And then it repeats, constantly checking the condition and executing what's in the
block, checking the condition and executing what's in the block. After three times of that, the condition
is going to be false, or a no answer, and that's it for the code.

- [3:27:22](https://www.youtube.com/watch?v=undefined&t=12442s) It just proceeds to whatever's


down here, just like with Scratch. It jumps to the next blocks down below. All right, what's another way,
though, to do this? Well, I've deliberately been counting from 0-- and that's a programming convention,
right? We started last week with all the light bulbs off, which was 0.

- [3:27:37](https://www.youtube.com/watch?v=undefined&t=12457s) So it's pretty reasonable to start


counting at 0's, just like you would here. Like, no fingers are up, this is 0-- fingers on your hand. But if you
prefer, you could start counting at i equals 1. But then you don't want to do it while i is less than 3, you
want to do i is less than or equal to 3.

- [3:27:54](https://www.youtube.com/watch?v=undefined&t=12474s) On most keyboards, there's no


symbol for less than or equal to or greater than or equal to, so in C, you use two characters, less than
and then an equals sign with no spaces in between. That just means less than or equal to. We could
change it to set i to 2 and make this condition be less than or equal to 4.
- [3:28:14](https://www.youtube.com/watch?v=undefined&t=12494s) We could make this be a 10 and
less than or equal to 12. But, again, just stick with the basics. Start at 0 and count on up would be the
convention. Or if you prefer to count down, that's fine too. Set i to 3 and then do this so long as i is
greater than 0, but you have to decrement instead of increment.

- [3:28:35](https://www.youtube.com/watch?v=undefined&t=12515s) So, again, we could do this all day


long. There's literally an infinite number of ways to implement this idea. And that's why I keep
emphasizing convention. Call the variable i for something like this, initialize it to 0 for something like this,
and just generally count up, unless you really prefer to count down.

- [3:28:49](https://www.youtube.com/watch?v=undefined&t=12529s) Again, just certain human


conventions. All right, how about another way to do this? This is what's called a for loop in C, also very
common. It's not quite as straightforward in that it doesn't really read top to bottom in exactly the same
way. This kind of has a lot more logic tucked into its first line.

- [3:29:07](https://www.youtube.com/watch?v=undefined&t=12547s) But it does exactly the same


thing. What happens here is-- notice that inside the parentheses, next to the word for, there's two
semicolons-- which is another weird use of syntax. They're not at the end of the line, now they're in the
middle of the parentheses. But that's what the humans chose years ago.

- [3:29:24](https://www.youtube.com/watch?v=undefined&t=12564s) The first thing before the


semicolons initializes your variable, int i = 0. The next thing is the condition that's going to constantly get
checked every cycle through this loop. And the last thing is going to be what you do after each loop,
which in this case is going to be count up. So, again, if I rewind we initialize i to 0.

- [3:29:46](https://www.youtube.com/watch?v=undefined&t=12586s) We then ask the question, is i less


than 3? If so, execute what's inside of the loop. Then the computer does this, it does the update,
incrementing i by 1. And then it's not going to blindly meow again. It's going to check again the
condition, is i less than 3? Then it's going to meow if so.

- [3:30:06](https://www.youtube.com/watch?v=undefined&t=12606s) Then it might go ahead and


increment i and check the condition again. So, again, this does not read quite the same simple fashion
top to bottom. You kind of read it left to right and then jump around. But, again, the initialization, the
constant Boolean expression being checked, and the update after each time does the exact same thing
as what we saw a moment ago in this while loop format.

- [3:30:33](https://www.youtube.com/watch?v=undefined&t=12633s) Which one is better? Eh, they're


the same. I think most people would probably eventually use a for loop once comfortable, but just
because is really the answer there. All right, any questions, then, on loops as we've translated them to C?
Yeah. AUDIENCE: [INAUDIBLE] DAVID J.

- [3:30:50](https://www.youtube.com/watch?v=undefined&t=12650s) MALAN: A for loop and while loop


can both be used to do exactly the same thing. There are subtle differences with issues of scope, which
we'll discuss before long, where when you create a variable in a for loop-- notice that it was, again, inside
of those parentheses, which technically means it's only going to exist in these four lines of code.

- [3:31:09](https://www.youtube.com/watch?v=undefined&t=12669s) By contrast, with the while loop, I


declared my variable outside of the loop. That variable is going to continue to exist elsewhere in my
program. So that's one of the minor differences there. Good question. But you'll see some others over
time. All right, so we claim then that it's better in some form to do this with loops.

- [3:31:27](https://www.youtube.com/watch?v=undefined&t=12687s) So let's actually jump back to the


code. Let me go ahead and now re-implement meowing with a for loop, for instance. So how about for
int i = 0, i less than 3, i++. Then inside my curly braces, let me go ahead and print out with printf, meow,
with a newline and a semicolon. So I did it pretty quickly just because I've long acquired the muscle
memory.

- [3:31:51](https://www.youtube.com/watch?v=undefined&t=12711s) But if I now make meow, no


errors there. Run ./meow. And I see meow, meow, meow. Well, let's do now what we did last week,
which was to begin to make our own custom functions, if you will, by using our own in C. So here's
where the syntax gets a little funky, but we'll explain over time what each of these keywords is doing.

- [3:32:14](https://www.youtube.com/watch?v=undefined&t=12734s) If I want to create a function


called meow-- because the authors of C did not create a function called meow decades ago-- I need to
give it a name, like meow. I need to specify if it takes any inputs. For now, I'm going to say no. And I'm
going to explicitly say no by writing the special word void.

- [3:32:34](https://www.youtube.com/watch?v=undefined&t=12754s) It's also necessary when


implementing a function in C-- which was not necessary in Scratch-- to specify what its return type is. But
for now, I'm just going to say that meow is the name of the function, it takes no inputs-- and that's what
the void in parentheses means-- and it does not return anything like ask did, or like get_string or get_int
does.

- [3:32:56](https://www.youtube.com/watch?v=undefined&t=12776s) meow's purpose in life is just to


have side effects, visual side effects by printing something on the screen. So what is meow going to do?
I'm going to have it quite simply say printf, quote unquote, "meow", backslash n. And now, just like in
Scratch, I can now just call a brand new function called meow.

- [3:33:17](https://www.youtube.com/watch?v=undefined&t=12797s) And here's where too, if you really


don't like the curly braces, technically speaking, you can get rid of them when there's only one line of
code inside your loop. But, again, stylistically, I would encourage you to preserve them to make super
clear to yourself and others what it is that's going on.

- [3:33:32](https://www.youtube.com/watch?v=undefined&t=12812s) Let me go ahead and save this


and do make meow. Whoops. Darn. All right, what did I do? Something stupid. AUDIENCE: [INAUDIBLE]
DAVID J. MALAN: Yeah, so 0 does not belong there. I meant to hit parenthesis. So let me rerun make
meow. OK, fixed. My mistake. All right, it's still working OK. But recall what I did in Scratch, kind of out of
sight, out of mind.

- [3:33:55](https://www.youtube.com/watch?v=undefined&t=12835s) And just to make a point, let me


just highlight this and move it way down in the file. Because, again, now that meow exists, it's an
abstraction. I just know a meow function exists. I want to be able to use it. So let me scroll back up. My
main function is the same. Let me go ahead and make meow again. And now, just by moving that
function, I've created all these lines of errors.
- [3:34:18](https://www.youtube.com/watch?v=undefined&t=12858s) And let's look at the first. Again,
the rule of thumb here-- it's a little small, but it says meow.c in bold-- which is the name of the file where
the bug is-- 5 is the line number, and 20 is the character. So line number is enough alone. Let's see. Oh,
this is what happens when I scrolled up too far.

- [3:34:37](https://www.youtube.com/watch?v=undefined&t=12877s) Sorry. This is the error we're now


looking at, line 7. I was looking at the old error message from earlier before I fixed the 0. meow.c line 7.
All right, apparently, C does not know what the meow function is. Implicit declaration of function meow
is invalid in C99. Well, what does that mean? Declaration of function means your creation of a function.

- [3:34:58](https://www.youtube.com/watch?v=undefined&t=12898s) Like, I'm declaring that meow


exists, but I haven't apparently defined it yet. And then C99 is the version of C from the year 1999, which
we generally use here, it's one of the more recent versions. So why is that the case? Can you infer from
the mere fact that I just moved meow to the bottom of the file-- which was fine in Scratch but now is
bad-- why is that? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yeah, C is just kind of old school.

- [3:35:24](https://www.youtube.com/watch?v=undefined&t=12924s) It reads your code top to bottom.


And if it does not know what meow is when you first try to use it, it just freaks out and prints out these
error messages. So the solution is, quite simply, don't do that, just leave it where it was. But you can
imagine this getting a little annoying over time, if only because main is, by name, the main part of your
program.

- [3:35:47](https://www.youtube.com/watch?v=undefined&t=12947s) And, honestly, it would just be


nice if main were always at the top of your code. Because if you want to understand what a file is doing,
it makes sense to just read it top to bottom. Well, there is a solution to this. You can put functions in
different orders with main at the top so long as you-- and this is perhaps the only time copy paste is
appropriate-- so long as you leave a little breadcrumb for the compiler at the very top of your file that
literally repeats the return value, the name, and the arguments

- [3:36:17](https://www.youtube.com/watch?v=undefined&t=12977s) to that function, semicolon. This


is, so to speak, declaring your function-- and the real fancy way is this is a prototype. It's like, what is this
thing going to look like? But the semicolon means I'm not going to deal with this yet. I'm going to
actually define the function or implement it down below here.

- [3:36:35](https://www.youtube.com/watch?v=undefined&t=12995s) This is kind of a stupid detail.


More recent languages get rid of this need, you can put your functions in any order. But, again, if you just
think about the basics of programming languages like this one here-- and as you noted-- it must just be
reading your code top to bottom. So annoying, yes, but explained, yes too.

- [3:36:53](https://www.youtube.com/watch?v=undefined&t=13013s) So let me go ahead and make


meow one more time, ./meow, still working OK. And let me make one final enhancement to this meow
program here. Let me go ahead now and say something like this. Let me go ahead and say, all right,
wouldn't it be nice if my meow function could do something for me some number of times? So suppose I
want to do this.

- [3:37:15](https://www.youtube.com/watch?v=undefined&t=13035s) This meow function at the


moment is going to meow three times. But suppose I want to meow n times, where n is just some
number provided by the user. Well, just like in Scratch, custom functions can take inputs, I just presently
am saying void. But if I change this to int n, thereby telling the compiler, hey, meow still doesn't return
something, but it does take something as input.

- [3:37:41](https://www.youtube.com/watch?v=undefined&t=13061s) It takes an integer, and I want to


call it n. So this is another way of declaring a variable but a way of declaring a variable that gets handed
into, as input, the function. So now if I tighten up main here, now I can actually do something really cool
just like in Scratch, which is this. If I now look at this code-- let me Zoom in here-- now my main program
is really well-written in the sense that it just says what it does, meow three times.

- [3:38:07](https://www.youtube.com/watch?v=undefined&t=13087s) This works, though, because I


defined meow as now taking an input, an integer called n, and then using n in my now familiar for loop.
There's one change. You might have caught my one mistake. I also have to remind myself up here to
make that change too. Again, this is one of the only redundancies or copy-paste that's sort of
reasonable.

- [3:38:30](https://www.youtube.com/watch?v=undefined&t=13110s) But there, I have now a better


version. So let me go ahead and rerun this, make meow, ./meow. Voila. So, again, no change in
correctness but now, again, we're sort of modularizing our code. And, heck, what you could do now--
and this is just a tease about a feature down the road-- those header files we talked about early, those
libraries, this is the kind of modularization we're talking about.

- [3:38:51](https://www.youtube.com/watch?v=undefined&t=13131s) We, the staff, wrote a function


called get_string, get_int, and so forth, we put it in a file called CS50, and we put little breadcrumbs--
specifically, these things called prototypes-- in cs50.h. So that when you all, as aspiring programmers,
include cs50.h, you are sort of secretly telling the compiler at the very top of your code what the menu
of available functions is.

- [3:39:16](https://www.youtube.com/watch?v=undefined&t=13156s) Why? Because in CS50 is lines like


these-- obviously, not for meow, but for get_string, get_int, and so forth. And stdio.h is the same lines of
code for things like printf. So that's all that's going on there. It's just a way of telling the computer in
advance what functions to expect. All right, any questions, then, on these here? Correct.

- [3:39:44](https://www.youtube.com/watch?v=undefined&t=13184s) So if you don't mind, I want to


continue to wave my hand at that detail for today. Indeed, int main void is a little weird, because what
would the input domain be? We have no mechanism for providing input yet. And what does it mean for
main to return anything? Like, who is it returning to? For another day, if we may.

- [3:40:00](https://www.youtube.com/watch?v=undefined&t=13200s) They're going to come into play


but that, for now, today is just something you should take at face value, as necessary copy-paste to begin
programs. So meow is a function that takes an input, the number of times to meow, but it didn't actually
have a return value, hence the void. But what if we actually want to create our own function that not
only takes 0 or more inputs as arguments but also returns some value, maybe an int, maybe a float,
maybe something else altogether? Well, it turns out, in C, we can do that as well.

- [3:40:28](https://www.youtube.com/watch?v=undefined&t=13228s) Let me go ahead and create a


new file here called discount. And let's implement a quick program via which we can discount some
regular price by some percentage, as though there's a sale going on in a store. Let me go ahead and
include our usual cs50.h followed by stdio.h at the top. Let me give myself int main void as before.
- [3:40:48](https://www.youtube.com/watch?v=undefined&t=13248s) And inside of main, let's go ahead
and do something simple. Let's give ourselves a float called regular, representing the regular price of
something in a store. Let's go ahead and get a float from the user asking them what that regular price is.
Then, next, let's go ahead and declare a second variable-- also a float-- called sale, ultimately
representing the sale price after some percentage discount off.

- [3:41:10](https://www.youtube.com/watch?v=undefined&t=13270s) And let's go ahead and simply


calculate whatever regular is. And, say, 15% off is a pretty good discount. So let's go ahead and discount
regular, whatever it is, by 15%, which is equivalent, of course, to multiplying it with the asterisk by 0.85.
Of course, if we're taking off 15%, we multiply the regular price by 0.85.

- [3:41:30](https://www.youtube.com/watch?v=undefined&t=13290s) Now, let's go ahead and print out


the results here. Let me go ahead and say printf sale price, colon-- let me go ahead and %f, but, more
specifically, %.2f because, at least in US currency we typically show cents to two decimal places--
followed by a newline. And then let me go ahead and plug in the value of sale.

- [3:41:49](https://www.youtube.com/watch?v=undefined&t=13309s) All right, let's go down here and


do make discount, Enter. So far, so good-- ./discount. And the regular price is maybe $100. So the sale
price should be $85. So our arithmetic seems to be correct here. But let's fast-forward now in time.
Suppose that we find ourselves discounting a lot of prices in an application, maybe a website like
Amazon where they're offering some kind of percentage discount.

- [3:42:11](https://www.youtube.com/watch?v=undefined&t=13331s) And it'd be nice to have a


reusable function that just does this arithmetic for us, simple though it may be. So let's go ahead and
modify discount this time to give ourselves our own function called discount, for instance, that takes an
input-- like the regular price that you want to discount-- and then it also returns a value.

- [3:42:28](https://www.youtube.com/watch?v=undefined&t=13348s) It doesn't just print it out. It


returns a value, namely, a float that represents what the sale price is. So let me go down below main and
go ahead and define a function that's going to return a float, because we're dealing with dollar amount
still. The function is going to be called discount.

- [3:42:44](https://www.youtube.com/watch?v=undefined&t=13364s) And it's going to take one input,


like the price that we want to discount. In here, I'm going to do something very simple. I'm going to say
float sale equals whatever that price is times 0.85. And then I'm going to go ahead and return sale. Now,
for that matter, I can actually tighten this up a bit.

- [3:43:01](https://www.youtube.com/watch?v=undefined&t=13381s) If I'm only declaring a variable to


store a value that I'm then returning with this keyword return, I actually don't even need that variable.
So I can delete the second line. And I can actually just go ahead and get rid of that variable altogether
and immediately return whatever the arithmetic result is of taking the price input, the argument that's
being passed in, times 0.85.

- [3:43:22](https://www.youtube.com/watch?v=undefined&t=13402s) So very simple function that


simply does the discounting for me. As always, let me go ahead and copy-paste-- the only time it's OK to
copy-paste-- the prototype of that function, so the top of the file, so that when compiling this code, main
has already seen the word discount before. And now let me go into the code here.
- [3:43:40](https://www.youtube.com/watch?v=undefined&t=13420s) And instead of doing the math
myself in main, let me presume that we have some function already in our toolkit called discount that
lets me discount the regular price and return that value. And then down here, my code doesn't need to
change. I'm still going to print out sale the variable in which I'm storing that result. But notice what I've
done here.

- [3:44:00](https://www.youtube.com/watch?v=undefined&t=13440s) I've sort of abstracted the way the


notion of taking a discount by creating my own function that takes a float called price, or anything else as
input. It does a little bit of math, simple though it is here, and then it returns a value. But notice that
discount is not printing that value. It's literally using this other keyword called return so that I can hand
back that value, just like get_string hands back a value, just like get_int back an integer without printing
it for you-- so that I up here on line 9 can go ahead and store

- [3:44:30](https://www.youtube.com/watch?v=undefined&t=13470s) that value in a variable if I want


and then actually print it out. Let me go ahead now and recompile this code with make discount. Let me
go ahead and do ./discount. And let's, again, do $100. Sale price is going to be $85 as well. Now, it turns
out that functions don't have to take just 0 or 1 argument as input.

- [3:44:51](https://www.youtube.com/watch?v=undefined&t=13491s) They can actually take 2 or 3 or


more. So, in fact, suppose we wanted to now enhance this version of my program and take in as input to
the discount function, not just the price that I want to discount but also the percentage off, thereby
allowing us to support not just 15% off but any number of percentage points off.

- [3:45:09](https://www.youtube.com/watch?v=undefined&t=13509s) Well, let me go up here and


declare an int, say, and call it percent_off. And let me ask the user for how many percentage points they
want to take off. So I'm going to say percent_off inside of the prompt here, get that int called
percent_off. And now in addition to passing in regular as an input to the discount function, I'm also going
to pass in percent_off.

- [3:45:31](https://www.youtube.com/watch?v=undefined&t=13531s) But I need to tell the computer


that it is taking now two arguments, and the way I do this is just with a comma down here in the
function's own definition. Here is going to be a percentage argument, a second argument, per the
comma. And I'm now going to use that percentage in a slightly familiar way.

- [3:45:50](https://www.youtube.com/watch?v=undefined&t=13550s) I don't want to just do percentage


like this, because, of course, that's going to increase the size of the total price. I actually need to do a
little bit of real-world math where if this is a percentage off, like the number 15 for 15 percentage points,
I need to do 100 minus that many percentage points, thereby giving me 100 minus 15-- 85.

- [3:46:10](https://www.youtube.com/watch?v=undefined&t=13570s) And then I need to divide that by


100 in order now to give myself 0.85 times the price that was passed in. But if I go ahead now and save
this, run, make discount one last time, I notice that I've actually got an error here. What have I done
wrong? Well, I need to change that prototype too. And, again, this is admittedly an annoying aspect of C
that you have to maintain consistency here.

- [3:46:33](https://www.youtube.com/watch?v=undefined&t=13593s) But that's fine. I'm just going to go


up here, change this to int percentage-- spelling incorrectly. And now let me retry compilation, make
discount, crossing my fingers this time. Worked OK. ./discount, and voila, $100. And percent off, say, 15
points. And, voila, $85. Now, it's worth noting that I've deliberately returned the results of my math from
this function.

- [3:46:58](https://www.youtube.com/watch?v=undefined&t=13618s) I haven't just done the math on


the original variable that's being passed. In fact, if we take a look at this second version where discount is
now taking a price argument and a percentage argument, notice that I'm not doing something like this.
I'm not just saying price equals price times 100 minus percentage divided by 100 and leaving at that.

- [3:47:18](https://www.youtube.com/watch?v=undefined&t=13638s) The problem there is that this


variable price is going to be scoped to that discount function. And we'll encounter this again before long,
but this notion of scope just refers to where in which a variable actually lives or exists or is accessible. So
it turns out if I change price in the context of this discount function, that's not going to have a lasting
effect.

- [3:47:39](https://www.youtube.com/watch?v=undefined&t=13659s) If I actually want to get the result


back to the function that used the discount function, namely, main, I actually do need to take this
approach of actually returning the value explicitly so that ultimately I'm handing back the discounted
price. All right. Well, let's go ahead and maybe how about let's just use these primitives in just a few
different ways.

- [3:47:58](https://www.youtube.com/watch?v=undefined&t=13678s) How about a little game of


yesteryear, Super Mario Brothers? And in the original Super Mario Brothers and in bunches of variants,
so you have these side-scrolling worlds that look like this where there's some coins in the sky hidden
behind these question marks. So let's just use this as a visual to consider how in C could I start to make
something semi-graphical.

- [3:48:17](https://www.youtube.com/watch?v=undefined&t=13697s) Like, not actual colors or


fanciness, that feels like too much too soon-- just something like printing out some question marks. Well,
if I go back over here, let me create that actual file that I alluded to earlier. So let me code up mario.c. Let
me go ahead and include stdio.h, int main void, again, which we'll continue to copy-paste for today.

- [3:48:36](https://www.youtube.com/watch?v=undefined&t=13716s) And then let me just go ahead


and do something simple like 1, 2, 3, 4, and a newline. All right, this is what we might call ASCII art, which
just means graphics but really just implemented with your keyboard. And if I make mario and do ./mario,
it's not nearly as engaging visually as this, but it's the beginning of this kind of map for a game.

- [3:48:56](https://www.youtube.com/watch?v=undefined&t=13736s) Well, if I wanted to now print out


of those things dynamically, let me go back to my code here. And instead of printing out for all at once, I
could do something like four int i gets 0, i less than 4, i plus plus. And then inside here, I could just print
out one of them at a time. Let me save that, make mario.

- [3:49:16](https://www.youtube.com/watch?v=undefined&t=13756s) And, at the risk of disappointing,


so close but I made a mistake, just a stupid aesthetic. The prompt is not on the new line. How could I
move it? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yeah, I need an escape character, the backslash n.
But should I put it here? OK, no, because that's going to put it after everyone, and it's going to make this
thing vertical instead of horizontal.
- [3:49:40](https://www.youtube.com/watch?v=undefined&t=13780s) So, logically, just like in Scratch,
put it at the end of the loop, so something out here. And just print out, for instance, only, quote
unquote, new line. And now if I do make mario again, ./mario, OK. We're back in business. But a little
better designed in that now I'm not repeating myself multiple times, I'm doing this again and again.

- [3:49:57](https://www.youtube.com/watch?v=undefined&t=13797s) But let's do one other thing here


with mario. Let me go ahead and ask the user how many question marks or coins to print. The catch here
is that there's another type of loop that's helpful for this, and it's called a do while loop, generally. A do
while loop is similar to a while loop, but it checks the condition last instead of first.

- [3:50:19](https://www.youtube.com/watch?v=undefined&t=13819s) Recall earlier on the slide, we had


while, open parenthesis, closed parenthesis. And I kept claiming that we check whether i is less than--
whatever it was, 3 in advance again and again. A do while loop just inverts the logic so that you can
actually do something like this. At the top of this program, I'm going to go ahead now and give myself a
variable n like this of type integer.

- [3:50:40](https://www.youtube.com/watch?v=undefined&t=13840s) And then I'm going to do, literally,


the following with the keyword do. n equals get_int-- and I'm going to ask the user for the width, like the
number of dollar signs to print. And I'm going to do this while n is less than, say, 1. So this is a little
cryptic, but the salient differences are the Boolean expression is now at the bottom of my block of code,
not at the top.

- [3:51:06](https://www.youtube.com/watch?v=undefined&t=13866s) Now, why is this? Well, the


difference here if I make mario is-- whoops. I need to add cs50.h, because I'm now using get_int. If I now
compile this version of Mario and do ./mario, a do while loop is helpful when you want to do something
no matter what first and then check some condition or some Boolean expression to see if maybe, in this
case, the user cooperated.

- [3:51:32](https://www.youtube.com/watch?v=undefined&t=13892s) It would make no sense if the user


typed in, say, 0, because there's no work to be done. It'd be really weird if they said negative 100,
because that makes no sense logically. So with this simple construct here, I am doing the following while
n is less than 1. The implication is that as soon as n equals 1 or is bigger than 1, I'm going to break out of
this loop, and I've got myself a variable called n containing, essentially, a positive value, 1 through 2
billion or so.

- [3:52:03](https://www.youtube.com/watch?v=undefined&t=13923s) And I can now use this, for


instance, here, change the 4 to an n so now my program is completely dynamic. Let me go ahead and do
make mario, ./mario again. And I'll do 4, still works. I'll do 40, still works. And the difference here with
the do while is if something like this involves getting user input, well, there's no question to ask.

- [3:52:25](https://www.youtube.com/watch?v=undefined&t=13945s) The user hasn't given you


anything yet. So you have to do something first, then check, and break out of the loop if the human has,
for instance, cooperated, in this case. All right, well why don't we escalate to something more like this in
the same game, where you're underground as Mario, and this is like a two-dimensional wall that's
popping up here? It looks like a 3 by 3, for instance, for the sake of discussion.

- [3:52:48](https://www.youtube.com/watch?v=undefined&t=13968s) And it's like, made of bricks, so I'll


use maybe hash symbols this time. Well, it turns out that we can nest-- that is, combine-- some of these
same ideas as follows. Let me go ahead now and change back to this code. And I'm going to keep the do
while loop from before. And I'm going to ask, though, this question, what's the size of this square? I'm
going to assume it's n by n, so 3 by 3, 4 by 4, whatever.

- [3:53:14](https://www.youtube.com/watch?v=undefined&t=13994s) So I'm just going to ask for the


size of this square of bricks. And now, how do I do this? Well, I'm going to go ahead, for instance, and
print out-- how about for int i = 0, i less than n, i++. Let me just keep it simple and print out something
like this, just a single hash symbol that is a brick, and a newline after it.

- [3:53:35](https://www.youtube.com/watch?v=undefined&t=14015s) All right, let's make mario. Run


mario of 3. OK, that's close to being it. I've got a column. All right, but I need it to be wider. So the
solution last time was to get rid of the newline and then maybe put the newline here, after the loop. All
right, so let's do make mario, ./mario, and type in 3 and huh.

- [3:53:56](https://www.youtube.com/watch?v=undefined&t=14036s) All right, so I kind of need to


combine these two ideas somehow. So how might we solve this problem? I want to print rows and
columns, not row or column. How do I do this? Yeah. AUDIENCE: Add another loop in the for loop. DAVID
J. MALAN: Yeah. Add another loop in the for loop, right? If you use one loop conceptually to kind of
count the rows from top to bottom, and then within each row, you then sort of typewriter style-- old
school typewriter-- do like, character, character, character, character horizontally,

- [3:54:30](https://www.youtube.com/watch?v=undefined&t=14070s) I think we could do exactly what


we want to achieve here. So how about this? Let me get rid of this line and get rid of this line for now.
And let me just give myself another loop on the inside. And since I'm already using i, another reasonable
convention here would be to say something like j. So j also gets 0, j is less than n, j++.

- [3:54:49](https://www.youtube.com/watch?v=undefined&t=14089s) And now, what's going to


happen? Let me go ahead and print out just one of these things at a time. And let me save and let me
run this. Let me see how close we are. Make mario 3. OK, three, that's clearly wrong, but I see nine
things there on the screen. So we're close. What's the one fix I need now to move the old school
typewriter head down to the next row when appropriate? What do you think? Yeah, I need one of these
backslash n's.

- [3:55:19](https://www.youtube.com/watch?v=undefined&t=14119s) And let me add some comments


now to help everyone visualize what I've done. For each row, for each column, how about print a brick--
just to kind of explain the logic? And so I add that because now move to next row, I could do something
like this with a backslash n. So here is where the comments, really, my pseudocode actually kind of
illuminates the situation a bit.

- [3:55:47](https://www.youtube.com/watch?v=undefined&t=14147s) Let me go ahead and recompile


mario, ./mario 3, now we're talking. It's not a perfect square, just because these hash symbols are a little
taller than they are wide, but that's just a font detail here. Now I've done something that's quite more
akin to something like this. All right, so let me pause here and see if there are any questions.

- [3:56:08](https://www.youtube.com/watch?v=undefined&t=14168s) Again, the code's getting a little


more complicated, but we're just building more complicated programs like in Scratch, with familiar
puzzle pieces-- some variables, some loops, some conditionals. It's all the same as before. Yeah. Can you
multiply strings in C? No. But ask that same question again in a few weeks when we get to Python, and
the answer will be yes.

- [3:56:28](https://www.youtube.com/watch?v=undefined&t=14188s) Other questions. Yeah. In C, you


must specify the return type, the name of the function, and the inputs, or arguments, to the function in
that order. And if none of them are applicable, you write the word void. So same question as earlier, let
me kick that can a week or so, and we'll come back to that and we'll see why.

- [3:56:44](https://www.youtube.com/watch?v=undefined&t=14204s) But for now, just take on faith that


you need to do that with main. Because main is a little special, similar to the when green flag is clicked. It
too was a little special as well. Yeah AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yes. If you want to get out
of a loop early, you could do this. So let me answer this question this way.

- [3:57:08](https://www.youtube.com/watch?v=undefined&t=14228s) An alternative to a do while loop


would be to do something like this. How about while true-- so do the following forever-- let me go ahead
and get an inch from the user for the size of this thing. If n is greater than 0-- that is, a positive integer--
then go ahead and use a new keyword called break. This is identical to what we just did.

- [3:57:36](https://www.youtube.com/watch?v=undefined&t=14256s) It's just a little longer. It's like a


couple extra lines, a lot of them are blank. And so it's just an alternative. But a do while does the same
thing but a little tighter-- if that's in answer to your question. All right, so let's now introduce, finally, a
sequence of problems that I've kind of been brushing under the rug, though we did see a little bit of
evidence of this earlier when we tried to add 2 billion and 2 billion, and it overflowed the number of bits
in an int, so to speak.

- [3:58:02](https://www.youtube.com/watch?v=undefined&t=14282s) Let me go ahead and code up a


program called calculator again. But I'm going to go ahead now and change this to floats. So I'm going to
change x to a float, and I'm going to use get_float. And a float, again, is just a floating point value, which
is a fancy way of saying a real number with a decimal point in it.

- [3:58:19](https://www.youtube.com/watch?v=undefined&t=14299s) And down here, I'm going to go


ahead and use %f for float. And I'm going to go ahead now and do one more thing. Instead of addition, I
want to do something fancier, like division, so divide x by y. And I'm going to give myself another third
float called z, as we did at the beginning of today. And I'm going to print out z instead of x and y
explicitly.

- [3:58:37](https://www.youtube.com/watch?v=undefined&t=14317s) So I'm going to go ahead now and


do make calculator, ./calculator. And let's do something like, oh, 2/3. 2 divided by 3 is 0.66667. So that's
what you would rather expect. Let me run it again, 1/10. All right, so 0.1, and a bunch of zeros. That too
is what you would rather expect. But now let me get a little curious.

- [3:58:58](https://www.youtube.com/watch?v=undefined&t=14338s) It turns out that in C, you can


modify the behavior of these format codes a little bit. By default, you get 6 or so digits. Suppose that you
want to get exactly 2 digits. You can more succinctly say 0.2 before the f and after the percent. This is the
kind of thing that's hard to remember, but you Google it, and you find that, OK, format code for floats
uses 0.2 to do two decimal points.
- [3:59:20](https://www.youtube.com/watch?v=undefined&t=14360s) So let me do make calculator
again, ./calculator. How about 2/3? 0.67. So it handles the display of significant digits for us here. And
now let me go ahead and do 1/10 and 0.10. So it's adhering to that. Well, maybe I really want a lot of
precision, right? I've got a really powerful computer. Let me see 50 numbers after the decimal point.

- [3:59:40](https://www.youtube.com/watch?v=undefined&t=14380s) That's a lot of significant digits.


Let me remake the calculator-- whoops, typo. Let me remake the calculator, ./mario calculator. And how
about 2/3 again? Well, that's interesting. Pretty sure it's supposed to be a 0.6 with a line over it, right? In
grade school math. All right, well, maybe that's just a bug.

- [4:00:01](https://www.youtube.com/watch?v=undefined&t=14401s) How about 1/10? OK, that's really


getting funky. So what's going on? It seems that my program cannot only not do addition very well-- we
eventually hit problems in the billions-- we can't even do very precise numbers here. What's going on?
Exactly. In a nutshell, the computer's approximating the answer using that many numbers after the
decimal point.

- [4:00:25](https://www.youtube.com/watch?v=undefined&t=14425s) But the problem fundamentally is


actually very similar to that integer overflow from before. And I'm using that now as a term of art.
Integers can overflow if you're trying to use more bits than you actually have available to you. You sort of
change them all to ones, and then you're out of bits, so to speak.

- [4:00:40](https://www.youtube.com/watch?v=undefined&t=14440s) Same thing here, but in the


different context of floats-- if you only have 32 bits-- or, heck, if we change to double and only have 64
bits, that's a lot of precision, but it's not infinite. And, yet, pretty sure there's an infinite number of real
numbers. In the world, which is to say a computer with finite memory cannot possibly represent all
possible numbers in the world.

- [4:01:01](https://www.youtube.com/watch?v=undefined&t=14461s) Because, again, there's not an


infinite number of permutations of 32 or 64 bits. It might be a lot, in the billions or more, but it's still
finite. And so, indeed, this is the computer's closest approximation to what's actually going on there. And
so this is an example of what we would actually generally call floating-point imprecision.

- [4:01:21](https://www.youtube.com/watch?v=undefined&t=14481s) Floating-point imprecision refers


to the inability for computers fundamentally to represent all possible real numbers 100% precisely, at
least by default in languages like C. Thankfully, in the world of scientific computing and so forth, there
are solutions to this problem that just give you more digits.

- [4:01:39](https://www.youtube.com/watch?v=undefined&t=14499s) But the problem fundamentally is


still going to be there. So there's a reason I changed x and y to floats. Let's see what would happen if we
rewound a bit. And instead of using floats for x and y, again, you say integer, so int x and y. And let's go
far back and use get_int as well, thereby giving us integers x and y.

- [4:01:59](https://www.youtube.com/watch?v=undefined&t=14519s) Let's still leave z as a float,


because at the end of the day, we want to be able to handle fractions or floating-point values. But let's
go ahead now and print out this value of z having changed x and y now to ints. make calculator,
./calculator, and let's do, say, 2 for the numerator, 3 for the denominator.
- [4:02:17](https://www.youtube.com/watch?v=undefined&t=14537s) And it's not 0.666, and it's not
even rounding oddly. It's just all zeros this time. So why is that? Well, it turns out that C, when dividing
an integer by an integer, is always going to give you back an integer, an int. The problem is that floating-
point values don't fit in ints. Only the integral part to the left of the decimal point does.

- [4:02:38](https://www.youtube.com/watch?v=undefined&t=14558s) Everything at and beyond the


decimal point itself get thrown away, known as a feature in C called truncation. When dividing an integer
by an integer, you get back an integer. But if you're trying to then store what's actually a floating point
result in that integer, C is just going to throw away everything at and beyond the decimal point, leaving
us with this case, in just the 0 from what should have been 0.666666 and so forth.

- [4:03:04](https://www.youtube.com/watch?v=undefined&t=14584s) So let's see one more example, in


fact. Let me go back to my terminal here. Let me do ./calculator again. And let's do 4/3. This time, It
should be 1.33333 and so forth. But let's see, 4 divided by 3, both as integers, this time gives us 1.0000,
but there too the answer should be 1.333. But the floating-point part is getting truncated or thrown
away, leaving us with just 1.

- [4:03:29](https://www.youtube.com/watch?v=undefined&t=14609s) So how do we solve this? Well,


certainly, we could just use floats from the get-go, as I did. But if, by nature of your program, you only
have access to integers-- or maybe even longs, for which the same problem would occur-- what we can
actually do is called type conversion. And we can explicitly tell the computer that we actually want to
treat this int as though it's a floating-point value.

- [4:03:50](https://www.youtube.com/watch?v=undefined&t=14630s) And we can do that for both x and


y. So let me go back to my code here, and I have a couple of options, in fact. I can convert y to a float by
doing this, I can cast y to a float by literally writing the type float inside of parentheses right before the y.
And if I really want to be explicit, I can also do the same to x.

- [4:04:09](https://www.youtube.com/watch?v=undefined&t=14649s) But, strictly speaking, it suffices to


just change one or the other, not necessarily both. Let me go ahead now and do make calculator
again, ./calculator, and let's try 2 divided by 3. And now, we're back to an answer that's closer to correct.
But, indeed, we're still having some rounding issues there.

- [4:04:28](https://www.youtube.com/watch?v=undefined&t=14668s) Let's run it one more time for 4


divided by 3. There too we're closer to the right answer, at least. But we still have that floating-point
imprecision, but that's going to be another problem altogether to solve. And here in a little more detail is
that issue of integer overflow, which is in the context of ints.

- [4:04:44](https://www.youtube.com/watch?v=undefined&t=14684s) Suppose that we think back to


last week when we had three bits, and we counted from 0 to 7, 0, 1, 2, 3, 4, 5, 6, 7. I think I asked the
question, how would we count to 8? Someone proposed, well, we need a fourth bit. That's fine if you
have a fourth bit, if you have access to another light bulb or transistor.

- [4:05:03](https://www.youtube.com/watch?v=undefined&t=14703s) If you don't, though, the next


number after this is technically 1000. But if you don't have space for or hardware for that fourth bit, you
might as well just be representing the number 0. So in the world of integers, if you're only using three
bits, those three bits eventually overflow when you count past 7.
- [4:05:23](https://www.youtube.com/watch?v=undefined&t=14723s) Because what should be 8 can't
fit, so to speak, so it rolls back over to 0. And as arcane as this problem might seem, we humans have
done this a couple of times. You might recall knowing about or reading about the Y2K problem, where a
lot of people thought the world was going to end. Why? Because on January 1st of 2000, a lot of
computers, presumably, were going to update their clocks from 1999 to the year 2000.

- [4:05:49](https://www.youtube.com/watch?v=undefined&t=14749s) The problem is, though, for


decades, for efficiency, we humans were honestly in the habit of not storing years as four digits. Why?
Because that's just a lot of space to waste, especially since centuries don't happen that often. So a lot of
computer systems, especially early on when hardware was very expensive and memory was very tight,
just stored the last two digits of any year.

- [4:06:08](https://www.youtube.com/watch?v=undefined&t=14768s) The problem, of course, on


January 1st of 2000 is that 99 rolls over to 100. But if you don't have room for another digit it's just 00.
And if your code assumes a prefix of 19, well, we just went from the year 1999 back to the year 1900.
Thankfully, long story short, a lot of people wrote a lot of code in a lot of old languages and mostly
warded off this problem, so the world did not end.

- [4:06:34](https://www.youtube.com/watch?v=undefined&t=14794s) The next time the world might


end though, is on January 19, 2038. Now, that might feel like a long time away, but so did the year 2000,
at one point. Why might clocks again break in today's modern computers in 2038, might you think?
AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Indeed. So this refers to some number of seconds.

- [4:06:57](https://www.youtube.com/watch?v=undefined&t=14817s) So it turns out that the way


computers generally keep track of time is they count the total number of seconds since the epoch, which
is defined as January 1, 1970. Why? It was just a good year to start counting at, when computers really
came onto the scene. Unfortunately, most computers used 32 bits to count the number of seconds since
January 1, 1970, the implication of which is we can only count up to roughly 2 billion seconds.

- [4:07:23](https://www.youtube.com/watch?v=undefined&t=14843s) 2 billion seconds is going to


happen in 2038, at which 30 11's are going to roll over as follows. That number 2 billion, which is the
max-- because if you're representing positive and negative numbers, recall that you can only count as
high as positive 2 billion or negative 2 billion-- looks like this.

- [4:07:42](https://www.youtube.com/watch?v=undefined&t=14862s) This is roughly the number 2


billion in binary. It's all ones with one zero way over here. If I count one second past that 2 billion
number, give or take-- that means, all right, I add 1, I carry the 1-- it's just like 9's becoming 0's in
decimal. If I keep this sort of simple animation and I keep carrying the 1, carrying the 1, carrying the 1, 1
second after 2 billion seconds, give or take, I have this number in the computer's memory.

- [4:08:08](https://www.youtube.com/watch?v=undefined&t=14888s) So there's still 1 bit that's a 1 all


the way to the left. Unfortunately, that bit often represents negativity, whereby if that first bit is
negative, that represents that the rest of it somehow represents a negative number. It's not negative 0.
There's a fancier representation. But a very big, positive number very suddenly becomes a very big,
negative number.

- [4:08:30](https://www.youtube.com/watch?v=undefined&t=14910s) And that number is roughly


negative 2 billion. That means computers in 2038 on that date are going to accidentally think that it's
been negative 2 billion seconds since January 1, 1970, which is going to make computers potentially
think it's 1901. So what is the solution to the 2038 problem, perhaps? Y2K was because we were using
two digits for years.

- [4:08:54](https://www.youtube.com/watch?v=undefined&t=14934s) What about 2038? More bits.


And, thankfully, we're getting a little better at lessons learned here, and computers now are increasingly
using 64 bits. And all of us will be long gone by the time we run out of that number of seconds, so it's
someone else's problem many, many years from now. But that's really the fundamental solution.

- [4:09:12](https://www.youtube.com/watch?v=undefined&t=14952s) If you're running up against


something finite, well, just kick the can further and just give yourself more bits. And, frankly, because
hardware is so much cheaper these days, computers are so much faster, it's not as big of a deal as it
might have been decades ago. But that's indeed the solution.

- [4:09:25](https://www.youtube.com/watch?v=undefined&t=14965s) But this arises in very common


contexts. In fact, let me go ahead and write a real quick program here called pennies. You might think
that just converting dollars to pennies in US currency might be simple, but let me go ahead and do this.
In pennies.c, I'm going to go ahead and include cs50.h. And I'm going to include stdio.h, int main void as
my starting point.

- [4:09:48](https://www.youtube.com/watch?v=undefined&t=14988s) And now down here, I'm going to


do this. I'm going to get a float called amount, and I'm going to ask the user for some amount of dollars,
so a dollar amount, and I'm going to store that in a variable called amount. Then I'm going to simply
convert that amount to pennies by doing, say, how about amount times 100? And then I'm going to go
ahead and print out that the number of pennies is %i-- because that's just an integer in pennies--
backslash n, quote unquote, comma, pennies.

- [4:10:22](https://www.youtube.com/watch?v=undefined&t=15022s) All right, so if I didn't make any


mistakes here, let me make pennies, ./pennies. And suppose I have, say, $0.99, so 0.99. That's 99
pennies. Suppose I have $1.23. That's pretty good. Suppose I have $4.20. Huh. There's that imprecision
issue. And this isn't even that big of an amount. Now, not a big deal if the cashier gives you one penny
less than you're owed, but you can imagine this adding up.

- [4:10:51](https://www.youtube.com/watch?v=undefined&t=15051s) You can imagine this being


worrisome for financial implications, for financial transactions, for scientific measurements and the like.
My program can't even handle this. Well, there are some solutions here. And it looks like what's really
happening-- if I print it out using the %f with a 0.

- [4:11:09](https://www.youtube.com/watch?v=undefined&t=15069s) 50 or whatever to see more


decimal points-- presumably, the computer is struggling to represent $4.20 precisely. It's probably
storing 4 dollars and 19.9999-something cents. So it's close, but it's not quite there. So I could at least
solve this by rounding up, for instance. And it turns out there is a round function out there.

- [4:11:31](https://www.youtube.com/watch?v=undefined&t=15091s) And it turns out that it's in a


library called the math library. And you would know this by looking at online documentation and the like,
as we'll point you to. And if I now make pennies again and do ./pennies, I can now do $4.20. And, voila.
Now it's correct. So at least in this context, it seems like a solvable problem.
- [4:11:50](https://www.youtube.com/watch?v=undefined&t=15110s) But it's certainly something I need
to be mindful of, nonetheless. Unfortunately, even professional, full-time programmers over the years
have not been particularly attentive to these kinds of details. And in a class like this, the goal is not just
to teach you programming but to really teach you what's going on underneath the hood, so to speak, so
that you have a bottom-up understanding of how data is represented, how computers are manipulating
it, so that you are not on the failing end of some program having some bug.

- [4:12:16](https://www.youtube.com/watch?v=undefined&t=15136s) And so that we as a society are


not beholden to those kinds of mistakes too. And this happens, unfortunately, all of the time. This is a
Boeing airplane that a few years ago needed to be rebooted after every 248 days. Why? Because this
Boeing airplane software was using a 32-bit integer counting up tenths of a second to keep track of
something or other related to its electrical power.

- [4:12:39](https://www.youtube.com/watch?v=undefined&t=15159s) And, unfortunately, after 248 days


of the airplane being continuously on-- which in the airline industry is apparently not uncommon to
make every dollar count, keeping the planes up and running all the time-- the 32-bit number would roll
over and the power would shut off on the airplane as a side effect because of sort of undefined behavior
in that case.

- [4:13:00](https://www.youtube.com/watch?v=undefined&t=15180s) The temporary solution by Boeing


at the time was apparently, essentially, sort of operating system style, well, have you rebooted your
plane? And that was indeed the fix until they rolled out an actual software patch. This stuff really
matters. And the more hardware we carry around and the more we as a society use these kinds of
devices, the more of these problems we're going to run into down the road.

- [4:13:20](https://www.youtube.com/watch?v=undefined&t=15200s) That's it for CS50. We'll see you


next time. [MUSIC PLAYING] DAVID MALAN: This is CS50 and this is week 2.

- [4:14:45](https://www.youtube.com/watch?v=undefined&t=15285s) Now that you have some


programming experience under your belts, in this more arcane language called c. Among our goals today
is to help you understand exactly what you have been doing these past several days. Wrestling with your
first programs in C, so that you have more of a bottom up understanding of what some of these
commands do.

- [4:15:00](https://www.youtube.com/watch?v=undefined&t=15300s) And, ultimately, what more we


can do with this language. So this recall was the very first program you wrote, I wrote in this language
called C, much more textual, certainly, than the Scratch equivalent. But at the end of the day, computers,
your Mac, your PC, VS Code doesn't understand this actual code.

- [4:15:19](https://www.youtube.com/watch?v=undefined&t=15319s) What's the format into which we


need to get any program that we write, just to recap? AUDIENCE: [INAUDIBLE] DAVID MALAN: So binary,
otherwise known as machine code. Right? The 0s and 1s that your computer actually does understand.
So somehow we need to get to this format. And up until now, we've been using this command called
make, which is aptly named, because it lets you make programs.

- [4:15:38](https://www.youtube.com/watch?v=undefined&t=15338s) And the invocation of that has


been pretty simple. Make hello looks in your current directory or folder for a file called hello.c, implicitly,
and then it compiles that into a file called hello, which itself is executable, which just means runnable, so
that you can then do ./hello. But it turns out that make is actually not a compiler itself.

- [4:15:58](https://www.youtube.com/watch?v=undefined&t=15358s) It does help you make programs.


But make is this utility that comes on a lot of systems that makes it easier to actually compile code by
using an actual compiler, the program that converts source code to machine code, on your own Mac, or
PC, or whatever cloud environment you might be using. In fact, what make is doing for us, is actually,
running a command automatically known as clang, for C language.

- [4:16:21](https://www.youtube.com/watch?v=undefined&t=15381s) And, so here, for instance, in VS


Code, is that very first program again, this time in the context of a text editor, and I could compile this
with make hello. Let me go ahead and use the compiler itself manually. And we'll see in a moment why
we've been automating the process with make. I'm going to run clang instead.

- [4:16:39](https://www.youtube.com/watch?v=undefined&t=15399s) And then I'm going to run hello.c.


So it's a little different how the compiler's used. It needs to know, explicitly, what the file is called. I'll go
ahead and run clang, hello.c, Enter. Nothing seems to happen, which, generally speaking, is a good thing.
Because no errors have popped up. And if I do ls for list, you'll see there is not a file called hello.

- [4:17:00](https://www.youtube.com/watch?v=undefined&t=15420s) But there is a curiously-named file


called a.out. This is a historical convention, stands for assembler output. And this is, just, the default file
name for a program that you might compile yourself, manually, using clang itself. Let me go ahead now
and point out that that's kind of a stupid name for a program.

- [4:17:18](https://www.youtube.com/watch?v=undefined&t=15438s) Even though it works, ./a.out


would work. But if you actually want to customize the name of your program, we could just resort to
make, or we could do explicitly what make is doing for us. It turns out, some programs, among them
make, support what are called command line arguments, and more on those later today.

- [4:17:35](https://www.youtube.com/watch?v=undefined&t=15455s) But these are literally words or


numbers that you type at your prompt after the name of a program that just influences its behavior in
some way. It modifies its behavior. And it turns out, if you read the documentation for clang, you can
actually pass a -o, for output, command line argument, that lets you specify, explicitly what do you want
your outputted program to be called? And then you go ahead and type the name of the file that you
actually want to compile, from source code to machine code.

- [4:18:01](https://www.youtube.com/watch?v=undefined&t=15481s) Let me hit Enter now. Again,


nothing seems to happen, and I type ls and voila. Now we still have the old a.out, because I didn't delete
it yet. And I do have hello now. So ./hello, voila, runs hello, world again. And let me go ahead and
remove this file. I could, of course, resort to using the Explorer, on the left hand side.

- [4:18:21](https://www.youtube.com/watch?v=undefined&t=15501s) Which, I am in the habit of


closing, just to give us more room to see. But I could go ahead and right-click or control-click on a.out if I
want to get rid of it. Or again, let me focus on the command line interface. And I can use-- anyone recall?
We didn't really use it much, but what command removes a file? AUDIENCE: rm.

- [4:18:37](https://www.youtube.com/watch?v=undefined&t=15517s) DAVID MALAN: So rm for remove.


rm, a.out, Enter. Remove regular file, a.out, y for yes, enter. And now, if I do ls again, voila, it's gone. All
right, so, let's now enhance this program to do the second version we ever did, which was to also include
cs50.h, so that we have access to functions like, get string, and the like.

- [4:18:57](https://www.youtube.com/watch?v=undefined&t=15537s) Let me do string, name, gets, get


string, what's your name, question mark. And now, let me go ahead and say hello to that name with our
%s placeholder, comma, name. So this was version 2 of our program last time, that very easily compiled
with make hello, but notice the difference now. If I want to compile this thing myself with clang, using
that same lesson learned, all right, let's do it.

- [4:19:23](https://www.youtube.com/watch?v=undefined&t=15563s) clang-o, hello, just so I get a


better name for the program, hello.c, Enter. And a new error pops up that some of you might have
encountered on your own. So it's a bit arcane here, and there's this mention of a cryptic-looking path
with temp for temporary there. But somehow, my issue's in main, as we can see here.

- [4:19:43](https://www.youtube.com/watch?v=undefined&t=15583s) It somehow relates to hello.c.


Even though we might not have seen this language last time in class, but there's an undefined reference
to get string. As though get string doesn't exist. Now, your first instinct might be, well maybe I forgot
cs50.h, but of course, I didn't. That's the very first line of my program.

- [4:19:59](https://www.youtube.com/watch?v=undefined&t=15599s) But it turns out, make is doing


something else for us, all this time. Just putting cs50.h, or any header file at the top of your code, for that
matter, just teaches the compiler that a function will exist. It, sort of, asks the compiler to-- it asks the
compiler to trust that I will, eventually, get around to implementing functions, like get string, and cs50.h,
and stdio.h, printf, therein.

- [4:20:22](https://www.youtube.com/watch?v=undefined&t=15622s) But this error here, some kind of


linker command, relates to the fact that there's a separate process for actually finding the 0s and 1s that
cs50 compiled long ago for you. That authors of this operating system compiled for you, long ago, in the
form of printf. We need to, somehow, tell the compiler that we need to link in code that someone else
wrote, the actual machine code that someone else wrote and then compiled.

- [4:20:48](https://www.youtube.com/watch?v=undefined&t=15648s) So to do that, you'd have to type -


lcs50, for instance, at the end of the command. So additionally, telling clang that, not only do you want
to output a file called hello, and you want to compile a file called hello.c, you also want to quote-
unquote link in a bunch of 0s and 1s that collectively implement get string and printf.

- [4:21:07](https://www.youtube.com/watch?v=undefined&t=15667s) So now, if I hit enter, this time it


compiled OK. And now if I run ./hello, it works as it did last week, just like that. But honestly, this is just
going to get really tedious, really quickly. Notice, already, just to compile my code, I have to run clang-o,
hello, hello.c, lcs50, and you're going to have to type more things, too.

- [4:21:28](https://www.youtube.com/watch?v=undefined&t=15688s) If you wanted to use the math


library, like, to use that round function, you would also have to do -lm, typically, to specify give me the
math bits that someone else compiled. And the commands just get longer and longer. So moving
forward, we won't have to resort to running clang itself, but clang is, indeed, the compiler.

- [4:21:46](https://www.youtube.com/watch?v=undefined&t=15706s) That is the program that converts


from source code to machine code. But we'll continue to use make because it just automates that
process. And the commands are only going to get more cryptic the more sophisticated and more feature
full year programs get. And make, again, is just a tool that makes all that happen.

- [4:22:04](https://www.youtube.com/watch?v=undefined&t=15724s) Let me pause there to see if


there's any questions before then we take a look further under the hood. Yeah, in front. AUDIENCE: Can
you explain again what the -lcs50-- just why you put that? DAVID MALAN: Sure, let me come back to that
in a moment. What does the -lcs50 mean? We'll come back to that, visually, in just a moment.

- [4:22:20](https://www.youtube.com/watch?v=undefined&t=15740s) But it means to link in the 0s and


1s that collectively implement get string and printf. But we'll see that, visually, in a sec. Yeah, behind you.
AUDIENCE: [INAUDIBLE]. DAVID MALAN: Really good question. How come I didn't have to link in
standard I/O? Because I used printf in version 1. Standard I/O is just, literally, so standard that it's built in,
it just works for free.

- [4:22:42](https://www.youtube.com/watch?v=undefined&t=15762s) CS50, of course, is not. It did not


come with the language C or the compiler. We ourselves wrote it. And other libraries, even though they
might come with the language C, they might not be enabled by default, generally for efficiency purposes.
So you're not loading more 0s and 1s into the computer's memory than you need to.

- [4:22:59](https://www.youtube.com/watch?v=undefined&t=15779s) So standard I/O is special, if you


will. Other questions? Yeah? AUDIENCE: [INAUDIBLE] DAVID MALAN: Oh, what does the -o mean? So -o
is shorthand for the English word output, and so -o is telling clang to please output a file called hello,
because the next thing I wrote after the command line recall was clang -o hello, then the name of the
file, then -lcs50.

- [4:23:24](https://www.youtube.com/watch?v=undefined&t=15804s) And this is where these


commands do get and stay fairly arcane. It's just through muscle memory and practice that you'll start to
remember, oh what are the other commands that you-- what are the command line arguments you can
provide to programs? But we've seen this before. Technically, when you run make hello, the program is
called make, hello is the command line argument.

- [4:23:41](https://www.youtube.com/watch?v=undefined&t=15821s) It's an input to the make function,


albeit, typed at the prompt, that tells make what you want to make. Even when I used rm a moment ago,
and did rm of a.out, the command line argument there was called a.out and it's telling rm what to
delete. It is entirely dependent on the programs to decide what their conventions are, whether you use
dash this or dash that, but we'll see over time, which ones actually matter in practice.

- [4:24:05](https://www.youtube.com/watch?v=undefined&t=15845s) So to come back to the first


question about what actually is happening there, let's consider the code more closely. So here is that
first version of the code again, with stdio.h and only printf, so no cs50 stuff yet. Until we add it back in
and had the second version, where we actually get the human's name.

- [4:24:24](https://www.youtube.com/watch?v=undefined&t=15864s) When you run this command,


there's a few things that are happening underneath the hood, and we won't dwell on these kinds of
details, indeed, we'll abstract it away by using make. But it's worth understanding from the get-go, how
much automation is going on, so that when you run these commands, it's not magic.
- [4:24:39](https://www.youtube.com/watch?v=undefined&t=15879s) You have this bottom-up
understanding of what's going on. So when we say you've been compiling your code with make, that's a
bit of an oversimplification. Technically, every time you compile your code, you're having the computer
do four distinct things for you. And this is not four distinct things that you need to memorize and
remember every time you run your program, what's happening, but it helps to break it down into
building blocks, as to how we're getting from source code, like C, into 0s and 1s.

- [4:25:06](https://www.youtube.com/watch?v=undefined&t=15906s) It turns out, that when you


compile, quote-unquote, "your code," technically speaking, you're doing four things automatically, and
all at once. Preprocessing it, compiling it, assembling it, and linking it. Just humans decided, let's just call
the whole process compiling. But for a moment, let's consider what these steps are.

- [4:25:24](https://www.youtube.com/watch?v=undefined&t=15924s) So preprocessing refers to this. If


we look at our source code, version 2 that uses the cs50 library and therefore get string, notice that we
have these include lines at top. And they're kind of special versus all the other code we've written,
because they start with hash symbols, specifically. And that's sort of a special syntax that means that
these are, technically, called preprocessor directives.

- [4:25:45](https://www.youtube.com/watch?v=undefined&t=15945s) Fancy way of saying they're


handled special versus the rest of your code. In fact, if we focus on cs50.h, recall from last week that I
provided a hint as to what's actually in cs50.h, among other things. What was the one salient thing that I
said was in cs50.h and therefore, why we were including it in the first place? AUDIENCE: Get string?
DAVID MALAN: So get string, specifically, the prototype for get string.

- [4:26:13](https://www.youtube.com/watch?v=undefined&t=15973s) We haven't made many of our


own functions yet, but recall that any time we've made our own functions, and we've written them
below main in a file, we've also had to, somewhat stupidly, copy paste the prototype of the function at
the top of the file, just to teach the compiler that this function doesn't exist, yet, it does down there, but
it will exist.

- [4:26:32](https://www.youtube.com/watch?v=undefined&t=15992s) Just trust me. So again, that's


what these prototypes are doing for us. So therefore, in my code, If I want to use a function like get
string, or printf, for that matter, they're not implemented clearly in the same file, they're implemented
elsewhere. So I need to tell the compiler to trust me that they're implemented somewhere else.

- [4:26:48](https://www.youtube.com/watch?v=undefined&t=16008s) And so technically, inside of


cs50.h, which is installed somewhere in the cloud's hard drive, so to speak, that you all are accessing via
VS Code, there's a line that looks like this. A prototype for the get string function that says the name of
the functions get string, it takes one input, or argument, called prompt, and that type of that prompt is a
string.

- [4:27:10](https://www.youtube.com/watch?v=undefined&t=16030s) Get string, not surprisingly, has a


return value and it returns a string. So literally, that line and a bunch of others, are in cs50.h. So rather
than you all having to copy paste the prototype, you can just trust that cs50 figured out what it is. You
can include cs50.h and the compiler is going to go find that prototype for you.

- [4:27:32](https://www.youtube.com/watch?v=undefined&t=16052s) Same thing in standard I/O.


Someone else-- what must clearly be in stdio.h, among other stuff, that motivates our including stdio.h,
too? Yeah? AUDIENCE: Printf. DAVID MALAN: Printf, the prototype for printf, and I'll just change it here in
yellow, to be the same. And it turns out, the format-- the prototype for printf is, actually, pretty fancy,
because, as you might have noticed, printf can take one argument, just something to print, 2, if you want
to plug a value into it, 3 or more.

- [4:28:00](https://www.youtube.com/watch?v=undefined&t=16080s) So the dot dot dot just represents


exactly that. It's not quite as simple a prototype as get strain, but more on that another time. So what
does it mean to preprocess your code? The very first thing the compiler, clang, in this case, is doing for
you when it reads your code top-to-bottom, left-to-right, is it notices, oh, here is hash include, oh, here's
another hash include.

- [4:28:22](https://www.youtube.com/watch?v=undefined&t=16102s) And it, essentially, finds those files


on the hard drive, cs50.h, stdio.h, and does the equivalent of copying and pasting them automatically
into your code at the very top. Thereby teaching the compiler that gets string and printf will eventually
exist somewhere. So that's the preprocessing step, whereby, again, it's just doing a find-and-replace of
anything that starts with hash include.

- [4:28:46](https://www.youtube.com/watch?v=undefined&t=16126s) It's plugging in the files there so


that you, essentially, get all the prototypes you need automatically. OK. What does it mean, then, to
compile the results? Because at this point in the story, your code now looks like this in the computer's
memory. It doesn't change your file, it's doing all of this in the computer's memory, or RAM, for you.

- [4:29:04](https://www.youtube.com/watch?v=undefined&t=16144s) But it, essentially, looks like this.


Well the next step is what's, technically, really compiling. Even though again, we use compile as an
umbrella term. Compiling code in C means to take code that now looks like this in the computer's
memory and turn it into something that looks like this. Which is way more cryptic.

- [4:29:23](https://www.youtube.com/watch?v=undefined&t=16163s) But it was just a few decades ago


that, if you were taking a class like CS50 in its earlier form, we wouldn't be using C it didn't exist yet, we
would actually be using this, something called assembly language. And there's different types of, or
flavors of, assembly language. But this is about as low level as you can get to what a computer really
understands, be it a Mac, or PC, or a phone, before you start getting into actual 0s and 1s.

- [4:29:47](https://www.youtube.com/watch?v=undefined&t=16187s) And most of this is cryptic. I


couldn't tell you what this is doing unless I thought it through carefully and rewound mentally, years ago,
from having studied it, but let's highlight a few key words in yellow. Notice that this assembly language
that the computer is outputting for you automatically, still has mention of main and it has mention of get
string, and it has mention of printf.

- [4:30:08](https://www.youtube.com/watch?v=undefined&t=16208s) So there's some relationship to


the C code we saw a moment ago. And then if I highlight these other things, these are what are called
computer instructions. At the end of the day, your Mac, your PC, your phone actually only understands
very basic instructions, like addition, subtraction, division, multiplication, move into memory, load from
memory, print something to the screen, very basic operations.

- [4:30:30](https://www.youtube.com/watch?v=undefined&t=16230s) And that's what you're seeing


here. These assembly instructions are what the computer actually feeds into the brains of the computer,
the CPU, the central processing unit. And it's that Intel CPU, or whatever you have, that understands this
instruction, and this one, and this one, and this one.

- [4:30:47](https://www.youtube.com/watch?v=undefined&t=16247s) And collectively, long story short,


all they do is print hello, world on the screen, but in a way that the machine understands how to do. So
let me pause here. Are there any questions on what we mean by preprocessing? Which finds and
replaces the hash includes symbols, among others, and compiling, which technically takes your source
code, once preprocessed, and converts it to that stuff called assembly language.

- [4:31:12](https://www.youtube.com/watch?v=undefined&t=16272s) AUDIENCE: [INAUDIBLE] each CPU


has-- DAVID MALAN: Correct. Each type of CPU has its own instruction set. Indeed. And as a teaser, this is
why, at least back in the day, when we used to install software from CD-ROMs, or some other type of
media, this is why you can't take a program that was sold for a Windows computer and run it on a Mac,
or vice-versa.

- [4:31:34](https://www.youtube.com/watch?v=undefined&t=16294s) Because the commands, the


instructions that those two products understand, are actually different. Now Microsoft, or any company,
could generally write code in one language, like C or another, and they can compile it twice, saving a PC
version and saving a Mac version. It's twice as much work and sometimes you get into some
incompatibilities, but that's why these steps are somewhat distinct.

- [4:31:57](https://www.youtube.com/watch?v=undefined&t=16317s) You can now use the same code


and support even different platforms, or systems, if you'd want. All right. Assembly, assembling.
Thankfully, this part is fairly straightforward, at least, in concept. To assemble code, which is step three of
four, that is just happening for you every time you run make or, in turn, clang, this assembly language,
which the computer generated automatically for you from your source code, is turned into 0s and 1s.

- [4:32:21](https://www.youtube.com/watch?v=undefined&t=16341s) So that's the step that, last week,


I simplified and said, when you compile your code, you convert it to source code-- from source code to
machine code. Technically, that happens when you assemble your code. But no one in normal
conversations says that, they just say compile for all of these terms. All right.

- [4:32:39](https://www.youtube.com/watch?v=undefined&t=16359s) So that's assembling. There's one


final step. Even in this simple program of getting the user's name and then plugging it into printf, I'm
using three different people's code, if you will. My own, which is in hello.c. Some of CS50s, which is in
hello.c, sorry-- which is in cs50.

- [4:33:03](https://www.youtube.com/watch?v=undefined&t=16383s) c, which is not a file I've


mentioned, yet, but it stands to reason, that if there's a cs50.h that has prototypes, turns out, the actual
implementation of get string and other things are in cs50.c. And there's a third file somewhere on the
hard drive that's involved in compiling even this simple program. hello.c, cs50.

- [4:33:24](https://www.youtube.com/watch?v=undefined&t=16404s) c, and by that logic, what might


the other be? Yeah? AUDIENCE: stdio? DAVID MALAN: Stdio.c. And that's a bit of a white lie, because
that's such a big, fancy library that there's actually multiple files that compose it, but the same idea, and
we'll take the simplification. So when I have this code, and I compile my code, I get those 0s and 1s that
end up taking hello.
- [4:33:46](https://www.youtube.com/watch?v=undefined&t=16426s) c and turning it, effectively, into
0s and 1s that are combined with cs50.c, followed by stdio.c as well. So let me rewind here. Here might
be the 0s and 1s for my code, the two lines of code that I wrote. Here might be the 0s and 1s for what
cs50 wrote some years ago in cs50.c. Here might be the 0s and 1s that someone wrote for standard I/O
decades ago.

- [4:34:06](https://www.youtube.com/watch?v=undefined&t=16446s) The last and final step is that


linking command that links all of these 0s and 1s together, essentially stitches them together into one
single file called hello, or called a.out, whatever you name it. That last step is what combines all of these
different programmers' 0s and 1s. And my God, now we're really in the weeds.

- [4:34:28](https://www.youtube.com/watch?v=undefined&t=16468s) Who wants to even think about


running code at this level? You shouldn't need to. But it's not magic. When you're running make, there's
some very concrete steps that are happening that humans have developed over the years, over the
decades, that breakdown this big problem of source code going to 0s and 1s, or machine code, into
these very specific steps.

- [4:34:47](https://www.youtube.com/watch?v=undefined&t=16487s) But henceforth, you can call all of


this compiling. Questions? Or confusion? Yeah? AUDIENCE: Can you explain again what a.out signifies?
DAVID MALAN: Sure. What does a.out signify? a.out is just the conventional, default file name for any
program that you compile directly with a compiler, like clang. It's a meaningless name, though.

- [4:35:08](https://www.youtube.com/watch?v=undefined&t=16508s) It stands for assembler output,


and assembler might now sound familiar from this assembling process. It's a lame name for a computer
program, and we can override it by outputting something like hello, instead. Yeah? AUDIENCE:
[INAUDIBLE] DAVID MALAN: To recap, there are other prototypes in those files, cs50.h, stdio.

- [4:35:36](https://www.youtube.com/watch?v=undefined&t=16536s) h, technically, they're all included


on top of your file, even though you, strictly speaking, don't need most of them, but they are there, just
in case you might want them. And finally, any other questions? Yeah? AUDIENCE: [INAUDIBLE] DAVID
MALAN: Does it matter what order we're telling the computer to run? Sometimes with libraries, yes, it
matters what order they are linked in together.

- [4:35:56](https://www.youtube.com/watch?v=undefined&t=16556s) But for our purposes, it's really


not going to matter. It's going to-- make is going to take care of automating that process for us. All right.
So with that said, henceforth, compiling, technically, is these four things. But we'll focus on it as a higher
level concept, an abstraction, known as compiling itself.

- [4:36:14](https://www.youtube.com/watch?v=undefined&t=16574s) So another process that we'll now


begin to focus on all the more this week because, invariably, this past week you ran against-- ran up
against some challenges. You probably created your very first bugs, or mistakes, in a program and so let's
focus for a moment on actual techniques for debugging.

- [4:36:28](https://www.youtube.com/watch?v=undefined&t=16588s) As you spend more time this


semester, in the years to come If you continue to program, you're never, frankly, probably, going to write
bug free code, ultimately. Though your programs are going to get more featureful, more sophisticated,
and we're all going to start to make more sophisticated mistakes.
- [4:36:44](https://www.youtube.com/watch?v=undefined&t=16604s) And to this day, I write buggy
code all the time. And I'm always horrified when I do it up here. But hopefully, that won't happen too
often. But when it does, it's a process, now, of debugging, trying to find the mistakes in your program.
You don't have to stare at your code, or shake your fist at your code.

- [4:37:00](https://www.youtube.com/watch?v=undefined&t=16620s) There are actual tools that real


world programmers use to help debug their code and find these faults. So what are some of the
techniques and tools that folks use? Well as an aside, if you've ever-- a bug in a program is a mistake,
that's been around for some time. If you've ever heard this tale, some 50 plus years ago, in 1947.

- [4:37:22](https://www.youtube.com/watch?v=undefined&t=16642s) This is an entry in a log book


written by a famous computer scientist known as-- named Grace Hopper, who happened to be the one
to record the very first discovery of a quote-unquote actual bug in a computer. This was like a moth that
had flown into, at the time, a very sophisticated system known as the Harvard Mark II computer, very
large, refrigerator-sized type systems, in which an actual bug caused an issue.

- [4:37:48](https://www.youtube.com/watch?v=undefined&t=16668s) The etymology of bug though,


predates this particular instance, but here you have, as any computer scientists might know, the example
of a first physical bug in a computer. How, though, do you go about removing such a thing? Well, let's
consider a very simple scenario from last time, for instance, when we were trying to print out various
aspects of Mario, like this column of 3 bricks.

- [4:38:07](https://www.youtube.com/watch?v=undefined&t=16687s) Let's consider how I might go


about implementing a program like this. Let me switch back over to VS Code here, and I'm going to run--
write a program. And I'm not going to trust myself, so I'm going to call it buggy.c from the get-go,
knowing that I'm going to mess something up. But I'm going to go ahead and include stdio.h.

- [4:38:25](https://www.youtube.com/watch?v=undefined&t=16705s) And I'm going to define main, as


usual. So hopefully, no mistakes just yet. And now, I want to print those 3 bricks on the screen using just
hashes for bricks. So how about 4 int i get 0, i less than or equal to 3, i plus plus. Now, inside of my curly
braces, I'm going to go ahead and print out a hash followed by a backslash n, semicolon.

- [4:38:48](https://www.youtube.com/watch?v=undefined&t=16728s) All right, saving the file, doing


make, buggy, Enter, it compiles. So there's no syntactical errors, my code is syntactically correct. But
some of you have probably seen the logical error already, because when I run this program I don't get
this picture, which was 3 bricks high, I seem to have 4 bricks instead.

- [4:39:10](https://www.youtube.com/watch?v=undefined&t=16750s) Now, this might be jumping out at


you, why it's happening, but I've kept the program simple just so that we don't have to find an actual
bug, we can use a tool to find one that we already know about, in this case. What might be the first
strategy for finding a bug like this, rather than staring at your code, asking a question, trying to think
through the problem? Well, let's actually try to diagnose the problem more proactively.

- [4:39:32](https://www.youtube.com/watch?v=undefined&t=16772s) And the simplest way to do this


now, and years from now, is, honestly, going to be to use a function like printf. Printf is a wonderfully
useful function, not for formatting-- printing formatted strings and all that, for just looking inside the
values of variables that you might be curious about to see what's going on.
- [4:39:49](https://www.youtube.com/watch?v=undefined&t=16789s) So you know what? Let me do
this. I see that there's 4 coming out, but I intended 3. So clearly, something's wrong with my i variables.
So let me be a little more pedantic. Let me go inside of this loop and, temporarily, say something explicit,
like, i is-- &i /n, and then just plug in the value of i.

- [4:40:09](https://www.youtube.com/watch?v=undefined&t=16809s) Right? This is not the program I


want to write, it's the program I'm temporarily writing, because now I'm going to say make buggy,
./buggy. And if I look, now, at the output, I have some helpful diagnostic information. i is 0, and I get a
hash, i is 1, and I get a hash, 2 and I get a hash, 3 and I get hash.

- [4:40:28](https://www.youtube.com/watch?v=undefined&t=16828s) OK, wait a minute. I'm clearly


going too many steps because, maybe, I forgot that computers are, essentially, counting from 0, and now,
oh, it's less than or equal to. Now you see it, right? Again, trivial example, but just by using printf, you
can see inside of the computer's memory by just printing stuff out like this.

- [4:40:45](https://www.youtube.com/watch?v=undefined&t=16845s) And now, once you've figured it


out, oh, so this should probably be less than 3, or I should start counting from 1, there's any number of
ways I could fix this. But the most conventional is probably just to say less than 3. Now, I can delete my
temporary print statement, rerun make buggy, ./buggy.

- [4:41:03](https://www.youtube.com/watch?v=undefined&t=16863s) And, voila, problem solved. All


right, and to this day, I do this. Whether it's making a command line application, or a web application, or
mobile application, It's very common to use printf, or some equivalent in any language, just to poke
around and see what's inside the computer's memory.

- [4:41:20](https://www.youtube.com/watch?v=undefined&t=16880s) Thankfully, there's more


sophisticated tools than this. Let me go ahead and reintroduce the bug here. And let me reopen my
sidebar at left here. Let me now recompile the code to make sure it's current. And I'm going to run a
command called debug50. Which is a command that's representative of a type of program known as a
debugger.

- [4:41:41](https://www.youtube.com/watch?v=undefined&t=16901s) And this debugger is actually built


into VS Code. And all debug50 is doing for us is automating the process of starting VS Code's built-in
debugger. So this isn't even a CS50-specific tool, we've just given you a debug50 command to make it
easier to start it up from the get-go. And the way you run this debugger is you say debug50, space, and
then the name of the program that you want to debug.

- [4:42:04](https://www.youtube.com/watch?v=undefined&t=16924s) So, in this case, . /buggy. So you


don't mention your c-file. You mention your already-compiled code. And what this debugger is going to
let me do is, most powerfully, walk through my code step-by-step. Because every program we've written
thus far, runs from start to finish, even if I'm not done thinking through each step at a time.

- [4:42:27](https://www.youtube.com/watch?v=undefined&t=16947s) With a debugger, I can actually


click on a line number and say pause execution here, and the debugger will let me walk through my code
one step at a time, one second at a time, one minute at a time, at my own human pace. Which is super
compelling when the programs get more complicated and they might, otherwise, fly by on the screen.
- [4:42:47](https://www.youtube.com/watch?v=undefined&t=16967s) So I'm going to click to the left of
line 5. And notice that these little red dots appear. And if I click on one it stays, and gets even redder. And
I'm going to run debug50 on ./buggy. And in just a moment, you'll see that a new panel opens on the left
hand side. It's doing some configuration of the screen.

- [4:43:06](https://www.youtube.com/watch?v=undefined&t=16986s) Let me zoom out a little bit here


so we can see more on the screen at once. And sometimes, you'll see in VS Code that debug console
opens up, which looks very cryptic, just go back to terminal window if that happens. Because at the
terminal window is where you can still interact with your code. And let's now take a look at what's going
on.

- [4:43:24](https://www.youtube.com/watch?v=undefined&t=17004s) If I zoom in on my buggy.c code


here, you'll notice that we have the same program as before, but highlighted in yellow is line 5. Not a
coincidence, that's the line I set a so-called breakpoint at. The little red dot means break here, pause
execution here. And the yellow line has not yet been executed.

- [4:43:48](https://www.youtube.com/watch?v=undefined&t=17028s) But if I, now, at the top of my


screen, notice these little arrows. There's one for Play. There's one for this, which, if I hover over it, says
Step Over, there's another that's going to say Step Into, there's a third that says Step Out. I'm just going
to use the first of these, Step Over.

- [4:44:03](https://www.youtube.com/watch?v=undefined&t=17043s) And I'm going to do this, and


you'll see that the yellow highlight moved from line 5 to line 7 because now it's ready, but hasn't yet
printed out that hash. But the most powerful thing here, notice, is that top left here. It's a little cryptic,
because there's a bunch of things going on that will make more sense over time, but at the top there's a
section called variables.

- [4:44:23](https://www.youtube.com/watch?v=undefined&t=17063s) Below that, something called


locals, which means local to my current function, main. And notice, there's my variable called i, and its
current value is 0. So now, once I click Step Over again, watch what happens. We go from line 7 back to
line 5. But look in the terminal window, one of the hashes has printed.

- [4:44:44](https://www.youtube.com/watch?v=undefined&t=17084s) But now, it's printed at my own


pace. I can think through this step-by-step. Notice that i has not changed, yet. It's still 0 because the
yellow highlighted line hasn't yet executed. But the moment I click Step Over, it's going to execute line 5.
Now, notice at top left, i has become 1, and nothing has printed, yet, because now, highlighted is line 7.

- [4:45:08](https://www.youtube.com/watch?v=undefined&t=17108s) So if I click Step Over again, we'll


see the hash. If I repeat this process at my own human, comfortable pace, I can see my variables
changing, I can see output changing on the screen, and I can just think about should that have just
happened. I can pause and give thought to what's actually going on without trying to race the computer
and figure it all out at once.

- [4:45:31](https://www.youtube.com/watch?v=undefined&t=17131s) I'm going to go ahead and stop


here because we already know what this particular problem is, and that brings me back to my default
terminal window. But this debugger, let me disable the breakpoint now so it doesn't keep breaking, this
debugger will be your friend moving forward in order to step through your code step-by-step, at your
own pace to figure out where something has gone wrong.
- [4:45:51](https://www.youtube.com/watch?v=undefined&t=17151s) Printf is great, but it gets
annoying if you have to constantly add print this, print this, print this, print this, recompile, rerun it, oh
wait a minute, print this, print this. The debugger lets you do the equivalent, but automatically.
Questions on this debugger, which you'll see all the more hands-on over time? Questions on debugger?
Yeah? AUDIENCE: You were using a Step Over feature.

- [4:46:15](https://www.youtube.com/watch?v=undefined&t=17175s) What do the other features in the


debugger-- DAVID MALAN: Really good question. We'll see this before long, but those other buttons that
I glossed over, step into and step out of, actually let you step into specific functions if I had any more
than main. So if main called a function called something, and something called a function called
something else, instead of just stepping over the entire execution of that function, I could step into it and
walk through its lines of code one by one.

- [4:46:41](https://www.youtube.com/watch?v=undefined&t=17201s) So any time you have a problem


set you're working on that has multiple functions, you can set a breakpoint in main, if you want, or you
can set it inside of one of your additional functions to focus your attention only on that. And we'll see
examples of that over time. All right, so what else? And what's the sort of, elephant in the room, so to
speak, is actually a duck in this case.

- [4:47:04](https://www.youtube.com/watch?v=undefined&t=17224s) Why is there this duck and all of


these ducks here? Well, it turns out, a third, genuinely recommended, debugging technique is talking
through problems, talking through code with someone else. Now, in the absence of having a family
member, or a friend, or a roommate who actually wants to hear you talk about code, of all things,
generally, programmers turn to a rubber duck, or other inanimate objects if something animate is not
available.

- [4:47:28](https://www.youtube.com/watch?v=undefined&t=17248s) The idea behind rubber duck


debugging, so to speak, is that simply by looking at your code and talking it through, OK, on line 3, I'm
starting a 4 loop and I'm initializing i to 0. OK, then, I'm printing out a hash. Just by talking through your
code, step-by-step, invariably, finds you having the proverbial light bulb go off over your head, because
you realize, wait a minute I just said something stupid, or I just said something wrong.

- [4:47:55](https://www.youtube.com/watch?v=undefined&t=17275s) And this is really just a proxy for


any other human, teaching fellow, teacher or friend, colleague. But in the absence of any of those people
in the room, you're welcome to take, on your way out today. One of these little, rubber ducks and
consider using it, for real, any time you want to talk through one of your problems in CS50, or maybe life
more generally.

- [4:48:13](https://www.youtube.com/watch?v=undefined&t=17293s) But having it there on your desk is


just a way to help you hear illogic in what you think might, otherwise, be logical code. So printf,
debugging, rubber-duck debugging are just three of the ways, you'll see over time, to get to the source
of code that you will write that has mistakes. Which is going to happen, but it will empower you all the
more to solve those mistakes.

- [4:48:36](https://www.youtube.com/watch?v=undefined&t=17316s) All right, any questions on


debugging, in general, or these three techniques? Yeah? AUDIENCE: [INAUDIBLE] DAVID MALAN: What's
the difference between Step Over and Step Into? At the moment, the only one that's applicable to the
code I just wrote is Step Over, because it means step over each line of code.

- [4:48:54](https://www.youtube.com/watch?v=undefined&t=17334s) If, though, I had other functions


that I had written in this program, maybe lower down in the file, I could step into those function calls and
walk through them one at a time. So we'll come back to this with an actual example, but step into will
allow me to do exactly that. In fact, this is a perfect segue to doing a little something like this.

- [4:49:13](https://www.youtube.com/watch?v=undefined&t=17353s) Let me go ahead and open up


another file here. And, actually, we'll use the same, buggy. And we're going to write one other thing
that's buggy, as well. Let me go up here and include, as before, cs50.h. Let me include stdio.h. Let me do
int main(void). So all of this, I think, is correct, so far.

- [4:49:32](https://www.youtube.com/watch?v=undefined&t=17372s) And let's do this, let's give myself


an int called i, and let's ask the user for a negative integer. This is not a function that exists, technically,
yet. But I'm going to assume, for the sake of discussion, that it does. Then, I'm just going to print out,
with %i and a new line, whatever the human typed in.

- [4:49:50](https://www.youtube.com/watch?v=undefined&t=17390s) So at this point in the story, my


program, I think, is correct. Except for the fact that get negative int is not a function in the CS50 library or
anywhere else. I'm going to need to invent it myself. So suppose, in this case, that I declare a function
called get negative int. It's return type, so to speak, should be int, because, as its name suggests, I want
to hand the user back in integer, and it's going to take no input to keep it simple.

- [4:50:15](https://www.youtube.com/watch?v=undefined&t=17415s) So I'm just going to say void there.


No inputs, no special prompts, nothing like that. Let me, now, give myself some curly braces. And let me
do something familiar, perhaps, from problem set 1. Let me give myself a variable, like n, and let me do
the following within this block of code. Assign n the value of get int, asking the user for a negative
integer using get int's own prompt.

- [4:50:39](https://www.youtube.com/watch?v=undefined&t=17439s) And I want to do this while n is


less than 0, because I want to get a negative from the user. And recall, from having used this block in the
past, I can now return n as the very last step to hand back whatever the user has typed in, so long as
they cooperated and gave me an actual negative integer. Now, I've deliberately made a mistake here, and
it's a subtle, silly, mathematical one, but let me compile this program after copying the prototype up to
the top, so I don't make that mistake again.

- [4:51:10](https://www.youtube.com/watch?v=undefined&t=17470s) Let me do make buggy, Enter. And


now, let me do ./buggy. I'll give it a negative integer, like negative 50. Uh-huh. That did not take. How
about negative 5? No. How about 0? All right. So it's, clearly, working backwards, or incorrectly here,
logically. So how could I go about debugging this? Well, I could do what I've done before? I could use my
printf technique and say something explicit like n is %i, new line, comma n, just to print it out, let me
recompile buggy, let me rerun buggy, let me type in negative 50.

- [4:51:53](https://www.youtube.com/watch?v=undefined&t=17513s) OK, n is negative 50. So that didn't


really help me at this point, because that's the same as before. So let me do this, debug50, ./buggy. Oh,
but I've made a mistake. So I didn't set my breakpoint, yet. So let me do this, and I'll set a breakpoint this
time. I could set it here, on line 8.
- [4:52:12](https://www.youtube.com/watch?v=undefined&t=17532s) Let's do it in main, as before. Let
me rerun debug50, now. On ./buggy. That fancy user interface is going to pop up. It's going to highlight
the line that I set the breakpoint on. Notice that, on the left hand side of the screen, i is defaulting, at the
moment to 0, because I haven't typed anything in, yet.

- [4:52:29](https://www.youtube.com/watch?v=undefined&t=17549s) But let me, now, Step Over this


line that's highlighted in yellow, and you'll see that I'm being prompted. So let's type in my negative 50,
Enter. Notice now that I'm stuck in that function. All right. So clearly, the issue seems to be in my get
negative int function. So, OK, let me stop this execution.

- [4:52:54](https://www.youtube.com/watch?v=undefined&t=17574s) My problem doesn't seem to be


in main, per se, maybe it's down here. So that's fine. Let me set my same breakpoint at line 8. Let me
rerun debug50 one more time. But this time, instead of just stepping over that line, let's step into it. So
notice line 8 is, again, highlighted in yellow. In the past I've been clicking Step Over.

- [4:53:12](https://www.youtube.com/watch?v=undefined&t=17592s) Let's click Step into, now. When I


click Step Into, boom, now, the debugger jumps into that specific function. Now, I can step through these
lines of code, again and again. I can see what the value of n is as I'm typing it in. I can think through my
logic, and voila. Hopefully, once I've solved the issue, I can exit the debugger, fix my code, and move on.

- [4:53:33](https://www.youtube.com/watch?v=undefined&t=17613s) So Step Over just goes over the


line, but executes it, Step Into lets you go into other functions you've written. So let's go ahead and do
this. We've got a bunch of possible approaches that we can take to solving some problems let's go ahead
and pace ourselves today, though. Let's take a five-minute break, here.

- [4:53:52](https://www.youtube.com/watch?v=undefined&t=17632s) And when we come back, we'll


take a look at that computer's memory we've been talking about. See you in five. All right. So let's dive
back in. Up until now, both, by way of week 1 and problems set 1, for the most part, we've just
translated from Scratch into C all of these basic building blocks, like loops and conditionals, Boolean
expressions, variables.

- [4:54:18](https://www.youtube.com/watch?v=undefined&t=17658s) So sort of, more of the same. But


there are features in C that we've already stumbled across already, like data types, the types of variables
that doesn't exist in Scratch, but that, in fact, does exist in other languages. In fact, a few that we'll see
before long. So to summarize the types we saw last week, recall this little list here.

- [4:54:35](https://www.youtube.com/watch?v=undefined&t=17675s) We had ints, and floats, and


longs, and doubles, and chars, there's also Booles and also string, which we've seen a few times. But
today, let's actually start to formalize what these things are, and actually what your Mac and PC are
doing when you manipulate bits as an int versus a char, versus a string, versus something else.

- [4:54:53](https://www.youtube.com/watch?v=undefined&t=17693s) And see if we can't put more tools


into your toolkit, so to speak, so we can start quickly writing more featureful, more sophisticated
programs in C. So it turns out, that on most systems nowadays, though this can vary by actual computer,
this is how large each of the data types, typically, is in C.

- [4:55:16](https://www.youtube.com/watch?v=undefined&t=17716s) When you store a Boolean value,


a 0 or 1, a true, a false, or true, it actually uses 1 byte. That's a little excessive, because, strictly speaking,
you only need 1 bit, which is 1/8 of this size. But for simplicity, computers use a whole byte to represent
a Boole, true or false. A char, we saw last week, is only 1 byte, or 8 bits. And this is why ASCII, which uses
1 byte, or technically, only 7 bits early on, was confined to only 256 maximally possible characters.

- [4:55:42](https://www.youtube.com/watch?v=undefined&t=17742s) Notice that an int is 4 bytes, or 32


bits. A float is also 4 bytes or 32 bits. But the things that we call long, it's, literally, twice as long, 8 bytes
or 64 bits. So is a double. A double is 64 bits of precision for floating point values. And a string, for today,
we're going to leave as a question mark.

- [4:56:01](https://www.youtube.com/watch?v=undefined&t=17761s) We'll come back to that, later


today and next week, as to how much space a string takes up, but, suffice it to say, it's going to take up a
variable amount of space, depending on whether the string is short or long. But we'll see exactly what
that means, before long. So here's a photograph of a typical piece of memory inside of your Mac, or PC,
or phone.

- [4:56:22](https://www.youtube.com/watch?v=undefined&t=17782s) Odds are, it might be a little


smaller in some devices. This is known as RAM, or random access memory. Each of these little black
chips on this circuit board, the green thing, these little black chips are where 0s and 1s are actually
stored. Each of those stores some number of bytes. Maybe megabytes, maybe even gigabytes,
nowadays.

- [4:56:39](https://www.youtube.com/watch?v=undefined&t=17799s) So let's focus on one of those


chips, to give us a zoomed in version, thereof. Let's consider the fact that, even though we don't have to
care, exactly , how this kind of thing is made, if this is, like, 1 gigabyte of memory, for the sake of
discussion, it stands to reason that, if this thing is storing 1 billion bytes, 1 gigabyte, then we can number
them, arbitrarily.

- [4:57:02](https://www.youtube.com/watch?v=undefined&t=17822s) Maybe this will be byte 0, 1, 2, 3,


4, 5, 6, 7, 8. Then, maybe, way down here in the bottom right corner is byte number 1 billion. We can
just number these things, as might be our convention. Let's draw that graphically. Not with a billion
squares, but fewer than those. And let's zoom in further, and consider that.

- [4:57:20](https://www.youtube.com/watch?v=undefined&t=17840s) At this point in the story, let's


abstract away all the hardware, and all the little wires, and just think of memory as taking up-- or, rather,
just think of data as taking up some number of bytes. So, for instance, if you were to store a char in a
computer's memory, which was 1 byte, it might be stored at this top left-hand location of this black chip
of memory.

- [4:57:40](https://www.youtube.com/watch?v=undefined&t=17860s) If you were to store something


like an integer that uses 4 bytes, well, it might use four of those bytes, but they're going to be contiguous
back-to-back-to-back, in this case. If you were to store a long or a double, you might, actually, need 8
bytes. So I'm filling in these squares to represent how much memory and given variable of some data
type would take up.

- [4:58:00](https://www.youtube.com/watch?v=undefined&t=17880s) 1, or 4, or 8, in this case, here.


Well, from here, let's abstract away from all of the hardware and really focus on memory as being a grid.
Or, really, like a canvas that we can paint any types of data onto that we want. At the end of the day, all
of this data is just going to be 0s and 1s. But it's up to you and I to build abstractions on top of that.
- [4:58:21](https://www.youtube.com/watch?v=undefined&t=17901s) Things like actual numbers,
colors, images, movies, and beyond. But we'll start lower-level, here, first. Suppose I had a program that
needs three integers. A simple program whose purpose in life is to average your three scores on an
exam, or some such thing. Suppose that your three scores were these, 72, 73, not too bad, and 33, which
is particularly low.

- [4:58:42](https://www.youtube.com/watch?v=undefined&t=17922s) Let's write a program that does


this kind of averaging for us. Let me go back to VS Code, here. Let me open up a file called scores.c. Let
me implement this as follows. Let me include stdio.h at the top, int main(void) as before. Then, inside of
main, let me declare score 1, which is 72. Give me another score, 73.

- [4:59:08](https://www.youtube.com/watch?v=undefined&t=17948s) Then, a third score, called score 3,


which is going to be 33. Now, I'm going to use printf to print out the average of those things, and I can do
this in a few different ways. But I'm going to print out %f, and I'm going to do score 1, plus score 2, plus
score 3, divided by 3, close parentheses semicolon.

- [4:59:28](https://www.youtube.com/watch?v=undefined&t=17968s) Some relatively simple arithmetic


to compute the average of three scores, if I'm curious what my average grade is in the class with these
three assessments. Let me, now, do make scores. All right, so I've somehow made an error already. But
this one is, actually, germane to a problem we, hopefully, won't encounter too frequently.

- [4:59:51](https://www.youtube.com/watch?v=undefined&t=17991s) What's going on here? So


underlined to score 1, plus score 2, plus score 3, divided by 3. Format specifies type double, but the
argument has type int, well, what's going on here? Because the arithmetic seems to check out. Yeah?
AUDIENCE: So the computer is doing the math, but they basically [INAUDIBLE] just gives out a value at
the end because, well [INAUDIBLE] DAVID MALAN: Correct.

- [5:00:14](https://www.youtube.com/watch?v=undefined&t=18014s) And we'll come back to this in


more detail, but, indeed, what's happening here is I'm adding three ints together, obviously, because I
define them right up here. And I'm dividing by another int, 3, but the catch is, recall that C when it
performs math, treats all of these things as integers.

- [5:00:28](https://www.youtube.com/watch?v=undefined&t=18028s) But integers are not floating point


value. So if you actually want to get a precise, average for your score without throwing away the
remainder, everything after the decimal point, it turns out, we're going to have to-- we're going to--
aww-- we're going to have to-- [LAUGHTER] we're going to have to convert this whole expression,
somehow, to a float.

- [5:00:48](https://www.youtube.com/watch?v=undefined&t=18048s) And there's a few ways to do this


but the easiest way, for now, I'm going to go ahead and do this up here, I'm going to change the divide by
3 to divide by 3.0. Because it turns out, long story short, in C, so long as one of the values participating in
an arithmetic expression like this is something like a float, the rest will be treated as promoted to a
floating point value as well.

- [5:01:08](https://www.youtube.com/watch?v=undefined&t=18068s) So let me, now, recompile this


code with make scores, Enter. This time it worked OK, because I'm treating a float as a float. Let me
do . /scores, Enter. All right, my average is 59.33333 and so forth. All right. So the math, presumably,
checks out. Floating point imprecision per last week aside. But let's consider the design of this program.
- [5:01:34](https://www.youtube.com/watch?v=undefined&t=18094s) What is, kind of, bad about it, or if
we maintain this program longer term, are we going to regret the design of this program? What might
not be ideal here? Yeah? AUDIENCE: [INAUDIBLE] DAVID MALAN: Yeah, so in this case, I have hard coded
my three scores. So, if I'm hearing you correctly, this program is only ever going to tell me this specific
average.

- [5:02:04](https://www.youtube.com/watch?v=undefined&t=18124s) I'm not even using something like,


get int or get float to get three different scores, so that's not good. And suppose that we wait later in the
semester, I think other problems could arise. Yeah? AUDIENCE: Just thinking also somewhat of an issue
that you can't reuse that number. DAVID MALAN: I can't reuse the number because I haven't stored the
average in some variable, which in this program, not a big deal, but certainly, if I wanted to reuse it
elsewhere, that's a problem.

- [5:02:27](https://www.youtube.com/watch?v=undefined&t=18147s) Let's fast-forward again, a little


later in the semester, I don't just have three test scores or exam scores, maybe I have 4, or 5, or 6. Where
might this take us? AUDIENCE: Yeah, if you ever want to have to take the average of any number of
scores other than 3, [INAUDIBLE] DAVID MALAN: Yeah, I've sort of, capped this program at 3.

- [5:02:42](https://www.youtube.com/watch?v=undefined&t=18162s) And honestly, this is, kind of,


bordering on copy paste. Even though the variables, yes, have different names; score 1, score 2, score 3.
Imagine doing this for a whole grade book for a class. Having to score 4, 5, 6, 11 10, 12, 20, 30, that's a
lot of variables. You can imagine just how ugly the code starts to get if you're just defining variable after
variable, after variable.

- [5:03:03](https://www.youtube.com/watch?v=undefined&t=18183s) So it turns out, there are better


ways, in languages like C, if you want to have multiple values stored in memory that happened to be of
the same data type. Let's take a look back at this memory, here, to see what these things might look like
in memory. Here's that grid of memory. Each of these recall represents a byte.

- [5:03:21](https://www.youtube.com/watch?v=undefined&t=18201s) To be clear, if I store score 1 in


memory first, how many bytes will it take up? AUDIENCE: [INAUDIBLE] DAVID MALAN: So 4, a.k.a. 32
bits. So I might draw a score 1 as filling up this part of the memory. It's up to the computer as to whether
it goes here, or down there, or wherever. I'm just keeping the pictures clean for today, from the top-left
on down.

- [5:03:40](https://www.youtube.com/watch?v=undefined&t=18220s) If I, then, declare another


variable, called score 2, it might end up over there, also taking up 4 bytes. And then score 3 might end up
here. So that's just representing what's going on inside of the computer's memory. But technically
speaking, to be clear, per week 0, what's really being stored in the computer's memory, are patterns of
0s and 1s.

- [5:03:59](https://www.youtube.com/watch?v=undefined&t=18239s) 32 total, in this case, because 32


bits is 4 bytes. But again, it gets boring quickly to think in and look at binary all the time. So we'll,
generally, abstract this away as just using decimal numbers, in this case, instead. But there might be a
better way to store, not just three of these things, but maybe four, maybe, five, maybe 10, maybe, more,
by declaring one variable to store all of them, instead of 3, or 4, or 5, or more individual variables.
- [5:04:30](https://www.youtube.com/watch?v=undefined&t=18270s) The way to do this is by way of
something known as an array. An array is another type of data that allows you to store multiple values of
the same type back-to-back-to-back. That is, to say, contiguously. So an array can let you create memory
for one int, or two, or three, or even more than that, but describe them all using the same variable
name, the same one name.

- [5:05:01](https://www.youtube.com/watch?v=undefined&t=18301s) So for instance, if, for one


program, I only need three integers, but I don't want to messily declare them as score 1, score 2, score 3,
I can do this, instead. This is today's first new piece of syntax, the square brackets that we're now seeing.
This line of code, here, is similar to int score 1 semicolon, or int score 1 equals 72 semicolon.

- [5:05:25](https://www.youtube.com/watch?v=undefined&t=18325s) This line of code is declaring for


me, so to speak, an array of size 3. And that array is going to store three integers. Why? Because the type
of that array is an int, here. The square brackets tell the computer how many ints you want. In this case,
3. And the name is, of course, scores. Which, in English, I've deliberately pluralized so that I can describe
this array as storing multiple scores, indeed.

- [5:05:52](https://www.youtube.com/watch?v=undefined&t=18352s) So if I want to now assign values


to this variable, called scores, I can do code like this. I can say, scores bracket 0 equals 72, scores bracket
1 equals 73, and scores bracket 2 equals 33. The only thing weird there is, admittedly, the square
brackets which are still new. But we're also, notice, 0 indexing things.

- [5:06:14](https://www.youtube.com/watch?v=undefined&t=18374s) To zero index means to start


counting at 0. When we've talked about that before, our four loops have, generally, been zero indexed.
Arrays in C are zero indexed. And you do not have choice over that. You can't start counting at 1 in arrays
because you prefer to, you'd be sacrificing one of the elements.

- [5:06:31](https://www.youtube.com/watch?v=undefined&t=18391s) You have to start in arrays


counting from 0. So out of context, this doesn't solve a problem, but it, definitely, is going to once we
have more than, even, three scores here. In fact, let me change this program a little bit. Let me go back
to VS Code. And delete these three lines, here. And replace it with a scores variable that's ready to store
three total integers.

- [5:06:54](https://www.youtube.com/watch?v=undefined&t=18414s) And then, initialize them as


follows, scores bracket 0 is 72, as before, scores bracket 1 is going to be 73, scores bracket 2 is going to
be 33. Notice, I do not need to say int before any of these lines, because that's been taken care of,
already, for me on line 5, where I already specified that everything in this array is going to be an int.

- [5:07:18](https://www.youtube.com/watch?v=undefined&t=18438s) Now, down here, this code needs


to change because I no longer have three variables, score 1, 2, and 3. I have 1 variable, but that I can
index into. I'm going to, here, then, do scores bracket 0, plus scores bracket 1, plus scores bracket 2,
which is equivalent to what I did earlier, giving me back those three integers.

- [5:07:39](https://www.youtube.com/watch?v=undefined&t=18459s) But notice, I'm using the same


variable name, every time. And again, I'm using this new square bracket notation to, quote-unquote,
index into the array to get at the first int, the second int, and the third, and then, to do it again down
here. Now, this program, still not really solving all the problems we describe, I still can only store three
scores, but we'll come back to something like that before long.
- [5:08:00](https://www.youtube.com/watch?v=undefined&t=18480s) But for now, we're just
introducing a new syntax and a new feature, whereby, I can now store multiple values in the same
variable. Well, let's enhance this a bit more. Instead of hard coding these scores, as was identified as a
problem, let's use get int to ask the user for a score. Let's, then, use get int to ask the user for another
score.

- [5:08:23](https://www.youtube.com/watch?v=undefined&t=18503s) Let's use get int to ask the user


for a third score, storing them in those respective locations. And, now, if I go ahead and save this
program, recompile scores, huh. I've messed up, here. Now these errors should be getting a little
familiar. What mistake did I make? Let me give folks a moment. AUDIENCE: cs50.

- [5:08:43](https://www.youtube.com/watch?v=undefined&t=18523s) h DAVID MALAN: cs50.h. That was


not intentional, so still making mistakes all these years later. I need to include cs50.h. Now, I'm going to
go back to the bottom in the terminal window, make scores. OK. We're back in business, ./scores. Now,
the program is getting a little more interesting. So maybe, this year was better and I got a 100, and a 99,
and a 98, and there, my average is 99.0000.

- [5:09:05](https://www.youtube.com/watch?v=undefined&t=18545s) So now, it's a little more dynamic.


It's a little more interesting. But it's still capping the number of scores at three, admittedly. But now, I've
introduced another, sort of, symptom of bad programming. There's this expression in programming, too,
called code smell, where like-- [SNIFFS AIR] something smells a little off.

- [5:09:20](https://www.youtube.com/watch?v=undefined&t=18560s) And there's something off here in


that I could do better with this code. Does anyone see an opportunity to improve the design of this code,
here, if my goal, still, is to get three scores from the user but [SNIFF SNIFF] without it smelling [SNIFF]
kind of bad? Yeah? AUDIENCE: [INAUDIBLE] use a 4 loop? That way you don't have to copy and paste all
of those scores.

- [5:09:40](https://www.youtube.com/watch?v=undefined&t=18580s) DAVID MALAN: Yeah, exactly.


Those lines of code are almost identical. And honestly, the only thing that's changing is the number, and
it's just incrementing by 1. We have all of the building blocks to do this better. So let me go ahead and
improve this. Let me delete that code. Let me, now, have a 4 loop.

- [5:09:56](https://www.youtube.com/watch?v=undefined&t=18596s) So for int i get 0, i less than 3, i


plus plus. Then, inside of this 4 loop, I can distill all three of those lines into something more generic, like
scores bracket i equals get int, and now, ask the user, just once, via get int, for a score. So this is where
arrays start to get pretty powerful. You don't have to hard code, that is, literally, type in all of these magic
numbers like 0, 1, and 2.

- [5:10:21](https://www.youtube.com/watch?v=undefined&t=18621s) You can start to do it,


programmatically, as you propose with a loop. So now, I've tightened things up. I'm now, dynamically,
getting three different scores, but putting them in three different locations. And so this program,
ultimately, is going to work, pretty much, the same. Make scores, ./scores, and 100, 99, 98, and we're
back to the same answer.

- [5:10:42](https://www.youtube.com/watch?v=undefined&t=18642s) But it's a little better designed,


too. If I really want to nitpick, there's something that still smells, a little bit, here. The fact that I have
indeed, this magic number three, that really has to be the same as this number here. Otherwise, who
knows what's going to go wrong. So what might be a solution, per last week, to cleaning that code up
further, too? AUDIENCE: [INAUDIBLE] the user's discretion how many input scores [INAUDIBLE].

- [5:11:06](https://www.youtube.com/watch?v=undefined&t=18666s) DAVID MALAN: OK, so we could


leave it up to the user's discretion. And so we could, actually, do something like this. Let me take this a
few steps ahead. Let me say something like, int n gets get int, how many scores question mark, then I
could actually change this to an n, and then this to an n, and, indeed, make the whole program dynamic?
Ask the human how many tests have there been this semester? Then, you can type in each of those
scores because the loop is going to iterate that many times.

- [5:11:34](https://www.youtube.com/watch?v=undefined&t=18694s) And then you'll get the average of


one test, two test, three-- well, lost another-- or however many scores that were actually specified by
the user Yeah, question? AUDIENCE: How many bits or bytes get used in an array? DAVID MALAN: How
many bytes are used in an array? AUDIENCE: [INAUDIBLE] point of doing this is to save [INAUDIBLE]
DAVID MALAN: So the purpose of an array is not to save space.

- [5:12:00](https://www.youtube.com/watch?v=undefined&t=18720s) It's to eliminate having multiple


variable names because that gets very messy quickly. If you have score 1, score 2, score 3, dot, dot, dot,
score 99, that's, like, 99 different variables, potentially, that you could collapse into one variable that has
99 locations. At different indices, or indexes.

- [5:12:20](https://www.youtube.com/watch?v=undefined&t=18740s) As someone would say, the index


for an array is whatever is in the square brackets. AUDIENCE: [INAUDIBLE] DAVID MALAN: So it's a good
question. So if you-- I'm using ints for everything-- and honestly, we don't really need ints for scores
because I'm not likely to get a 2 billion on a test anytime soon.

- [5:12:46](https://www.youtube.com/watch?v=undefined&t=18766s) And so you could use different


data types. And that list we had on the screen, earlier, is not all of them. There's a data type called short,
which is shorter than an int, you could, technically, use char, in some form or other data types as well.
Generally speaking, in the year 2021, these tend to be over optima-- overly optimized decisions.

- [5:13:05](https://www.youtube.com/watch?v=undefined&t=18785s) Everyone just uses ints, even


though no one is going to get a test score that's 2 billion, or more, because int is just, kind of, the go-to.
Years ago, memory was expensive. And every one of your instincts would have been spot on because
memory is so tight. But, nowadays, we don't worry as much about it.

- [5:13:21](https://www.youtube.com/watch?v=undefined&t=18801s) Yeah? AUDIENCE: I have a


question about the error [INAUDIBLE].. Could it-- when you're doing a hash problem on the problem
set-- DAVID MALAN: So what is the difference between dividing two ints and not getting an error, as you
might have encountered in a program like cash, versus dividing two ints and getting an error like I did a
moment ago? The problem with the scenario I created a moment ago was printf was involved.

- [5:13:47](https://www.youtube.com/watch?v=undefined&t=18827s) And I was telling printf to use a


%f, but I was giving printf the result of dividing integers by another integer. So it was printf that was
yelling at me. I'm guessing in the scenario you're describing, for something like cash, printf was not
involved in that particular line of code. So that's the difference, there.
- [5:14:05](https://www.youtube.com/watch?v=undefined&t=18845s) All right. So we, now, have this
ability to create an array. And an array can store multiple values. What, then, might we do that's more
interesting than just storing numbers in memory? Well, let's take this one step further. As opposed to
just storing 72, 73, 33 or 100, 99, 98, at these given locations, because again, an array gives you one
variable name, but multiple locations, or indices therein, bracket 0, bracket 1, bracket 2 on up, if it were
even bigger than that.

- [5:14:36](https://www.youtube.com/watch?v=undefined&t=18876s) Let's, now, start to consider


something more modest, like simple chars. Chars, being 1 byte each, so they're even smaller, they take
up much less space. And, indeed, if I wanted to say a message like, hi I could use three variables. If I
wanted a program to print, hi, H-I exclamation point, I could, of course, store those in three variables,
like c1, c2, c3.

- [5:14:57](https://www.youtube.com/watch?v=undefined&t=18897s) And let's, for the sake of


discussion, let's whip this up real quickly. Let me create a new program, now, in VS Code. This time, I'm
going to call it hi.c. And I'm not going to bother with the CS50 library. I just need the standard I/O one,
for now. int main(void). And then, inside of main, I'm going to, simply, create three variables.

- [5:15:17](https://www.youtube.com/watch?v=undefined&t=18917s) And this is already, hopefully,


striking you as a bad idea. But we'll go down this road, temporarily, with c1, and c2, and, finally, c3.
Storing each character in the phrase I want to print, and I'm going to print this in a different way than
usual. Now I'm dealing with chars. And we've, generally, dealt with strings, which was easier last week.

- [5:15:39](https://www.youtube.com/watch?v=undefined&t=18939s) But %c, %c, %c, will let me print


out three chars, and like c1, c2, and c3. So, kind of, a stupid way of printing out a string. So we already
have a solution to this problem last week. But let's poke around at what's going on underneath the hood,
here. So let's make hi, ./hi. And, voila no surprise.

- [5:15:59](https://www.youtube.com/watch?v=undefined&t=18959s) But we, again, could have done


this last week with a string and just one variable, or even, 0, at that. But let's start converting these
characters to their apparent numeric equivalents like we talked about in week 0 too. Let me modify
these %c's, just to be fun, to be %i's. And let me add some spaces so there are gaps between each of
them.

- [5:16:20](https://www.youtube.com/watch?v=undefined&t=18980s) Let me, now, recompile hi, and let


me rerun it. Just to guess, what should I see on the screen now? Any guesses? Yeah? AUDIENCE: The
ASCII values? DAVID MALAN: The ASCII values. And it's intentional that I keep using the same word, hi,
because it should be, hopefully, the old friends, 72, 73, and 33. Which, is to say, that c knows about
ASCII, or equivalently, Unicode, and can do this conversion for us automatically.

- [5:16:49](https://www.youtube.com/watch?v=undefined&t=19009s) And it seems to be doing it


implicitly for us, so to speak. Notice that c1, c2 and c3 are, obviously, chars, but printf is able to tolerate
printing them as integers. If I really want it to be pedantic, I could use this technique, again, known as
typecasting, where I can actually convert one data type to another, if it makes logical sense to do so.

- [5:17:11](https://www.youtube.com/watch?v=undefined&t=19031s) And we saw in week 0, chars, or


characters, are just numbers, like 72, 73, and 33. So I can use this parenthetical expression to convert,
incorrectly, [LAUGHTER] three chars to three integers, instead. So that's what I meant to type the first
time. There we go. Strike two, today. So parenthesis, int, close parenthesis says take whatever variable
comes after this, c1, c2, or c3 and convert it to an int.

- [5:17:39](https://www.youtube.com/watch?v=undefined&t=19059s) The effect is going to be no


different, make hi, and then rerunning whoops-- then running ./hi still works the same, but now I'm
explicitly converting chars to ints. And we can do this all day long, chars to ints, floats to ints, ints to
floats. Sometimes, it's equivalent. Other times, you're going to lose information.

- [5:17:58](https://www.youtube.com/watch?v=undefined&t=19078s) Taking a float to an int, just


intuitively, is going to throw away everything after the decimal point, because an int has no decimal
point. But, for now, I'm going to rewind to the version of this that just did implicit-type conversion, or
implicit casting, just to demonstrate that we can, indeed, see the values underneath the hood.

- [5:18:18](https://www.youtube.com/watch?v=undefined&t=19098s) All right. Let me go ahead and do


this, now, the week 1 way. This was kind of stupid. Let's just do printf, quote-unquote-- Actually, let's do
this, string s equals quote-unquote hi, and then let's do a simple printf with %s, printing out s's there. So
now I've rewound to last week, where we began this story, but you'll notice that, if we keep playing
around with this-- whoops, what did I do here? Oh, and let me introduce the C50 library here, more on
that next before long.

- [5:18:48](https://www.youtube.com/watch?v=undefined&t=19128s) Let me go ahead and recompile,


rerun this, we seem to be coding in circles, here. Like, I've just done the same thing multiple, different
ways. But there's clearly an equivalence, then, between sequences of chars and strings. And if you do it
the real pedantic way, you have three different variables, c1, c2, c3, representing H-I exclamation point,
or you can just treat them all together like this h, i, exclamation point.

- [5:19:12](https://www.youtube.com/watch?v=undefined&t=19152s) But it turns out that strings are


actually implemented by the computer in a pretty now familiar way. What might a string actually be as of
this point in the story? Where are we going with this? Let me try to look further back. Yeah, in way back?
Yeah? AUDIENCE: Can a string like this be an array of chars? DAVID MALAN: Yeah, a string might be, and
indeed is, just an array of characters.

- [5:19:39](https://www.youtube.com/watch?v=undefined&t=19179s) So last week we took for granted


that strings exist. Technically, strings exist, but they're implemented as arrays of characters, which
actually opens up some interesting possibilities for us. Because, let me see, let me see if I can do this. Let
me try to print out, now, three integers again. But if string s is but an array, as you propose, maybe I can
do s bracket 0, s bracket 1, and s bracket 2.

- [5:20:04](https://www.youtube.com/watch?v=undefined&t=19204s) So maybe I can start poking


around inside of strings, even though we didn't do this last week, so I can get at those individual values.
So make hi, ./hi and, voila, there we go again. It's the same 72, 73, 33, but now, I'm sort of, hopefully,
like, wrapping my mind around the fact that, all right, a string is just an array of characters, and arrays,
you can index into them using this new square bracket notation.

- [5:20:29](https://www.youtube.com/watch?v=undefined&t=19229s) So I can get at any one of these


individual characters, and, heck, convert it to an integer like we did in week 0. Let me get a little curious
now. What else might be in the computer's memory? Well, let's-- I'll go back to the depiction of these
same things. Here might be how we originally implemented hi with three variables, c1, c2, c3.
- [5:20:53](https://www.youtube.com/watch?v=undefined&t=19253s) Of course, that map to these
decimal digits or equivalent, these binary values. But what was this looking like in memory? Literally,
when you create a string in memory, like this, string s equals quote-unquote hi, let's consider what's
going on underneath the hood, so to speak. Well, as an abstraction, a string, it's H-I exclamation point
taking up, it would seem, 3 bytes, right? I've gotten rid of the bars, there, because if you think of a string
as a type, I'm just going to use one big box of size 3.

- [5:21:20](https://www.youtube.com/watch?v=undefined&t=19280s) But technically, a string, we've


just revealed, is an array, and the array is of size 3. So technically, if the string is called s, s bracket 0 will
give you the first character, s bracket 1, the second, and s bracket 3, the third. But let me ask this
question now, if this, at the end of the day, is the only thing in your computer memory and the ability,
like a canvas to draw 0s and 1s, or numbers, or characters, or whatever on it, but that's it, like this is
what your Mac, and PC, and phone ultimately reduced to.

- [5:21:50](https://www.youtube.com/watch?v=undefined&t=19310s) Suppose that I'm running a piece


of software, like a text messenger, and now I write down bye exclamation point. Well, where might that
go in memory? Well, it might go here. B-Y-E. And then the next thing I type might go here, here, here and
so forth. My memory just might get filled up, over time, with things that you or someone else are typing.

- [5:22:09](https://www.youtube.com/watch?v=undefined&t=19329s) But then how does the computer


know if, potentially, B-Y-E exclamation point is right after H-I exclamation point where one string ends
and the next one begins? Right? All we have are bytes, or 0s and 1s. So if you were designing this, how
would you implement some kind of delimiter between the two? Or figure out what the length of a string
is? What do you think? AUDIENCE: A nul character.

- [5:22:36](https://www.youtube.com/watch?v=undefined&t=19356s) DAVID MALAN: OK, so the right


answer is use a nul character, and for those who don't know, what does that mean? AUDIENCE: It's
special. DAVID MALAN: Yeah, so it's a special character. Let me describe it as a sentinel character.
Humans decided some time ago that you know what, if we want to delineate where one string ends and
where the next one begins, we just need some special symbol.

- [5:22:56](https://www.youtube.com/watch?v=undefined&t=19376s) And the symbol they'll use is


generally written as backslash 0. This is just shorthand notation for literally eight 0 bits. 0, 0, 0, 0, 0, 0, 0,
0. And the nickname for eight 0 bits, in this context, is nul, N-U-L, so to speak. And we can actually see
this as follows. If you look at the corresponding decimal digits, like you could do by doing out the math
or doing the conversion, like we've done in code, you would see for storing hi, 72, 73, 33, but then 1
extra byte that's sort of invisibly there, but that is all 0s.

- [5:23:31](https://www.youtube.com/watch?v=undefined&t=19411s) And now I've just written it as the


decimal number 0. The implication of this is that the computer is apparently using, not 3 bytes to store a
word like hi, but 4 bytes. Whatever the length of the string is, plus 1 for this special sentinel value that
demarcates the end of the string. So we might draw it like this instead.

- [5:23:51](https://www.youtube.com/watch?v=undefined&t=19431s) And this character is, again,


pronounced nul, or written N-U-L. So that's all, right? If humans, at the end of the day, just have this
canvas of memory, they just needed to decide, all right, well, how do we distinguish one string from
another? It's a lot easier with chars, individually, it's a lot easier with ints, it's even easier With floats,
why? Because, per that chart earlier, every character is always 1 byte.

- [5:24:14](https://www.youtube.com/watch?v=undefined&t=19454s) Every int is always 4 bytes. Every


long is always 8 bytes. How long is a string? Well, hi is 1, 2, 3 with an exclamation point. Bye is 1, 2, 3, 4
with an exclamation point. David is D-A-V-I-D, five without an exclamation point. And so a string can be
any number of bytes long, so you somehow need to draw a line in the sand to separate in memory one
string from another.

- [5:24:41](https://www.youtube.com/watch?v=undefined&t=19481s) So what's the implication of this?


Well, let me go back to code, here. Let's actually poke around. This is a bit dangerous, but I'm going to
start looking at memory locations past my string here. So let me go ahead and recompile, make hi.
Whoops, what did I do here? I forgot a format code. Let me add one more %i.

- [5:25:03](https://www.youtube.com/watch?v=undefined&t=19503s) Now let me go ahead and rerun


make hi, ./hi, Enter. There it is. So you can actually see in the computer, unbeknownst to you previously,
that there's indeed something else going on there. And if I were to make one other variant of this
program-- let's get rid of just this one word and let's have two.

- [5:25:20](https://www.youtube.com/watch?v=undefined&t=19520s) So let me give myself another


string called t, for instance, just this common convention with bye exclamation point. Let me, then print
out with %s. And let me also print out with %s, whoops, printf, print out t, as well. Let me recompile this
program, and obviously the out-- ugh-- this is what happens when I go too fast.

- [5:25:42](https://www.youtube.com/watch?v=undefined&t=19542s) All right, third mistake today,


close quote. As I was missing. Make hi. Fourth mistake today. Make hi. Dot slash hi. OK, voila. Now we
have a program that's printing both hi and bye, only so that we can consider what's going on in the
computer's memory. If s is storing hi and apparently one bonus byte that demarcates the end of that
string, bye is apparently going to fit into the location directly after.

- [5:26:11](https://www.youtube.com/watch?v=undefined&t=19571s) And it's wrapping around, but


that's just an artist's rendition, here. But bye, B-Y-E exclamation point is taking up 1, 2, 3, 4, plus a fifth
byte, as well. All right, any questions on this underlying representation of strings? And we'll contextualize
this, before long, so that this isn't just like, OK, who really cares? This is going to be the source of actually
implementing things.

- [5:26:35](https://www.youtube.com/watch?v=undefined&t=19595s) In fact for problem set 2, like


cryptography, and encryption, and scrambling actual human messages. But some questions first.
AUDIENCE: So normally if you were to not use string, you would just make a character range that would
declare, how many characters there are so you know how many characters are going to be there.

- [5:26:52](https://www.youtube.com/watch?v=undefined&t=19612s) DAVID MALAN: A good question,


too and let me summarize as, if we were instead to use chars all the time, we would indeed have to
know in advance how many chars you want for a given string that you're storing, how, then, does
something like get string work, because when you CS50 wrote the get string function, we obviously don't
know how long the words are going to be that you all are typing in.
- [5:27:09](https://www.youtube.com/watch?v=undefined&t=19629s) It turns out, two weeks from now
we'll see that get string uses a technique known as dynamic memory allocation. And it's going to grow or
shrink the array automatically for you. But more on that soon. Other questions? AUDIENCE: Why are we
using a nul value? Isn't that wasting a byte? DAVID MALAN: Good question.

- [5:27:28](https://www.youtube.com/watch?v=undefined&t=19648s) Why are we using a nul value,


isn't it wasting a byte? Yes. But I claim there's really no other way to distinguish the end of one string
from the start of another, unless we make some sort of notation in memory. All we have, at the end of
the day, inside of a computer, are bits. Therefore, all we can do is spin those bits in some creative way to
solve this problem.

- [5:27:52](https://www.youtube.com/watch?v=undefined&t=19672s) So we're minimally going to spend


1 byte to solve this problem. Yeah? AUDIENCE: How does our memory device know to enter a line when
you type the /n if we don't have it stored as a char? DAVID MALAN: If you don't-- how does the
computer know to move to a next line when you have a /n? So /n, even though it looks like two
characters, it's actually stored as just 1 byte in the computer's memory.

- [5:28:16](https://www.youtube.com/watch?v=undefined&t=19696s) There's a mapping between it and


an actual number. And you can see that, for instance, on the ASCII chart from the other day. AUDIENCE:
So with that being stored would be the [INAUDIBLE].. DAVID MALAN: It would be. If I had put a /n in my
code here, right after the exclamation point here and here, that would actually shift everything in
memory because we would need to make room for a /n here and another one over here.

- [5:28:41](https://www.youtube.com/watch?v=undefined&t=19721s) So it would take two more bytes,


exactly. Other questions? AUDIENCE: So if hi exclamation point is written in binary and ASCII too as 72,
73, 33, if we are to write those numbers in the string, and convert them into binary how would the
computer know what's 72 and what's 8? DAVID MALAN: And what's the last thing you said? AUDIENCE:
8, for example.

- [5:29:08](https://www.youtube.com/watch?v=undefined&t=19748s) DAVID MALAN: It's context


sensitive. So if, at the end of the day, all we're storing is these numbers, like 72, 73, 33, recall that it's up
to the program to decide, based on context, how to interpret them. And I simplified this story in week 0
saying that Photoshop interprets them as RGB colors, and iMessage or a text messaging program
interprets them as letters, and Excel interprets them as numbers.

- [5:29:32](https://www.youtube.com/watch?v=undefined&t=19772s) How those programs do it is by


way of variables like string, and int, and float. And in fact, later this semester, we'll see a data type via
which you can represent a color as a triple of numbers, and red value, a green value, and a blue value. So
we'll see other data types as well. Yeah? AUDIENCE: It seems easy enough to just add a nul thing at the
end of the word, so why do we have integers and long integers? Why can't we make everything variable
in its data size? DAVID MALAN: Really interesting question.

- [5:30:01](https://www.youtube.com/watch?v=undefined&t=19801s) Why could we not just make all


data types variable in size? And some languages, some libraries do exactly this. C is an older language,
and because memory was expensive memory was limited. The reality was you gain benefits from just
standardizing the size of these things. You also get performance increases in the sense that if you know
every int is 4 bytes, you can very quickly, and we'll see this next week, jump from integer to another, to
another in memory just by adding 4 inside of those square brackets.

- [5:30:31](https://www.youtube.com/watch?v=undefined&t=19831s) You can very quickly poke around.


Whereas, if you had variable length numbers, you would have to, kind of, follow, follow, follow, looking
for the end of it. Follow, follow-- you would have to look at more locations in memory. So that's a topic
we'll come back to. But it was generally for efficiency.

- [5:30:45](https://www.youtube.com/watch?v=undefined&t=19845s) And other question, yeah?


AUDIENCE: Why not store the nul character [INAUDIBLE] DAVID MALAN: Good question why not store
the-- why not store the nul character at the beginning? You could-- let's see, why not store it at the
beginning? You could do that. You could absolutely-- well, could you do this? If you were to do that at the
beginning-- short answer, no.

- [5:31:22](https://www.youtube.com/watch?v=undefined&t=19882s) OK, now I retract that. No,


because I finally thought of a problem with this. If you store it at the beginning instead, we'll see in just a
moment how you can actually write code to figure out where the end of a string is, and the problem
there is wouldn't necessarily know if you eventually hit a 0 at the end of the string, because it's the
number 0 in the context of Excel using some memory, or if it's the context of some other data type,
altogether.

- [5:31:44](https://www.youtube.com/watch?v=undefined&t=19904s) So the fact that we've


standardized-- the fact that we've standardized strings as ending with nul means that we can reliably
distinguish one variable from another in memory. And that's actually a perfect segue way, now, to
actually using this primitive to building up our own code that manipulates these things that are lower
level.

- [5:32:03](https://www.youtube.com/watch?v=undefined&t=19923s) So let me do this. Let me create a


new file called length. And let's use this basic idea to figure out what the length of a string is after it's
been stored in a variable. So let's do this. Let me include both the CS50 header and the standard I/O
header, give myself int main(void) again here, and inside of main, do this.

- [5:32:26](https://www.youtube.com/watch?v=undefined&t=19946s) Let me prompt the user for a


string s and I'll ask them for a string like their name, here. And then let me name it more verbosely name
this time. Now let me go ahead and do this. Let me iterate over every character in this string in order to
figure out what its length is. So initially, I'm going to go ahead and say this, int length equals 0, because I
don't know what it is yet.

- [5:32:52](https://www.youtube.com/watch?v=undefined&t=19972s) So we're going to start at 0. And


then while the following is true-- while-- let me-- do I want to do this? Let me change this to i, just for
clarity, let me do this, while name bracket i does not equal that special nul character. So I typed it on the
slide is N-U-L, but you don't write N-U-L in code, you actually use its numeric equivalent, which is /0 in
single quotes.

- [5:33:18](https://www.youtube.com/watch?v=undefined&t=19998s) While name bracket i does not


equal the nul character, I'm going to go ahead and increment i to i plus plus. And then down here I'm
going to print out the value of i to see what we actually get, printing out the value of i. All right, so what's
going to happen here? Let me run make length. Fortunately no errors.
- [5:33:39](https://www.youtube.com/watch?v=undefined&t=20019s) ./length and let me type in
something like H-I, exclamation point, Enter. And I get 3. Let me try bye, exclamation point, Enter. And I
get 4. Let me try my own name, David, Enter. 5, and so forth. So what's actually going on here? Well, it
seems that by way of this 4 loop, we are specifying a local variable called i initialized to 0, because we're
figuring out the length of the string as we go.

- [5:34:05](https://www.youtube.com/watch?v=undefined&t=20045s) I'm then asking the question,


does location 0, that is i in the name string, which we now know is an array, does it not equal /0?
Because if it doesn't, that means it's an actual character like H, or B, or D. So let's increment i. Then, let's
come back around to line 9 and let's ask the question again.

- [5:34:25](https://www.youtube.com/watch?v=undefined&t=20065s) Now i equals 1. So does name


bracket 1 not equal /0? Well, if it doesn't, and it won't if it's an i, or a y, or an a, based on what I typed in,
we're going to increment i once more. Fast-forward to the end of the story, once I get to the end of the
string, technically, one space past the end of the string, name bracket i will equal /0.

- [5:34:50](https://www.youtube.com/watch?v=undefined&t=20090s) So I don't increment i anymore, I


end up just printing the result. So what we seem to have here with some low level C code, just this while
loop, is a program that figures out the length of a given string that's been typed in. Let's practice our
abstraction and decompose this into, maybe, a helper function here.

- [5:35:08](https://www.youtube.com/watch?v=undefined&t=20108s) Let me grab all of this code here,


and assume, for the sake of discussion for a moment, that I can call a function now called string length.
And the length of the string is name that I want to get, and then I'll go ahead and print out, just as before
with %i, the length of that string. So now I'm abstracting away this notion of figuring out the length of
the string.

- [5:35:30](https://www.youtube.com/watch?v=undefined&t=20130s) That's an opportunity for to me to


create my own function. If I want to create a function called string length, I'll claim that I want to take a
string as input, and what should I have this function return as its return type? What should get string
presumably return? Yeah? AUDIENCE: Int. DAVID MALAN: An int, right? An int makes sense.

- [5:35:53](https://www.youtube.com/watch?v=undefined&t=20153s) Float really wouldn't make sense


because we're measuring things that are integers. In this case, the length of something. So indeed, let's
have it return an int. I can use the same code as before, so I'm going to paste what I cut earlier in the file.
The only thing I have to change is the name of the variable.

- [5:36:11](https://www.youtube.com/watch?v=undefined&t=20171s) Because now this function, I


decided arbitrarily that I'm going to call it s, just to be more generic. So I'm going to look at s bracket i at
each location. And I don't want to print it at the end, this would be a side effect. What's the line of code I
should include here if I actually want to hand back the total length? Yeah? AUDIENCE: Return i.

- [5:36:30](https://www.youtube.com/watch?v=undefined&t=20190s) DAVID MALAN: Say again?


AUDIENCE: Return i. DAVID MALAN: Return i, in this case. So I'm going return i, not print it. Because now,
my main function can use the return value stored in length and print it on the next line itself. I just need
a prototype, so that's my one forgivable copy paste here. I'm going to rerun make length.
- [5:36:48](https://www.youtube.com/watch?v=undefined&t=20208s) Hopefully I didn't screw up. I
didn't. ./length, I'll type in hi-- oops-- I'll type in hi, again. That works. I'll type in bye again, and so forth.
So now we have a function that determines the length of a string. Well, it turns out we didn't actually
need this all along. It turns out that we can get rid of my own custom string length function here.

- [5:37:10](https://www.youtube.com/watch?v=undefined&t=20230s) I can definitely delete the whole


implementation down here. Because it turns out, in a file called string.h, which is a new header file
today, we actually have access to a function called, more succinctly, strlen, S-T-R-L-E-N. Which, literally
does that. This is a function that comes with C, albeit in the string.

- [5:37:30](https://www.youtube.com/watch?v=undefined&t=20250s) h header file, and it does what we


just implemented manually. So here's an example of, admittedly, a wheel we just reinvented, but no
more. We don't have to do that. And how do what kinds of functions exist? Well, let me pop out of my
browser here to a website that is a CS50's incarnation of what are called manual pages.

- [5:37:49](https://www.youtube.com/watch?v=undefined&t=20269s) It turns out that in a lot of


systems, Macs, and Unix, and Linux systems, including the Visual Studio Code instance that we have in
the cloud, there are publicly accessible manual pages for functions. They tend to be written very
expertly, in a way that's not very beginner-friendly. So we have here at manual.cs50.

- [5:38:10](https://www.youtube.com/watch?v=undefined&t=20290s) io is CS50's version of manual


pages that have this less-comfortable mode that give you a, sort of, cheat sheet of very frequently used,
helpful functions in C. And we've translated the expert notation to things that a beginner can
understand. So, for instance, let me go ahead and search for a string up at the top here. You'll see that
there's documentation for our own get string function, but more interestingly down here, there's a
whole bunch of string-related functions that we haven't even seen most of, yet.

- [5:38:37](https://www.youtube.com/watch?v=undefined&t=20317s) But there's indeed one here


called strlen, calculate the length of a string. And so if I go to strlen here, I'll see some less-comfortable
documentation for this function. And the way a manual page typically works, whether in CS50's format
or any other, system is you see, typically, a synopsis of what header files you need to use the function.

- [5:38:58](https://www.youtube.com/watch?v=undefined&t=20338s) So you would copy paste these


couple of lines here. You see what the prototype is of the function so that you know what its inputs are,
if any, and its outputs are, if any. Then down below you might see a description, which in this case, is
pretty straightforward. This function calculates the length of s.

- [5:39:12](https://www.youtube.com/watch?v=undefined&t=20352s) Then you see what the return


value is, if any, and you might even see an example, like this one that we've whipped up here. So these
manual pages which are again, accessible here, and we'll link to these in the problem sets moving
forward, are pretty much the place to start when you want to figure out has a wheel been invented
already? Is there a function that might help me solve some problems set problems so that I don't have to
really get into the weeds of doing all of those lower-level steps as I've had.

- [5:39:38](https://www.youtube.com/watch?v=undefined&t=20378s) Sometimes the answer is going to


be yes, sometimes it's going to be no. But again the point of our having just done this together is to
reveal that even the functions you start taking for granted, they all reduce to some of these basic
building blocks. At the end of the day, this is all that's inside of your computer is 0s and 1s.
- [5:39:55](https://www.youtube.com/watch?v=undefined&t=20395s) We're just learning, now, how to
harness those and how to manipulate them ourselves. Any questions here on this? Any questions at all?
Yeah. AUDIENCE: We did just see [INAUDIBLE] Is that so common that we would have to specify it, or is it
not? DAVID MALAN: Good question. Is it so common that you would have to specify it or not? You do
need to include its header files because that's where all of those prototypes are.

- [5:40:26](https://www.youtube.com/watch?v=undefined&t=20426s) You don't need to worry about


linking it in with -l anything. And in fact, moving forward, you do not ever need to worry about linking in
libraries when compiling your code. We, the staff, have configured make to do all of that for you
automatically. We want you to understand that it is doing it, but we'll take care of all of the -l's for you.

- [5:40:44](https://www.youtube.com/watch?v=undefined&t=20444s) But the onus is on you for the


prototypes and the header files. Other questions on these representations or techniques? Yeah?
AUDIENCE: [INAUDIBLE] exclamation mark. How does it actually define the spaces [INAUDIBLE]?? DAVID
MALAN: A good question. If you were to have a string with actual spaces in it that is multiple words,
what would the computer actually do? Well for this. let me go to asciichart.com.

- [5:41:14](https://www.youtube.com/watch?v=undefined&t=20474s) Which is just a random website


that's my go-to for the first 127 characters of ASCII. This is, in fact, what we had a screenshot of the other
day. And if you look here, it's a little non-obvious, but S-P is space. If a computer were to store a space, it
would actually store the decimal number 32, or technically, the pattern of 0s and 1s that represent the
number 32.

- [5:41:35](https://www.youtube.com/watch?v=undefined&t=20495s) All of the US English keys that you


might type on a keyboard can be represented with a number, and using Unicode can you express even
things like emojis and other languages. Yeah? AUDIENCE: Are only strings followed by nul number, or
let's say we had a series of numbers, would each one of them be accompanied by nuls? DAVID MALAN:
Good question.

- [5:41:53](https://www.youtube.com/watch?v=undefined&t=20513s) Only strings are accompanied by


nuls at the end because every other data type we've talked about thus far is of well defined finite length.
1 byte for char, 4 bytes for ints and so forth. If we think back to last week, we did end the week with a
couple of problems. Integer overflow, because 4 bytes, heck, even 8 bytes is sometimes not enough.

- [5:42:12](https://www.youtube.com/watch?v=undefined&t=20532s) We also talked about floating


point imprecision. Thankfully in the world of scientific computing and financial computing, there are
libraries you can use that draw inspiration from this idea of a string, and they might use 9 bytes for an
integer value or maybe 20 bytes that you can count really high. But they will then start to manage that
memory for you and what they're really probably doing is just grabbing a whole bunch of bytes and
somehow remembering how long the sequence of bytes is.

- [5:42:37](https://www.youtube.com/watch?v=undefined&t=20557s) That's how these higher-level


libraries work, too. All right, this has been a lot. Let's take one more break here. We'll do a seven-minute
break here. And when we come back, we'll flesh out a few more details. All right. So we just saw strlen as
an example of a function that comes in the string library.

- [5:42:57](https://www.youtube.com/watch?v=undefined&t=20577s) Let's start to take more of these


library functions out for a spin. So we're not relying only on the built ins that we saw last week. Let me
switch over to VS Code. And create a file called, say string.h. to apply this lesson learned, as follows. Let
me include cs50.h, stdio.

- [5:43:19](https://www.youtube.com/watch?v=undefined&t=20599s) h, and this new thing, string.h as


well, at the top. I'm going to do the usual int main(void) here. And then in this program suppose, for the
sake of discussion, that I didn't know about %s for printf or, heck, maybe early on there was no %s
format code. And so there was no easy way to print strings. Well, at least if we know that strings are just
arrays of characters, we could use %c as a workaround, a solution to that, sort of, contrived problem.

- [5:43:46](https://www.youtube.com/watch?v=undefined&t=20626s) So let me ask myself for a string s


by using get string here and I'll ask the user for some input. And then, let me print out say, output , and
all I want to do is print back out what the user typed. Now, the simplest way to do this, of course, is
going to be like last week, printf %s, and plug in the s, and we're done.

- [5:44:05](https://www.youtube.com/watch?v=undefined&t=20645s) But again, for the sake of


discussion, I forgot about, or someone didn't implement %s, so how else could we do this? Well, in
pseudo code, or in English what's the gist of how we could solve this problem, printing out the string s
on the screen without using %s? How might we go about solving this? Just in English, high-level? What
would your pseudo code look like? Yeah? AUDIENCE: You could just print each letter.

- [5:44:34](https://www.youtube.com/watch?v=undefined&t=20674s) DAVID MALAN: OK, so just print


each letter. And maybe, more precisely, some kind of loop. Like, let's iterate over all of the characters in s
and print one at a time. So how can I do that? Well, for int i, get 0 is kind of the go-to starting point for
most loops, i is less than-- OK, how long do I want to iterate? Well, it's going to depend on what I type in,
but that's why we have strlen now.

- [5:44:56](https://www.youtube.com/watch?v=undefined&t=20696s) So iterate up to the length of s,


and then increment i with plus plus on each iteration. And then let's just print out %c with no new line,
because I want everything on the same line, whatever the character is at s bracket i. And then at the very
end, I'll give myself that new line, just to move the cursor down to the next line so the dollar sign is not
in a weird place.

- [5:45:19](https://www.youtube.com/watch?v=undefined&t=20719s) All right, so let's see if I didn't


screw up any of the code, make string, Enter, so far so good, string and let me type in something like, hi,
Enter. And I see output of hi, too. Let me do it once more with bye, Enter, and that works, too. Notice I
very deliberately and quickly gave myself two spaces here and one space here just because I, literally,
wanted these things to line up properly, and input is shorter than output.

- [5:45:43](https://www.youtube.com/watch?v=undefined&t=20743s) But that was just a deliberate


formatting detail. So this code is correct. Which is a claim I've made before, but it's not well-designed. It
is well-designed in that I'm using someone else's library function, like, I've not reinvented a wheel,
there's no line 15 or below, I didn't implement string length myself.

- [5:46:03](https://www.youtube.com/watch?v=undefined&t=20763s) So I'm at least practicing what I've


preached. But there's still an imperfection, a suboptimality. This one's really subtle though. And you have
to think about how loops work. What am I doing that's not super efficient? Yeah, in back? AUDIENCE:
[INAUDIBLE] over and over again. DAVID MALAN: Yeah, this is a little subtle.
- [5:46:29](https://www.youtube.com/watch?v=undefined&t=20789s) But if you think back to the basic
definition of a 4 loop and recall when I highlighted things last week, what happens? Well, the first thing
is that i gets set to 0. Then we check the condition. How do we check the condition? We call strlen on s,
we get back an answer like 3 if it's a H-I exclamation point and 0 is less than 3, so that's fine, and then we
print out the character.

- [5:46:51](https://www.youtube.com/watch?v=undefined&t=20811s) Then we increment i from 0 to 1.


We recheck the condition. How do I recheck the condition? I call strlen of s. Get back the same answer,
3. Compare 3 against 1. We're still good. So we print out another character. i gets incremented again, i is
now 2. We check the condition. What's the condition? Well, what's the string like the best? It's still 3.

- [5:47:13](https://www.youtube.com/watch?v=undefined&t=20833s) 2 is still less than 3. So I keep


asking the same question sort of stupidly because the string is, presumably, never changing in length.
And indeed, every time I check that condition, that function is going to get called. And every time, the
answer for hi is going to be 3. 3. 3. So it's a marginal suboptimality, but I could do better, right? Don't ask
multiple times questions that you can remember the answer to.

- [5:47:40](https://www.youtube.com/watch?v=undefined&t=20860s) So how could I remember the


answer to this question and ask it just once? How could I remember the answer to this question? Let me
see. Yeah, back there? AUDIENCE: Store it in a variable. DAVID MALAN: So store it in a variable, right?
That's been our answer most any time we want to keep something around.

- [5:47:56](https://www.youtube.com/watch?v=undefined&t=20876s) So how could I do this? Well, I


could do something like this, int, maybe, length equals strlen of s. Then I can just change this function
call. Let me fix my spelling here. Let me fix this to be comparing against length, and this is now OK.
Because now strlen is only called once on line 9. And I'm reusing the value of that variable, a.k.a.

- [5:48:17](https://www.youtube.com/watch?v=undefined&t=20897s) length, again, and again, and


again. So that's more efficient. Turns out that 4 loops let you declare multiple variables at once, so we
can do this a little more elegantly all in one line. And this is just some syntactic improvement. I could
actually do something like this, n equals strlen of s, and then I could just say n here or I could call it
length.

- [5:48:39](https://www.youtube.com/watch?v=undefined&t=20919s) But heck, while I'm being succinct


I'm just going to use n for number. So now it's just a marginal change but I've now declared two variables
inside of my loop, i and n. i is set to 0. n extends to the string length of s. But now, hereafter, all of my
condition checks are just, i less than n, i less than n, and n is never changing.

- [5:49:00](https://www.youtube.com/watch?v=undefined&t=20940s) All right, so a marginal


improvement there. Now that I've used this new function, let's use some other functions that might be
of interest. Let me write a quick program here that capitalizes the beginning of-- changes to uppercase
some string that the user types in. So let me code a file called uppercase.c.

- [5:49:20](https://www.youtube.com/watch?v=undefined&t=20960s) Up here I'll use my new friends,


cs50.h, and standard I/O, and string.h. So standard I/O, and string.h So just as before int main(void). And
then inside of main, what I'm going to do this time, is let's ask the user for a string s using get string
asking them for the before value. And then let me print out something like after.
- [5:49:44](https://www.youtube.com/watch?v=undefined&t=20984s) So that it-- just so I can see what
the uppercase version thereof is. And then after this, let me do the following, for int, i equals 0, oh, let's
practice that same lesson, so n equals the string length of s, i is less than n, i plus plus. So really, nothing
new, fundamentally yet. How do I now convert characters from lowercase, if they are, to uppercase? In
other words, if I type in hi, H-I in lowercase, I want my program, now, to uppercase everything to capital
H, capital I.

- [5:50:20](https://www.youtube.com/watch?v=undefined&t=21020s) Well how can I go about doing


this? Well you might recall that there is this-- you might recall that there is this ASCII chart. So let's just
consult this real quick on asciichart.com. We've looked at this last week notice that a-- capital A is 65,
capital B is 66, capital C is 67, and heck, here's lowercase a, lowercase b, lowercase c, and that's 97, 98,
99.

- [5:50:44](https://www.youtube.com/watch?v=undefined&t=21044s) And if I actually do some math,


there's a distance of 32. Right? So if I want to go from uppercase to lowercase, I can do 65 plus 32 will
give me 97 and that actually works out across the board for everything else. 66 plus 32 gets me to 98 or
lowercase b. Or conversely, if you have a lowercase a, and its value is 97, subtract 32 and boom, you
have capital A. So there's some arithmetic involved.

- [5:51:11](https://www.youtube.com/watch?v=undefined&t=21071s) But now that we know that


strings are just arrays, and we know that characters, which are in those arrays, are just binary
representations of numbers, I think we can manipulate a few of these things as follows. Let me go back
to my program here, and first ask the question, if the current character in the array during this loop is
lowercase, let's force it to uppercase.

- [5:51:33](https://www.youtube.com/watch?v=undefined&t=21093s) So how am I going to do that? If


the character at s bracket i, the current location in the array, is greater than or equal to lowercase a, and
s bracket i is less than or equal to lowercase z, kind of a weird Boolean expression but it's completely
legitimate, because in this array s is a whole bunch of characters that the humans typed in, because
that's what a string is, greater than or equal to a might be a little nonsensical because when have you
ever compared numbers to letters? But we know from week 0 lowercase a is 97, lowercase z is, what is
it, 1?

- [5:52:12](https://www.youtube.com/watch?v=undefined&t=21132s) I don't even remember.


AUDIENCE: 132. DAVID MALAN: What's that? AUDIENCE: 132? DAVID MALAN: 132, We know. And so
that would allow us to answer the question is the current letter lowercase? All right, so let me answer
that question. If it is, what do I want to print out? I don't want to print out the letter itself, I want to print
out the letter minus 32, right? Because if it happens to be a lowercase a, 97, 97 minus 32 gives me 65,
which is uppercase A, and I know that just from having stared at that chart in the past.

- [5:52:43](https://www.youtube.com/watch?v=undefined&t=21163s) Else if the character is not


between little a and big A, I'm just going to print out the character itself by printing s bracket i. And at
the very end of this, I'm going to print out a new line just to move the cursor to the next line. So again,
it's a little wordy. But this loop here, which I borrowed from our code previously, just iterates over the
string, a.k.a.
- [5:53:06](https://www.youtube.com/watch?v=undefined&t=21186s) array, character-by-character,
through its length. This line 11 here is just asking the question if that current character, the i-th character
of s, is greater than or equal to little a and less than or equal to little z, that is between 97 and 132, then
we're going to go ahead and force it to uppercase instead.

- [5:53:29](https://www.youtube.com/watch?v=undefined&t=21209s) All right, and let me zoom out


here for just a second. And sorry, I misspoke 122, which is what you might have said. There's only 26
letters. So 122 is little z. Let me go ahead now and compile and run this program. So make uppercase,
./uppercase, and let me type in hi in lowercase, Enter. And there's the capitalized version, thereof.

- [5:53:53](https://www.youtube.com/watch?v=undefined&t=21233s) Let me do it again, with my own


name in lowercase, and now it's capitalized as well. Well, what could we do to improve this? Well. You
know what? Let's stop reinventing wheels. Let's go to the manual pages. So let me go here and search
for something like, I don't know, lowercase. And there I go.

- [5:54:10](https://www.youtube.com/watch?v=undefined&t=21250s) I did some auto complete here,


our little search box is saying that, OK there's an is-lower function, check whether a character is
lowercase. Well how do I use this? Well let me check, is lower, now I see the actual man page for this
function. Now we see, include ctype.h. So that's the protot-- that's the header file I need to include.

- [5:54:29](https://www.youtube.com/watch?v=undefined&t=21269s) This is the prototype for is-lower,


it apparently takes a char as input and returns an int. Which is a little weird. I feel like is-lower should
return true or false. So let's scroll down to the description and return value. It returns, oh this is
interesting. And this is a convention in C.

- [5:54:50](https://www.youtube.com/watch?v=undefined&t=21290s) This function returns a non-zero


int if C is a lowercase letter and 0 if C is not a lowercase letter. So it returns non-zero. So like 1, negative
1, something that's not 0 if C is a lowercase letter, and 0 if it is not a lowercase letter. So how can we use
this building block? Let me go back to my code here. Let me add this file, include ctype.h.

- [5:55:14](https://www.youtube.com/watch?v=undefined&t=21314s) And down here, let me get rid of


this cryptic expression, which was kind of painful to come up with, and just ask this, is-lower s bracket i?
That should actually work but why? Well is-lower, again, returns a non-zero value if the letter is
lowercase. Well, what does that mean? That means it could return 1.

- [5:55:38](https://www.youtube.com/watch?v=undefined&t=21338s) It could return negative 1. It could


return 50 or negative 50. It's actually not precisely defined, why? Just, because. This was a common
convention to use 0 to represent false and use any other value to represent true. And so it turns out, that
inside of Boolean expressions, if you put a value like a function call like this, that returns 0, that's going
to be equivalent to false.

- [5:56:01](https://www.youtube.com/watch?v=undefined&t=21361s) It's like the answer being no, it is


not lower. But you can also, in parentheses, put the name of the function and its arguments, and not
compare it against anything. Because we could do something like this, well if it's not equal to 0, then it
must be lowercase. Because that's the definition, if it returns a non-zero value, it's lowercase.

- [5:56:20](https://www.youtube.com/watch?v=undefined&t=21380s) But a more succinct way to do


that is just a bit more like English. If it's is lower, then print out the character minus 32. So this would be
the common way of using one of these is- functions to check if the answer is true or false. AUDIENCE:
[INAUDIBLE] DAVID MALAN: OK, well we might be done. OK. AUDIENCE: [INAUDIBLE] DAVID MALAN: No.

- [5:56:42](https://www.youtube.com/watch?v=undefined&t=21402s) So it's not necessarily 1. It would


be incorrect to check for 1, or negative 1, or anything else. You want to check for the opposite of 0. So
not equal 0. Or more succinctly, like I did by just putting it into parentheses. Let me see what happens
here. So this is great, but some of you might have spotted a better solution to this problem.

- [5:57:04](https://www.youtube.com/watch?v=undefined&t=21424s) A moment ago when we were on


the manual pages searching for things related to lowercase, what might be another building block we
can employ here? Based on what's on the screen here? Yeah? AUDIENCE: To-upper. DAVID MALAN: So
to-upper. There's a function that would literally do the uppercasing thing for me so I don't have to get
into the weeds of negative 32, plus 32.

- [5:57:24](https://www.youtube.com/watch?v=undefined&t=21444s) I don't have to consult that chart.


Someone has solved this problem for me in the past. And let's see if I can actually get back to it. There
we go. Let me go ahead, now, and use this. So instead of doing s bracket i minus 32, let's use a function
that someone else wrote, and just say to-upper, s bracket i.

- [5:57:45](https://www.youtube.com/watch?v=undefined&t=21465s) And now it's going to do the


solution for me. So if I rerun make uppercase, and then do, slowly, .uppercase, type in hi, now it's
working as expected. And honestly, if I read the documentation for to-upper by going back to its man
page, or manual page, what you'll see is that it says if it's lowercase, it will return the uppercase version
thereof.

- [5:58:09](https://www.youtube.com/watch?v=undefined&t=21489s) If it's not lowercase, it's already


uppercase, it's punctuation, it will just return the original character. Which means, thanks to this
function, I can actually tighten this up significantly, get rid of all of my conditional there, and just print
out the to-upper return value, and leave it to whoever wrote that function to figure out if something's
uppercase or lowercase.

- [5:58:34](https://www.youtube.com/watch?v=undefined&t=21514s) All right, questions on these kinds


of tricks? Again, it all reduces to week 0 basics, but we're just building these abstractions on top. Yeah?
AUDIENCE: I'm wondering if there's any way just to import all packages under a certain subdomain
instead of having to do multiple [INAUDIBLE] statements, kind of like a star [INAUDIBLE] DAVID MALAN:
Yes.

- [5:58:54](https://www.youtube.com/watch?v=undefined&t=21534s) Unfortunately, no. There is no


easy way in C to say, give me everything. That was for, historically, performance reasons. They want you
to be explicit as to what you want to include. In other languages like Python, Java, one of which we'll see
later this term, you can say, give me everything. But that, actually, tends to be best practice because it
can slow down execution or compilation of your code.

- [5:59:14](https://www.youtube.com/watch?v=undefined&t=21554s) Yeah? AUDIENCE: Does to-upper


accommodate for special characters? DAVID MALAN: Ah. Does to-upper accommodate special characters
like punctuation? Yes. If I read the documentation more pedantically, we would see exactly that. It will
properly hand me back an exclamation point, even if I passed it in. So if I do make uppercase here, and
let me do .
- [5:59:33](https://www.youtube.com/watch?v=undefined&t=21573s) /upper, sorry-- ./uppercase, hi
with an exclamation point, it's going to handle that, too, pass it through unchanged Yeah? AUDIENCE: Do
we access to a function that would do all of that but just to the screen rather than to [INAUDIBLE] DAVID
MALAN: Really good question, too. No, we do not have access to a function that at least comes with C or
comes with CS50's library that will just force the whole thing to uppercase.

- [5:59:56](https://www.youtube.com/watch?v=undefined&t=21596s) In C, that's actually easier said


than done. In Python, it's trivial. So stay tuned for another language that will let us do exactly that. All
right, so what does this leave us with? There's just a-- let's come full circle now, to where we began
today where we were talking about those command line arguments.

- [6:00:12](https://www.youtube.com/watch?v=undefined&t=21612s) Recall that we talked about rm


taking command line argument. The file you want to delete, we talked about clang taking command line
arguments, that again, modify the behavior of the program. How is it that maybe you and I can start to
write programs that actually take command line arguments? Well here is where I can finally explain why
we've been typing int main(void) for the past week and just asking that you take on faith that it's just the
way you do things.

- [6:00:39](https://www.youtube.com/watch?v=undefined&t=21639s) Well, by default in C, at least the


most recent versions thereof, there's only two official ways to write main functions. You might see other
formats online, but they're generally not consistent with the current specification. This, again, was sort
of a boilerplate for the simplest function we might write last week, and recall that we've been doing this
the whole time.

- [6:01:00](https://www.youtube.com/watch?v=undefined&t=21660s) (Void) What that (void) means, for


all of the programs I have written thus far and you have written thus far, is that none of our programs
that we've written take command line arguments. That's what the void there means. It turns out that
main is the way you can specify that your program does, in fact, take command line arguments, that is
words after the command in your terminal window.

- [6:01:24](https://www.youtube.com/watch?v=undefined&t=21684s) If you want to actually not use get


int or get string, you want the human to be able to say something, like hello, David and hit Enter. And
just run-- print hello, David on the screen. You can use command line arguments, words after the
program name on your command line. So we're going to change this in a moment to be something more
verbose, but something that's now a bit more familiar syntactically.

- [6:01:48](https://www.youtube.com/watch?v=undefined&t=21708s) If you change that (void) in main


to be this incantation instead, int, argc, comma, string, argv, open bracket, close bracket, you are now
giving yourself access to writing programs that take command line arguments. Argc, which stands for
argument count is going to be an integer that stores how many words the human typed at the prompt.

- [6:02:11](https://www.youtube.com/watch?v=undefined&t=21731s) The C automatically gives that to


you. String argv stands for argument vector, that's going to be an array of all of the words that the
human typed at the prompt. So with today's building block of an array, we have the ability now to let the
humans type as many words, or as few words, as they want at the prompt.

- [6:02:28](https://www.youtube.com/watch?v=undefined&t=21748s) C is going to automatically put


them in an array called argv, and it's going to tell us how many words there are in an int called argc. The
int, as the return type here, we'll come back to in just a moment. Let's use this definition to make,
maybe, just a couple of simple programs. But in problem set 2 will we actually use this to control the
behavior of your own code.

- [6:02:51](https://www.youtube.com/watch?v=undefined&t=21771s) Let me code up a file called argv.0


just to keep it aptly named. Let me include cs50.h. Let me go ahead and include-- oops. That is not the
right name of a program, let's start that over. Let's go ahead and code up argv.c. And here we have--
include cs50.h, include stdio.h, int, main, not void, let's actually say int, argc, string, argv, open bracket,
close bracket.

- [6:03:24](https://www.youtube.com/watch?v=undefined&t=21804s) No numbers in between because


you don't know, in advance, how many words the human's going to type at their prompt. Now let's go
ahead and do this. Let's write a very simple program that just says, hello, David, hello, Carter, whoever
the name is that gets typed. But not using get string, let's instead have the human just type their name
at the prompt, just like rm, just like clang, just like make, so it's just one and done when you hit Enter.

- [6:03:46](https://www.youtube.com/watch?v=undefined&t=21826s) No additional prompts. Let me go


ahead then and do this, printf, quote-unquote, hello, comma, and instead of world today, I want to print
out whatever the human typed in. So let's go ahead and do this, argv, bracket 0 for now. But I don't think
this is quite what I want because, of course, that's going to literally print out argv, bracket, 0, bracket.

- [6:04:13](https://www.youtube.com/watch?v=undefined&t=21853s) I need a placeholder, so let me


put %s here and then put that here. So if argv is an array, but it's an array of strings, then argv bracket 0
is itself a single string. And so it can be plugged into that %s placeholder. Let me go ahead and save my
program. And compile argv, so far, so good. Let me now type in my name after the name of the program.

- [6:04:37](https://www.youtube.com/watch?v=undefined&t=21877s) So no get string. I'm literally


typing an extra word, my own name at the prompt, Enter. OK, it's apparently a little buggy in a couple of
ways. I forgot my /n but that's not a huge deal. But apparently, inside of argv is literally everything that
humans typed in including the name of the program.

- [6:04:56](https://www.youtube.com/watch?v=undefined&t=21896s) So logically, how do I print out


hello, David, or hello so-and-so and not the actual name of the program? What needs to change here?
Yeah? AUDIENCE: Change the index to 1. DAVID MALAN: Yeah. So presumably index to 1, if that's the
second thing I, or whichever human, has typed at the prompt. So let's do make argv again, ./argv, Enter.

- [6:05:16](https://www.youtube.com/watch?v=undefined&t=21916s) Huh. Hello, nul. So this is another


form of nul. But this is user error, now, on my part. I didn't do exactly what I said I would. Yeah?
AUDIENCE: You forgot the parameter. DAVID MALAN: Yeah, I forgot the parameter. So that's actually, hm.
I should probably deal with that, somehow, so that people aren't breaking my program and printing out
random things, like nul.

- [6:05:35](https://www.youtube.com/watch?v=undefined&t=21935s) But if I do say argv, David, now


you see hello, David. I can get a little curious, like what's at location 2? Well we can see, make argv,
bracket, ./argv, David, Enter. All right, so just nothing is there. But it turns out, in a couple of weeks, we'll
start really poking around memory and see if we can't crash programs deliberately because nothing is
stopping me from saying, oh what's at location 2 million, for instance? We could really start to get
curious.
- [6:06:03](https://www.youtube.com/watch?v=undefined&t=21963s) But for now, we'll do the right
thing. But let's now make sure the human has typed in the right number of words. So let's say this, if argc
equals 2, that is the name of the program and one more word after that, go ahead and trust that in argv
1, as you proposed, is the person's name. Else, let's go ahead and default here to something simple and
basic, like, well, if we don't get a name from the user, just say hello, world, like always.

- [6:06:32](https://www.youtube.com/watch?v=undefined&t=21992s) So now we're programming


defensively. This time the human, even if they screw up, they don't give us a name or they give us too
many names, we're just going to say hello, world, because I now have some error handling here.
Because, again, argc is argument count, the number of words, total, typed at the command line.

- [6:06:48](https://www.youtube.com/watch?v=undefined&t=22008s) So make, argv, ./argv. Let me


make the same mistake as before. OK. I don't get this weird nul behavior. I get something well-defined. I
could now do David. I could do David Malan, but that's not currently supported. I would need to alter my
logic to support more than just two words after the prompt. So what's the point of this? At the moment,
it's just a simple exercise to actually give myself a way of taking user input when they run the program.

- [6:07:15](https://www.youtube.com/watch?v=undefined&t=22035s) Because, consider, it's just more


convenient in this new, command-line-interface world. If you had to use get string every time you
compile your code, it'd be kind of annoying, right? You type make, then you might get a prompt, what
would you like to make? Then you type in hello, or cash, or something else, then you hit Enter, it just
really slows the process.

- [6:07:34](https://www.youtube.com/watch?v=undefined&t=22054s) But in this command-line-


interface world, if you support command line arguments, then you can use these little tricks. Like,
scrolling up and down in your history with your arrow keys. You can just type commands more quickly
because you can do it all at once. And you don't have to keep prompting the user, more pedantically, for
more and more info.

- [6:07:52](https://www.youtube.com/watch?v=undefined&t=22072s) So any questions then on


command line arguments? Which, finally, reveals why we had (void) initially, but what more we can now
put in main. That's how you take command line arguments. Yeah? AUDIENCE: If you were to put-- if you
were to use argv, and you were to put integers inside of it, would it still give you, like, a string? Would
that still be considered string? Or would you consider [INAUDIBLE]? DAVID MALAN: Yes.

- [6:08:18](https://www.youtube.com/watch?v=undefined&t=22098s) If you were to type at the


command line something like, not a word, but something like the number 42, that would actually be
treated as a string. Why? Because again, context matters. So if your program is currently manipulating
memory as though its characters or strings, whatever those patterns of 0s and 1s are, they will be
interpreted as ASCII text, or Unicode text.

- [6:08:41](https://www.youtube.com/watch?v=undefined&t=22121s) If we therefore go to the chart


here, that might make you wonder, well, then how do you distinguish numbers from letters in the
context of something like chars and strings? Well, notice 65 is a, 97 is a, but also 49 is 1, and 50 is 2. So
the designers of ASCII, and then later Unicode, realized well wait a minute, if we want to support
programs that let you type things that look like numbers, even though they're not technically ints or
floats, we need a way in ASCII and Unicode to represent even numbers.
- [6:09:15](https://www.youtube.com/watch?v=undefined&t=22155s) So here are your numbers. And
it's a little silly that we have numbers representing other numbers. But again, if you're in the world of
letters and characters, you've got to come up with a mapping for everything. And notice here, here's the
dot. Even if you were to represent 1.

- [6:09:31](https://www.youtube.com/watch?v=undefined&t=22171s) 23 as a string, or as characters,


even the dot now is going to be represented as an ASCII character. So again, context here matters. All
right, one final example to tease apart what this int is and what it's been doing here for so long. So I'm
going to add one bit of logic to a new file that I'm going to call exit.c. So an exit.c.

- [6:09:53](https://www.youtube.com/watch?v=undefined&t=22193s) We're going to introduce


something that are generally known as exit status. It turns out this is not a feature we've used yet, but
it's just useful to know about. Especially when automating tests of your own code. When it comes to
figuring out if a program succeeded or failed. It turns out that main has one more feature we haven't
leveraged.

- [6:10:13](https://www.youtube.com/watch?v=undefined&t=22213s) An ability to signal to the user


whether something was successful or not. And that's by way of main's return value. So I'm going modify
this program as follows, like this. Suppose I want to write a similar program that requires that the user
type a word at the prompt. So that argc has to be 2 for whatever design purpose.

- [6:10:37](https://www.youtube.com/watch?v=undefined&t=22237s) If argc does not equal 2, I want to


quit out of my program prematurely. I want to insist that the user operate the program correctly. So I
might give them an error message like, missing command line argument /n. But now I want to quit out of
the program. Now how can I do that? The right way, quote-unquote, to do that is to return a value from
main.

- [6:11:02](https://www.youtube.com/watch?v=undefined&t=22262s) Now it's a little weird because no


one called main yet, right, main just gets called automatically, but the convention is anytime something
goes wrong in a program you should return a non-zero value from main. 1 is fine as a go-to. We don't
need to get into the weeds of having many different exit statuses, so to speak.

- [6:11:20](https://www.youtube.com/watch?v=undefined&t=22280s) But if you return 1, that is a clue


to the system, the Mac, the PC, the cloud device that's something went wrong. Why? Because 1 is not 0.
If everything works fine, like, let's go ahead and print out hello comma %s like before, quote-unquote
argv bracket 1. So this is just a version of the program without an else.

- [6:11:43](https://www.youtube.com/watch?v=undefined&t=22303s) So this is the same as doing,


essentially, an else here like I did earlier. I want to signal to the computer that all is well. And so I return
0. But strictly speaking, if I'm already returning here, I don't technically need, if I really want to be nit
picky, I don't technically need the else because the only way I'm going to get to line 11 is if I didn't
already return.

- [6:12:06](https://www.youtube.com/watch?v=undefined&t=22326s) So what's going on here? The only


new thing here logically, is that for the first time ever, I'm returning a value from main. That's something I
could always have done because main has always been defined by us as taking an int as a return value.
By default, main automatically, sort of secretly, returns 0 for you.
- [6:12:24](https://www.youtube.com/watch?v=undefined&t=22344s) If you've never once use the
return keyword, which you probably haven't in main, it just automatically returns 0 and the system
assumes that all went well. But now that we're starting to get a little more sophisticated with our code,
and you know, the programmer, something went wrong, you can abort programs early.

- [6:12:40](https://www.youtube.com/watch?v=undefined&t=22360s) You can exit out of them by


returning some other value, besides 0, from main. And this is fortuitous that it's an int, right? 0 means
everything worked. Unfortunately, in programming, there are seemingly, an infinite number of things
that can go wrong. And int gives you 4 billion possible codes that you can use, a.k.a. exit statuses, to
signify errors.

- [6:13:01](https://www.youtube.com/watch?v=undefined&t=22381s) So if you've ever on your Mac or


PC gotten some weird pop up that an error happened, sometimes, there's a cryptic number in it. Maybe
it's positive, maybe it's negative. It might say error code 123, or negative 49, or something like that.
What you're generally seeing, are these exit statuses, these return values from main in a program that
someone at Microsoft, or Apple, or somewhere else wrote, something went wrong, they are
unnecessarily showing you, the user what the error code is.

- [6:13:30](https://www.youtube.com/watch?v=undefined&t=22410s) If only, so that when you call


customer support or submit a ticket, you can tell them what exit status you encountered, what error
code you encounter. All right, any questions on exit statuses, which is the last of our new building blocks,
for now? Any questions at all? Yeah? AUDIENCE: [INAUDIBLE] You know how if you have get string or get
int, if you want to make [INAUDIBLE] DAVID MALAN: No.

- [6:14:00](https://www.youtube.com/watch?v=undefined&t=22440s) The question is can you do things


again and again at the command line like you could with get string and get int. Which, by default, recall
are automatically designed to keep prompting the user in their own loop until they give you an int, or a
float, or the like with command line arguments, no. You're going to get an error message but then you're
going to be returned to your prompt.

- [6:14:18](https://www.youtube.com/watch?v=undefined&t=22458s) And it's up to you to type it


correctly the next time. Good question. Yeah? AUDIENCE: [INAUDIBLE] automatically for you. DAVID
MALAN: If you do not return a value explicitly main will automatically return 0 for you, that is the way C
simply works so it's not strictly necessary. But now that we're starting to return values explicitly, if
something goes wrong, it would be good practice to also start returning a value for main when
something goes right and there are no errors.

- [6:14:48](https://www.youtube.com/watch?v=undefined&t=22488s) So let's now get out of the weeds


and contextualize this for some actual problems that we'll be solving in the coming days by way of
problems set 2 and beyond. So here for instance-- So here for instance, is a problem that you might think
back to when you were a kid the readability of some text or some book, the grade level in which some
book is written.

- [6:15:10](https://www.youtube.com/watch?v=undefined&t=22510s) If you're a young student, you


might read at first-grade level or third-grade level in the US. Or, if you're in college presumably, you're
reading at a university-level of text. But what does it mean for text, like in a book, or in an essay, or
something like that to correspond to some kind of grade level? Well, here's a quote-- a title of a
childhood book.

- [6:15:29](https://www.youtube.com/watch?v=undefined&t=22529s) One Fish, Two Fish, Red Fish, Blue


Fish. What might the grade level be for a book that has words like this? Maybe, when you were a kid or if
you have a siblings still reading these things, what might the grade level of this thing be? Any guesses?
Yeah? AUDIENCE: Before grade 1. DAVID MALAN: Sorry, again? AUDIENCE: Before grade 1.

- [6:15:48](https://www.youtube.com/watch?v=undefined&t=22548s) DAVID MALAN: Before grade 1 is,


in fact, correct. So that's for really young kids? Why is that? Well, let's consider. These are pretty simple
phrases, right? One fish, two fish, red-- I mean there's not even verbs in these sentences, they're just
nouns and adjectives, and very short sentences.

- [6:16:04](https://www.youtube.com/watch?v=undefined&t=22564s) And so that might be a heuristic


we could use. When analyzing text, well if the words are kind of short, the sentences are kind of short,
everything's very simple, that's probably a very young, or early, grade level. And so by one formulation, it
might indeed be even before grade 1, for someone quite young.

- [6:16:19](https://www.youtube.com/watch?v=undefined&t=22579s) How about this? Mr and Mrs.


Dursley, of number 4, Privet Drive, were proud to say that they were perfectly normal, thank you very
much. They were the last people you would expect to be involved in anything strange or mysterious
because they just didn't hold with such nonsense. And, onward. All right, what grade level is this book?
AUDIENCE: Third.

- [6:16:36](https://www.youtube.com/watch?v=undefined&t=22596s) DAVID MALAN: OK, I heard third.


AUDIENCE: What? DAVID MALAN: Seventh, fifth. OK, all over the place. But grade 7, according to one
particular measure. And whether or not we can debate exactly what age you were when you read this,
and maybe you're feeling ahead of your time, or behind now. But here, we have a snippet of text.

- [6:16:56](https://www.youtube.com/watch?v=undefined&t=22616s) What makes this text assume an


older audience, a more mature audience, a higher grade level, would you think? Yeah? AUDIENCE:
[INAUDIBLE] DAVID MALAN: Yeah, it's longer, different types of words, there's commas now in phrases,
and so forth. So there's just some kind of sophistication to this. So it turns out for the upcoming problem
set, among the things you'll do is take, as input, texts like this and analyze them.

- [6:17:21](https://www.youtube.com/watch?v=undefined&t=22641s) Considering , well, how many


words are in the text? How many sentences are in the text? How many letters are in the text? And use
those according to a well-defined formula to prescribe what, exactly, the grade level of some actual
text-- there's the third-- might actually be. Well what else are we going to do in the coming days? Well
I've alluded to this notion of cryptography in the past.

- [6:17:40](https://www.youtube.com/watch?v=undefined&t=22660s) This notion of scrambling


information in such a way that you can hide the contents of a message from someone who might
otherwise intercept it, right? The earliest form of this might also be when you're younger, and you're in
class, and you're passing a note from one person to another, from yourself to someone else.

- [6:17:55](https://www.youtube.com/watch?v=undefined&t=22675s) You don't want to necessarily


write a note in English, or some other written, language you might want to scramble it somehow, or
encrypt it. Maybe you change the As to a B, and the Bs to a C. So that if the teacher snaps it up and
intercepts it, they can't actually understand what it is you've written because it's encrypted.

- [6:18:11](https://www.youtube.com/watch?v=undefined&t=22691s) So long as your friend, the


recipient of this note, knows how you manipulated it. How you added or subtracted letters to each
other, they can decrypt it, which is to reverse that process. So formally, in the world of cryptography and
computer science, this is another problem to solve. Your input, though, when you have a message you
want to send securely, is what's generally known as plain text.

- [6:18:33](https://www.youtube.com/watch?v=undefined&t=22713s) There's some algorithm that's


going to then encipher, or encrypt that information, into what's called ciphertext, which is the scrambled
version that theoretically can get safely intercepted and your message has not been spoiled, unless that
intercept actually knows what algorithm you used inside of this process.

- [6:18:51](https://www.youtube.com/watch?v=undefined&t=22731s) So that would be generally known


as a cipher. The ciphers typically take, though, not one input, but two. If, for instance, your cipher is as
simple as A becomes B, B becomes C, C becomes D, dot dot dot, Z becomes A, you're essentially adding
one to every letter and encrypting it. Now that would be, what we call, the key.

- [6:19:12](https://www.youtube.com/watch?v=undefined&t=22752s) You and the recipient both have


to agree, presumably, before class, in advance, what number you're going to use that day to rotate, or
change all of these letters by. Because when you add 1, they upon receiving your ciphertext have to
subtract 1 to get back the answer. For instance, if the input, plaintext, is hi, as before, and the key is 1,
the ciphertext using this simple rotational algorithm, otherwise known as the Caesar cipher, might be ij
exclamation point.

- [6:19:42](https://www.youtube.com/watch?v=undefined&t=22782s) So it's similar, but it's at least


scrambled at first glance. And unless the teacher really cares to figure out what algorithm are they using
today, or what key are they using today, it's probably sufficiently secure for your purposes. How do you
reverse the process? Well, your friend gets this and reverses it by negative 1.

- [6:19:58](https://www.youtube.com/watch?v=undefined&t=22798s) So I becomes H, J becomes I, and


things like punctuation remain untouched at least in this scheme. So let's consider one final example
here. If the input to the algorithm is Uijtxbtdt50, and the key this time is negative 1. Such that now B
should become A, and C should become B, and A should become A.

- [6:20:24](https://www.youtube.com/watch?v=undefined&t=22824s) So we're going in the other


direction. How might we analyze this? Well if we spread all the letters out, and we start from left to right,
and we start subtracting one letter, U becomes T, I becomes H, J becomes I, T becomes S, X becomes W,
A, was, D, T-- this was CS50. We'll see you next time. [APPLAUSE] [MUSIC PLAYING] DAVID J. MALAN: This
is CS50, and this is already week three.

- [6:22:07](https://www.youtube.com/watch?v=undefined&t=22927s) And even as we've gotten much


more into the minutia of programming and some of the C stuff that we've been doing is all the more
cryptic looking, recall that at the end of the day, like, everything we've been doing ultimately fits into to
this model. So keep that in mind, particularly as things seem like they're getting more complicated and
more sophisticated.
- [6:22:24](https://www.youtube.com/watch?v=undefined&t=22944s) It's just a process of learning a
new language that ultimately lets us express this process. And of course, last week we really went into
the weeds of like how inputs and outputs are represented. And this thing here, a photograph thereof, is
called what? This is what? AUDIENCE: RAM. DAVID J.

- [6:22:41](https://www.youtube.com/watch?v=undefined&t=22961s) MALAN: RAM, I heard-- Random


Access Memory or just generally known as memory. And recall that we looked at one of these little black
chips that contains all of the bytes-- all of the bits, ultimately. It's just kind of a grid, sort of an artist grid,
that allows us to think about every one of these memory locations as just having a number or an
address, so to speak.

- [6:22:58](https://www.youtube.com/watch?v=undefined&t=22978s) Like, this might be byte number 0


and then 1 and then 2 and then, maybe way down here again, something like 2 billion if you have 2
gigabytes of memory. And so as we did that, we started to explore how we could use this canvas to
create kind of our own information, our own inputs and outputs, not just the basics like ints and floats
and so forth.

- [6:23:17](https://www.youtube.com/watch?v=undefined&t=22997s) But we also talked about strings.


And what is a string as you now know it? How would you describe in layperson's terms a string? Yeah,
over there. AUDIENCE: I was gonna say-- [AUDIO OUT] DAVID J. MALAN: An array of characters. And an
array, meanwhile-- let's go there. How might someone else define an array in more familiar now terms?
What would be an array? Yeah.

- [6:23:38](https://www.youtube.com/watch?v=undefined&t=23018s) AUDIENCE: Kind of like an indexed


set of things. DAVID J. MALAN: An indexed set of things-- not bad. And I think a key characteristic to keep
in mind with an array is that it does actually pertain to memory. And it's contiguous memory. Byte after
byte after byte is what constitutes an array. And we'll see in a couple of weeks time that there's actually
more interesting ways to use this same primitive Canvas to stitch together things that are sort of two
directional even that have some kind of shape

- [6:24:04](https://www.youtube.com/watch?v=undefined&t=23044s) to them. But for now, all we've


talked about is arrays and just using these things from left to right, top to bottom, contiguously to
represent information. So today, we'll consider still an array. But we won't focus so much on
representation of strings or other data types. We'll actually now focus on the other part of that process,
of inputs becoming outputs, namely the thing in the middle-- algorithms.

- [6:24:25](https://www.youtube.com/watch?v=undefined&t=23065s) But we have to keep in mind,


even though every time we've looked at an array thus far, certainly on the board like this, you as a
human certainly have the luxury of just kind of eyeballing the whole thing with a bird's eye view and
seeing where all of those numbers are. If I asked you where a particular number is, like zero, odds are
your eyes would go right to where it is, and boom, problem solved in sort of one step.

- [6:24:47](https://www.youtube.com/watch?v=undefined&t=23087s) But the catch is, with a computer


that has this memory, even though you, the human, can [INAUDIBLE] see everything at once, a computer
cannot. It's better to think of your computer's memory, your phone's memory, or more specifically an
array of memory like this as really being a set of closed doors, not unlike lockers in a school.
- [6:25:06](https://www.youtube.com/watch?v=undefined&t=23106s) And only by opening each of
those doors can the computer actually see what's in there, which is to say that the computer, unlike you,
doesn't have this bird's eye view of all of the data in all these locations. It has to much more
methodically look here, maybe look here, maybe look here, and so forth in order to find something.

- [6:25:24](https://www.youtube.com/watch?v=undefined&t=23124s) Now fortunately, we already have


some building blocks-- loops, conditions, Boolean expressions, and the like-- where you could imagine
writing some code that very methodically goes from left to right or right to left or something more
sophisticated that actually finds something you're looking for. And just remember that the conventions
we've had since last week now is that these arrays are zero indexed, so to speak.

- [6:25:48](https://www.youtube.com/watch?v=undefined&t=23148s) To be zero indexed just means


that the data type starts counting from zero. So this is location 0, 1, 2, 3, 4, 5, 6. And notice even though
there are seven total doors here, the right-most one, of course, is called 6 just because we've started
counting at 0. So in the general case, if you had n doors or n bytes of memory, 0 would always be at the
left, and n minus 1 would always be at the right.

- [6:26:14](https://www.youtube.com/watch?v=undefined&t=23174s) That's sort of a generalization of


just thinking about this kind of convention. All right, so let's revisit the problem that we started the
whole term off with in week zero, which was this notion of searching. And what does it mean to search
for something? Well, to find information-- and this, of course, is omnipresent.

- [6:26:30](https://www.youtube.com/watch?v=undefined&t=23190s) Anytime you take out your phone,


you're searching for a friend's contact. Any time you pull up a browser, you're googling for this or that. So
search is kind of one of the most omnipresent topics and features of any device these days. So let's
consider how the Googles, the Apples, the Microsofts of the world are implementing something as
seemingly familiar as this.

- [6:26:48](https://www.youtube.com/watch?v=undefined&t=23208s) So here might be the problem


statement. We want some input to become some output. What's that input going to be? Maybe it's a
bunch of closed doors like this out of which we want to get back an answer, true or false. Is something
we're looking for there or not? You can imagine taking this one step further and trying to find where is
the thing you're looking for.

- [6:27:08](https://www.youtube.com/watch?v=undefined&t=23228s) But for now, let's just take one


bite out of the problem. Can we tell ourselves, true or false, is some number behind one of these doors
or lockers in memory? But before we go there and start talking about ways to do that-- that is,
algorithms. Let's consider how we might lay the foundation of, like, comparing whether one algorithm is
better than another.

- [6:27:30](https://www.youtube.com/watch?v=undefined&t=23250s) We talked about correctness, and


it sort of goes without saying that any code you write, any algorithm you implement, had better be
correct. Otherwise, what's the point if it doesn't give you the right answers? But we also talked about
design. And in your own words, what do we mean when we say a program is better designed at this
stage than another? How do you think about this notion of design now? Yeah, in the middle? AUDIENCE:
Easier to understand or easier to institute.
- [6:27:56](https://www.youtube.com/watch?v=undefined&t=23276s) DAVID J. MALAN: OK, so easier to
understand. I like that. Other thoughts? Yeah. AUDIENCE: Efficiency. DAVID J. MALAN: Efficiency, and
what do you mean by efficiency precisely? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Nice. It doesn't use
up too much memory, and it isn't redundant. So you can think about design along a few of these axes--
sort of the quality of the code but also the quality of the performance.

- [6:28:16](https://www.youtube.com/watch?v=undefined&t=23296s) And as our programs get bigger


and more sophisticated and just longer, those kinds of things are really going to matter. And in the real
world, if you start writing code not just by yourself but with someone else, getting the design right is just
going to make it easier to collaborate and ultimately produce, write code, with just higher probability.

- [6:28:33](https://www.youtube.com/watch?v=undefined&t=23313s) So let's consider how we might


focus on exactly the second characteristic, the efficiency, of an algorithm. And the way we might talk
about the efficiency of algorithms, just how fast or how slow they are, is in terms of their running time.
That is to say, when they're running, how much time do they take? And we might measure this in
seconds or milliseconds or minutes or just some number of steps in the general case because
presumably fewer steps, to your point, is better than more steps.

- [6:28:59](https://www.youtube.com/watch?v=undefined&t=23339s) So how might we think about


running times? Well, there's one general notation we should define today. So computer scientists tend to
describe the running time of an algorithm or a piece of code, for that matter, in terms of what's called
big O notation. This is literally a capitalized O, a big O.

- [6:29:15](https://www.youtube.com/watch?v=undefined&t=23355s) And this generally means that the


running time of some algorithm is on the order of such and such, where such and such, we'll see, is just
going to be a very simple mathematical formula. It's kind of a way of waving your hands mathematically
to convey the idea of just how fast or how slow some algorithm or code is without getting into the weeds
of like, it took this many milliseconds or this many specific number of steps.

- [6:29:37](https://www.youtube.com/watch?v=undefined&t=23377s) So you might recall then from


week zero, I even introduced this picture but without much context. At the time, we just use this to
compare those phone book algorithms. Recall that this red straight line was the first algorithm, one page
at a time. The yellow line that's still straight differed how if you recall? That line represented what
alternative algorithm? Looking out and back.

- [6:30:00](https://www.youtube.com/watch?v=undefined&t=23400s) What is that second algorithm?


Yeah, over there. AUDIENCE: Like, two pages at a time. DAVID J. MALAN: Two pages at a time, which was
almost correct so long as we potentially double back a page if maybe we go a little too far in the phone
book. So it had a potential bug but arguably solvable. This last algorithm, though, was the so-called
divide and conquer strategy where I sort of unnecessarily tore the phone book in half and then in half
and then in half, which, as dramatic as that was unnecessarily, it actually took significantly bigger

- [6:30:25](https://www.youtube.com/watch?v=undefined&t=23425s) bites out of the problem-- like 500


pages the first time, another 250, another 125 versus just 1 or 2 bytes at a time. And so we described its
running time as this picture there, though I didn't use that expression at the time, running times. But
indeed, time to solve might be measured just abstractly in some unit of measure-- seconds, milliseconds,
minutes, pages-- via this y-axis here.
- [6:30:49](https://www.youtube.com/watch?v=undefined&t=23449s) So let's now slap some numbers
on this. If we had n pages in that phone book, n just representing a generic number, the first algorithm
here we might describe as taking n steps. Second algorithm we might describe as taking n divided by 2
steps, maybe give or take one if we have to double back but generally n divided by 2.

- [6:31:07](https://www.youtube.com/watch?v=undefined&t=23467s) And then this thing, if you


remember your logarithms, was sort of a fundamentally different formula-- log base 2 of n or just log of
n for short. So this is of a fundamentally different formula. But what's noteworthy is that these first two
algorithms, even though, yes, the second algorithm was hands down faster-- I mean, literally twice as
fast-- when you start to zoom out and if I increase my y-axis and x-axis, these first two start to look
awfully similar to one another.

- [6:31:37](https://www.youtube.com/watch?v=undefined&t=23497s) And if we keep zooming out and


zooming out and zooming out as n gets really large-- that is, the x-axis gets really long-- these first two
algorithms start to become essentially the same. And so this is where computer scientists use big O
notation. Instead of saying specifically, this algorithm takes any steps.

- [6:31:55](https://www.youtube.com/watch?v=undefined&t=23515s) And this one n divided by 2, a


computer scientist would say, eh, each of those algorithms takes on the order of n steps or on the order
of n over 2. But you know what? On the order of n over 2 is pretty much the same when n gets really
large as being equivalent to big O of n itself. So yes, in practice, it's obviously fewer steps to move twice
as fast.

- [6:32:19](https://www.youtube.com/watch?v=undefined&t=23539s) But in the big picture, when n


becomes a million, a billion, the numbers are already so darn big at that point that these are as, the
shapes of these curves imply, pretty much functionally equivalent. But this one still looks better and
better as n gets large because it's rising so much less quickly. And so here, a computer scientist would say
that that third algorithm was on the order of-- that is, big O of-- log n.

- [6:32:44](https://www.youtube.com/watch?v=undefined&t=23564s) And they don't have to bother


with the base because it's a smaller mathematical detail that is also just in some sense a constant,
multiplicative factor. So in short, what are the takeaways here? This is just a new vocabulary that we'll
start to use when we just want to describe the running time of an algorithm.

- [6:33:00](https://www.youtube.com/watch?v=undefined&t=23580s) To make this more real, if any of


you have implemented a for loop at this point in any of your code and that for loop iterated n times
where maybe in was the height of your pyramid or maybe n was something else that you wanted to do n
times, you wrote code or you implemented an algorithm that operated in big O of n time, if you will.

- [6:33:22](https://www.youtube.com/watch?v=undefined&t=23602s) So this is just a way now to


retroactively start describing with somewhat mathematical notation what we've been doing in practice
for a while now. So here's a list of commonly seen running times in the real world. This is not a thorough
list because you could come up with an infinite number of mathematical formulas, certainly.

- [6:33:42](https://www.youtube.com/watch?v=undefined&t=23622s) But the common ones we'll


discuss and you will see in your own code probably reduce to this list here. And if you were to study
more computer science theory, this list would get longer and longer. But for now, these are sort of the
most familiar ones that we'll soon see. All right, two other pieces of vocabulary, if you will, before we
start to use this stuff-- so this, a big omega, capital omega symbol, is used now to describe a lower bound
on the running time of an algorithm.

- [6:34:10](https://www.youtube.com/watch?v=undefined&t=23650s) So to be clear, big O is on the


order of-- that is, an upper bound-- on how many steps an algorithm might take, on the order of so many
steps. If you want to talk, though, from the other perspective, well, how few steps my algorithm take?
Maybe in the so-called best case, it'd be nice if we had a notation to just describe what a lower bound is
because some algorithms might be super fast in these so-called best cases.

- [6:34:34](https://www.youtube.com/watch?v=undefined&t=23674s) So the symbology is almost the


same, but we replace the big O with the big omega. So to be clear, big O describes an upper bound and
omega describes a lower bound. And we'll see examples of this before long. And then lastly, last one
here, big theta, is used by a computer scientist when you have a case where both the upper bound on an
algorithm's running time is the same as the lower bound.

- [6:35:00](https://www.youtube.com/watch?v=undefined&t=23700s) You can then describe it in one


breath as being in theta of such and such instead of saying it's in big O and in omega of something else.
All right, so out of context, sort of just seemingly cryptic symbols, but all they refer to is upper bounds,
lower bounds, or when they happen to be one in the same.

- [6:35:17](https://www.youtube.com/watch?v=undefined&t=23717s) And we'll now introduce over


time examples of how we might actually apply these to concrete problems. But first, let me pause to see
if there's any questions. Any questions here? Any questions? I see pointing somewhere. Where are you
pointing to? Over here-- there we go. OK, sorry-- very bright. AUDIENCE: So, um, smaller-- DAVID J.
MALAN: Smaller n functions move faster.

- [6:35:46](https://www.youtube.com/watch?v=undefined&t=23746s) So yes, if you have something like


n, that takes only steps. If you have a formula like n squared, just by nature of the math, that take more
steps and therefore be slower. So the larger the mathematical expression, the slower your algorithm is
because the more time or more steps that it takes. AUDIENCE: So you want your n function to be small?
DAVID J. MALAN: You want your n function, so to speak, to be small, yes.

- [6:36:11](https://www.youtube.com/watch?v=undefined&t=23771s) And in fact, the Holy Grail, so to


speak, would be this last one here either in big O notation or even theta, when an algorithm is on the
order of a single step. That means it literally takes constant time, one step, or maybe 10 steps, 100 steps,
but a fixed, constant number of steps. That's the best because even as the phone book gets bigger, even
as the data set you're searching gets larger and larger, if something only takes a finite number of steps
constantly, then it doesn't matter how big the data set actually gets.

- [6:36:42](https://www.youtube.com/watch?v=undefined&t=23802s) Questions as well on these


notations-- yep, thank you for the pointing. This is actually very helpful. I'm seeing pointing this way?
AUDIENCE: [INAUDIBLE] DAVID J. MALAN: What is the input to each of these functions? It is an
expression of how many steps an algorithm takes. So in fact, let me go ahead and make this more
concrete with an actual example here if we could.

- [6:37:04](https://www.youtube.com/watch?v=undefined&t=23824s) So on stage here, we have seven


lockers which represent, if you will, an array of memory. And this array of memory is maybe storing
seven integers, seven integers that we might actually want to search for. And if we want to search for
these values, how might we go about doing this? Well, for this, why don't we make things interesting?
Would a volunteer like to come on up? Have to be masked and on the internet if you are comfortable.

- [6:37:25](https://www.youtube.com/watch?v=undefined&t=23845s) Both of-- oh, there's someone


putting their friend's hand up and back? Yes, OK. Come on down. And in just a moment, our brave
volunteer is going to help me find a specific number in the data set that we have here on the screen. So
come on down, and I'll get things ready for you in advance here.

- [6:37:47](https://www.youtube.com/watch?v=undefined&t=23867s) Come on down nice to meet. And


what is your name? AUDIENCE: [? Nomira. ?] DAVID J. MALAN: Minera? AUDIENCE: [? Nomira. ?] DAVID
J. MALAN: [? Nomira. ?] Nice to meet. Come on over. So here we have for Nomira seven lockers or an
array of memory. And behind each of these doors is a number. And the goal, quite simply, is, given this
array of memory as input, to return, true or false, is the number I care about actually there? So suppose I
care about the number 0.

- [6:38:15](https://www.youtube.com/watch?v=undefined&t=23895s) What would be the simplest,


most correct algorithm you could apply in order to find us the number 0? OK, try opening the first one.
All right, and maybe just step aside so the audience can see. I think you have not found 0 yet. OK, so
keep the door open. Let's move on to your next choice. Second door, sure.

- [6:38:35](https://www.youtube.com/watch?v=undefined&t=23915s) AUDIENCE: [INAUDIBLE] DAVID J.


MALAN: Oh, go ahead, second door. Let's keep it simple. Let's just move from left to right, sort of
searching our way. And what do you see there? Oh, 6, not 0. How about the next door? All right, also not
working out so well yet, but that's OK. If you want to go on to the next, we're still looking for 0.

- [6:38:56](https://www.youtube.com/watch?v=undefined&t=23936s) All right, I see a 2. All right, it's not


so good yet. Let's keep going. Next door. 2, 7-- no. OK, next door. No, that's a-- all right, very well done.
Oh. All right, so I kind of set you up for a fairly slow algorithm, but let me just ask you to describe what is
it you did by following the steps I gave you.

- [6:39:22](https://www.youtube.com/watch?v=undefined&t=23962s) AUDIENCE: I just went one by one


to each character. DAVID J. MALAN: You went one by one to each character if you want to talk into here.
So you went one by one by each character. And would you say that algorithm left or right is correct?
AUDIENCE: No. DAVID J. MALAN: No? AUDIENCE: Or, yes, in the scenario. DAVID J. MALAN: OK, yes in
this scenario.

- [6:39:39](https://www.youtube.com/watch?v=undefined&t=23979s) Why are you hesitating? What's


going through your mind? AUDIENCE: Because it's not the most efficient way to do it. DAVID J. MALAN:
OK, good. So we see a contrast here between correctness and design. I mean, I do think it was correct
because even though it was slow, you eventually found zero. But it took some number of steps.

- [6:39:53](https://www.youtube.com/watch?v=undefined&t=23993s) So in fact, this would be an


algorithm. It has a name, called linear search. And, [? Nomira, ?] as you did, you kind of walked along a
line going from left to right. Now let me ask. If you had gone from right to left, would the algorithm have
been fundamentally better? AUDIENCE: Yes. DAVID J.

- [6:40:09](https://www.youtube.com/watch?v=undefined&t=24009s) MALAN: OK, and why? AUDIENCE:


Because the zero is here in the first scenario. But if it was like, the zero is in the middle, it wouldn't have
been. DAVID J. MALAN: Yeah, and so here is where the right way to do things becomes a little less
obvious. You would absolutely have given yourself a better result if you would just happened to start
from the right or if I had pointed you to start over there.

- [6:40:27](https://www.youtube.com/watch?v=undefined&t=24027s) But the catch is if I asked her to


find another number, like the number 8, well, that would have backfired. And this time, it would have
taken longer to find that number because it's way over here instead. And so in the general case, going
left to right or, heck, right to left is probably as correct as you can get because if you know nothing about
the order of these numbers-- and indeed, they seem to be fairly random.

- [6:40:48](https://www.youtube.com/watch?v=undefined&t=24048s) Some of them are smaller, some


of them are bigger. There doesn't seem to be rhyme or reason. Linear search is about as good as you can
do when you don't know anything a priori about the numbers. So I have a little thank you gift here, a
little CS stress ball. Round of applause for our first volunteer.

- [6:41:03](https://www.youtube.com/watch?v=undefined&t=24063s) Thank you so much. Let's try to


formalize what I just described as linear search because indeed, no matter which end [? Nomira ?] had
started on, I could have kind of changed up the problem to make sure that it appears to be running slow.
But it is correct. If zero were among those doors, she absolutely would have found it and indeed did.

- [6:41:25](https://www.youtube.com/watch?v=undefined&t=24085s) So let's now try to translate what


we did into what we might call again pseudo code as from week zero. So with pseudo code, we just need
a terse English like, or any language, syntax to describe what we did. So here might be one formulation
of what [? Nomira ?] did. For each door, from left to right, if the number is behind the door, return true.

- [6:41:47](https://www.youtube.com/watch?v=undefined&t=24107s) Else, at the very end of the


program, you would return false by default. And now you got lucky. And by the seventh door, [?
Nomira ?] had indeed returned true by saying, well, there is the zero. But let's consider if this pseudo
code is now correct, an accurate translation. First of all, normally, when we've seen ifs, we might see an
if else.

- [6:42:07](https://www.youtube.com/watch?v=undefined&t=24127s) And yet down here, return false is


aligned with the for. Why did I not indent the return false, or put another way, why did I not do if
number is behind door, return true, else return false? Why would that version of this code have been
problematic? Way in back. AUDIENCE: [INAUDIBLE] DAVID J. MALAN: OK, I'm not sure it's because of
redundancy.

- [6:42:37](https://www.youtube.com/watch?v=undefined&t=24157s) Let me go ahead and just make


this explicit. If I had instead done else return false, I don't think it's so much redundancy that I'd be
worried about. Let me bounce somewhere else. Yeah, in front? AUDIENCE: Um, maybe [INAUDIBLE] for
the entire list after just checking one number. DAVID J.

- [6:42:56](https://www.youtube.com/watch?v=undefined&t=24176s) MALAN: Yeah, it would be


returning falls for-- even though I'd only looked at-- [? Nomira ?] had only looked at one element. And it
would have been as though if all of these doors were still closed, she opens this up and says, nope, this is
not zero, return false. That would give me an incorrect result because obviously, at that stage in the
algorithm, she wouldn't have even looked through any of the other doors.
- [6:43:13](https://www.youtube.com/watch?v=undefined&t=24193s) So just the original indentation of
this, if you will, without the [? else, ?] is correct because only if I get to the bottom of this algorithm or
the pseudo code does it make sense to conclude at that point, once she's gone through all of the doors,
that nope, there's in fact-- the number I'm looking for is, in fact, not actually there.

- [6:43:33](https://www.youtube.com/watch?v=undefined&t=24213s) So how might we consider now


the running time of this algorithm? We have a few different types of vocabulary now. And if we consider
now how we might think about this, let's start to translate it from sort of higher level pseudo code to
something a little lower level. We've been writing code using n and loops and the like.

- [6:43:52](https://www.youtube.com/watch?v=undefined&t=24232s) So let's take this higher level


pseudo code and now just kind of get a middle ground between English and C. Let me propose that we
think about this version of the same algorithm as being a little more pedantic. For i from 0 to n minus 1,
if number behind doors bracket i return true. Otherwise, at the end of the program, return false.

- [6:44:16](https://www.youtube.com/watch?v=undefined&t=24256s) Now I'm kind of mixing English


and C here, but that's reasonable if the reader is familiar with C or some similar language. And notice
this pattern here. This is a way of just saying in pseudo code, give myself a variable called i. Start at 0 and
then just count up to n minus 1. And recall n minus 1 is not one shy of the end of the array.

- [6:44:38](https://www.youtube.com/watch?v=undefined&t=24278s) N minus 1 is the end of the array


because again, we started counting at 0. So this is a very common way of expressing this kind of loop
from the left all the way to the right of an array. Doors I'm kind of implicitly treating as the name of this
array, like it's a variable from last week that I defined as being an array of integers in this case.

- [6:44:56](https://www.youtube.com/watch?v=undefined&t=24296s) So doors bracket i means that


when i is 0, it's this location. When i is 1, it's this. When i is 7 or, more generally n minus-- sorry, 6 or,
more generally, n minus 1, that's this location here. So same idea but a translation of it. So now let's
consider what the running time of this algorithm is.

- [6:45:17](https://www.youtube.com/watch?v=undefined&t=24317s) If we have this menu of possible


answers to this question, how efficient or inefficient is this algorithm, let's take a look in the context of
this pseudo code. We don't even have to bother going all the way to C. How do we go about analyzing
each of these steps? Well, let's consider this. This outermost loop here for i from 0 to n minus 1, that line
of code is going to execute how many times? How many times will that loop execute? Let me give folks
this moment to think on it.

- [6:45:49](https://www.youtube.com/watch?v=undefined&t=24349s) How many times is that going to


loop here? Yeah, over there. AUDIENCE: [INAUDIBLE] DAVID J. MALAN: n times, right? Because it's from
0 to n minus 1. And if it's a little weird to think in from 0 to n minus 1, this is essentially the same
mathematically as from 1 to n. And that's perhaps a little more obviously more intuitively n total steps.

- [6:46:08](https://www.youtube.com/watch?v=undefined&t=24368s) So I might just make a note to


myself this loop is going to operate n times. What about these inner steps? Well, how many steps or
seconds does it take to ask a question? If the number behind-- if the number you're looking for is behind
doors bracket i, well, as [? Nomira ?] did, that's kind of like one step.
- [6:46:26](https://www.youtube.com/watch?v=undefined&t=24386s) So you open the door and boom.
All right, maybe it's two steps, but it's a constant number of steps. So this is some constant number of
steps. Let's just call it one for simplicity. How many steps or seconds does it take to return true? I don't
know exactly in the computer's memory but that feels like a single step.

- [6:46:42](https://www.youtube.com/watch?v=undefined&t=24402s) Just return true. So if this takes


one step, this takes one step but only if the condition is true, it looks like you're doing a constant number
of things n times. Or maybe you're doing one additional step. So in short, the only thing that really
matters here in terms of the efficiency or inefficiency of the algorithm is what are you doing again and
again and again because that's obviously the thing that's going to add up.

- [6:47:07](https://www.youtube.com/watch?v=undefined&t=24427s) Doing one thing or two things a


constant number of times? Not a big deal. But looping, that's going to add up over time because the
more doors there are, the bigger n is going to be and the more steps that's going to take, which is all to
say if you were to describe roughly how many steps does this algorithm take in big O notation, what
might your instincts say? How many steps is this algorithm on the order of given n doors or n integers?
Yeah? AUDIENCE: [INAUDIBLE] DAVID J.

- [6:47:38](https://www.youtube.com/watch?v=undefined&t=24458s) MALAN: Say again? AUDIENCE: O


n. DAVID J. MALAN: Big O of n. And indeed, that's going to be the case here. Why? Because you're
essentially, at the end of the day, doing n things as an upper bound on running time. And that's, in fact,
what exactly what happened with [? Nomira. ?] She had to look at all n lockers before finally getting to
the right answer.

- [6:47:56](https://www.youtube.com/watch?v=undefined&t=24476s) But what if she got lucky and the


number we were looking for was not at the end of the array but was at the beginning of the array? How
might we think about that? Well, have a nomenclature for this too, of course-- omega notation.
Remember, omega notation is a lower bound. So given this menu of possible running times for lower
bounds on an algorithm, what might the omega notation be for [? Nomira's ?] linear search? AUDIENCE:
Omega 1.

- [6:48:25](https://www.youtube.com/watch?v=undefined&t=24505s) DAVID J. MALAN: Omega of 1, and


why that? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Right, because if just by chance she gets lucky and
the number she's looking for is right there where she begins the algorithm, that's it. It's one step. Maybe
it's two steps if you have to unlock the door and open it, but it's a constant number of steps.

- [6:48:41](https://www.youtube.com/watch?v=undefined&t=24521s) And the way we describe constant


number of steps is just with a single number like 1. So the omega notation for linear search might be
omega of 1 because in the best case, she might just get the number right from the get go. But in the
worst case, we need to talk about the upper bound, which might indeed be big O of n.

- [6:48:58](https://www.youtube.com/watch?v=undefined&t=24538s) So again there's this way now of


talking symbolically about best cases and worst cases or lower bounds and upper bounds. Theta
notation, just as a little trivia now, is it applicable based on the definition I gave earlier? AUDIENCE:
[INAUDIBLE] DAVID J.

- [6:49:16](https://www.youtube.com/watch?v=undefined&t=24556s) MALAN: OK, no, because you only


take out the theta notation when those two bounds, upper and lower, happen to be the same for
shorthand notation, if you will. So it suffices here to talk about just big O and omega notation. Well, what
if we are a little smarter about this? Let me go ahead and sort of semi-secretly here rearrange these
numbers. But first, how about one other volunteer? One other volunteer-- you have to be comfortable
with your mask and your being on the internet.

- [6:49:40](https://www.youtube.com/watch?v=undefined&t=24580s) How about over here? Yes, you


want to come on down? All right, come on down. And don't look at what I'm doing because I'm going
to-- take your time and don't look up this way because I need a moment to rearrange all of the numbers.
And actually, if you could stay right there before coming up, just an awkward few seconds while I finish
hiding the numbers behind these doors for you.

- [6:50:07](https://www.youtube.com/watch?v=undefined&t=24607s) AUDIENCE: [INAUDIBLE] DAVID J.


MALAN: I will be right with you. Actually, if-- do you want to warm up the crowd for a moment and I'll be
right back? So you want to introduce yourself? AUDIENCE: Yeah, hi, guys. I'm Rave. Yeah! DAVID J.
MALAN: All right, I think I am ready. Thank you for stalling there.

- [6:50:32](https://www.youtube.com/watch?v=undefined&t=24632s) AUDIENCE: Of course. DAVID J.


MALAN: And I didn't catch your name. What was your name? AUDIENCE: I'm Rave. DAVID J. MALAN: I'm
sorry? AUDIENCE: Rave, like a party. DAVID J. MALAN: Rave, OK. Nice to meet. Come on over. So Rave has
kindly volunteered now. And I'm going to give you an additional advantage this time.

- [6:50:43](https://www.youtube.com/watch?v=undefined&t=24643s) AUDIENCE: OK. DAVID J. MALAN:


Unbeknownst to you, I now took numbers behind the doors, but I sorted them for you. So they're not in
the same random order like they were for [? Nomira. ?] You now have the advantage to know that the
numbers are sorted from small to big. AUDIENCE: OK. DAVID J.

- [6:51:00](https://www.youtube.com/watch?v=undefined&t=24660s) MALAN: Given that, and given


perhaps what we talked about in week zero with the phone book, where might you propose we begin
the story this time? With which locker? AUDIENCE: To find zero? DAVID J. MALAN: Let's find number six
this time. Let's make things interesting. AUDIENCE: OK. I'll start in the middle. DAVID J. MALAN: OK, so
the middle. There's seven total.

- [6:51:13](https://www.youtube.com/watch?v=undefined&t=24673s) So-- AUDIENCE: OK. DAVID J.


MALAN: --that would be right here. Go ahead. Open that up. And you find, sadly, the number five. So
what do you know now? AUDIENCE: I know to go up. DAVID J. MALAN: OK. AUDIENCE: OK. DAVID J.
MALAN: All right, and just to keep it uniform, just like I did, I opened to the right half of the phone book.

- [6:51:27](https://www.youtube.com/watch?v=undefined&t=24687s) AUDIENCE: Yes. DAVID J. MALAN:


Let's keep it similar. Yeah. AUDIENCE: All right. DAVID J. MALAN: All right, and, uh, a little too far even
though I know you wanted to go one over. AUDIENCE: All good, all good. DAVID J. MALAN: And now
we're going to go which direction? AUDIENCE: Over here in the middle.

- [6:51:39](https://www.youtube.com/watch?v=undefined&t=24699s) DAVID J. MALAN: Right, and voila,


the number six. All right, so very nicely done. A little stressful for you as well. Thank you again. So here
we see by nature of the locker door still being open sort of an artifact of the greater efficiency, it would
seem, of this algorithm because now that Rave was given the assumption that these numbers are sorted
from small on the left to large on the right, she was able to apply that same divide and conquer
algorithm from week zero which we're now going to give a name--
- [6:52:08](https://www.youtube.com/watch?v=undefined&t=24728s) binary search. And simply by
starting in the middle and realizing, OK, too small, then by going to the right half and realizing, oh, went
a little too far, then by going to the left half, which, Rave able to find in just three steps instead of seven
the number six in this case that we were actually searching for.

- [6:52:28](https://www.youtube.com/watch?v=undefined&t=24748s) So you can see that this would


seem to be more efficient. Let's consider for just a moment is it correct. If I had used different numbers
but still sorted them from left to right, would it still have worked this algorithm? You're nodding your
head. Can I call on you? Like, why would it still have worked, do you think? AUDIENCE: [INAUDIBLE]
DAVID J.

- [6:52:53](https://www.youtube.com/watch?v=undefined&t=24773s) MALAN: Yeah, so so long as the


numbers are always in the same order from left to right or, heck, they could even be in reverse order, so
long as it's consistent, the decisions that Rave was making-- if greater than, else, if less than-- would
guide us to the solution no matter what. And it would seem to take fewer steps. So if we consider now
the pseudo code for this algorithm, let's take a look how we might describe binary search.

- [6:53:13](https://www.youtube.com/watch?v=undefined&t=24793s) So binary search we might


describe with something like this. If the number is behind the middle door, which is where Rave began,
then we can just return true. Else if the number is less than the middle door, so if six is less than
whatever is behind the middle door, then Rave would have searched the left half.

- [6:53:30](https://www.youtube.com/watch?v=undefined&t=24810s) Else if the number is greater than


the middle door, Rave would have searched the right half. Else, if there are no doors-- and we'll see in a
moment why I put this up top just to keep things clean. If there's no doors, what should Rave have
presumably returned immediately if I gave her no lockers to work with? Just returned false.

- [6:53:48](https://www.youtube.com/watch?v=undefined&t=24828s) But this is an important case to


consider because if in the process of searching by locker by locker, we might have whittled down the
problem from seven doors to three doors to one door to zero doors-- and at that point, we might have
had no doors left to search. So we have to naturally have a scenario for just considering if there were no
doors.

- [6:54:07](https://www.youtube.com/watch?v=undefined&t=24847s) So it's not to say that maybe I


don't give Rave any doors to begin with. But as she divides and divides and divides, if she runs out of
lockers to ask those questions of-- or a few weeks ago, if I ran out of phone book pages to tear in half, I
too might have had to return false as in this case. So how can we now describe this a little more like C
just to give ourselves a variable to start thinking and talking about? Well, I might talk about doors as
being an array.

- [6:54:33](https://www.youtube.com/watch?v=undefined&t=24873s) And so if I want to express the


middle door, I could just, in pseudo code, say doors bracket middle. I'm assuming that someone has
done the math to figure out what the middle door is, but that's easy enough to do. And then doors, if
the number we're looking for is less than doors bracket middle, then search door zero through doors
middle minus 1.

- [6:54:54](https://www.youtube.com/watch?v=undefined&t=24894s) So again, this is a more pedantic


way of taking what's a pretty intuitive idea-- search the left half, search the right half-- but start to now
describe it in terms of actual indices or indexes like we did with our array notation. The last scenario, of
course, is if the number is greater than the door's bracket middle, then Rave would have wanted to
search the middle door plus 1-- so 1 over-- through doors n minus 1-- through n minus 1.

- [6:55:23](https://www.youtube.com/watch?v=undefined&t=24923s) So again, just a way of sort of


describing a little more syntactically what it is that's going on. So how might we translate this now into
big O notation? Well, in the worst case, how many steps total might Rave's binary search algorithm have
taken? Given seven doors or given more generically n doors, how many times could she go left or go
right before finding herself with one or no doors left? What's the way to think about that? Yeah, in the
middle? AUDIENCE: Log n.

- [6:55:56](https://www.youtube.com/watch?v=undefined&t=24956s) DAVID J. MALAN: Log n. So there's


log n again. And even if you're not feeling wholly comfortable with your logarithm still, pretty much in
programming and in computer science more generally, any time we talk about some algorithm that's
dividing and conquering in half, in half, in half, or any other multiple, it's probably involving logarithms in
some sense.

- [6:56:13](https://www.youtube.com/watch?v=undefined&t=24973s) And log base n essentially refers


to the number of times you can divide n by 2 until you bottom out at just a single door or equivalently
zero doors left. So log n. So we might say that indeed, binary search is in big O of log n because the door
that Rave opened last, this one, happened to be three doors away.

- [6:56:35](https://www.youtube.com/watch?v=undefined&t=24995s) And actually, if you do the math


here, that roughly works out to be exactly that case. If we add one, that's sort of out of seven doors or
roughly eight, we were able to search it in just three total steps. What about omega notation, though?
Like, in the best case, Rave might have gotten lucky. She opened the door, and there it is.

- [6:56:54](https://www.youtube.com/watch?v=undefined&t=25014s) So how might we describe a lower


bound on the running time of binary search. Yeah. AUDIENCE: 1. DAVID J. MALAN: Say again? AUDIENCE:
1. DAVID J. MALAN: Omega of 1. So here too, we see that in some cases binary search and linear search,
eh, like, they're pretty equivalent. And so this is why sometimes compelling to consider both the best
case in the worst case because honestly, in general, who really cares if you just get lucky once in a while
and your algorithm is super fast? What you probably care about is what's the worst case.

- [6:57:25](https://www.youtube.com/watch?v=undefined&t=25045s) How long are my users-- how long


am I going to be sitting there watching some spinning hourglass or beach ball trying to give myself an
answer to a pretty big problem? Well, odds are, you're going to generally care about big O notation. So
indeed, moving forward, will generally talk about the running time of algorithms often in terms of big O,
a little less so in terms of omega.

- [6:57:45](https://www.youtube.com/watch?v=undefined&t=25065s) But understanding the range can


be important depending on the nature of the data that you're going to actually be given here. All right let
me pause and see if there is any questions. Any questions here? Yes, thank you. AUDIENCE: So this
method is clearly more efficient, but it requires that the information is all compiled in a certain order.

- [6:58:11](https://www.youtube.com/watch?v=undefined&t=25091s) How do you ensure that you can


compile information in a particular order at scale? DAVID J. MALAN: Yeah, it's a really good question. And
if I can generalize it, how do you guarantee that you can do this at scale, which algorithm is better? I've
sort of led us down this road of implying that Rave's second algorithm, binary search, is better because
it's so much faster.

- [6:58:30](https://www.youtube.com/watch?v=undefined&t=25110s) It's log of n in the worst case


instead of big O of n. But Rave was given an advantage when she came up here in that the doors were
already sorted. And so that sort of invites the question, well, given a whole bunch of random data, either
a small data set or, heck, something Google sized with millions, billions of pieces of data, should you sort
it first from smallest to largest and then search? Or should you just dive right in and search it linearly?
Like, how might you think about that? If you are Google, for instance, and you've

- [6:59:00](https://www.youtube.com/watch?v=undefined&t=25140s) got millions, billions of web


pages, should they just go with linear search because it's always going to work even though it might be
slow? Or should they invest the time in sorting all of that data-- we'll see how in a bit-- and then search it
more efficiently? Like, how do you decide between those options? AUDIENCE: If you're sorting the data,
then wouldn't you have to go through all of the data? DAVID J.

- [6:59:23](https://www.youtube.com/watch?v=undefined&t=25163s) MALAN: Yeah, if you had to sort


the data first-- and we don't yet formally know how to do this. But obviously, as humans, we could
probably figure it out. You do have to look at all of the data anyway. And so you're sort of wasting your
time if you're sorting it only then to go in search it. But maybe it depends a bit more. Like, that's
absolutely right, and if you're just searching for one thing in life, then that's probably a waste of time to
sort it and then search it because you're just adding to the process.

- [6:59:46](https://www.youtube.com/watch?v=undefined&t=25186s) But what's another scenario in


which you might not worry about that whereby it might make sense to sort it and then search? Yeah.
AUDIENCE: [INAUDIBLE] you can go and use the other values as a way to find out what's happening.
DAVID J. MALAN: Yeah, exactly. So if your problem is a Google-like problem where you have more than
just one user who's searching for more than just one website page, probably you should incur the cost
up front and sort the whole thing because every subsequent request thereafter

- [7:00:15](https://www.youtube.com/watch?v=undefined&t=25215s) is going to be faster, faster, faster


because it's going to [INAUDIBLE] algorithm of binary search, binary search, binary search that's going to
add up to be way fewer steps than doing linear search multiple times. So again, kind of depends on the
use case and kind of depends on how important it is.

- [7:00:30](https://www.youtube.com/watch?v=undefined&t=25230s) And this happens even in real


world contexts. I think back always to graduate school, when I was writing some code to analyze some
large data set. And honestly, it was actually easier at the time for me to write pretty inefficient but
hopefully correct code because you know what? I could just go to sleep for eight hours and let it analyze
this really big data set.

- [7:00:48](https://www.youtube.com/watch?v=undefined&t=25248s) I didn't have to bother writing


more complex code to sort it just to run it more efficiently. Why? Because I was the only user, and I only
needed to run these queries once. And so this was kind of a reasonable approach, reasonable until I
woke up eight hours later and my code was incorrect. And now I had to spend another eight hours
rerunning it after fixing it.
- [7:01:05](https://www.youtube.com/watch?v=undefined&t=25265s) But even there, you see an
example where, what is your most precious resource? Is it time to run the code? Is it time to write the
code? Is it the amount of memory the computer is using? These are all resources we'll start to talk about
because it really depends on what your goals are. Any questions, then, on upper bounds, lower bounds,
or each of these two searches, linear or binary? Yeah.

- [7:01:27](https://www.youtube.com/watch?v=undefined&t=25287s) AUDIENCE: So just, when you're


calculating running time, does the sorting step count for that time? DAVID J. MALAN: When analyzing
running time, does the sorting step count? If you want it to if you actually do it. At the moment, it did
not apply. I just gave Rave the luxury of knowing that the data was sorted.

- [7:01:45](https://www.youtube.com/watch?v=undefined&t=25305s) But if I really wanted to charge


her for the amount of time it took to find that number six, I should have added the time to sort plus the
time to search. And in fact, that's a road we'll go down. Why don't we go ahead and pace ourselves as
before? Let's take a 10 minute break here. And when we come back, we'll write some actual code.

- [7:02:01](https://www.youtube.com/watch?v=undefined&t=25321s) So we've seen a couple of


searches-- linear search and binary search, which, to be fair, we saw back in week zero. But let's actually
translate at least one of those now to some code using this building block from last week where we can
actually define an array if we want, like an array of integers called numbers.

- [7:02:17](https://www.youtube.com/watch?v=undefined&t=25337s) So let me switch over to VS Code


here. Let me go ahead and start a program called numbers.c. And in numbers.c, let me go ahead here.
And how about let's include our familiar header files? So css50.h. I'll include standardio.h that we can get
input and print input if we want. And now I'm going to go ahead and give myself int main void.

- [7:02:39](https://www.youtube.com/watch?v=undefined&t=25359s) No command line arguments


today. So I'll leave that as void. And I'm going to go ahead and give myself an array of how about seven
numbers? So I'll call it int number 7. And then I can fill this array with numbers. Like, numbers brackets 0
can be the number 4, and numbers bracket 1 could be the number 6, and numbers bracket 2 can be the
number 8.

- [7:02:59](https://www.youtube.com/watch?v=undefined&t=25379s) And this is the same list that we


saw with [? Nomira ?] a bit ago where it was 4, then 6, then 8. But you know what? There's actually
another syntax I can show you here. If you know in advance in a C program that you want an array of
certain values and you know therefore how many of those values you want, you can actually do this little
trick using curly braces.

- [7:03:18](https://www.youtube.com/watch?v=undefined&t=25398s) You can say, don't worry about


how big this is. It's going to be implicit by way of these curly braces. Here, I can do 4, 6, 8, 2, 7, 5, 0, close
curly brace. So it's a somewhat new use of curly braces. But this has the effect of giving me an array
called numbers inside of which are a whole bunch of integers.

- [7:03:37](https://www.youtube.com/watch?v=undefined&t=25417s) How many? The compiler can


infer it from what's ever inside these curly braces. And it seems to be of size 1, 2, 3, 4, 5, 6, 7. And all
seven elements will be initialized with 4, 6, 8, 2, 7, 5, 0 respectively. So just a minor optimization code
wise to tighten up what would have otherwise been like eight separate lines of code.
- [7:03:56](https://www.youtube.com/watch?v=undefined&t=25436s) Now let's go ahead and
implement linear search, as we called it. And you can do this in a bunch of ways, but I'm going to do it
like this. For int i get 0, i is less than 7 i plus plus. Then inside of my loop, I'm going to ask the question,
well, if the numbers at location i equals equals, as we asked of [? Nomira, ?] the number 0, then I'm
going to go ahead and do something like printf found backslash n.

- [7:04:26](https://www.youtube.com/watch?v=undefined&t=25466s) And then I'm going to return 0.


Just because of last week's discussion of returning a value for main when all is well, I'm going to return 0
by convention just to signal that indeed, I found what I'm looking for. Otherwise, on what line do I want
to go and add a printf, like, not found and return something other than 0? Right, I don't think I want an
else here per our pseudo code earlier.

- [7:04:51](https://www.youtube.com/watch?v=undefined&t=25491s) So on what line would you prefer I


sort of insert a default scenario of not found and I'll return an error? Yeah, over here? [INTERPOSING
VOICES] DAVID J. MALAN: Nice. So at the end of the for loop because you want to give the program or
our volunteer earlier a chance to go through all of the doors, all of the numbers.

- [7:05:11](https://www.youtube.com/watch?v=undefined&t=25511s) But if you go through the whole


thing, through the whole loop, at the very end, you probably just want to conclude not found backslash
n and then return something like positive 1 just to signify that an error happened. And again, this was a
minor detail last week. Any time main is successful, the programming convention is to return 0.

- [7:05:29](https://www.youtube.com/watch?v=undefined&t=25529s) That means all as well. And if


something goes wrong, like you didn't find what you're looking for, you might return something other
than 0, like positive 1, maybe positive 2, or even negative numbers if you want. All right, well, let me go
ahead and save this. Let me do make numbers. Hopefully no syntax errors.

- [7:05:46](https://www.youtube.com/watch?v=undefined&t=25546s) All good so far. dot slash numbers,


enter. All right, and it's found, as I would hope it would be. And just as a little check, let's search for
something that's definitely not there, like the number negative 1. Let me go ahead and recompile the
code with make numbers. Let me rerun the code with dot slash numbers and hopefully-- whew, OK, not
found.

- [7:06:06](https://www.youtube.com/watch?v=undefined&t=25566s) So proof by example seems to be


working correctly. But let's make things a little more interesting now. Right now, I'm using just an array of
integers. Let me go ahead and introduce maybe an array of strings instead. And maybe this time, I'll store
a bunch of names and not just integers but actual strings of names.

- [7:06:23](https://www.youtube.com/watch?v=undefined&t=25583s) So how might I do this? Well, let


me go back to my code here. I'm going to switch us over to maybe a file called names.c. And in here, I'll
go ahead and include cs50.h. I'll include standardio.h. And I'm going to go ahead and for now include a
new friend from last week, string.h, which gives me some string-related functionality.

- [7:06:45](https://www.youtube.com/watch?v=undefined&t=25605s) Int main void because I'm not


going to bother with any command line arguments for now. And now if I want an array of strings, I could
do something like this-- string names bracket 7. And then I could start doing like before. Names bracket 0
could be someone like Bill, and names bracket 1 could be someone like Charlie and so forth.
- [7:07:05](https://www.youtube.com/watch?v=undefined&t=25625s) But there's this new improvement
I can make. Let me just let the compiler figure out how many names there are. And using curly braces, I'll
do Bill and then Charlie and then Fred and then George and then Ginny and then Percy and then Ron if
there's the pattern there. All right, so now I have these seven names as strings.

- [7:07:27](https://www.youtube.com/watch?v=undefined&t=25647s) Let's do something similar. So for


int, i get 0. i is less than 7 as before, i plus plus as before. And inside of the, loop lets this time check for
the string in question, and suppose we're searching for Ron arbitrarily. He is there, so we should
eventually find him. Let me go ahead and say if names bracket i equals quote unquote Ron, then inside
of my if condition, I'm going to say printf found just like before.

- [7:07:56](https://www.youtube.com/watch?v=undefined&t=25676s) And I'm going to return 0 just


because all is well. And I'm going to take your advice from the get go this time and, at the end of the
loop, print out not found because if I get this far, I have not printed found, and I have not returned
already. So I'm just going to go ahead and return 1 after printing not found.

- [7:08:12](https://www.youtube.com/watch?v=undefined&t=25692s) All right, let me go ahead and


cross my fingers as always. Make names this time. And it doesn't seem to like my code here. This is
perhaps a new error that you might not have seen yet in names.c line 11. So that's this line here, my if
condition. Result of comparison against a string literal is unspecified.

- [7:08:32](https://www.youtube.com/watch?v=undefined&t=25712s) Use an explicit string comparison


function instead. I mean, that's kind of a mouthful, and the first time you see it, you're probably not
going to know how to make sense of that. But it does kind of draw our attention to something being
awry with the equality checking here, with equal equals and Ron.

- [7:08:48](https://www.youtube.com/watch?v=undefined&t=25728s) And here's where again we've


been telling sort of a white lie for the past couple of weeks. Strings are a thing in C. Strings are a thing in
programming. But recall from last week, I did disclaim there's no such thing as a string data type
technically because it's not a primitive in the way an int and a float and a bool are that are sort of built
into the language.

- [7:09:09](https://www.youtube.com/watch?v=undefined&t=25749s) You can't just use equation equals


to compare two strings. You actually have to use a special function that's in this header file we talked
briefly about last week. In that header file was string length or strlen. But there's other functions instead
as well. Let me, in fact, go ahead and open up the manual pages.

- [7:09:27](https://www.youtube.com/watch?v=undefined&t=25767s) And if we go to string.h-- let me


scroll down a bit. In string.h you can perhaps infer what function will probably take the place of equals
equals for today. What do we want to use? Yeah. AUDIENCE: Strcmp? DAVID J. MALAN: So strcmp, S-T-R-
C-M-P, which apparently compares two strings. And if I click on that, we'll see more information.

- [7:09:50](https://www.youtube.com/watch?v=undefined&t=25790s) And indeed, if I click on strcmp,


we'll see under the synopsis that, OK, I need to use the CS50 header file and string.h, as I already have.
Here is its prototype, which is telling me that strcmp takes two strings, S1 and S2, that are presumably
going to be compared. And it returns an integer, which is interesting.
- [7:10:10](https://www.youtube.com/watch?v=undefined&t=25810s) So let's read on. The description
of this function is that it compares two strings case sensitively. So uppercase or lowercase matters, just
FYI. And then let's look it the return value here. The return value of this function returns an int less than
0 if S1 comes before S2, 0 if S1 is the same as S2, or an int greater than 0 if S1 comes after S2.

- [7:10:35](https://www.youtube.com/watch?v=undefined&t=25835s) So the reason that this function


returns an integer and not just a bool, true or false, is that it actually will allow us to sort these things
eventually because if you can tell me if two strings come in this order or in this order or they're the
same, you need three possible return values. And a bool, of course, only gives you two, but an int gives
you like 4 billion even though we just need the 3.

- [7:10:57](https://www.youtube.com/watch?v=undefined&t=25857s) So 0 or a positive number or a


negative number is what this function returns. And the documentation goes on to explain what we mean
by ASCIIbetical order. Recall that capital A is 65, capital B is 66, and it's those underlying ASCII or Unicode
numbers that a computer uses to figure out whether something comes before it or after it like in the
dictionary.

- [7:11:18](https://www.youtube.com/watch?v=undefined&t=25878s) But for our purposes now, we


only care about equality. So I'm going to go ahead and do this. If I want to compare names bracket i
against Ron, I use stir compare or strcmp, names bracket i comma, quote unquote, Ron. So it's a little
more involved than actually using equals equals, which does work for integers, longs, and certain other
values.

- [7:11:41](https://www.youtube.com/watch?v=undefined&t=25901s) But for strings, it turns out we


need to use a more powerful function. Why? Well, last week, recall what a string really is. It's an array of
characters. And so whereas you can use equals equals for single characters, strcmp, as we'll eventually
see, is going to compare multiple characters for us. There's more logic there.

- [7:12:00](https://www.youtube.com/watch?v=undefined&t=25920s) There's a loop needed, and that's


why it comes with the string library. But it doesn't just work out of the box with equals equals alone.
That would literally be comparing two things, not two arrays of things. And we'll come back to this next
week as to what's really going on under the hood.

- [7:12:14](https://www.youtube.com/watch?v=undefined&t=25934s) So let me go ahead and fix one


bug that I just realized I made. I want to check if the return value of str compare is equal to 0 because
per the documentation, that meant they're the same. All right, let me go ahead and make names this
time. Now it compiles. Dot slash names, Enter, found. And just as a sanity check, let's check someone
outside the family.

- [7:12:40](https://www.youtube.com/watch?v=undefined&t=25960s) Searching now for Hermione after


recompiling the code, after rerunning the code. And she's not, in fact, found. So here's just a similar
implementation of linear search not for integers this time but instead for strings, the subtlety really
being we need a helper function, str compare, to actually do the legwork for us of comparing two arrays
of characters.

- [7:13:02](https://www.youtube.com/watch?v=undefined&t=25982s) All right, questions on either of


these implementations-- yeah, in the middle? AUDIENCE: So, if I do [INAUDIBLE] DAVID J. MALAN: Ah,
good question. If I had not fixed what I claimed was a mistake earlier and I did this-- and we saw an
example of this last week, actually. If a function returns an integer, be it negative or positive or 0, when
you get back 0, the expression, the Boolean expression, will be considered false.

- [7:13:26](https://www.youtube.com/watch?v=undefined&t=26006s) So 0 equals false always. If a


function returns any positive number, or any negative number, that's going to be interpreted as true
even if it's positive or negative, whether it's 1, negative 1, 2, negative 2. And so if I did this, this would be
saying the opposite. So if I were to say this, if str compare of names bracket i and Hermione, that's
implicitly like saying this does not equal 0, or it means sort of is true, but you don't want to check for
true because, again, we're comparing integers here.

- [7:14:02](https://www.youtube.com/watch?v=undefined&t=26042s) So the reason I did 0 here in this


case is that it explicitly checks for the return value that means they're the same. And yeah. Follow up?
AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yes, you might not have seen this yet, but you can express the
equivalent because if you want to check if this is false, you can actually use an exclamation point, known
as a bang in programming, that inverts the meaning.

- [7:14:29](https://www.youtube.com/watch?v=undefined&t=26069s) So false becomes true, true


becomes false. So this would be another way of expressing it. This is arguably a worse design, though,
because the documentation explicitly says you should be checking for 0 or a positive value or a negative
value, and this little trick, while correct, and I think you can make a reasonable case for it, sort of hides
that detail.

- [7:14:50](https://www.youtube.com/watch?v=undefined&t=26090s) And I would argue instead for the


first way, checking for equals equals 0 instead. And if that's a little subtle, not to worry. We'll come back
to little syntactic tricks like that before long. Other questions on linear search in these two forms. Is there
another hand or hands? Two hands? No? OK, just holler if I missed.

- [7:15:10](https://www.youtube.com/watch?v=undefined&t=26110s) So let's now actually take this one


step further. Suppose that we want to write a program that maybe implements something a little more
like a phone book that has both names and numbers and not just integers but actual phone numbers.
Well, we could escalate things like this. We could now have two arrays-- one called names, one called
numbers.

- [7:15:27](https://www.youtube.com/watch?v=undefined&t=26127s) And I'm going to use strings for


the numbers now, the phone numbers, because in most communities, phone numbers might have
dashes, pluses, parentheses, so something that really looks more like a string even though we call it a
phone number. Probably don't want to use an int lest we throw away those kinds of details.

- [7:15:43](https://www.youtube.com/watch?v=undefined&t=26143s) So let me switch back to VS Code


here, and let's do one more program, this one in a file called phonebook.c. And now let me go ahead and
do the same. Let me include cs50.h. Let me include standardio.h, and let me include string.h. I'm going
to again do int main void. And then inside of my program, I'm going to give myself two arrays-- the
efficient way this time.

- [7:16:07](https://www.youtube.com/watch?v=undefined&t=26167s) String names will be just two of us


this time. How about Carter and me? And then I'll give myself-- oops, typo already. If I want this to be an
array, I don't have to specify the number. The compiler can count for me. But I do need the square
brackets. Then for numbers, I'm again going to use a string array specifying with the curly braces that
how about Carter can be at 1-617-495-1000.

- [7:16:34](https://www.youtube.com/watch?v=undefined&t=26194s) And how about my own number


here-- 1-949-468-- oh pattern appearing-- 2750 will be mine. Why mine? Well, I'm just kind of lined
things up. So Carter's number is apparently first in this array, and I'm claiming that he'll be first in this
array, respectively. I, David, will be the first-- the second in the names array and second in the numbers
array.

- [7:16:57](https://www.youtube.com/watch?v=undefined&t=26217s) If you want to have a little fun


with programming, feel free to text or call me some time at that number. So now let's actually use this
data in some way. Let's go ahead and actually search for my own name and number here. So let me do.
For int i, get 0. There's two of us this time-- so i less than 2 and then i plus plus as before.

- [7:17:16](https://www.youtube.com/watch?v=undefined&t=26236s) And now I'm going to practice


what I preached earlier, and I'm going to use str compare to find my name in this case. And I'm going to
say if strcmp of names bracket i equals quote unquote David and that equals 0, meaning they're the
same, then just as before, I'm going to go ahead and print something out.

- [7:17:36](https://www.youtube.com/watch?v=undefined&t=26256s) But this time, I'm going to make


the program more useful and not just say found or not found. Now I'm implementing a phone book, like
the contacts app on iOS or Android. So I'm going to say something like, quote unquote, found percent s
backslash n and then actually plug in numbers bracket i to correspond to the current name bracket i.

- [7:17:57](https://www.youtube.com/watch?v=undefined&t=26277s) And then I'll return 0 as before.


And then down here if we get all the way through the loop and David's not there for some reason, I'm
going to print as before not found and then return 1. So let me go ahead and compile this with make
phone dot slash phonebook, and it seems to have found the number.

- [7:18:14](https://www.youtube.com/watch?v=undefined&t=26294s) So this code I'm going to claim is


correct. It's kind of stupid because I've just made a phone book or a contacts app that only supports two
people. They're only going to be me and Carter. This would be like downloading the contacts app on a
phone and you can only call two people in the world.

- [7:18:28](https://www.youtube.com/watch?v=undefined&t=26308s) There's no ability to add names or


edit things. That, of course, could come later using get string or something else. But for now for the sake
of discussion, I've just hardcoded two names and two numbers. But for what it does, I claim this is
correct. It's going to find me and print out my number.

- [7:18:44](https://www.youtube.com/watch?v=undefined&t=26324s) But is it well-designed? Let's start


to now consider if we're not just using arrays, but are we using them, well? We started to use them last
week, but are we using them well this week? And what might I even mean by using an array well or
designing this program well? Any critiques or concerns with why this might not be the best road for us to
be going down when I want to implement something like a phone book with pieces of information? It
seems all too vulnerable to just mistakes.

- [7:19:16](https://www.youtube.com/watch?v=undefined&t=26356s) For instance, if I screw up the


actual number of names in the names array such that it's now more or less than is in the numbers array
or vise versa, it feels like there's not a tight relationship between those pieces of data, and it's just sort of
is trusting on the honor system that any time I use names bracket i that it lines up with numbers bracket
i.

- [7:19:38](https://www.youtube.com/watch?v=undefined&t=26378s) And that's fine. If you're the one


writing the code, you're probably not going to really screw this up. But if you start collaborating with
someone else or the program is getting much, much longer, the odds that you or your colleagues
remember that you're sort of just trusting that names and numbers line up like this is going to fail
eventually.

- [7:19:55](https://www.youtube.com/watch?v=undefined&t=26395s) Someone's not going to realize


that, and just, the code is going to break. And you're going to start out putting the wrong numbers for
names, which is to say it'd be much nicer if we could somehow couple these two pieces of data, names
and numbers, a little more tightly together so that you're not just trusting that these two independent
variables, names and numbers, have this kind of relationship with themselves.

- [7:20:17](https://www.youtube.com/watch?v=undefined&t=26417s) So let's consider how we might


solve this. A new feature today that we'll introduce is generally known as a data structure. In C, we have
the ability to invent our own data types, if you will-- data types that the authors of C decades ago just
didn't envision or just didn't think were necessary because we can implement them ourselves-- similar to
Scratch just as you could create custom puzzle pieces, or in C, you can create custom functions.

- [7:20:41](https://www.youtube.com/watch?v=undefined&t=26441s) So in C, can you create your own


types of data that go beyond the built in ints and floats and even strings? You can make, for instance, a
person data type or a candidate data type in the context of elections or a person data type more
generically that might have a name and a number. So how might we do this? Well, let me go here and
propose that if we want to define a person, wouldn't it be nice if we could have a person data type, and
then we could have an array called people? And maybe that array is our only array with two things

- [7:21:17](https://www.youtube.com/watch?v=undefined&t=26477s) in it, two persons in it. But


somehow, those data types, these persons, would have both a name and a number associated with
them. So we don't need two separate arrays. We need one array of persons, a brand new data type. So
how might we do this? Well, if we want every person in the world or in this program to have a name and
a number, we literally right out first those two data types.

- [7:21:41](https://www.youtube.com/watch?v=undefined&t=26501s) Give me a string called name.


Give me a string called number semicolon, after each. And then we wrap that, those two lines of code,
with this syntax, which at first glance is a little cryptic. It's a lot of words all of a sudden. But typedef is a
new keyword today that defines a new data type. This is the C key word that lets you create your own
data type for the very first time.

- [7:22:02](https://www.youtube.com/watch?v=undefined&t=26522s) Struct is another related key word


that tells the compiler that this isn't just a simple data type, like an int or a float renamed or something
like that. It actually is a structure. It's got some dimensions to it, like two things in it or three things in it
or even 50 things inside of it. The last word down here is the name that you want to give your data type,
and it weirdly goes after the curly braces.
- [7:22:26](https://www.youtube.com/watch?v=undefined&t=26546s) But this is how you invent a data
type called person. And what this code is implying is that henceforth, the compiler clang will know that a
person is composed of a name that's a string and a number that's a string. And you don't have to worry
about having multiple arrays now. You can just have an array of people moving forward.

- [7:22:48](https://www.youtube.com/watch?v=undefined&t=26568s) So how can we go about using


this? Well, let me go back to my code from before where I was implementing a phone book. And why
don't we enhance the phone book code a little bit by borrowing some of that new syntax? Let me go to
the top of my program above main and define a type that's a structure or a data structure that has a
name inside of it and that has a number inside of it.

- [7:23:09](https://www.youtube.com/watch?v=undefined&t=26589s) And the name of this new


structure again is going to be called person. Inside of my code now, let me go ahead and delete this old
stuff temporarily. Let me give myself an array called people of size 2. And I'm going to use the non-terse
way to do this. I'm not going to use the curly braces. I'm going to more pedantic spell out what I want in
this array of size 2 at location 0, which is the first person in an array because you always start counting at
0.

- [7:23:37](https://www.youtube.com/watch?v=undefined&t=26617s) I'm going to give that person a


name of quote unquote Carter. And the dot is admittedly one new piece of syntax today too. The dot
means go inside of that structure and access the variable called name and give it this value Carter.
Similarly, if I'm going to give Carter a number, I can go into people bracket 0 dot number and give that
the same thing as before plus 1-617-495-1000.

- [7:24:02](https://www.youtube.com/watch?v=undefined&t=26642s) And then I can do the same for


myself here-- people bracket-- where should I go? OK, one because again, two elements. But we started
counting at zero. Bracket name equals quote unquote David. And then lastly, people bracket 1 dot
number equals quote unquote plus 1-949-468-2750. So now if I scroll down here to my logic, I don't
think this part needs to change too much.

- [7:24:30](https://www.youtube.com/watch?v=undefined&t=26670s) I'm still, for the sake of discussion,


going to iterate 2 times from i is 0 on up to but not through 2. But I think this line of code needs to
change. How should I now refer to the i-th person's name as I iterate? What should I compare quote
unquote David to this time? Let me see. On the end here? AUDIENCE: People bracket i dot name.

- [7:24:57](https://www.youtube.com/watch?v=undefined&t=26697s) DAVID J. MALAN: Yeah, people


bracket i dot name. Why? Because people is the name of the array. Bracket i is the i-th person that we're
iterating over in the current loop-- first zero, then one, maybe higher if it had more people. Then dot is
our new syntax for going inside of a data structure and accessing a variable therein which in this case is
name.

- [7:25:14](https://www.youtube.com/watch?v=undefined&t=26714s) And so I can compare David just


as before. So it's a little more verbose, but now arguably this is a better program because now these
people are full fledged data types unto themselves. There's no more honor system inside of my loop that
this is going to line up because in just a moment, I'm going to fix this one last remnant of the previous
version.
- [7:25:34](https://www.youtube.com/watch?v=undefined&t=26734s) And if I can call back on you again,
what should I change numbers bracket i to this time? AUDIENCE: [INAUDIBLE] dot number. DAVID J.
MALAN: Dot number, exactly. So gone is the honor system that just assumes that bracket i in this array
lines up with bracket i in this other array. Now why? There's only one array.

- [7:25:55](https://www.youtube.com/watch?v=undefined&t=26755s) It's an array called people. The


things it stores are persons. A person has a name and a number. And so even though it's kind of marginal
admittedly given that this is a short program and given that this kind of made things look more
complicated at first glance, we're now laying the foundation for just a better design because you really
can't screw up now the association of names with numbers because every person's name and number is,
so to speak, encapsulated inside of the same data type.

- [7:26:21](https://www.youtube.com/watch?v=undefined&t=26781s) And that's a term of art in CS.


Encapsulation means to encapsulate-- that is, contain-- related pieces of information. And thus, we have
a person that encapsulates two other data types, name and number. And this just sets the foundation for
all of the cool stuff we've talked about and you use every day.

- [7:26:40](https://www.youtube.com/watch?v=undefined&t=26800s) What is an image? Well, recall


that an image is a bunch of pixels or dots on the screen. Every one of those dots has RGB values
associated with it-- red, green, and blue. You could imagine now creating a structure in C probably where
maybe you have three values, three variables-- one called red, one called green, one called blue.

- [7:26:58](https://www.youtube.com/watch?v=undefined&t=26818s) And then you could name the


thing not person but pixel. And now you could store in C three different colors-- some amount of red,
some green, some blue-- and collectively treat it as the color of a pixel. And you could imagine doing
something similar perhaps for video or music. Music, you might have three variables-- one for the
musical note, the duration, the loudness of it.

- [7:27:17](https://www.youtube.com/watch?v=undefined&t=26837s) And you can imagine coming up


with your own data type for music as well. So this is a little low level. We're just using like a familiar
contacts application. But we now have the way in code to express most any type of data that we might
want to implement or discuss ultimately. So any questions now on struct or defining our own types, the
purposes for which are to use arrays but use them more responsibly now in a better design but also to
lay the foundation for implementing cooler and cooler stuff per our week zero discussion?

- [7:27:50](https://www.youtube.com/watch?v=undefined&t=26870s) Yeah. AUDIENCE: What's the


[INAUDIBLE] DAVID J. MALAN: What's the difference between this and an object in an object oriented
language? So slight side note, C is not object-oriented. Languages like Java and C++ and others which you
might have heard of, programmed yourself, had friends program in, are object oriented languages in
those languages they have things called classes or objects which are interrelated.

- [7:28:11](https://www.youtube.com/watch?v=undefined&t=26891s) And objects can store not just


data, like variables. Objects can also store functions, and you can kind of sort of do this in C. But it's not
sort of conventional. In C, you have data structures that store data. In languages like Java and C+, you
have objects that store data and functions together. Python is an object-oriented language as well.

- [7:28:32](https://www.youtube.com/watch?v=undefined&t=26912s) So we'll see this issue in a few


weeks, but let me wave my hands at it for now. Yeah. AUDIENCE: Could you use this [INAUDIBLE]??
DAVID J. MALAN: Yes. Could you use this struct to redefine how an int is defined? Short answer, yes. We
talked a couple of times now about integer overflow. And most recently, you might have seen me
mention the bug in iOS and Mac OS that was literally related to an int overflow.

- [7:28:53](https://www.youtube.com/watch?v=undefined&t=26933s) That's the result of ints only


storing 4 bytes or 32 bits or even as long as 64 bits or 8 bytes. But it's finite. But if you want to
implement some financial software or some scientific or mathematical software that allows you to count
way bigger than a typical int or a long, you could imagine John coming up with your own structure.

- [7:29:13](https://www.youtube.com/watch?v=undefined&t=26953s) And in fact, in some languages


there is a structure called big int, which allows you to express even bigger numbers. How? Well, maybe
you store inside of a big ant an array of values. And you somehow allow yourself to store more and more
bits based on how high you want to be able to count. So in short, yes.

- [7:29:31](https://www.youtube.com/watch?v=undefined&t=26971s) We now have the ability now to


do most anything we want in the language even if it's not built in for us. Other questions. AUDIENCE:
[INAUDIBLE] DAVID J. MALAN: Could you define a name and a number in the same line? Sort of. It starts
to get syntactically a little messy, so I did it a little more pedantic line by line.

- [7:29:52](https://www.youtube.com/watch?v=undefined&t=26992s) Good question. Over here.


AUDIENCE: [INAUDIBLE] function you use for the function at the bottom of the [INAUDIBLE]. Could you
do something like that [INAUDIBLE]?? DAVID J. MALAN: Prototypes-- you have to do A and C. You have to
define anything you're going to use or declare anything you're going to use before you actually use it.

- [7:30:11](https://www.youtube.com/watch?v=undefined&t=27011s) So it is deliberate that I put it at


the top of my code in this file. Otherwise, the compiler would not know what I mean by person when I
first use it here on what's line 14. So it has to come first, or it has to be put into something like a header
file so that you include it at the very top of your code.

- [7:30:29](https://www.youtube.com/watch?v=undefined&t=27029s) Other questions over here. Yeah.


AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yeah, good question, and we'll come back to this later in the
term when we talk about SQL, a database language, and storing things in actual databases. Generally
speaking, even though we humans call things phone numbers, or in the US, we have social security
numbers, those types of numbers often have other punctuation in it, like dashes, parentheses, pluses,
and so forth.

- [7:30:58](https://www.youtube.com/watch?v=undefined&t=27058s) You could not store any of that


syntax or that punctuation inside of an int. You could only store numbers. So one motivation for using a
string is just I can store whatever the human wanted me to store, including parentheses and so forth.
Another reason for storing things as strings, even if they look like numbers, is in the context of zip codes
in the United States.

- [7:31:17](https://www.youtube.com/watch?v=undefined&t=27077s) Again, we'll come back to this. But


long story short-- years ago, actually-- I was using Microsoft Outlook for my email client. And eventually I
switched to Gmail. And this is like 10 plus years ago now. And Outlook at the time lets you export all of
your contacts as a CSV file-- Comma Separated Values.
- [7:31:33](https://www.youtube.com/watch?v=undefined&t=27093s) More on that in the weeks to
come too. And that just means I could download a text file with all of my friends and family and their
numbers inside of it. Unfortunately, I open that same CSV file with Excel, I think, at the time just to kind
of spot check it and see if what's in there was what it was expected.

- [7:31:48](https://www.youtube.com/watch?v=undefined&t=27108s) And I must have instinctively hit,


like, Command or Control-S to save it. And Excel at least has this habit of sort of reformatting your data.
If things look like numbers, it treats them as numbers. And Apple Numbers does this too. Google
Spreadsheets does this to nowadays. But long story short, I then imported my mildly saved CSV file into
Gmail.

- [7:32:07](https://www.youtube.com/watch?v=undefined&t=27127s) And now 10 plus years later, I'm


still occasionally finding friends and family members whose zip codes are in Cambridge, Massachusetts
2138, which is missing the 0 because we here in Cambridge are 02138. And that's because I treated or I
let Excel treat what looks like a number as an actual number or int, and now leading zeros become a
problem because mathematically, they mean nothing, but in the mail system, they do-- sending
envelopes and such.

- [7:32:34](https://www.youtube.com/watch?v=undefined&t=27154s) All right, other final questions


here. AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yeah, so could I have used a 2D or two dimensional array
to solve the problem earlier of having just one array? Yes, but one, I would argue it's less readable,
especially as I get lots of names and numbers. And two, that too is also kind of relying on the honor
system.

- [7:32:56](https://www.youtube.com/watch?v=undefined&t=27176s) It would be all too easy to omit


some of the square brackets in the two dimensional array. So I would argue it too is not as good as
introducing a struct. More on that down the road. Two dimensional arrays just means arrays of arrays, as
you might infer. All right, so now that we have this ability to store different types of data like contacts in a
phone book, having names and addresses, let's actually take a step back and consider how we might
now solve one of the original problems by actually sorting the information we're

- [7:33:26](https://www.youtube.com/watch?v=undefined&t=27206s) given in advance and considering,


per our discussion earlier, just how costly, how time consuming is that because that might tip the scales
in favor of sorting, then searching, or maybe just not sorting and only searching. It'll give us a sense of
just how expensive, so to speak, sorting something actually is.

- [7:33:45](https://www.youtube.com/watch?v=undefined&t=27225s) Well, what's the formulation of


this problem? It's the same thing as week zero. We've got input to sort. We want it to be output as
sorted. So for instance, if we're taking unsorted input as input, we want the sorted output as the result.
More concretely, if we've got numbers like these-- 63852741, which are just randomly arranged
numbers-- we want to get back out 12345678.

- [7:34:09](https://www.youtube.com/watch?v=undefined&t=27249s) So we just want those things to


be sorted. So again, inside of the black box here is going to be one or more algorithms that actually gets
this job done. So how might we go about doing this? Well, just to vary things a bit more, I think we have
a chance here for a bit more audience participation. But this time, we need eight people if we may.
- [7:34:28](https://www.youtube.com/watch?v=undefined&t=27268s) All of you have to be comfortable
appearing on the internet. OK, so this is actually quite convenient that you're all quite close. How about
1, 2, 3, 4, 5, 6, 7-- oh, OK, and someone volunteering their friend-- number eight. Come on down. Come
on down. And if you could, I'm going to set things up. If you all could join Valerie, my colleague over
there, to give you a prop to use here, we'll go ahead in just a moment and try to find some numbers at
hand.

- [7:34:56](https://www.youtube.com/watch?v=undefined&t=27296s) In just a moment, each of our


volunteers is going to be representing an integer. And that integer is initially going to be in unsorted
order. And I claim that using an algorithm, step by step instructions, we can probably sort these folks in
at least a couple of different ways. So they're in wardrobe right now just getting their very own Harvard
T-shirts with a Jersey number on it, which will then represent an element of our array.

- [7:35:27](https://www.youtube.com/watch?v=undefined&t=27327s) Give us just a moment to finish


getting the attire ready. They're being handed a shirt and a number. And let me ask the audience for just
a moment. As we have these numbers up here on the screen, these numbers too are unsorted. They're
just in random order. And let me ask the audience. How would you go about sorting these eight numbers
on the screen? How would you go about sorting these? Yeah, what are your thoughts? AUDIENCE:
[INAUDIBLE] the number at the end, the following number.

- [7:36:05](https://www.youtube.com/watch?v=undefined&t=27365s) DAVID J. MALAN: OK. AUDIENCE:


The following number is bigger, then I keep it as it is. DAVID J. MALAN: OK. AUDIENCE: If not, then
[INAUDIBLE]. DAVID J. MALAN: OK, so just to recap, you would start with one of the numbers on the end.
You would look to the number to the right or to the left of it, depending on which end you start at.

- [7:36:19](https://www.youtube.com/watch?v=undefined&t=27379s) And if it's out of order, you would


just start to swap things. And that seems reasonable. There's a whole bunch of mistakes to fix here
because things are pretty out of order. But probably, if you start to solve small problems at a time, you
can achieve the end result of getting the whole thing sorted.

- [7:36:32](https://www.youtube.com/watch?v=undefined&t=27392s) Other instincts, if you were just


handed these numbers, how you might go about sorting them? How might you? Yeah, in the back.
AUDIENCE: [INAUDIBLE] DAVID J. MALAN: OK, I like that. So to recap there, find the smallest one first and
put it at the beginning, if I heard you correctly. And then presumably, you could do that again and again
and again.

- [7:36:55](https://www.youtube.com/watch?v=undefined&t=27415s) And that would seem to give you


a couple of different algorithms. And if you all are attired here-- do you want to come on up if you're
ready? We had some [? felt ?] volunteers too. Come on over. So if you all would like to line yourselves up
facing the audience in exactly this order-- so whoever is number zero should be way over here, and
whoever is number five should be way over there.

- [7:37:18](https://www.youtube.com/watch?v=undefined&t=27438s) Feel free to distance as much as


you'd like and scooch a little with this way if you could. OK, all right. And make a little more room. So
seven-- let's see. 5, 2, 7, 4-- AUDIENCE: [INAUDIBLE] DAVID J. MALAN: 4, hopefully 1. Yeah, keep them to
the side. OK, 1, 6, and there we go-- 3. Come on over, three.
- [7:37:37](https://www.youtube.com/watch?v=undefined&t=27457s) I was looking for you. All right, so
here, we have an array of eight numbers-- eight integers if you will. And do you want to each say a quick
hello to the group? AUDIENCE: Hello, I'm Quinn. Go [INAUDIBLE]. AUDIENCE: Hi, everyone. I'm
[INAUDIBLE]. AUDIENCE: Hey, I'm Mitchell. AUDIENCE: Hi, I'm Brett.

- [7:37:55](https://www.youtube.com/watch?v=undefined&t=27475s) And also, go [INAUDIBLE].


AUDIENCE: I'm Hannah. Go [INAUDIBLE]. AUDIENCE: Hi, I'm Matthew. Go [INAUDIBLE] AUDIENCE: Hi, I'm
Miriam. Go Winthrop. AUDIENCE: Hi, I'm Celeste, and go Strauss. DAVID J. MALAN: Wonderful. Well,
welcome all to the stage, and let's just visualize, perhaps organically, how you eight would solve this
problem.

- [7:38:14](https://www.youtube.com/watch?v=undefined&t=27494s) So we currently have the numbers


0 through 7 quite out of order. Could you go ahead and just yourselves from 0 through 7? AUDIENCE:
Thank you. DAVID J. MALAN: OK, so what did they just do? OK, yes. First of all, yes, very well done. How
would you describe what they just did? Well, let's do this. Could you go back into that order on the
screen-- 52741630? And could you do exactly what you just did again? Sort yourselves.

- [7:38:49](https://www.youtube.com/watch?v=undefined&t=27529s) All right, what did-- OK, yes. Well


done again. All right, so admittedly, there's kind of a lot going on because each of you, except number
four, are doing something in parallel all at the same time. And that's not really how a computer typically
works. Just like a computer can only look at one memory location, at one locker, at a time, so can a
computer only move one number at a time-- sort of opening a locker, checking what's there, moving it as
needed.

- [7:39:18](https://www.youtube.com/watch?v=undefined&t=27558s) So let's try this more methodically


based on the two audience suggestions. If you all could randomize yourself again to 52741630, let's take
the second of those approaches first. I'm going to look at these numbers. And even though I as the
human can obviously see all the numbers and I just kind of have the intuition for how to fix this, we got
to be more methodical because eventually, we've got to translate this to pseudo code and then code.

- [7:39:40](https://www.youtube.com/watch?v=undefined&t=27580s) So let me see. I'm going to search


for, as you proposed, the smallest number. And I'm going to start from left to right. I could do it right to
left, but left to right just tends to be convention. All right, 5 at this moment is the smallest number I've
seen. So I'm going to remember that in a variable, if you will.

- [7:39:54](https://www.youtube.com/watch?v=undefined&t=27594s) Now I'm going to take one more


step-- 2. OK, 2 I'm going to compare to the variable in mind, obviously smaller. I'm going to forget about
5 and only now remember 2 as the now smallest elements. 7, nope-- I'm going to ignore that because it's
not smaller than the 2 I have in mind. 4, 1-- OK, I'm going to update the variable in mind because that's
indeed smaller.

- [7:40:13](https://www.youtube.com/watch?v=undefined&t=27613s) Now obviously, we the humans


know that's getting pretty small. Maybe it's the end. I have to check all values to see if there's something
even smaller because 6 is not, 3 is not, but 0 is. And what's your name again? AUDIENCE: Celeste. DAVID
J. MALAN: Celeste. Where should Celeste or number 0 go according to this proposed algorithm? All right,
I'm seeing a lot of this.
- [7:40:34](https://www.youtube.com/watch?v=undefined&t=27634s) So at the beginning of the array,
so before doing this for real, let's have you pop out in front. And could you all shift and make room for
Celeste? Is this a good idea to have all of them move or equivalently move everything in the array to
make room for Celeste and number 0 over there? No, probably not.

- [7:40:52](https://www.youtube.com/watch?v=undefined&t=27652s) That felt like a lot of work. And


even though it happened pretty quickly, that's like seven steps to happen just to move her in place. So
what would be marginally smarter perhaps-- a little more efficient, perhaps? What's that? AUDIENCE:
Swapping. DAVID J. MALAN: Swapping. What do you mean by swap? AUDIENCE: Replacing swaps.

- [7:41:07](https://www.youtube.com/watch?v=undefined&t=27667s) DAVID J. MALAN: OK, replace two


values. So if you want to go back to where you were, one step Over, number 5, he's not in the right
place. He's got to move eventually. So you know what? If that's where Celeste belongs, why don't we just
swap 5 and 0? So if you want to go ahead and exchange places with each other.

- [7:41:21](https://www.youtube.com/watch?v=undefined&t=27681s) Notice what's just happened. The


problem I'm trying to solve has gotten smaller. Instead of being size 8, now it's size 7. Now granted, I
moved 5 to another wrong location. But if these numbers started off randomly, it doesn't really matter
where 5 goes until we get him into the right place.

- [7:41:37](https://www.youtube.com/watch?v=undefined&t=27697s) So I think we've improved. And


now if I go back, my loop is sort of coming back around. I can ignore Celeste and make this a seven step
problem and not eight because I know she's in the right place. 2 seems to be the smallest. I'll remember
that. Not 7, not 4-- 1 seems to be the smallest. Now I know as a human this should be my next smallest.

- [7:41:58](https://www.youtube.com/watch?v=undefined&t=27718s) But why, intuitively, should I keep


going, do you think? I can't sort of optimize as a human and just say, number 1, let's get you into the
right place. I still want to check the whole array. Why? Yeah. AUDIENCE: Perhaps there's another 1.
DAVID J. MALAN: Maybe there's another 1, and that could be another problem altogether.

- [7:42:16](https://www.youtube.com/watch?v=undefined&t=27736s) Other thoughts? Yeah. AUDIENCE:


Could be another 0 DAVID J. MALAN: There could be another 0 indeed, but I did go through the list once,
right? And I kind of know there isn't. Your thoughts? AUDIENCE: You don't know that every value is
represented. So maybe there's a [INAUDIBLE] You just don't know what kind of data you're working with.

- [7:42:33](https://www.youtube.com/watch?v=undefined&t=27753s) DAVID J. MALAN: Yeah, I don't


necessarily know what is there. And honestly, I only stipulated earlier that I'm using one variable in my
mind. I could use two and remember the two smallest elements I've seen. I could use three variables,
four. But then I'm going to start to use a lot of space in addition to time.

- [7:42:48](https://www.youtube.com/watch?v=undefined&t=27768s) So if I've stipulated that I only


have one variable to solve this problem, I don't know anything more about these elements because the
only thing I'm remembering at this moment is number 1 is the smallest element I've seen. So I'm going
to keep going. 6? Nope. 3? Nope. 5? Nope. OK, I know that number 1, and your name was-- AUDIENCE:
Hannah.

- [7:43:04](https://www.youtube.com/watch?v=undefined&t=27784s) DAVID J. MALAN: --Hannah is the


next smallest element. I could have everyone move over to make room, but nope. 2? You know, even
though you're so close to where I want you, I'm just going to keep it simple and swap you two. So
granted, I've made the problem a little worse. But on average, I could get lucky too and just pop number
2 into the right place.

- [7:43:21](https://www.youtube.com/watch?v=undefined&t=27801s) Now let me just accelerate this. I


can now ignore Hannah and Celeste, making the problem size 6 instead of 8. So it's getting smaller. 7 is
the smallest. Nope, now 4 is-- 2 is the smallest. Still 2, still 2, still 2. So let's go ahead and swap 2 and 7.
And now I'll just kind of orchestrate it verbally.

- [7:43:41](https://www.youtube.com/watch?v=undefined&t=27821s) 4, you're about to have to do


something. So we now have 4, 7, 6 3, 5. OK, 3-- could you swap with 4? All right, now we have 7, 6, 4, 5.
OK, 4, could you swap with 7? Now we have 6, 7, 5. 5, could you swap with 6? And now we have 7, 6. 6,
would you swap at 7? And now perhaps round of applause. They've sorted themselves.

- [7:44:05](https://www.youtube.com/watch?v=undefined&t=27845s) OK, hang on there one minute. So


we'll do this one other approach. And my God, that felt so much slower than the first approach, but
that's, one, because I was kind of providing a long voiceover. But two, we were doing one thing at a time
whereas the first time, you guys had the luxury of moving like eight different CPUs-- brains, if you will--
were all operating at the same time.

- [7:44:28](https://www.youtube.com/watch?v=undefined&t=27868s) And computers like that exist. If


you have a computer with multiple cores, so to speak, that's like having a computer that technically can
do multiple things at once. But software typically, at least as we've written it thus far, can only do one
thing at a time. So in a bit, we'll add up all of these steps.

- [7:44:42](https://www.youtube.com/watch?v=undefined&t=27882s) But for now, let's take one other


approach. If you all could reorder yourselves like that-- 52741630-- let's take the other approach that
was recommended by just fixing small problems and see where this gets us. So we're back in the original
order. 5 and 2 are clearly out of order. So you know what? Let's just bite this problem off now.

- [7:45:02](https://www.youtube.com/watch?v=undefined&t=27902s) 5 and 2, could you swap? Now let


me take a next step. 5 and 7, I think you're OK. There's a gap, yes, but that might not be a big deal. 7 and
4-- problem. Let's have you swap. OK, 7 and 1, let's have you swap. 7 and 6, let's have you swap. 7 and 3,
you swap. 7 and 0, you swap. Now let me pause for just a moment.

- [7:45:24](https://www.youtube.com/watch?v=undefined&t=27924s) Still not sorted. So I'm clearly not


done. But have I improved the problem? Right, I can't see-- like before, I can't optimize like before
because 0 is obviously not here. So unless they're still way back there, so it's not like I've gone from 8
steps to 7 to 6 just yet. But have I made any improvements? AUDIENCE: Yes.

- [7:45:43](https://www.youtube.com/watch?v=undefined&t=27943s) DAVID J. MALAN: Yes. In what


sense is this improved? What's a concrete thing you could point to is better? Yeah. AUDIENCE: Sorted the
highest number. DAVID J. MALAN: I've sorted the highest number, which is indeed 7. And conversely, if
you prefer, Celeste is one step closer to the beginning. Now worst case, Celeste is going to have to move
one step on each iteration.

- [7:46:04](https://www.youtube.com/watch?v=undefined&t=27964s) So I might need to do this thing


like n total times to move her all the way over. But that might work out OK. Let me see. 2 and 5, you're
good. 5 and 4, swap you. 5 and 1, let's swap you. 5 and 6, you're good. 6 and 3, let's swap you. 6 and 0,
let's swap you. 6 and 7, you're good. And I think now-- notice that the high values, as you noted, are sort
of bubbling up, if you will, to the end of the list.

- [7:46:29](https://www.youtube.com/watch?v=undefined&t=27989s) 2 and 4, you're good. 4 and 1,


let's swap. 4 and 5, good. 5 and 3, swap. 5 and 0, swap. 5, 6, 7, of course, are good. So now you can sort
of see the problem resolving itself. And let's just do this part now faster. 2 and 1, 2 and 4. OK, 4 and 3, 4
and 0. All right, now 1 and 2, 2, and 3, and 0, and good.

- [7:46:53](https://www.youtube.com/watch?v=undefined&t=28013s) So we do have some optimization


there. We don't need to keep going because those all are sorted. 1 and 2, you're good. 2 and 0, all right,
done. 1 and 0-- and big round of applause in closing. OK, so thank you all. We need the puppets back,
but you can keep the shirts. Thank you for volunteering here.

- [7:47:13](https://www.youtube.com/watch?v=undefined&t=28033s) Feel free to make your way exits


left or right. And let's see if, thanks to our volunteers here, we can't now formalize a little bit what we did
on both passes here. I claim that the first algorithm our volunteers kindly acted out is what's called
selection sort. And as the name implied, we selected the smallest elements again and again and again,
working our way from left to right, putting Celeste into the right place, and then continuing with
everyone else.

- [7:47:42](https://www.youtube.com/watch?v=undefined&t=28062s) So selection sort, as it's formally


called, can be described, for instance, with this pseudo code here-- 4i from 0 to n minus 1. And again,
why this? This is just how talk about arrays. The left end is 0, the right end is n minus 1 where in this
case, n happened to be eight people. So that's 0 through 7.

- [7:48:03](https://www.youtube.com/watch?v=undefined&t=28083s) So for i from 0 to n minus 1, what


did I do? I found the smallest number between numbers bracket i and numbers bracket n minus 1. It's a
little cryptic at first glance, but this is just a very pseudo code-like way of saying find the smallest
element among all eight volunteers because if i starts at 0 and n minus 1 never changes because there's
always 8, 8 people, so 8 minus 1 is 7, this first says find the smallest number between numbers bracket 0
and numbers bracket 7, if you will.

- [7:48:38](https://www.youtube.com/watch?v=undefined&t=28118s) Then what do I do? Swap the


smallest number with numbers bracket i. So that's how we got Celeste from over here all the way over
there. We just swapped those two values. What then happens next in this pseudo code? i, of course,
goes from 0 to 1. And that's the technical way of saying now find the smallest element among the 7
remaining volunteers, ignoring Celeste this time because she was already in the correct location.

- [7:49:02](https://www.youtube.com/watch?v=undefined&t=28142s) So the problem went from size 8


to size 7. And if we repeat, size 6, 5, 4, 3, 2, 1, until boom, it's all done at the very end. So this is just one
way of expressing in pseudo code what we did a little more organically and a formalization of what
someone volunteered out in the audience. So if we consider, then, the efficiency of this algorithm,
maybe abstracting it away now as a bunch of doors where the left most again is always 0, the right most
is always n minus 1, or equivalently, the second to last is n minus 2, the third to last

- [7:49:35](https://www.youtube.com/watch?v=undefined&t=28175s) is n minus 3 where n might be 8


or anything else, how do we think about or quantify the running time of selection sort? Big O of what? I
mean, that was a lot of steps to be adding up. It's probably more than n, right, because I went through
the list again and again. It was like n plus n minus 1 plus n minus 2.

- [7:49:59](https://www.youtube.com/watch?v=undefined&t=28199s) Any instincts here? We got like


the whole team in the orchestra now. Let me propose we think about it this way with just a bit of
formula, say. So the first time, I had to look at n different volunteers. n was 8 in this case, but generically,
I looked at all eight numbers in order to decide who was the smallest.

- [7:50:20](https://www.youtube.com/watch?v=undefined&t=28220s) And sure enough, Celeste was at


the very end. She happened to be all the way to the right. But I only knew that once I looked at all 8 or
all n volunteers. So that took me n steps first. But once the list was swapped into the right place, then my
problem with size n minus 1, and I had n minus 1 other people to look through.

- [7:50:39](https://www.youtube.com/watch?v=undefined&t=28239s) So that's n minus 1 steps. Then


after that, it's n minus 2 plus n minus 3 plus n minus 4 plus dot dot dot until I had one final step. And it's
obvious that I only have one human left to consider. So we might wave our hands at this with a little
ellipsis and just say dot dot dot plus 1 for the final step.

- [7:50:55](https://www.youtube.com/watch?v=undefined&t=28255s) Now what does this actually


equal? Well, this is where you might think back on, like, your high school math or physics textbook that
has a little cheat sheet at the end that shows these kinds of recurrences. That happens to work out
mathematically to be n times n plus 1 all divided by 2. That's just what that recurrence, that series,
actually adds up to.

- [7:51:13](https://www.youtube.com/watch?v=undefined&t=28273s) So if you take on faith that that


math is correct, let's just now multiply this out mathematically. That's n squared plus n divided by 2 or n
squared divided by 2 plus n over 2. And here's where we're starting to get annoyingly into the weeds.
Like, honestly, as n gets really large, like a million doors or integers or a billion web pages in Google
search engine, honestly, which of these terms is going to matter the most mathematically if n is a really
big number? Is n squared divided by 2 the dominant factor,

- [7:51:46](https://www.youtube.com/watch?v=undefined&t=28306s) or is n divided by 2 the dominant


factor? AUDIENCE: n squared. DAVID J. MALAN: Yeah, n squared. I mean, no matter what n is-- and the
bigger it is, the bigger raising it to the power 2 is going to be. So you know what? Let's just wave our
hands at this because at the end of the day, as n gets really large, the dominant factor is indeed that first
one.

- [7:52:04](https://www.youtube.com/watch?v=undefined&t=28324s) And you know what? Even the


divided 2, as I claimed earlier with our two phone book examples, where the two straight lines if you
keep zooming out essentially looked the same when n is large enough, let's just call this on the order of n
squared. So that is to say a computer scientist would describe bubble sort as taking on the order of n
squared steps.

- [7:52:24](https://www.youtube.com/watch?v=undefined&t=28344s) That's an oversimplification. If we


really added it up, it's actually this many steps-- n squared divided by 2 plus n over 2. But again, if we
want to just be able to generally compare two algorithms' performance, I think it's going to suffice if we
look at that highest order term to get a sense of what the algorithm feels like, if you will, or what it even
looks like graphically.
- [7:52:47](https://www.youtube.com/watch?v=undefined&t=28367s) All right, so with that said, we
might describe bubble sort as being in big O-- sorry, selection sort as being in big O of n squared. But
what if we consider now the best case scenario-- an opportunity to talk about a lower bound? In the
best case, how many steps does selection sort take? Well, here, we need some context.

- [7:53:09](https://www.youtube.com/watch?v=undefined&t=28389s) Like, what does it mean to be the


best case or the worst case when it comes to sorting? Like, what could you imagine meaning the best
possible scenario when you're trying to sort a bunch of numbers? I got the whole crew here again. Yeah.
AUDIENCE: They would already be sorted. DAVID J.

- [7:53:25](https://www.youtube.com/watch?v=undefined&t=28405s) MALAN: All right, they're already


sorted, right? I can't really imagine a better scenario than I have to sort some numbers, but they're
already sorted for me. But does this algorithm leverage that fact in practice? Even if all of our humans
had lined up from 0 to 7, I'm pretty sure I would have pretty naively started here. And yes, Celeste
happens to be here.

- [7:53:43](https://www.youtube.com/watch?v=undefined&t=28423s) But I only know she needs to be


here once I've looked at all eight people. And then I would have realized, well, that was a waste of time. I
can leave Celeste be. But then what would I have done? I would have ignored her position because
we've solved one problem. I would have done the same thing now for seven people, then six people.

- [7:54:01](https://www.youtube.com/watch?v=undefined&t=28441s) So every time I walk through, I'm


not doing much useful work. But I am doing those comparisons because I don't know until I do the work
that the people were in the right order. So this would seem to imply that the omega notation, the best
case scenario, even, a lower bound on the running time would be what, then? AUDIENCE: [INAUDIBLE]
DAVID J.

- [7:54:22](https://www.youtube.com/watch?v=undefined&t=28462s) MALAN: A little louder?


AUDIENCE: N squared. DAVID J. MALAN: It's still going to be n squared, in fact, because the code I'm
giving myself doesn't leverage or benefit from any of that scenario because it just mindlessly continues
to do this again and again. So in this case, yes, I would claim that the omega notation for selection sort is
also big O of n squared.

- [7:54:43](https://www.youtube.com/watch?v=undefined&t=28483s) So those are the kinds of numbers


to beat. It seems like the upper bound and lower bound of selection sort are indeed n squared. And so
we can also describe selection sort, therefore, as being in theta of n squared. That's the first algorithm
we've had the chance to describe that in, which is to say that it's kind of slow.

- [7:54:59](https://www.youtube.com/watch?v=undefined&t=28499s) I mean, maybe other algorithms


are slower, but this isn't the best starting point. Can we do better? Well, there's a reason that I guided us
to doing the second algorithm second. Even though you verbally proposed them in a different order, this
second algorithm we did is generally known as bubble sort.

- [7:55:13](https://www.youtube.com/watch?v=undefined&t=28513s) And I deliberately used that word


a bit ago, saying the big values are bubbling their way up to the right to kind of capture the fact that,
indeed, this algorithm works differently. But let's consider if it's better or worse. So here, we have
pseudo code for bubble sort. You could write this too in different ways.
- [7:55:30](https://www.youtube.com/watch?v=undefined&t=28530s) But let's consider what we did on
the stage. We repeated the following n minus 1 times. We initialized at least, even though I didn't
verbalize it this way, a variable like i from 0 to n minus 2, n minus 2. And then I asked this question. If
numbers bracket i and numbers bracket i plus 1 are out of order, then swap them.

- [7:55:55](https://www.youtube.com/watch?v=undefined&t=28555s) So again, I just did it more


intuitively by pointing, but this would be a way, with a bit of pseudo code, to describe what's going on.
But notice that I'm doing something a little differently here. I'm iterating from if equals 0 to n minus 2.
Why? Well, if I'm comparing two things, left hand and right hand, I'd still want to start at 0.

- [7:56:13](https://www.youtube.com/watch?v=undefined&t=28573s) But I don't want to go all the way


to n minus 1 because then, I'd be going past the boundary of my array, which would be bad. I want to
make sure that my left hand-- i, if you will-- stops at n minus 2 so that when I plus 1 in my pseudo code,
I'm looking at the last two elements, not the last element and then pass the boundary.

- [7:56:31](https://www.youtube.com/watch?v=undefined&t=28591s) That's actually a common


programming mistake that we'll undoubtedly soon make by going beyond the boundaries of your array.
So this pseudo code, then, allows me to say compare every one again and again and swap them if they're
out of order. Why do I repeat the whole thing n minus 1 times? Like, why does it not suffice just to do
this loop here? Think what happened with Celeste.

- [7:56:59](https://www.youtube.com/watch?v=undefined&t=28619s) Why do I repeat this whole thing


n minus 1 times? Yeah, in the back? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Indeed, and I think if I can
recap accurately, think back to Celeste again. And I'm sorry to keep calling on you as our number 0. Each
time through bubble sort, she only moved one step. And so in total, if there's n locations, at the end of
the day, she needs to move n minus 1 steps to get 0 all the way to where it needs to be.

- [7:57:30](https://www.youtube.com/watch?v=undefined&t=28650s) And so this inner loop, if you will,


where we're iterating using i, that just fixes some of the problems. But it doesn't fix all of the problems
until we do that same logic again and again and again. And so how might we quantify the running time of
this algorithm? Well, one way to see it is to just literally look at the pseudo code.

- [7:57:49](https://www.youtube.com/watch?v=undefined&t=28669s) The outer loop repeats n minus 1


times by definition. It literally says that. The inner loop, the for loop, also iterates n minus 1 times. Why?
Because it's going from 0 to n minus 2. And if that's hard to think about, that's the same thing is 1 to n
minus 1 if you just add 1 to both ends of the formula.

- [7:58:10](https://www.youtube.com/watch?v=undefined&t=28690s) So that means you're doing n


minus 1 things n minus 1 times. So I literally multiply how many times the outer loop is running by how
many times the inner loop is running, which gives me sort of FOIL method n minus 1 squared. And I
could multiply that whole thing out. Well, let's consider this just a little more methodically here.

- [7:58:29](https://www.youtube.com/watch?v=undefined&t=28709s) If I have n minus 1 on the outer, n


minus 1 on the inner-- let's go ahead and FOIL this. So n squared minus n minus n plus 1, combine like
terms-- n squared minus 2n plus 1. And now which of these terms is clearly going to be dominant, so to
speak? The-- AUDIENCE: N squared. DAVID J. MALAN: --the n squared.
- [7:58:48](https://www.youtube.com/watch?v=undefined&t=28728s) So yes, even though minus 2n is a
good thing because it's subtracting off some of the time required, plus 1 is not that big a thing, there's
such drops in the bucket when n gets really large, like in the millions or billions, certainly, that bubble
sort 2 is on the order of n squared. It's not the same exactly as selection sort.

- [7:59:05](https://www.youtube.com/watch?v=undefined&t=28745s) But as n gets big, honestly, we're


barely going to be able to notice the difference most likely. And so it too might be said to be on the order
of n squared. And if we consider now the lower bound on bubble sort's running time, here's where
things get potentially interesting. What might you claim is the running time of bubble sort in the best
case? And the best case, I claim, is when the numbers are already sorted.

- [7:59:33](https://www.youtube.com/watch?v=undefined&t=28773s) Is our pseudo code going to take


that into account? AUDIENCE: N DAVID J. MALAN: OK, n. Why do you propose n? AUDIENCE:
[INAUDIBLE] DAVID J. MALAN: Yes, and that's the key word. To summarize, in bubble sort, I do have to
minimally make one pass because if I don't look at all n elements, that I'm theoretically just guessing if
it's sorted or not.

- [7:59:54](https://www.youtube.com/watch?v=undefined&t=28794s) Like, I obviously intuitively have to


look at every element to decide yay or nay, it's in the right order. And my original pseudo code, though, is
pretty naive. It's just going to blindly go back and forth n minus 1 times again and again, and that's going
to add up. But what if I add a bit of an optimization that you might have glimpsed on the slide a moment
ago where if I compare two people and I don't swap them, compare two people, don't swap them, and I
go all the way through the list comparing

- [8:00:19](https://www.youtube.com/watch?v=undefined&t=28819s) every pair of adjacent people, and


I make no swaps, it would be kind of not just naive but stupid to do that same process again because if
the humans have not moved, I'm not going to make any different decisions. I'm going to do nothing
again, nothing again. So at that point, it would be stupid, very inefficient, to go back and forth and back
and forth.

- [8:00:39](https://www.youtube.com/watch?v=undefined&t=28839s) So if I modify our pseudo code


with just an additional if condition, I bet we can speed this up. Inside of that same pseudo code, what if I
say, hey, if no swaps, quit? Like quit, prematurely before the loops are finished running. One of the loops
has gone through per the indentation here. But if I do a loop from left to right and I have made no swaps,
which you can think of as just being one other variable that's plus plusing as I go keeping, track of how
many swaps-- if I've made no swaps from left to right,

- [8:01:09](https://www.youtube.com/watch?v=undefined&t=28869s) I'm not going to make any swaps


the next time around either. So let's just quit at that point. And that is to say in the best case, if you will,
when the list is already sorted, the omega notation for bubble sort might indeed be omega of n if you
add that optimization so as to short circuit all of that inefficient looping to do it only as many times as is
necessary.

- [8:01:34](https://www.youtube.com/watch?v=undefined&t=28894s) Let me pause to see if there's any


questions here. Yeah. AUDIENCE: [INAUDIBLE] to optimize the running time for all cases possible? DAVID
J. MALAN: Good question. If the running time of selection sort and bubble sort are both in big O of n
squared but selection sort is in omega of n squared while bubble sort is in omega of n, which sounds
better-- I think if I may, should we just always use bubble sort? Yes if we think that we might benefit over
time from a lot of good case scenarios or best case scenarios.

- [8:02:14](https://www.youtube.com/watch?v=undefined&t=28934s) However, the goal at hand in just


a bit is going to be to do even better than both of these. So hold that question further for a moment.
Yeah. AUDIENCE: [INAUDIBLE] n minus 1? DAVID J. MALAN: No. So yes, good question. So I say omega of
n, but is it technically omega of n minus 1? Maybe, but again, we're throwing away lower order terms.

- [8:02:34](https://www.youtube.com/watch?v=undefined&t=28954s) And that's an advantage because


we're not comparing things ever so precisely. Just like I plotted with the green and yellow and red chart, I
just want to get a sense of the shape of these algorithms so that when n gets really large, which of these
choices is going to matter the most? At the end of the day, it's actually perfectly reasonable to use
selection sort or bubble sort if you don't have that much data because they're going to be pretty fast.

- [8:02:57](https://www.youtube.com/watch?v=undefined&t=28977s) My God, our computers


nowadays are 1 gigahertz, 2 gigahertz, 1 billion things per second, 2 billion things per second. But if we
have large data sets, as we will later in the term and as you might in the real world, that the Googles of
the world, then you're going to want to be more thoughtful. And that's where we're going today.

- [8:03:12](https://www.youtube.com/watch?v=undefined&t=28992s) All right, so let's actually see this


visualized a little bit. In a moment, I'm going to change screens here to open up what is a little
visualization tool that will give us a sense of how these things actually work and look at a faster rate than
our humans are able to do here on stage. So here is another visualization of a bunch of numbers, an
array of numbers.

- [8:03:32](https://www.youtube.com/watch?v=undefined&t=29012s) Short bars mean small numbers,


tall bars mean big numbers. So instead of having the numbers on their torsos here, we just have bars
that are small or tall based on the magnitude of the number. Let me go ahead, and I preconfigured this
in advance to operate somewhat quickly. Let's go ahead and do selections sort by clicking this button.

- [8:03:50](https://www.youtube.com/watch?v=undefined&t=29030s) And you'll see some pink bars


flying by. And that's like me walking left and right, left and right, to select the next smallest number. And
so what you'll see happening on the left of this array of numbers is Celeste, if you will, and all of the
other smaller numbers are appearing on the left while we continue to solve the remaining problems to
the right.

- [8:04:10](https://www.youtube.com/watch?v=undefined&t=29050s) So again, we no longer have to


touch the smaller numbers here. So that's why the problem is getting smaller and smaller and smaller
over time. But you can notice now visually, look at how many times we're retracing our steps. This is why
things that are n squared tend to be frowned upon if avoidable because I'm touching the same elements
again and again.

- [8:04:30](https://www.youtube.com/watch?v=undefined&t=29070s) When I was walking through, I


kept pointing at the same humans again and again. And that adds up. So let's see if bubble sort looks or
feels a little different. Let me re-randomize the thing, and let me now click Bubble Sort at the top. And as
you might infer, there's other sorting algorithms out there, not all of which we'll look at.
- [8:04:45](https://www.youtube.com/watch?v=undefined&t=29085s) But here's bubble sort. Same pink
coloration, but it's doing something different. It's two pink bars going through again and again comparing
the adjacent numbers. And you'll see that the largest numbers are indeed bubbling the way up to the
right, but the smaller numbers, like our number 0 was, is only slowly making its way over.

- [8:05:05](https://www.youtube.com/watch?v=undefined&t=29105s) Here's a comparable. Here's the


number one. And it's going to take a while to get all the way to the left. And here too, notice how many
times the same bars are becoming pink, how many times the algorithm is retracing and retracing its
steps. Why? Because it's only solving one problem at a time on each pass.

- [8:05:22](https://www.youtube.com/watch?v=undefined&t=29122s) And each time we do that, we're


stepping through practically the whole array. And now granted, I could speed this up even further if I
really wanted to, but my God, this is only, what, like 50 or 60 elements, something like that? This is slow.
Like, this is what n squared looks like and feels like. And now I'm just trying to come up with words to say
until we get to the finish line here.

- [8:05:41](https://www.youtube.com/watch?v=undefined&t=29141s) Like, this would be annoying if


this is the speed of sorting, and this is why I sort of secretly sorted the numbers for Rave in advance
because it would have taken us an annoying number of steps to get that in place for her. So those two
algorithms are n squared. Can we do, in fact, better? Well, to save the best algorithm for last, let's take a
shorter five minute break here.

- [8:06:00](https://www.youtube.com/watch?v=undefined&t=29160s) And when we come back, we'll do


even better than n squared. All right. So the challenge at hand is to do better than selection sort and
better than bubble sort and ideally not just marginally better but fundamentally better. Just like in week
zero, that third and final divide and conquer algorithm was sort of fundamentally faster than the other
two.

- [8:06:23](https://www.youtube.com/watch?v=undefined&t=29183s) So can we do better than


something on the order of n squared? Well, I bet we can if we start to approach the problem a little
differently. The sorts we've done thus far, generally known as comparison sorts-- and that kind of
captures the reality that we were doing a huge number of comparisons again and again.

- [8:06:38](https://www.youtube.com/watch?v=undefined&t=29198s) And you kind of saw that in the


vertical bars that were going pink as everything was being compared again and again. But there's this
programming technique, and it's actually a mathematical technique known as recursion that we've
actually seen before. And this is a building block or a mental model we can bring to bear on the problem
to solve the sorting problem sort of fundamentally differently.

- [8:06:58](https://www.youtube.com/watch?v=undefined&t=29218s) But first, let's look at it in a more


familiar context. A little bit ago, I proposed this pseudo code for the binary search algorithm. And notice
that what was interesting about this code, even though I didn't call it out at the time, it's kind of
cyclically defined. Like, I claim this is an algorithm for search, and yet it seems a little unfair that I'm using
the verb search inside of the algorithm for search.

- [8:07:23](https://www.youtube.com/watch?v=undefined&t=29243s) It's like an English sort of defining


a word by using the word. Normally, you shouldn't really get away with that. But there's something
interesting about this technique here because even though this whole thing is a search algorithm and I'm
using my own algorithm to search the left half or the right half, the key feature here that doesn't
normally happen in English when you define a word in terms of a word is that when I search the left half
or search the right half, yes, I'm doing the same thing.

- [8:07:51](https://www.youtube.com/watch?v=undefined&t=29271s) I'm using the same algorithm. But


the problem is, by definition, half as large. So this isn't going to be a cyclical argument in the same way.
This approach, by using search within search is going to whittle the problem down and down and down
until hopefully, one door or no doors remains. And so recursion is a programming technique whereby a
function calls itself.

- [8:08:14](https://www.youtube.com/watch?v=undefined&t=29294s) And we haven't seen this yet in C,


and we haven't seen this really in Scratch. But in C, you can have a function call itself. And the form that
takes is like literally using the function's name inside of the function's implementation itself. We've
actually seen an opportunity for this once before too.

- [8:08:33](https://www.youtube.com/watch?v=undefined&t=29313s) Think back to week zero. Here's


that same pseudo code for searching for someone in an actual, physical phone book. And notice these
yellow lines here. We described those in week zero as inducing a loop, a cycle. And this is a very
procedural approach, if you will, because lines 8 and 11 are very mechanically, if you will, telling me to go
back to line three to do this kind of looping thing.

- [8:08:55](https://www.youtube.com/watch?v=undefined&t=29335s) But really, what that's doing in the


binary search algorithm for the phone book is it's just telling me to search the left half or search the right
half. I'm doing it more mechanically again by sort of telling myself what line number to go back to. But
that's equivalent to just telling myself go search the left half, search the right half, the key thing being the
left have and the right half are smaller than the original problem.

- [8:09:18](https://www.youtube.com/watch?v=undefined&t=29358s) It would be a bug if I just said


search the phone book, search the phone book, because obviously, you never get anywhere. But if you
search the half, the half, the half, problem gets smaller and smaller. So let's reformulate week zero's
phone book code to be not procedural as here but recursive whereby in this search algorithm, AKA
binary search, formerly called divide and conquer, I'm going to literally use also the keyword search here.

- [8:09:46](https://www.youtube.com/watch?v=undefined&t=29386s) Notice among the benefits of


doing this is it kind of tightens the code up, makes it a little more succinct, even though that's kind of a
fringe benefit here. But it's an elegant way too of describing a problem by just having a function use
itself to solve a smaller puzzle at hand. So let's now consider a familiar problem, a smaller version than
the one you've dabbled with-- this sort of pyramid, this half pyramid from Mario.

- [8:10:13](https://www.youtube.com/watch?v=undefined&t=29413s) And let's throw away the parts


that aren't that interesting and just consider how we might, up until now, implement this in C code, this
left aligned pyramid, if you will. Let me go over here, and let me create a file called-- how about
iteration.c? And in this file, I'm going to go ahead and include cs50.h.

- [8:10:32](https://www.youtube.com/watch?v=undefined&t=29432s) And I'm going to include stdio.h.


And the goal at hand is to implement in C a little program that just prints out this and exactly this
pyramid. So no get string or any of that-- we're just going to keep it simple and print exactly this pyramid
of height 4 here. So how might I do this? Well, let me go ahead, and in main, let me first ask the user
for-- well, we'll go ahead and generalize it.

- [8:10:57](https://www.youtube.com/watch?v=undefined&t=29457s) Let's go ahead and ask the user


for heights. We're using getint as before. And I'll store that in a variable called height. And then let me go
ahead and simply call the function draw passing in that height. So for the moment, let me assume that
someone somewhere has implemented a draw function.

- [8:11:11](https://www.youtube.com/watch?v=undefined&t=29471s) And this, then, is the entirety of


my program. All right, unfortunately, C does not come with a draw function. So let me go ahead and
invent one. It doesn't need to return a value. It just needs to print something-- so-called side effect. So
I'm going to define a function called draw that takes as input an int.

- [8:11:28](https://www.youtube.com/watch?v=undefined&t=29488s) I'll call it n for number, but I could


call it anything I want. And inside of this. I'm going to go ahead and print out a left aligned pyramid like
this from top to bottom. The salient features here are that this is a pyramid, at least in this example, of
height four. And now in height four, the first row has one brick.

- [8:11:47](https://www.youtube.com/watch?v=undefined&t=29507s) The second row has two. The


third has three. The fourth has four. That's a nice pattern that I can probably represent in code. So how
might I do this? Well, how about 4 int i gets-- let me do it the old school way-- 1. And then i is less than
or equal to n. And then i plus plus-- so I'm going from 1 to 4 just to keep myself sane here.

- [8:12:08](https://www.youtube.com/watch?v=undefined&t=29528s) And then inside of this loop, what


do I want to do? Well, let me keep it conventional, in fact. Let me just change this to be the more
conventional 0 to n even though it might not be as intuitive because now on row 0, I want one brick. On
row 1, I want two bricks, dot dot dot. On row 3, I want four. So it's kind of offset now.

- [8:12:27](https://www.youtube.com/watch?v=undefined&t=29547s) But I'm being more conventional.


So on each row, how many bricks do I want to print? Well, I think I want to do this. For int j, for instance,
common to use j after if you have a nested loop, let's start j at 0 and do this so long as is less than i plus 1
and then do j plus plus. So why i plus 1? Well, again, when I equals 0, that's the first row, and I want one
brick.

- [8:12:55](https://www.youtube.com/watch?v=undefined&t=29575s) When i equals 1, that's the


second row. I want two bricks. And dot dot dot, when i is 3, I want four bricks. So again, I have to add 1
to i to get the total number of bricks that I want to print to the screen. So inside of this nested for loop,
I'm going to do printf of a hash with no new line. I'm going to save the new line for about here instead.

- [8:13:17](https://www.youtube.com/watch?v=undefined&t=29597s) All right, the last thing I'm going


to do is copy and paste the prototype at the top of the file. So that I can call this. And again, this is of
now week one, week two. Wouldn't necessarily come to your mind as quickly as it might to mine after all
this practice, but this is something reminiscent of what you yourself did already for Mario-- printing out a
pyramid that hopefully in a moment is going to look like this.

- [8:13:39](https://www.youtube.com/watch?v=undefined&t=29619s) So let me go back to my code. Let


me run make iteration, and let me do dot slash iteration. I'll type in 4, and voila. Seems to be correct, and
let's assume it's going to work for other inputs as well. Oh, thank you. So this is indeed an example of
iteration-- doing something again and again. And it's very procedural.

- [8:14:03](https://www.youtube.com/watch?v=undefined&t=29643s) Like, I literally have a function


called draw that does this thing. But I can think about implementing draw in a somewhat different way
that's kind of clever. And it's not strictly necessary for this problem because this problem honestly is not
that complicated to solve once you have practice under your belt.

- [8:14:18](https://www.youtube.com/watch?v=undefined&t=29658s) Certainly the first time around,


probably significantly challenging. But now that you kind of associate, OK, row one with one brick, row
two with two bricks, it kind of comes together with these for loops. But how else could we think about
this problem? Well, this physical structure, these bricks, in some sense is a recursive structure, a
structure that's defined in terms of itself.

- [8:14:39](https://www.youtube.com/watch?v=undefined&t=29679s) Now what do I mean by that?


Well, if I were to ask you the question, what does a pyramid of height 4 look like, you would point, of
course, to this picture. But you could also kind of cleverly say to me, well, it's actually a pyramid of height
3 plus 1 additional row. And here's that cyclical argument, right? Kind of obnoxious to do typically in
English or in a spoken language because you're defining one thing in terms of itself.

- [8:15:08](https://www.youtube.com/watch?v=undefined&t=29708s) What's a pyramid of height 4?


Well, it's a pyramid of height 3 plus 1 more row. But we can kind of leverage this logic in code. Well,
what's a pyramid of height 3? Well, it's a pyramid of height 2 plus 1 more row. Fine, what's a pyramid of
height 2? Well, it's a pyramid of height 1 plus 1 more row.

- [8:15:24](https://www.youtube.com/watch?v=undefined&t=29724s) And then hopefully, this process


ends, and it does because notice, the pyramid is getting smaller and smaller. So you're not going to have
this sort of silly back and forth with me infinitely many times because when we finally get to the base
case, the end of the pyramid, fine. What is a pyramid of height 1? Well, it's a pyramid of no height plus
one more row.

- [8:15:43](https://www.youtube.com/watch?v=undefined&t=29743s) And at that point, things just get


negative-- no pun intended. Things just would otherwise go negative. And so you can just kind of stop.
The base case is when there is no more pyramid. So there's a way to draw a line in the sand and say,
stop, no more arguments. But this idea of defining a physical structure in terms of itself or code in terms
of itself actually lets us do some interesting new algorithms.

- [8:16:07](https://www.youtube.com/watch?v=undefined&t=29767s) Let me go back to my code here.


Let me go ahead and create one final file here called recursion.c that leverages this idea of this built-in
self-referential nature. Let me include cs50.h. Let me go ahead and include standardio.h, int main void.
And then inside of main, I'm going to do the exact same thing-- int height equals get int, asking the user
for height.

- [8:16:35](https://www.youtube.com/watch?v=undefined&t=29795s) And then I'm going to go ahead


and call draw passing in height. So that's going to stay the same. I even am going to make my prototype
the same-- void draw int n semicolon. And now I'm going to implement void down here with that same
prototype, of course. But the code now is going to be a little different.
- [8:16:53](https://www.youtube.com/watch?v=undefined&t=29813s) What am I going to do here? Well,
first of all, if you ask me to draw a pyramid of height n, I'm going to be kind of a wise ass here and say,
well, just draw a pyramid of n minus 1-- done. All right, but there's still a little more work to be done.
What happens after I print or draw a pyramid of height n minus 1 according to our structural definition a
moment ago? What remains after drawing a pyramid of height n minus 1 or 3, specifically? AUDIENCE:
[INAUDIBLE] We need one more row of hashes.

- [8:17:27](https://www.youtube.com/watch?v=undefined&t=29847s) OK, so I can do that, right? I'm OK


with the single loops. There's no nesting necessary here. I'm just going to do this-- for int i get 0, i is less
than n, which is the height that's passed in, i plus plus. And then inside of this loop, I'm very simply going
to print out a single hash.

- [8:17:41](https://www.youtube.com/watch?v=undefined&t=29861s) And then down here, I'm going to


print out a new line at the very end. So that's good, right? I might not be as comfortable with nested
loops. This is nice and simple. What does this loop do here on line 17 through 20? It literally prints n
hashes by counting from i equals 0 on up to but not through n.

- [8:17:59](https://www.youtube.com/watch?v=undefined&t=29879s) So that's sort of week one style


syntax. But this is kind of trippy now because I've somehow boiled down the implementation of draw
into printing a row after just drawing the thing above it. But this is problematic as is because in this case,
my drawer function, notice, is always going to call the draw function forever in some sense.

- [8:18:23](https://www.youtube.com/watch?v=undefined&t=29903s) But ideally, when do I want this


cyclical process to stop? When do I want to not call draw anymore? Yeah, when n is 1, right? When I get
to the top of the pyramid, when n is 1, or heck, when the pyramids all gone and n equals 0. I can pick any
line in the sand, so long as it's sort of at the end of the process.

- [8:18:44](https://www.youtube.com/watch?v=undefined&t=29924s) Then I don't want to call draw


anymore. So maybe what I should do is this. If n equals equals 0, there's really nothing to draw. So I'm
just going to go ahead and return like this. Otherwise, I'm going to go ahead and draw n minus 1 rows
and then one more row. And I could express this differently.

- [8:19:06](https://www.youtube.com/watch?v=undefined&t=29946s) I could do something like this,


which would be equivalent. I could say something like if n is greater than or equal to 0, then go ahead
and draw the row. But I like it this way first. For now, I'm going to go with the original way just to ask a
simple question and then just bail out of the function if n equals 0.

- [8:19:23](https://www.youtube.com/watch?v=undefined&t=29963s) And heck, just to be super safe,


just in case the user types in a negative number, let me also just check if n is a negative number, also, just
return immediately. Don't do anything. I'm not returning a value because again, the function is void. It
doesn't need or have a return value. So just saying return suffices.

- [8:19:40](https://www.youtube.com/watch?v=undefined&t=29980s) But if n equals 1 or 2 or 3 or


anything higher, it is reasonable to draw a pyramid of slightly shorter height like, instead of 4, 3, and then
go ahead and print one more row. So this is an example now of code that calls itself within itself. Draw is
calling draw. But this so-called base case ensures, this conditional ensures, that we're not going to do
this forever.
- [8:20:09](https://www.youtube.com/watch?v=undefined&t=30009s) Otherwise, we literally would do
this infinitely many times, and something bad is probably going to happen. All right, let me go ahead and
compile this code-- make recursion. OK, no syntax errors-- dot slash recursion, Enter, height of 4, and
voila. If only because some of you have run into this issue accidentally already, let me get rid of the base
case here, and let me recompile the code.

- [8:20:33](https://www.youtube.com/watch?v=undefined&t=30033s) Make recursion. Oh, and actually,


now it's actually catching it. So the compiler is smart enough here to realize that all paths through this
function will call itself. AKA, It's going to loop forever. So let me do the first thing. Suppose I only check
for n equaling 0. Let me go ahead and recompile this code with make recursion.

- [8:20:54](https://www.youtube.com/watch?v=undefined&t=30054s) And now let me just be kind of


uncooperative. When I run this program, still works for 4, still works for 0. What if I do like negative 100?
Have any of you experienced a segmentation fault or core dump? OK, so no shame in this. Like, this
means I have somehow touched memory that I shouldn't have. And in short, I actually called this
function thousands of times accidentally, it would seem now, until the program just bailed on me
because I eventually touched memory in the computer that I shouldn't have.

- [8:21:24](https://www.youtube.com/watch?v=undefined&t=30084s) That'll make even more sense


next week. But for now, it's simply a bug. And I can avoid that bug in this context, probably not your own
pset context, by just making sure we don't even allow for negative numbers at all. So with this building
block in place, what can we now do in terms of those same numbers to sort? Well, it turns out there's a
sorting algorithm called merge sort.

- [8:21:45](https://www.youtube.com/watch?v=undefined&t=30105s) And there's bunches of others


too. But merge sort is a nice one to discuss because it fundamentally, we hope, is going to do better than
selection sort and bubble sort that is better than n squared. But the catch is it's a little harder to think
about. In fact, I'll act it out myself with just these numbers on the shelf here rather than humans because
recursion in general takes a little bit of effort to wrap your mind around, typically a bit of practice.

- [8:22:08](https://www.youtube.com/watch?v=undefined&t=30128s) But I'll see if we can't walk


through it methodically enough such that this comes to light. So here's the pseudo code I propose for
this algorithm called merge sort. In the spirit of recursion, this sorting algorithm literally calls itself by
using the verb sort in its pseudo code. So how does merge sort work? It sort of obnoxiously says, well, if
you want to sort all of these things, go sort the left half, then go sort the right half, and then merge the
two together.

- [8:22:35](https://www.youtube.com/watch?v=undefined&t=30155s) Now obnoxious in what sense?


Well, if I just asked you to sort something and you just tell me, well, go sort that thing and then go sort
that thing, what was the point of asking you in the first place? But the key is that each of these lines is
sorting a smaller piece of the problem. So eventually, we'll be able to pare this down into something that
doesn't go on forever because in fact, in merge sort, there's a base case too.

- [8:22:56](https://www.youtube.com/watch?v=undefined&t=30176s) There's a scenario where we just


check, wait a minute, if there's only one number to sort, that's it. Quit then because you're all done. So
there has to be this base case in any use of recursion to make sure that you don't mindlessly call yourself
forever. You've got to stop at some point.
- [8:23:14](https://www.youtube.com/watch?v=undefined&t=30194s) So let's focus on the third of these
steps. What does it mean to merge two lists, two halves of a list, just because this is apparently going to
be a key ingredient-- so here, for instance, are two halves of a list of size 8. We have the numbers 2-- and
I'll call it out if you're at a bad angle-- 2457 and 0136.

- [8:23:37](https://www.youtube.com/watch?v=undefined&t=30217s) Notice that the left half at the


moment, 2457, is already sorted, and the right half, 0136, is also sorted as well. So that's a good thing
because it means that theoretically, I've sorted the left half already. I've sorted the right half already
before we began. I just need to merge these two halves.

- [8:23:54](https://www.youtube.com/watch?v=undefined&t=30234s) What does it mean to sort two


halves? Well, for the sake of discussion, I'm just going to turn over most of the numbers except for the
first numbers in each of these halves. There's two halves here, left and right. At the moment, I'm only
going to consider the leftmost element of each half-- that is, the one on the left here and the one on the
left here.

- [8:24:15](https://www.youtube.com/watch?v=undefined&t=30255s) How do I merge these two lists


together? Well, if I look at 2 and I look at 0, which one should presumably come first? The smaller one.
So I'm going to grab the 0, and I'm going to put it into its own place on this new shelf here. And now I'm
going to consider, as part of my iteration, the beginning of this list and the new beginning of this list.

- [8:24:38](https://www.youtube.com/watch?v=undefined&t=30278s) So I'm now comparing 2 and 1.


Which one's smaller? I'm going to go ahead and grab the 1. Now I'm going to compare the beginning of
the left list and the new beginning of the right list, 2 and 3. Of course, it's 2. Now I'm going to compare
the beginning of the left list and the beginning of the right list, 4 and 3.

- [8:24:54](https://www.youtube.com/watch?v=undefined&t=30294s) It's of course 3. Now I'm going to


compare the 4 against the beginning and end, it turns out, of the second list-- 4, of course. Now I'm
going to compare the beginning of the left list and the beginning of the right list-- 5, of course. I'm
realizing this is not going to end well because I left too much distance between the numbers.

- [8:25:11](https://www.youtube.com/watch?v=undefined&t=30311s) But that has nothing to do with


the algorithm. 7 is the beginning of the left list. 6 is the beginning of the right list. It's, of course, 6. And
at the risk of knocking all of these over, if I now make room for this element, we have hopefully sorted
the whole thing by having merged together the two halves of the list.

- [8:25:35](https://www.youtube.com/watch?v=undefined&t=30335s) So in short-- thank you. I'm a little


worried that's just getting sarcastic now, but we now have merged two half lists. We haven't done the
guts of the algorithm yet-- sort the left half and sort the right half. But I claim that that is how
mechanically you merge two sorted halves. You keep looking at the beginning of each list, and you just
kind of weave them together based on which one belongs first based on its size.

- [8:26:05](https://www.youtube.com/watch?v=undefined&t=30365s) So if you agree that that was a


reasonable way to merge two lists together, let's go ahead and focus lastly on what it means to actually
sort the left half and sort the right half of a whole bunch of numbers. And for this, I'm going to go ahead
and order them in this seemingly random order. And I just have a little cheat sheet above so that I don't
mess up.
- [8:26:25](https://www.youtube.com/watch?v=undefined&t=30385s) And I'm going to start at the very
top this time. And hopefully, these will not fall down at any point. But I'm just deliberately putting them
in this random order, 5274. And then we have 1630-- 1630. Hopefully this won't fall over. Here is now an
array of size 8 with eight integers. And I want to sort this.

- [8:26:50](https://www.youtube.com/watch?v=undefined&t=30410s) I could use selection sort and just


go back and forth and back and forth. I could use bubble sort and just compare pairs, pairs, pairs. But
those are going to be on the order of big O of n squared. My hope is to do fundamentally better here. So
let's see if we can do better. All right, so let me look now at my code.

- [8:27:04](https://www.youtube.com/watch?v=undefined&t=30424s) I'll keep it on the screen. How do I


implement merge sort? Well, if there's only one number, I quit. There's obviously not. There's eight
numbers, so that's not applicable. I'm going to go ahead and sort the left half of numbers. All right,
here's the left half-- 5274. Do I sort an array of size 4? Well, here's where the recursion kicks in.

- [8:27:25](https://www.youtube.com/watch?v=undefined&t=30445s) How do you sort a list of size 4?


Well, there's the pseudo code on the board. I sort the left half of the list of size 4. So here we go. I have a
list of size 4. How do I sort it? I sort the left half. All right, now I have a list of size 2. How do I sort this?
Well, sort the left half. So here we go. Here's a list of size 1.

- [8:27:45](https://www.youtube.com/watch?v=undefined&t=30465s) How do I sort this? I think it's


done, right? That's quit, right? If only one number, I'm done. The 5 is sorted. All right, what was the next
step? You have to now rewind in time. I just sorted the left half of the left half of the left half. What do I
now sort? The right half, which is 2. This is one element.

- [8:28:06](https://www.youtube.com/watch?v=undefined&t=30486s) So I'm done. So now at this point


in the story, I have sorted, sort of idiotically-- the 5 assorted, and the 2 is sorted. But what's the third and
final step of this phase of the algorithm? Merge the two together. So here's the left, here's the right list.
How do I merge these together? I compare the lists, and I put the two there.

- [8:28:26](https://www.youtube.com/watch?v=undefined&t=30506s) I only have the [? 5 ?] left, and I


do that. So now we see some visible progress. But again, let's rewind. How did we get here? We started
to sort the left half of the left half of the left half, then the right half. And now where are we? We've just
sorted the left half of the left half. So what comes after sorting the left half of anything? Right half.

- [8:28:46](https://www.youtube.com/watch?v=undefined&t=30526s) All right, here's the sort of same


nonsensical thing. Here's a list of size 2. Let's sort the left half. Done. Let's sort the right half. Done.
What's the third step? Merge them together. So that's the 4, and that's the 7. What have I now done? In
total, I've now sorted the left half of the original thing.

- [8:29:05](https://www.youtube.com/watch?v=undefined&t=30545s) So what happens next? Wait a


minute, wait a minute. I have not done that. What have I done? I have sorted the left half of the left half,
and I've sorted the right half of the left half. What do I now need to do lastly? Merge those two lists
together. So again, I put my finger on the beginning of this list, the beginning of this list.

- [8:29:25](https://www.youtube.com/watch?v=undefined&t=30565s) And if you want, I'll do the same


thing when I merged last time to be clear what I'm comparing. 2 and 4-- the 2 obviously comes first.
What comes next? Well, the 4 comes next. What comes next? The 5 comes next and then lastly, of
course, the 7. Notice that the 2457 are now sorted. So the original left half is sorted.

- [8:29:47](https://www.youtube.com/watch?v=undefined&t=30587s) And I'll do the rest a little faster


because, my God, this feels like it takes forever. But I bet we're on to something here. What step remains
next? I've just sorted the left half of the original. Sort the right half of the original. How do I sort this? I
sort the left half of the right half. How do I sort this? I sort the left half of the left half.

- [8:30:06](https://www.youtube.com/watch?v=undefined&t=30606s) Done. I sort the right half of the


left half. Done. Now I merge the two together. The 1 comes first, the 6 comes next. Now I sort the right
half of the right half. What do I do? Sort the left half. Done. Sort the right half. Done. What do I do?
Merge them together. So that's the third step of that phase. Now where are we in the stor-- oh my God,
where are we in the story? We have sorted the left half of the right half and the right half of the right
half.

- [8:30:39](https://www.youtube.com/watch?v=undefined&t=30639s) What comes next? Merge. So I'm


going to compare, and I'm going to move those down just to make clear what I'm comparing, the
beginning of both sublists. What comes first? Of course, the 0. What comes next? What comes next? The
1. What comes next? The 3. And then lastly comes the 6. All right, where are we in the story? We've now
sorted the left half of the original and the right half of the original.

- [8:31:05](https://www.youtube.com/watch?v=undefined&t=30665s) What step remains? Merge. All


right, so I'm going to make the same point. And this is actually literally what we did earlier because I
deliberately demoed those original numbers in this order, 2 and a 0. This comes out first. What comes
next? 2 and 1. The 1 comes out next. What comes next? The 2 comes next.

- [8:31:25](https://www.youtube.com/watch?v=undefined&t=30685s) What comes next? The 3 comes


next. What comes next? The 4. What comes after that? The 5. What comes after that? The 6. And lastly--
this is when we run out of memory-- the 7 over there is actually in place. OK. OK, so admittedly, a little
harder to explain, and honestly, it gets a little trippy because it's so easy to forget about where you are in
the story because we're constantly diving into the algorithm and then backing back out of it.

- [8:32:00](https://www.youtube.com/watch?v=undefined&t=30720s) But in code, we could express this


pretty correctly and, it turns out, pretty efficiently because what I was doing, even though it's longer
when I do it verbally, I was touching these elements a minimal amount of times, right? I wasn't going
back and forth, back and forth in front of the shelf again and again.

- [8:32:17](https://www.youtube.com/watch?v=undefined&t=30737s) I was deliberately only ever


merging the smallest elements in each list. So every time we merge, even though I was doing it quickly,
my fingers were only touching each of the elements once. And how many times did we divide, divide,
divide in half the list? Well, we started with all of the elements here, and there were eight of them.

- [8:32:38](https://www.youtube.com/watch?v=undefined&t=30758s) And then we moved them 1, 2, 3


positions. So the height of this visualization, if you will, is actually log n, right? If I started with 8, turns
out if you do the arithmetic, this is log n height because 2 to the 3 is 8. But for now, just trust that this is
a log n height. And how wide is the shelf? Well, it's of width n because there's n elements any time they
were on the shelf.
- [8:33:04](https://www.youtube.com/watch?v=undefined&t=30784s) So technically, I was kind of
cheating this algorithm because this is the first time I've needed shelves. With the human examples, we
just had the humans, and that's it, and only eight of them. Here, I was sort of using more and more
memory. In fact, I was using like four times as much memory even though that was just for visualization's
sake.

- [8:33:21](https://www.youtube.com/watch?v=undefined&t=30801s) Merge sort actually requires that


you have some spare space, an empty array to move the elements into when you're merging them
together. But if I really wanted and if I didn't have this shelf or this shelf, honestly, I could have just gone
back and forth between the two shelves. That would have been sufficient.

- [8:33:36](https://www.youtube.com/watch?v=undefined&t=30816s) So merge sort uses more memory


for this merging process, but the advantage of using more memory is that the total running time, if you
can perhaps infer from that math, is what? The big O notation for merge sort, it turns out, is actually
going to be n times log n. And even if you're a little rusty still on your logarithms, we saw in week zero
and again today that log n is smaller than n.

- [8:34:03](https://www.youtube.com/watch?v=undefined&t=30843s) That's a good thing. Binary search


was log n. That's faster than linear search, which was n. So n times log n is, of course, smaller than n
times n or n squared. So it's sort of lower on this little cheat sheet that I've been drawing, which is to
suggest that it's running time is indeed better or faster.

- [8:34:20](https://www.youtube.com/watch?v=undefined&t=30860s) And in fact, if we consider the


best case running time, turns out it's not quite as good as bubble sort with omega of n, where you can
just sort of abort if you realize, wait a minute, I've done no work. Merge sort, you actually have to do
that work to get to the finish line anyway. So it's actually in omega and ultimately theta of n log n as well.

- [8:34:41](https://www.youtube.com/watch?v=undefined&t=30881s) So again, a trade off there


because if you happen to have a data set that is very often sorted, honestly, you might want to stick with
bubble sort. But in the general case, where the data is unsorted, n log n as sounding better than n
squared. Well, what does it actually look or feel like? Give me a moment to just change over to our
visualization here.

- [8:34:59](https://www.youtube.com/watch?v=undefined&t=30899s) And we'll see with this example


what merge sort looks like depicted with now these vertical bars. So same algorithm, but instead of my
numbers on shelves, here is a random array of numbers being sorted. And you can see it being done half
at a time. And you see sort of remnants of the previous bars. Actually, that was unfair.

- [8:35:20](https://www.youtube.com/watch?v=undefined&t=30920s) Let me zoom out here. Let me


zoom out so you can actually see the height here. Let me go ahead and randomize this again and run
merge sort. There we go. Now you can see the second array and where the values are going temporarily.
And even though this one looks way more cryptic visualization-wise, it does seem to be moving faster.

- [8:35:41](https://www.youtube.com/watch?v=undefined&t=30941s) And it seems to be merging halves


together, and boom, it's done. So let's actually see, in conclusion, what these algorithms compare to and
consider that moving forward as we write more and more code, the goal is, again, not just to be correct
but to be well-designed. And one measure of design is going to indeed be efficiency.
- [8:35:58](https://www.youtube.com/watch?v=undefined&t=30958s) So here we have, in final, a
visualization of three algorithms-- selection sort, bubble sort, and merge sort-- from top to bottom. And
let's see what these algorithms might look or sound like here. Oh, if we can dim the lights for dramatic
effect-- selection's on top, bubble on bottom, merge in the middle.

- [8:36:20](https://www.youtube.com/watch?v=undefined&t=30980s) [MUSIC PLAYING] [MUSIC


PLAYING] [MUSIC PLAYING]

- [8:39:13](https://www.youtube.com/watch?v=undefined&t=31153s) DAVID J. MALAN: Well, this is


CS50, and already this is week four, and recall that last week, week three, we began to explore the inside
of a computer's memory a bit more. We talked about arrays, which were just chunks of memory back to
back to back that really lay things out left to right, top to bottom, and this is actually a pretty common
paradigm, even if you're new to programming, and certainly new to C.

- [8:39:34](https://www.youtube.com/watch?v=undefined&t=31174s) You've seen this approach of just


using memory in some way to lay things out, like images, for instance. So for instance, here is a photo
taken of last week's front row, for instance, and this is an opportunity to explore exactly what happens if
we start to zoom in and zoom in and zoom in, because it seems like most any TV show like CSI, or
whatever, or any movie that explores forensic information might have the investigators zoom in on an
image like this to see what the glint in someone's eye

- [8:40:05](https://www.youtube.com/watch?v=undefined&t=31205s) is because that reveals the license


plate number of someone that just drove past. Something that's a little over the top there, but there's an
opportunity here to speak to why that is so unrealistic. For instance, let's zoom on this puppet here's eye
and let's zoom in a little more to see what might be reflected.

- [8:40:19](https://www.youtube.com/watch?v=undefined&t=31219s) Let's zoom in a little more, and


that's it. There's only finite amount of information if you have an image represented in this way. We're
using pixels-- these dots on the screen as rows and columns-- because if you're only using a finite
amount of memory then at the end of the day, you can only store a finite amount of information.

- [8:40:35](https://www.youtube.com/watch?v=undefined&t=31235s) At least I don't really see in this


grid here any glint of a license plate or something like that that you might otherwise see in Hollywood.
So today we'll explore these kinds of representations of how you might use memory in new and
interesting ways to represent now, very familiar things, but also start to explore what some of the
limitations are of this representation.

- [8:40:54](https://www.youtube.com/watch?v=undefined&t=31254s) But consider after all that this


doesn't need to be even as high resolution, as many pixels as something like this other image, you can
imagine just doing something silly with Post-It notes, like this. And if you think of an image as just having
rows and columns, these rows otherwise known as scan lines-- something we'll explore in the coming
week-- you could make this fun smiley face by just using two different values, maybe a zero and a one.

- [8:41:17](https://www.youtube.com/watch?v=undefined&t=31277s) Or yellow and purple, or vice


versa, just to make something come to life. Now in practice, recall we talked about storing not just a zero
or one, but maybe an R, a G, and a B value-- like 24 bits, or three bytes in total-- but we'll come back to
that. That would just be a more involved image. But for fun, if today you want to tackle something
passively in the background, if you go to this URL here, we've put together an opportunity to do a bit of
pixel art.

- [8:41:48](https://www.youtube.com/watch?v=undefined&t=31308s) If you go to this URL here, that'll


redirect you to a Google Spreadsheet. If you have a laptop with you today that'll look a little something
like this, which we've organized in rows and columns. So if you'd like to go ahead and use Google
Spreadsheet's colorization feature to color in those individual squares if you'd like, see if you can't make
something a little creative and then email it to Carter and we'll exhibit some of the best or favorites on
the website thereafter.

- [8:42:12](https://www.youtube.com/watch?v=undefined&t=31332s) So let's transition then to


something a little more familiar-- images. And not all of you have used, presumably, Photoshop, but
you're probably generally familiar with Photoshop as a program for editing and creating images or
photos or the like. And here is a screenshot of p's color picker, via which you can change what color
you're going to draw with the paint brush, or what color you're going to fill in with the paint bucket.

- [8:42:32](https://www.youtube.com/watch?v=undefined&t=31352s) It's representative of any kind of


graphical tool. And there's a lot of information in here, but there's perhaps some familiar terms now-- R,
G, and B. In fact, right now this is Photoshop's way of saying you're about to fill in your background or
foreground with the color black, and that appears to be represented with an R, a G, and a B value of
zero, zero, zero.

- [8:42:51](https://www.youtube.com/watch?v=undefined&t=31371s) Or alternatively, using a hash


symbol and then 000000. And if some of you have already made web pages before and you know a little
bit of HTML and CSS, you probably are familiar with this kind of syntax-- a hash symbol and then six, or
sometimes three digits thereafter. And if we look at a few different colors here, for instance, here might
be the representation of white.

- [8:43:12](https://www.youtube.com/watch?v=undefined&t=31392s) Now the R, the G, and the B


values went way up from 0 to 255, 255, 255. Or alternatively, it looks like Photoshop, and in turn web
browsers, could represent that same color white with FFFFFF. And let's just do a few others. Here is red,
and it turns out that red is a whole lot of red, 255, but no green, no blue.

- [8:43:35](https://www.youtube.com/watch?v=undefined&t=31415s) Or, a.k.a. FF0000. So there's


perhaps a pattern here emerging. Here is green, zero, 255, zero, a.k.a. 00FF00, or lastly, here blue, which
is no red, no green but apparently a lot of blue, 255 again, a.k.a. 0000FF. Now some of you, again, might
have seen this notation before, these zeros and these F's and all of the numbers and letters in between,
but this is another form of notation.

- [8:44:02](https://www.youtube.com/watch?v=undefined&t=31442s) And in fact, we'll explore this


today-- really is just a precondition for talking about some other concepts. But the ideas, ultimately, are
really no different. What we're about to see is a different base system-- not just binary, not just decimal,
but something we're about to call hexadecimal.

- [8:44:17](https://www.youtube.com/watch?v=undefined&t=31457s) But first, recall that with RGB we


previously did the following. Any RGB value-- red, green, blue-- just combine some amount of red or
green or blue. So here we have 72, 73, 33, which in the context of an email or text, of course, said what--
a couple of weeks back? Just hi with an exclamation point, but in the context of a Photoshop-like
program, this might instead be representing, collectively, this shade of yellow, for instance, when you
combine that much red that much green that much blue.

- [8:44:46](https://www.youtube.com/watch?v=undefined&t=31486s) So here is the same idea. If you've


got a lot of red, no green, no blue, together that's going to give us red. If you've got no red, a lot of
green, no blue, that's going to give us, of course, green. If you've got no red, no green, a lot of blue, that
of course, is going to give us blue.

- [8:45:00](https://www.youtube.com/watch?v=undefined&t=31500s) So there's a pattern emerging


here where apparently 00 is none, as always, and FF is apparently a lot. And it's maybe somehow
equated with 255, at least per that Photoshop screenshot. Meanwhile, if we combine one last one, a lot
of red, a lot of green, a lot of blue-- that's actually going to give us a single white pixel like this.

- [8:45:21](https://www.youtube.com/watch?v=undefined&t=31521s) All right, so think back. Here was


binary-- in the world of binary you had just two digits, zero and one. Could have been anything else-- A
or B, X or Y, but the world standardized on these numerals zero and one. In our world's decimal system,
of course, you have zero through nine. As of today though, we're going to start using hexadecimal
sometimes in the context of images and also files just because it's a convention and there's some
conveniences to it.

- [8:45:45](https://www.youtube.com/watch?v=undefined&t=31545s) Where now, you're going to be


able to count up to F in a notation called hexadecimal. From zero through nine, then you keep going to A
to B to C to D to E to F, the idea being each of these, even though it's weirdly a letter of the English
alphabet, it's still just a single symbol. It's not one zero for 10, or 1 1 for eleven-- all 16 of these values,
these digits, so to speak, are indeed still just single symbols, and that's a characteristic of just using this
other notational system.

- [8:46:15](https://www.youtube.com/watch?v=undefined&t=31575s) So how do we get from 00 and FF


to something like 0 and 255, respectively? Well, this hexadecimal system, a.k.a. Base 16, just does the
math from week zero and really, grade school, a little bit differently. For instance, if you have a number
that's got two digits, or hexadecimal digits as of today, the columns are just a little different.

- [8:46:34](https://www.youtube.com/watch?v=undefined&t=31594s) Instead of powers of two or


powers of 10, which we saw for binary and decimal respectively, it's powers of 16. So if we just do the
math out, that's the ones column, this is the 16s column, and so forth. Things get actually pretty big
pretty quickly in this system. But now let's just consider how we would represent familiar numbers.

- [8:46:52](https://www.youtube.com/watch?v=undefined&t=31612s) If you've got two hexadecimal


digits for which these hashes are just placeholders, zero, zero is going to mathematically equal the
decimal number you and I know, of course, as zero. Why? Same thing as week zero-- 16 times zero plus
one times zero is the number you and I know as zero. And we can count up from here.

- [8:47:08](https://www.youtube.com/watch?v=undefined&t=31628s) This, in hexadecimal, would be


how a computer represents the number we know as one. It would be zero one in this case. This would
be two, three, four, five, six, seven, eight, nine-- in decimal, we're about to go to 10. But in hexadecimal,
to be clear, what comes next? So, apparently A, so 0A, 0B, which is now 10, or 11, or 12, 13, 14, 15.
- [8:47:33](https://www.youtube.com/watch?v=undefined&t=31653s) So using hexadecimal is just an
interesting way of using single symbols now, zero through F, to count from zero through 15. And we'll see
why it's 15 in a moment, but as soon as we get to F, anyone want to conjecture how in hexadecimal,
a.k.a. hex, do we now count up one position higher? What comes after 0F in hexadecimal? So, one zero--
it's the same kind of thing-- once you're at the highest digit possible, F-- or in our decimal world that
would have been nine-- you add one more, nine wraps around to zero, or in this case,

- [8:48:06](https://www.youtube.com/watch?v=undefined&t=31686s) F wraps around to zero. You carry


the one and voila-- now we're representing the number you and I know as 16. And we could keep going
forever, literally. This could be 17, 18, 19, 20, and decimal-- but let's just wave our hands at it and count
as high as we can-- dot, dot, dot-- the highest we could count in hexadecimal with two digits, just
logically, would be what, in hexadecimal? Something, something.

- [8:48:30](https://www.youtube.com/watch?v=undefined&t=31710s) FF, I heard. So yes, that's the


biggest digit possible, so FF is what we have. So how high can you count in hexadecimal if you've got just
two of these digits? Well, it's the same math as always. 16 times F, a.k.a. 15, so that's 16 times 15 plus
one times F, or one times 15-- that gives us 240 plus 15 in decimal, the result of which, of course, now is
255.

- [8:48:55](https://www.youtube.com/watch?v=undefined&t=31735s) So this hexadecimal system-- you


may have seen in the world of web pages, and if you haven't we'll get to that in this class in a few weeks,
or we just saw in the context of Photoshop-- just has this shorthand notation of counting as high as 255
but just calling it FF. Now it's marginal, but that's like 50% savings of how many digits you need in order
to count as high as 255 because in decimal, of course, 255 is three digits.

- [8:49:19](https://www.youtube.com/watch?v=undefined&t=31759s) In hexadecimal you can count as


high using just two, and that difference is going to get magnified the bigger our numbers get. Let me
stipulate for now, you're going to get more and more savings in terms of just how many symbols you
need on the screen to represent bigger and bigger numbers than that.

- [8:49:35](https://www.youtube.com/watch?v=undefined&t=31775s) All right, let me pause here just to


see if there's any questions thus far on what we've called hexadecimal, which again, just gives us zero
through nine as well as A through F. Any questions or confusion? And if it feels like we're lingering a bit
much on arithmetic, we're not really going to see other notations besides this moving forward.

- [8:49:55](https://www.youtube.com/watch?v=undefined&t=31795s) These are the go-to three in a


programmer's world, typically. But there are some others. Yeah. AUDIENCE: Does the hexadecimal
symbol take more storage than the decimal system? DAVID J. MALAN: Good question. Does hexadecimal
require more storage or less storage than the decimal system? Theoretically no, because this is just a way
of representing information and we'll see in a concrete example in a moment.

- [8:50:19](https://www.youtube.com/watch?v=undefined&t=31819s) But inside of the computer, at the


end of the day, you're still storing bits. And using hexadecimal is not using more or fewer bits, think of
this as how you might write it down on a piece of paper, just how many digits you're going to write or on
a computer screen, how many digits you're going to see at once, but it doesn't change how the
computer is representing information because all they're representing at the end of the day is zeros and
ones.
- [8:50:40](https://www.youtube.com/watch?v=undefined&t=31840s) So in fact, let's go there. If this-- a
moment ago FF I claimed was 255-- let's just rewind to week zero and if we wanted to count to 255 in
binary, that's as high as you can count, recall, with eight bits. And there's only a few of these numbers
that are useful to memorize, like 255 is as high as you can count with eight bits if you start at zero,
because two to the eighth is 256, but if you start at zero it's zero through 255.

- [8:51:05](https://www.youtube.com/watch?v=undefined&t=31865s) So in binary, recall if you have


eight bits, all of which were ones, and I won't do out the math pedantically here, but if I do do this plus
this plus this, dot, dot, dot-- that's also going to give me 255. So this is what's interesting here about
hexadecimal. It turns out that an upside of storing values in hexadecimal is that we're going to see the
first F represents the left half of all these bits, and the second F in this case represents the rightmost four
of these bits.

- [8:51:34](https://www.youtube.com/watch?v=undefined&t=31894s) So it turns out hexadecimal is very


useful when you want to treat data in units of four. It's not quite eight, but units of four, and that's not
bad. Which is why-- if you use two digits like I have thus far, 00 or FF or anything in between-- that's
actually a convenient way of representing eight bits in total.

- [8:51:53](https://www.youtube.com/watch?v=undefined&t=31913s) One hex digit for the first four


bits, one hex digit for the second. And again, there's nothing new intellectually here per se, it's just a
different way of representing the same story as before-- zeros and ones. So in what context do we see
this? Well, we talked about memory last week, and we're going to talk more about it this week.

- [8:52:10](https://www.youtube.com/watch?v=undefined&t=31930s) If this is my computer's RAM--


random access memory-- you can again think of each byte as having a number associated with it-- its
address or location. This might be zero, this might be 2 billion, and so in the past I've described these as
just this, using decimal numbers. Here's byte zero, one, two, three, four, five, six, seven, 15, 16 would be
here, and so forth.

- [8:52:31](https://www.youtube.com/watch?v=undefined&t=31951s) But it turns out in the world of


memory, and thus today, programming, people tend to count memory bytes using hexadecimal. Partly
just by convention, but also partly because it's a little more succinct and again, each digit represents four
bits, typically. So what comes after F here? Well, if I think about the computer's memory, I normally
might do after F, which is 15, 16.

- [8:52:57](https://www.youtube.com/watch?v=undefined&t=31977s) But instead, one zero, one one,


one two, one three-- this is not 10, 11, 12, 13, because I claim I'm in the context of hexadecimal now. As
per the previous slide, we already started going into A's through F's, so you immediately see here a
possible problem. Why is this now worrisome, if all of a sudden you're seeing seemingly familiar
numbers like 10, 11, 12, 13? We didn't really stumble across this problem when it was all zeros and ones
before.

- [8:53:26](https://www.youtube.com/watch?v=undefined&t=32006s) Yeah. AUDIENCE: Try to do math


[INAUDIBLE]. DAVID J. MALAN: Yeah, so if you're writing some code in C that's doing some math, you
might accidentally-- or the computer might accidentally confuse hexadecimal with decimal if they look in
some context the same. Any number on the board that doesn't have a letter is ambiguously hexadecimal
or decimal at this point, and so how might we resolve this? Well, it turns out that what computers
typically do is this.

- [8:53:51](https://www.youtube.com/watch?v=undefined&t=32031s) By convention, any time you see


0x and then a number, that's a human convention of saying-- signaling to the reader that this is in fact a
hexadecimal number. So if it's 0x10, that is not the number 10, that is the hexadecimal number one zero,
which recall we said earlier, is how you count up to 16.

- [8:54:14](https://www.youtube.com/watch?v=undefined&t=32054s) And again, these are not the kinds


of things to memorize, it's really just the system for how you think about these things. So henceforth
today, we're going to start seeing hexadecimal in a bunch of contexts. When you write code, you might
even write code using some hexadecimal but again, it's just a different way of representing numbers and
humans have different conventions for different contexts.

- [8:54:33](https://www.youtube.com/watch?v=undefined&t=32073s) All right, so with that said, any


questions now on this building block? But here on out, we'll start using it in some actual code. Any
questions? Nothing so far? All right. So, let's go ahead and consider maybe a familiar example.
Something where involving code, where I initialize a variable like n to a value like 50, in this case.

- [8:54:55](https://www.youtube.com/watch?v=undefined&t=32095s) And then let's start to tinker


around with what's going on inside of the computer's memory. In a moment I'm going to load up VS
Code on my computer and I'm going to go ahead and whip up a program that very simply assigns a value
like the number 50 to a variable called n, but today, keep in mind that that variable n and that value 50 is
going to be stored somewhere in my computer's memory, and it turns out today we'll introduce a bit
more syntax so you can actually see where things are being stored.

- [8:55:22](https://www.youtube.com/watch?v=undefined&t=32122s) So let me click over to VS Code


here. I'm going to create a program called address.c just to explore computer's addresses today, and I'm
going to do an include stdio.h, int main(void), as usual. No command line arguments for now. I'm going
to declare that variable n equals 50, and then I'm just going to go ahead and print it out.

- [8:55:41](https://www.youtube.com/watch?v=undefined&t=32141s) So nothing very interesting but I'll


use %i backslash n and then comma n to print out that value. Nothing here should be very interesting to
compile or run, but I'll do it just to make sure I didn't make any mistakes. Looks like as expected, it simply
prints out the number 50, like this. But let's consider then, what this code is doing underneath the hood
when it's actually run on your machine.

- [8:56:05](https://www.youtube.com/watch?v=undefined&t=32165s) So here we have that grid of


memory. That variable n is an int, and if you think back, how many bytes typically do we use for an int?
Yeah. Four, so four bytes, or 32 bits. So if each of these squares represents one byte, then my computer,
somewhere in my memory, or RAM, is using four of these squares. Maybe it ends up over here just
because there's other stuff being used elsewhere, for instance.

- [8:56:29](https://www.youtube.com/watch?v=undefined&t=32189s) Though I don't really know, and


frankly, I don't really care where it ends up, just that it ends up somewhere. So the variable-- the value
50 is stored here in a variable called n. Even though I've written it as decimal, just like in my code-- let me
again remind that this is 32 zeros and ones representing that 50-- it's just going to be very tedious if we
start writing everything in binary, so I'll use the more comfortable human decimal system.
- [8:56:52](https://www.youtube.com/watch?v=undefined&t=32212s) So that's what's going on inside of
the computer's memory. So what if I actually wanted to start tinkering with its location, or maybe just
knowing its location? Well, this variable n indeed has a name, n-- that's a label of sorts for it-- but at the
end of the day that 50 is technically at a specific address, and I'm going to make one up-- 0x123, and it's
123 because I really don't care what it is, I just want an address for the sake of discussion.

- [8:57:18](https://www.youtube.com/watch?v=undefined&t=32238s) So way over here off screen might


be byte zero, way down here is byte 0x123. It's in hexadecimal notation just by convention. So how can I
actually see where my variables are ending up in memory if I'm curious to do so? Well, let me go back to
my code here and let me actually change this just a little bit.

- [8:57:39](https://www.youtube.com/watch?v=undefined&t=32259s) Let me go ahead and introduce,


for instance, another symbol here and another topic altogether, namely pointers. So a pointer is a
variable that stores the address of some value-- the location of some value or more specifically, the
specific byte in which that value is stored. So again, if you think of your memory as being a whole bunch
of bytes-- zero at top left, 2 billion or whatever at bottom right, depending on how much RAM you have--
each of those things has a location, or an address.

- [8:58:11](https://www.youtube.com/watch?v=undefined&t=32291s) A pointer is just a variable storing


one such address. So it turns out that in the world of C, there's a couple of new symbols we can use if we
want to see what it is we're talking about here, and those two operators, as of today, are these. You can
use the ampersand operator in C in a couple of ways.

- [8:58:31](https://www.youtube.com/watch?v=undefined&t=32311s) We already saw it very briefly to


do ampersand ampersand-- it's kind of and two Boolean expressions together in the context of a
conditional. This is different. A single ampersand is the address of operator. So literally, in your code, if
you've got a variable like n or anything else and you write &n, C is going to figure out for you what is the
address of that variable n in the computer's memory.

- [8:58:56](https://www.youtube.com/watch?v=undefined&t=32336s) And it's going to give you a


number, otherwise known as the address of that. If you want to store that address in a variable even
though yes, it's a number like 0x123, you have to tell C in advance that you want to store not an int per
se, but the address of an int. And the syntax for doing that-- somewhat nonobviously-- is to use an
asterisk here, a star operator, and you say this when creating the variable.

- [8:59:26](https://www.youtube.com/watch?v=undefined&t=32366s) If you want p to be a pointer, that


is the address of some other variable, you do int star p. And the star just tells the computer, this is not an
integer per se, this is the address of something that yes, is an int, but we're just being more precise. So
on the right hand side you have the address of operator.

- [8:59:45](https://www.youtube.com/watch?v=undefined&t=32385s) As always with the equal sign, you


copy from right to left. Because &n is by definition the address of something you have to store it in a
pointer, and the way to declare a pointer is to specify the type of value whose address you're storing,
and then use the star to indicate that this is indeed a pointer and not just a regular old int.

- [9:00:05](https://www.youtube.com/watch?v=undefined&t=32405s) So let's see this in practice. Let me


go back to my own source code here and let me make just a couple of tweaks. I'm going to leave n alone
here but I'm going to go ahead and initially just do this. Let me say int star p equals ampersand n, and
then down here, I'm going to print out not n this time, but p-- the variable p.

- [9:00:29](https://www.youtube.com/watch?v=undefined&t=32429s) And then even though yes, it's


just a number and therefore I could use %i for integers, there's actually a special format code in printf for
printing pointers or addresses, and that's %p. So now let's go ahead and recompile this, make address--
so far so good-- .

- [9:00:49](https://www.youtube.com/watch?v=undefined&t=32449s) /address, Enter, and a little


weirdly, but perhaps understandably now, the address in my computer's memory at which the variable n
happened to be stored was not quite as simple as 0x123. This computer has a lot more memory so
technically, it was stored at 0x7FFCB4578E5C. Now that has no special significance to me. It could have
ended up somewhere else altogether, but this is just where, in my computer-- or technically the cloud
server to which I'm connected using VS Code here-- that just happens to be where n ended up.

- [9:01:21](https://www.youtube.com/watch?v=undefined&t=32481s) And strictly speaking, I don't even


need to introduce this variable. I could get rid of p and I could just say print not just n, but the address of
n and achieve the same thing. You don't need to temporarily store it in a variable. Let me just do make
address again, ./address, and now I see this address here.

- [9:01:38](https://www.youtube.com/watch?v=undefined&t=32498s) And notice if I keep running the


program, it's actually moving around. There's other stuff presumably going on inside of the computer.
Maybe it's actually randomizing it so it's not always at the same location. That can actually be a security
feature underneath the hood, but this happens to be at that moment in time where that value is in
memory, quite like our picture a moment ago.

- [9:01:59](https://www.youtube.com/watch?v=undefined&t=32519s) All right, so let me pause here to


see if there's now any questions on what we just did. Yeah? AUDIENCE: Is there any way to control where
you are storing something in memory? Does it even matter if it works, or does it just matter that you
could go in and locate where something is? DAVID J. MALAN: Really good question.

- [9:02:18](https://www.youtube.com/watch?v=undefined&t=32538s) Is there any way to control where


something is in memory? Short answer is yes, and this is both the power in the danger of C, and we're
going to do this today and make a few deliberate mistakes, because with this power of going to or
getting the address of any variable, I could just arbitrarily right now write code that stores a value at byte
2 billion, or zero, or anything in between.

- [9:02:38](https://www.youtube.com/watch?v=undefined&t=32558s) But that also means potentially, I


could start creepily looking around at all of the computer's memory, even at things that I didn't put
there. Maybe other programs, maybe other parts of programs and indeed, this is a potential security
threat, if suddenly you're able to just look anywhere you want in the computer's memory.

- [9:02:55](https://www.youtube.com/watch?v=undefined&t=32575s) Now, I'm overselling it a little bit


because nowadays, in this decade, there are some defenses in place in compilers and in our operating
systems that do hedge against this a little bit. But this is still a very frequent source of problems, and
later today we'll talk briefly about things called stack overflow, which is not just a website, it is a problem
that you can encounter.
- [9:03:15](https://www.youtube.com/watch?v=undefined&t=32595s) Heap overflow, and more
generally buffer overflows-- there's just so many things that can go wrong using this language called C,
and if any of you have encountered a segmentation fault yet? I think we saw a few hands for that
already. You touched memory that you shouldn't have and odds are you did it most recently by going too
far in an array.

- [9:03:34](https://www.youtube.com/watch?v=undefined&t=32614s) Going to the left, or negative in an


array, or somehow looking at memory you shouldn't have. And we'll explain today why it is you were
able to do that. Other questions on these primitives so far? Yeah, from Carter? AUDIENCE: [INAUDIBLE]
pointer star p, but then we used p later in the code. Is it called star p or p? DAVID J. MALAN: Good
question.

- [9:03:53](https://www.youtube.com/watch?v=undefined&t=32633s) Earlier, we used star p. Let me


rewind in time to the previous version of this code, where I actually had a variable called p. Just like with
variable declarations in the past, once you've declared a variable to be an int, a char, a bool, or an int
star, a.k.a. a pointer, you don't thereafter keep using the word int or now, the star.

- [9:04:14](https://www.youtube.com/watch?v=undefined&t=32654s) Once you've declared it, that's it.


You only refer to it by name. And so it's very deliberate what I did here, saying that the type here is int
star-- that is a pointer to an int-- but here I just said the name of the variable, as always. I didn't repeat
int, and I also didn't repeat star.

- [9:04:32](https://www.youtube.com/watch?v=undefined&t=32672s) But at the risk of bending one's


minds a little bit there is unfortunately one other use for the star operator, and that's as follows. If you
want to print out not the address of something, but what is at a specific address, you can actually do
this. If I want to print out the integer via %i, that is at that address, I can actually use the star here, which
technically contradicts what I just said but it has a different function here-- a different purpose.

- [9:05:03](https://www.youtube.com/watch?v=undefined&t=32703s) So let me go ahead and do this in


two different ways. I'm going to leave this line of code as is, but I'm going to add another line of code
now that prints out what apparently will be an integer, in a moment. So %i backslash n, and I could see--
and let me just do n for now. So there's really nothing special happening now, I'm just adding a sort of
mindless printing of n.

- [9:05:21](https://www.youtube.com/watch?v=undefined&t=32721s) So make address, ./address--


there's the current address of n and there's the value of n. But what's kind of cool about C here, too, is if
you know that a value is at a specific address like p, there's one other use for this star operator, the
asterisk. You can use it as the so-called dereference operator, which means go to that address.

- [9:05:44](https://www.youtube.com/watch?v=undefined&t=32744s) And so here what we actually


have is an example of a pointer p, which is an address like 0x123 or 0x7FF and so forth. But if you say star
p now, you're not redeclaring the variable because I didn't mention int-- you're going to that address in
p. So let me recompile this now. Make address, .

- [9:06:11](https://www.youtube.com/watch?v=undefined&t=32771s) /address, and just to be clear--


what should I see? I'm first going to see the pointer itself, 0x something. What's the second line of
output I should presumably see now? Shout a little louder. So I'm hearing 50, and that's true because if
you figure out the address of n an
- [9:06:34](https://www.youtube.com/watch?v=undefined&t=32794s) d print it in line seven, but then
go to the address of n, a.k.a. p, that's indeed going to just show you the number n-- the value of n again.
All right, any questions now on this syntax-- and I will concede, I think this is confusing-- the fact that we
use the star for multiplication, the fact that we use the star to declare a pointer, but then we use a star in
a third way to dereference the pointer and go to the pointer.

- [9:06:53](https://www.youtube.com/watch?v=undefined&t=32813s) It's just too confusing, honestly,


but with practice comes comfort. Yeah. AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Good question. Do
you-- when you are using the ampersand operator to get the address of something, the onus is on you at
the moment to know what you are getting the address of. Is it a string? Is it a char? Is it a bool? Is it an
int? I wrote this code so I know in line six that I'm trying to get the address of what is an integer.

- [9:07:28](https://www.youtube.com/watch?v=undefined&t=32848s) AUDIENCE: What about line


eight? DAVID J. MALAN: In line eight you don't have to worry about that-- good question. Notice in line
eight, I didn't tell the computer, other than the %i, what kind of address I'm going to, but I did already in
line six. I told the compiler that p, now and forever, is going to be the address of an int.

- [9:07:50](https://www.youtube.com/watch?v=undefined&t=32870s) That's enough information in


advance so that printf, or really the language C, still knows on line eight that p is a pointer to an int, and
that way it will print out all four bytes at that address, not just part of it, and not more than those four
bytes. Good question. Yeah, next to you. AUDIENCE: Do pointers have pointers? DAVID J.

- [9:08:12](https://www.youtube.com/watch?v=undefined&t=32892s) MALAN: Do pointers have


pointers? Yes. We won't do this today by having pointers to pointers, but yes, you can use star star, and
then things get-- I'm sorry. We won't do that today and we won't do that often. In fact Python, another
language, is just a couple of weeks away, so hang in there. Almost there. A question back here? Was
there? That was-- more verbal feedback like that is helpful as we forge into the more complicated stuff.

- [9:08:36](https://www.youtube.com/watch?v=undefined&t=32916s) Other questions? Yeah.


AUDIENCE: What's the point of [INAUDIBLE]?? DAVID J. MALAN: What's the point of printing the
address? AUDIENCE: Like, using the address to [INAUDIBLE].. DAVID J. MALAN: Sure. What's the point of
doing this? If you don't mind, let me-- let's get there in a moment. This is not the common use case, just
printing out the address-- who really cares? At the moment we care only for the sake of discussion.

- [9:09:01](https://www.youtube.com/watch?v=undefined&t=32941s) We're soon going to start using


these addresses. So hang in there just a little bit for that one, too, but it will solve some problems for us
before long. So let's actually just now depict what was going on inside of the computer's memory just a
moment ago. So if I toggle back here, let me redraw my computer's memory, now let me plop into the
memory n, which is storing in this program the number 50.

- [9:09:24](https://www.youtube.com/watch?v=undefined&t=32964s) Where is p in my computer's


memory? Specifically, I don't know and apparently it moves around each time I run the program so for
the sake of discussion, let's just propose that if 50 ended up at address 0x123, I don't know-- p ends up
over here, at address-- whoops-- at whatever address this is here.

- [9:09:42](https://www.youtube.com/watch?v=undefined&t=32982s) But notice a couple of curiosities


now. If p is a pointer, it's the address of something. So the value in p should be an address, and I've
indeed written it as such-- 0x123, and technically there's not an x there, there's not a zero there, there's
not even a 123 there per se-- there's a pattern of bits that represents the address 0x123.

- [9:10:03](https://www.youtube.com/watch?v=undefined&t=33003s) But again, that's weak zero-- don't


care about binary day-to-day. So if this is p, and this I claimed was n, why is p so much bigger? Can
someone conjecture here? Because it turns out whether n is an int or a char or a bool, which are
different types-- heck, even a long-- it turns out that p is always going to take up eight squares on the
board, but why might that be? What might explain that? Yeah, thoughts? AUDIENCE: Perhaps it allocates
eight bytes, but it doesn't know the type of the data [INAUDIBLE]..

- [9:10:44](https://www.youtube.com/watch?v=undefined&t=33044s) DAVID J. MALAN: OK, fair. Maybe


it's allocating eight bytes because it doesn't know the type. Turns out that's OK because an address is an
address. It's really up to the programmer to use it as a string or a char or a bool. Other thoughts?
AUDIENCE: Maybe the first four for the actual number and the last four is some null that [INAUDIBLE]
where the pointer ends.

- [9:11:06](https://www.youtube.com/watch?v=undefined&t=33066s) DAVID J. MALAN: OK, possibly. It


could be that pointers have some complexity like a backslash n or something curious like that, like we
talked about for strings. Turns out that's not the case. It turns out that pointers nowadays typically are,
but not always are eight bytes, a.k.a. 64 bits, because you and I-- our Macs, our PCs, heck-- even our
phones have a lot more memory than they did years ago.

- [9:11:28](https://www.youtube.com/watch?v=undefined&t=33088s) Back in the day, a pointer might


have only been 32 bits, or even only eight bits way back in the day. It's considered 32 bits, because that
was the norm for some time. How high can you count, roughly, if you've got 32 bits? What's the number
we keep rattling off? 32 bits is roughly 2 to the 32, so it's 4 billion, and I keep saying it's 2 billion if you do
negative, but in the world of memory there's a reason I keep saying 2 billion bytes, two gigabytes,
because for a very long time that was the maximum amount of memory

- [9:11:59](https://www.youtube.com/watch?v=undefined&t=33119s) a computer could have. Why?


Because the pointers that the computers were using were only, for instance, 32 bits. And with 32 bits,
depending on whether you allow for negatives or not, you can count as high as 2 billion, roughly, or
maybe 4 billion but you know what-- your Mac, your PC, your phone could not have had five gigabytes of
memory, or 5 billion bytes of memory.

- [9:12:18](https://www.youtube.com/watch?v=undefined&t=33138s) You certainly couldn't have had


what computers nowadays come with, which might be 8 gigabytes of memory-- 16 gigabytes of memory.
Why? Because with 4 bytes, or 32 bits, you literally, physically, can't count that high, which means if I
drew a picture of all of the memory we would run out of numbers to describe them, which means most
of my memory would just be unusable.

- [9:12:38](https://www.youtube.com/watch?v=undefined&t=33158s) So pointers nowadays are 64 bits,


or eight bytes. That's really big. I can't even pronounce how big that number is, but it's plenty for the
next many years, and so we've drawn it that way on the board here. Now let's just abstract this away.
Let's get rid of all the other bytes that are storing something or nothing else, and let's now start to
abstract away this complexity because the reality is, to your question earlier-- what is this useful for, or
what do we-- do we actually
- [9:13:02](https://www.youtube.com/watch?v=undefined&t=33182s) care about these addresses?
Generally, no. We're doing this so that you see there's no magic. We're just moving things around and
poking around in memory. But what a person would typically do when talking about pointers would
literally be to just point at something. I really don't care what address n is at, so it suffices when general,
when drawing pictures on a whiteboard, having a discussion with another programmer, you just draw an
arrow from the pointer to the value in question, because neither you nor I probably care about the
specifics of 0x whatever.

- [9:13:32](https://www.youtube.com/watch?v=undefined&t=33212s) There's your pointer-- it's literally


an arrow, and we can see this. So it turns out that these pointers, these addresses, are not that dissimilar
to what we've done for hundreds of years in the form of a postal system. For instance, here is a post
office-- here, no-- here is a mailbox, and suppose that this is a mailbox labeled p.

- [9:13:51](https://www.youtube.com/watch?v=undefined&t=33231s) It's a pointer, and suppose there's


another mailbox way over there, which is just another bite of my computer's memory. What are we
really talking about? Well, you store in a computer's memory values like the number 50, or the word "hi"
inside of your computer's memory at some location.

- [9:14:07](https://www.youtube.com/watch?v=undefined&t=33247s) But today we can also use those


same memory locations to store the address of things. For instance, if I open this up here and I see OK,
the value inside of this mailbox is not a number like 50, it's actually an address-- 0x123-- that's like a
pointer, a breadcrumb leading from one location in memory to another.

- [9:14:28](https://www.youtube.com/watch?v=undefined&t=33268s) And in fact, would someone


who's seated roughly over there-- do you mind getting the mail over there? Any volunteers over in this
section? Just need you to get to the mailbox before I do. Who's being volunteered? Oh yes, please.
Whoever is gesturing most wildly, come on down. Sure. What's your name? AUDIENCE: Anfoo.

- [9:14:55](https://www.youtube.com/watch?v=undefined&t=33295s) DAVID J. MALAN: Say again?


AUDIENCE: Anfoo. DAVID J. MALAN: Anfoo? OK, come on up to the edge of the stage there and just to be
clear-- if this is p, that is apparently n, but to make clear what we're talking about when we're storing 0x
whatever values-- like 0x123, that's essentially equivalent to my maybe pulling out something like this
and just abstractly pointing to your mailbox there, or if you prefer, pointing to the mailbox-- OK, all right.

- [9:15:24](https://www.youtube.com/watch?v=undefined&t=33324s) Thank you. All right. This is akin to


me pointing at your mailbox, and if you want to go ahead and open your mailbox and reveal to the
crowd what's inside your mailbox labeled n. All right. Thank you. We have a little CS50 stress ball for your
trouble. Thank you for coming up. So that's just to put a visual on what it is we're talking about, because
it can get very abstract, very cryptic quickly when we're talking about addresses and memory and
drawing it like these little squares.

- [9:15:57](https://www.youtube.com/watch?v=undefined&t=33357s) But if you think about just walking


into a post office or an apartment complex that's got a lot of mailboxes, those mailboxes essentially are a
big chunk of memory and each of those mailboxes has an address-- this is apartment one, two, three--
apartment 2 billion. And inside of those mailboxes can go anything that can be represented as
information.
- [9:16:16](https://www.youtube.com/watch?v=undefined&t=33376s) It could be a number like n, or 50,
or if you prefer it could be a number that represents the address of another mailbox. And this is akin,
really, if you've ever had an apartment or you and your parents have moved, to having a forwarding
address. It's like having the Post Office in the US put some kind of piece of paper in your old mailbox
saying, actually forward it to that other mailbox.

- [9:16:37](https://www.youtube.com/watch?v=undefined&t=33397s) That really is all a pointer is doing.


At the end of the day, it's just a number but it's a number being used in a different way and it's the
syntax that we've introduced, not just int but int star, that tells the computer how to treat that number
in this slightly different way. Are there any questions then, on this? Yeah, in back.

- [9:16:59](https://www.youtube.com/watch?v=undefined&t=33419s) AUDIENCE: If you had a variable,


like int c, [INAUDIBLE].. DAVID J. MALAN: If I did int c and-- say the code again? Once more? Equal to n,
so let me actually type it out. If I give myself another line of code, tell me one last time what to type. int
is equal to n, like this? So this is OK, and I can't draw it quite quickly enough on the board here, but this
would be like creating another four bytes somewhere in memory, maybe down here, that stores an
identical copy of 50 because the assignment operator from right to left copies one value

- [9:17:39](https://www.youtube.com/watch?v=undefined&t=33459s) to another. So that would just add


one more rectangle of size four to this particular picture. If I'm answering your question as intended. OK,
so that is week one style use of assignment operators before pointers. I could, though, start copying
pointers but again, we'll come back to some of that complexity.

- [9:17:57](https://www.youtube.com/watch?v=undefined&t=33477s) Any other questions here?


AUDIENCE: That was a great question. Does the pointer point-- does the same pointer point to the new
replica as well? DAVID J. MALAN: Ah, good question. Short answer, no. And to repeat for the camera, if I
create a second variable like this, int c equals n, and I claim without actually drawing it on the board that
this gives me another rectangle, the value of which is also 50, p does not get touched.

- [9:18:22](https://www.youtube.com/watch?v=undefined&t=33502s) And this is what's important and


really characteristic of C. Nothing happens automatically for you. p is not going to be updated unless you
update p in some way, so creating a third variable called c-- even if you're copying its value from right to
left, that has no effect on anything else in the program.

- [9:18:40](https://www.youtube.com/watch?v=undefined&t=33520s) A good question. So what have


we seen that's perhaps now a little more explainable? Well, recall that we talked quite a bit last week
about strings, and just to recap in layperson's terms, what is this string as you now understand it? So
say-- well, let me take a specific hand here. What's a string? How about over here.

- [9:19:02](https://www.youtube.com/watch?v=undefined&t=33542s) AUDIENCE: An array of characters.


DAVID J. MALAN: OK, sure. Both of you are right. An array of characters. An array of characters, and we--
I claimed-- or revealed last week that string is not technically a feature built into C. It's not an official data
type but every programmer in most any language refers to sequences of characters-- words, letters,
paragraphs-- as strings.

- [9:19:23](https://www.youtube.com/watch?v=undefined&t=33563s) So the vernacular exists but the


data type doesn't typically exist per se in C. So what we're about to do, if you will, for dramatic effect, is
take off some training wheels today. The CS50 library implemented in the form of the header file cs50.h--
we claim has had a bunch of things in it. Prototypes for GetString, prototypes for GetInt, and all of those
other functions, but it turns out it also is what defines the word "string" in such a way that you all can
use it these past several weeks.

- [9:19:51](https://www.youtube.com/watch?v=undefined&t=33591s) So let's take a look at an example


of a string in use. Here, for instance, is a tiny bit of code that uses the word "string," creating a variable
called s and then storing quote unquote, hi, exclamation point. Let's consider what this looks like now in
the computer's memory. I don't care about all the other bytes, let's just focus on these, and this per last
week is how "hi" might be stored.

- [9:20:12](https://www.youtube.com/watch?v=undefined&t=33612s) h-i exclamation point and then


one more, as someone already observed, that sentinel value-- that null character which just means eight
zero bits to demarcate the end of that string just in case there's something to the right of it, the
computer can now distinguish one string from another. So last week we introduced this new syntax.

- [9:20:30](https://www.youtube.com/watch?v=undefined&t=33630s) Well, if strings are just arrays of


characters you can then very cleverly use that square bracket notation and go to location zero or one or
two, which are like addresses, but they're relative to the string. This could be at 0x123 or 0x456, but with
this bracket notation zero is always the beginning of the string, one is the next, two is the next, and so
forth.

- [9:20:51](https://www.youtube.com/watch?v=undefined&t=33651s) So that was our array syntax for


indexing into an array. But technically speaking, we can go a little deeper today-- technically speaking, if
hi is starting at the address 0x123 then it stands to reason that i is at 0x124, exclamation point's at
0x125, and the null is that 0x126. Now, I don't care about 123 per se, but even though this is
hexadecimal, this is correct math.

- [9:21:20](https://www.youtube.com/watch?v=undefined&t=33680s) Even in hex, if you just add one


when you start at 0x123, the next number is four, five, six at the end. I don't have to worry about A's, B's,
and C's because I'm not counting that high in this example. So if that's the case, and my computer is
actually laying out the word hi in memory like that, well, what exactly is s? What exactly is s if, at the end
of the day, H-I exclamation point null is storing-- or is or stored at these addresses? Where is s? Now that
I've taken off those training wheels

- [9:21:54](https://www.youtube.com/watch?v=undefined&t=33714s) and showed you where H-I


exclamation point null actually are, what happened to s? Well s, as always, is actually a variable. Even in
the code I proposed a moment ago, s is apparently a data type that yes, doesn't come with C, but CS50's
library makes it exist. s is a variable of type string, so where is s in this picture? Well, it turns out that s
might be up here.

- [9:22:21](https://www.youtube.com/watch?v=undefined&t=33741s) Again, I'm just drawing it


anywhere for the sake of discussion, but s is a variable per that line of code. What s is storing,
apparently, I claim, is 0x123. I actually don't really care about these addresses, so let's abstract that
away. s is apparently, as of now, today, one week later, just a pointer to a character.

- [9:22:42](https://www.youtube.com/watch?v=undefined&t=33762s) Specifically, the first character in


s. And this is the last piece of the puzzle. Last week we had this clever way of demarcating the end of a
string. Well, it turns out that strings are represented in the computer's memory as a variable that is a
pointer, inside of which is the address of the first character in the string.

- [9:23:02](https://www.youtube.com/watch?v=undefined&t=33782s) So if s points at the first character


and you can trust that backslash zero is at the end of the string, that's literally all you need to figure out
where a string begins and ends. So what do I mean by this? Well, let's be a little more concrete. In terms
of this picture, if I've started with this line of code here, it turns out all this time since week 1, that the
word string has just semi-secretly been an alias for char star.

- [9:23:32](https://www.youtube.com/watch?v=undefined&t=33812s) I know, so char star. So why does


this make sense? It's a little weird still, but if in our previous example we were able to store the address
of an integer by declaring a variable called p, as int star p-- well, if as of now strings are just the address
of the first character in a string, then probably a string is just a char star because that means s is the
address of a character, the very first character in the string.

- [9:23:59](https://www.youtube.com/watch?v=undefined&t=33839s) Now, the string might have three


letters like it did, or four, or even a hundred if it's a long paragraph, but that's fine because you can trust
that there's going to be that null character at the very end. So this is a general purpose way of
representing strings using this new mechanism in C.

- [9:24:15](https://www.youtube.com/watch?v=undefined&t=33855s) So in fact, let me go ahead here


and introduce maybe a couple of manipulations of this. Let me go back to my code here, and let's get rid
of this integer stuff, and let's instead now do, for instance, this. Let me add in the CS50 library, so we'll
include CS50.H for now. I'm going to go ahead and inside of main, give myself a string s equals hi
exclamation point.

- [9:24:37](https://www.youtube.com/watch?v=undefined&t=33877s) I don't type the backslash zero. C


does that for me automatically by using my double quotes like this. Now let me just go ahead and print
it. So this again is week 1 style stuff where I'm just printing a string. No pointers yet. So let me do make
address, Enter, ./address, and hopefully I see hi, so nothing new there.

- [9:24:57](https://www.youtube.com/watch?v=undefined&t=33897s) But let's start to peel back some


of these layers here. Let me first of all, get rid of the CS50 library for a moment and let me change string
to char star. And it's a little bit weird but yes, the convention is to say char, a space, then the star, and
then immediately thereafter the name of the variable.

- [9:25:16](https://www.youtube.com/watch?v=undefined&t=33916s) Strictly speaking though, you


might see textbooks or websites that do it like this or like this, but the canonical way is typically to do it
like that. So now no more CS50 library, no more training wheels, if you will. I'm just treating strings for
what they really are. Let me go ahead and do make address, Enter-- so far so good-- .

- [9:25:35](https://www.youtube.com/watch?v=undefined&t=33935s) /address-- and that, too, still


works. So %s is a thing that comes with printf because the word string is programmer terminology but
strictly speaking C doesn't have a string data type. It's always been char star, so what this means now is I
can start to have some fun with these basic ideas, even though this is not purposeful other than for the
sake of discussion.
- [9:25:55](https://www.youtube.com/watch?v=undefined&t=33955s) But if s is this-- let me go back and
give myself the CS50 library. Let's put those training wheels back on for just a moment so that I can do
one manipulation at a time. Here's my string s, as before. Well, let me go ahead and declare a char called
c, and let me store the first character in the string there, which is s bracket zero, and that should give me
h.

- [9:26:18](https://www.youtube.com/watch?v=undefined&t=33978s) And then just for kicks, let me go


ahead and do char star-- whoops-- let me go ahead and do char star p equals ampersand c, and see what
this actually prints for me. Let me go ahead and print out what p is here. So we're just playing around. So
make address-- so far so good-- ./address. All right, so what have I just done? I've just created a char c
and stored in it the letter H, which is the same thing as s bracket I, then I'm saying, what's the address of
c, and that's apparently 0x7FF whatever.

- [9:26:54](https://www.youtube.com/watch?v=undefined&t=34014s) So that's the address. But I


technically didn't have to do that. Let me go ahead and do two things now. Instead of just printing p, let
me go ahead and print out maybe s itself. Let me go ahead and do make address, Enter-- so far so
good-- ./address and-- damn it, what did I do wrong. Oh shoot, I didn't want to do that.

- [9:27:18](https://www.youtube.com/watch?v=undefined&t=34038s) Oh, I really made a mess of this.


What did I want to do here? That was supposed to be impressive but it was the opposite. So let me turn
it around. So if I intended to do this, why are lines nine and 10 printing different values? Didn't really
intend to go here, but let me try to save this. Why are we seeing different addresses, namely this address
402004 for s, and then 0x7FF for p? Any thoughts? Yeah, over here.

- [9:27:55](https://www.youtube.com/watch?v=undefined&t=34075s) AUDIENCE: [INAUDIBLE] is the


character c is its own sort of location of the [INAUDIBLE],, and it's taking off just the values [INAUDIBLE]..
DAVID J. MALAN: Correct. So if I really wanted to weasel my way out of this, this is a great answer to the
previous question which was about, what if I introduce another variable, c, that's a copy of the value,
and not in this case an int, but an actual char.

- [9:28:18](https://www.youtube.com/watch?v=undefined&t=34098s) Here, I've made c be a copy of the


character that's at the beginning of s, but that's indeed a copy. So if I were to draw it on the screen that
would give me a different rectangle in which this copy of h would actually be stored. So I didn't intend to
do this, but what you're seeing is yes, the address of s-- and apparently that's at a pretty low address by
default here-- then you're seeing the address of c.

- [9:28:40](https://www.youtube.com/watch?v=undefined&t=34120s) But even though each of them is


h, I claim one is at a different address in memory. And this has always been happening. Any time you
created one variable or another it was ending up here, or here, or here, or somewhere else in memory.
Now for the first time all we're doing is actually just poking around the computer's memory to see what
is actually there.

- [9:28:58](https://www.youtube.com/watch?v=undefined&t=34138s) So let me actually back this up a


little bit and do what I intended to do here, which was something like this. So if string s equals quote
unquote, hi, let's go ahead and give myself a pointer, called p, to the first character in s. All right, so now
let me go ahead and print out the value of this pointer, %p, printing out p.
- [9:29:24](https://www.youtube.com/watch?v=undefined&t=34164s) So we're just going to do one
thing at a time. So make address, Enter, ./address. There, at the moment, is the address of the first
character in s. What I meant to do now, was this. If I want to print out two things this time, let me print
out not only what p is, but also what s itself originally is. Because if I claim that everyone from last week
should be comfortable with s bracket zero just representing the first character in s by definition of strings
being arrays of characters.

- [9:29:55](https://www.youtube.com/watch?v=undefined&t=34195s) Then s, as of today, is itself the


address of a character, the first one in s. So if I now do make address, and do ./address, this time I see
the same exact things. Thank you. This is really the lamest sort of thing to be applauding over, but what
we're demonstrating here is that s is by definition the address of the first character in c.

- [9:30:24](https://www.youtube.com/watch?v=undefined&t=34224s) So if we borrow some of our


mental model from last week-- well, if s bracket zero is the first character in c, doing the ampersand on
that expression should be the same as s. Now this isn't to say that we would jump through these hoops
all the time with this much syntax, but this is just to do proof by example that s is in fact, as I claimed a
moment ago, just the address of a character.

- [9:30:47](https://www.youtube.com/watch?v=undefined&t=34247s) Not even multiple characters, it's


the address of a single character, but the key thing is it's the address of the first character in the string,
and per last week we trust that C is going to look for that null character at the very end just to make sure
it knows where the string actually ends. All right, a question came up over here.

- [9:31:08](https://www.youtube.com/watch?v=undefined&t=34268s) AUDIENCE: [INAUDIBLE] DAVID J.


MALAN: Correct. To summarize, on line eight, when I am using %p-- that just means print a pointer
value, so 0x something-- I'm passing it s. Previously, when we used %s, printf knew to print not just the
first character of s, but h, i, exclamation point, and then stop when it hits the backslash zero.

- [9:31:42](https://www.youtube.com/watch?v=undefined&t=34302s) p is different. %p tells the


computer to go to that address-- sorry, tells the computer to print that address on the screen. So this is
where %s all this time has been powerful. The reason printf worked in week 1 and 2 and 3 was because
printf was designed by some human years ago to go to the address that's being passed in-- for instance,
s-- and print out character after character after character until it sees the null character backslash zero,
and then stop printing it.

- [9:32:13](https://www.youtube.com/watch?v=undefined&t=34333s) So that's-- you're getting a lot of


functionality for free from %s. Today we're using something much simpler, %p, which just literally prints
what s is. And the reason we don't do this in week 1 is just because this is like way too much to be
interesting when all you want to print out is hi or hello, world, or the like.

- [9:32:30](https://www.youtube.com/watch?v=undefined&t=34350s) But now what we're really doing


is revealing what's been going on this whole time. And let me make one other example here. Let me go
ahead and get rid of this variable here and let me just print out a few things to make the same point. I'm
going to print out not just s like I did here, but let's go ahead and print out every-- the address of every
character in s.

- [9:32:48](https://www.youtube.com/watch?v=undefined&t=34368s) So let's get the first letter in s and


get its address, and I'm going to do copy paste for time's sake, but not something I would do frequently.
So let me print out the address of the first character, the second character, the third, and actually even
the fourth, which is the backslash zero, by doing this.

- [9:33:07](https://www.youtube.com/watch?v=undefined&t=34387s) So when I compiled this


program-- make address, ./address-- I should see two identical values and then additional values that are
one byte away. In my diagram a moment ago, my addresses were arbitrarily 0x123, 124, 125, 126. Now it
starts at, by chance, 0x402004, which is s. 0x402004 is the same thing as s because I'm just saying go to
the first character and then get its address.

- [9:33:35](https://www.youtube.com/watch?v=undefined&t=34415s) Those are one in the same now.


And then after that is 0x402005, 006, 007, because that is just like the diagram. Go to the i, to the
exclamation point, and to the null character. So all I'm doing now is using my newfound understanding of
what ampersand does and what the star does, is I'm just playing around.

- [9:33:55](https://www.youtube.com/watch?v=undefined&t=34435s) I'm poking around in the


computer's memory. Just to demonstrate there's no magic. It's all there very deliberately because I or
printf or someone else put it there. Yeah. AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Really good
observation. So it's indeed the case that hi, unlike 50, is ending up at a very low address, not the 0x7FF
wherever it was.

- [9:34:22](https://www.youtube.com/watch?v=undefined&t=34462s) That's actually because, long story


short, strings are often stored in a different part of the computer's memory-- more on that later today--
for efficiency. There's actually only going to be one copy of the word "hi" and exclamation point, and the
computer is going to tuck it at the beginning of my memory, but other values like ints and floats and the
like-- they end up lower in memory by convention.

- [9:34:42](https://www.youtube.com/watch?v=undefined&t=34482s) But a good observation, because


that is consistent here. All right, so a couple final details then, on what's been going on here. Let me go
ahead and claim that we implemented char star-- or rather, string as a char star as follows. As of last
week we were writing this code. As of this week, we can now start writing this code because char star
specifically, we invented in the CS50 library.

- [9:35:07](https://www.youtube.com/watch?v=undefined&t=34507s) But it turns out you've seen a way


of inventing your own data types. Recall this thing here. We played around last time with data structures,
or the struct keyword in C, and briefly the typedef keyword, which defines a type for you. And if I
highlight what's interesting here, the way we invented a person data type last time was to define a
person as having two variables inside of it-- a structure that encapsulates a name and encapsulates a
number.

- [9:35:34](https://www.youtube.com/watch?v=undefined&t=34534s) Now even though the syntax is a


little different today because of the star thing, notice that this could be a similar application of that idea.
If I want to create a type called string, highlighted in yellow here, then I use typedef to make it defined to
be char star. So this is literally all that has ever been in CS50.

- [9:35:55](https://www.youtube.com/watch?v=undefined&t=34555s) h, in addition to those prototypes


of functions we've talked about. typedef char star string is a one-line code that brings the word string as
a data type into existence, and that's all that's ever been there. But the star, the char star, is just too
much in week 1. We wait until this point to peel back that layer.
- [9:36:14](https://www.youtube.com/watch?v=undefined&t=34574s) are any questions, then, on what
a string is? What star or the ampersand are doing? Yeah. AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Oh
my God. Massive spoiler, but yes. If that is-- is that why when you compare two strings as I briefly did, or
almost did, problems arise. And in fact yes, last week we use str compare-- STRCMP-- for a very
deliberate reason because yes, the spoiler is I accidentally would have compared two addresses in
memory, not the strings at those addresses.

- [9:36:47](https://www.youtube.com/watch?v=undefined&t=34607s) Other questions here. All right,


well, before we give ourselves maybe a 10 minute break here, we have lots of pieces of paper. If anyone
wants to come on up and play with this big stack of Post-Its, if you want to make your own eight by eight
grid of something to share with the class if you're artistically inclined, come on up.

- [9:37:03](https://www.youtube.com/watch?v=undefined&t=34623s) Otherwise, let's take 10 minutes


and will return after 10. All right, so let's come back to this question of how we can start to use these
pointers and these addresses, ultimately in an interesting way. The goal ultimately next week is going to
be to use these addresses to really stitch together more complicated data structures than just persons,
like last week, or candidates in the context of an electoral algorithm, if you will, and actually really use
our memory in the most versatile way to represent not just images but maybe videos

- [9:37:32](https://www.youtube.com/watch?v=undefined&t=34652s) and other two-dimensional


structures as well. But for now, let's come back to this address example, whittle it down to just a hi
initially, and see what's going on again, here underneath the hood. So let me re-add the CS50 library just
so we use our synonym for a moment, that is the word string, and I'll redefine s as a string.

- [9:37:52](https://www.youtube.com/watch?v=undefined&t=34672s) And what I didn't mention before


is that these double quotes that you've been using for some time are actually a little special. The double
quotes are a clue to the compiler that what is between them is in fact a string as we now know it, which
means the compiler will do all the work of figuring out where to put the h, the i, the exclamation point,
and even adding for you automatically a backslash zero.

- [9:38:14](https://www.youtube.com/watch?v=undefined&t=34694s) And what the compiler will do for


you, too, is figure out what address all four of those chars ended up at and store it for you in the variable
s. So that's why it just happens with strings without using ampersands or even stars explicitly, but the
star at least has been there because again, string is just synonymous now with char star.

- [9:38:34](https://www.youtube.com/watch?v=undefined&t=34714s) It's not really as readable, but it is


now the same idea. So I'll leave string in place just to do something week 1 style here for a moment, and
let's go ahead and print out a few characters. So I'm going to use %c this time, and I'm going to print out
s bracket zero and then I'm going to print out s bracket one and s bracket two, literally doing week three
style from last week-- a printing of every character in s as though it were an array.

- [9:39:03](https://www.youtube.com/watch?v=undefined&t=34743s) So ./address should give me h-i


exclamation point. And if I really want to get curious, technically speaking, I could print out one more
location, and let me go ahead and recompile, make address ./address and there is, it would seem, the
backslash zero. I'm not seeing zero because I didn't type literally the zero char in ASCII, it's literally eight
zero bits which are technically unprintable, if you will, in printf speak.
- [9:39:30](https://www.youtube.com/watch?v=undefined&t=34770s) And so what I'm seeing here is
like a blank symbol. That just means there is something else there-- it's apparently all eight zero bits, but
they are there even though we're not seeing them literally right now. Well, let's go ahead and peel back
one of these layers and let me go ahead and get rid of the CS50 library and get rid of, therefore, the word
string because again, henceforth it's just char star.

- [9:39:52](https://www.youtube.com/watch?v=undefined&t=34792s) Nothing else is different. I'm going


to now do make address, ./address, and it's the same exact thing. And now, let's just focus on the hi
rather than even worry about that. So I'm going to recompile one last time and now I have h-i
exclamation point. Well, it turns out that the array notation we used last week was technically some of
this syntactic sugar.

- [9:40:13](https://www.youtube.com/watch?v=undefined&t=34813s) Sort of a neat way to use syntax in


a useful way, but we can see more explicitly today what the square brackets for a string is actually doing.
Let me go ahead and do this. Let me adventurously say I want to print out not s bracket zero, but I want
to print out whatever the first character of s is. So to be clear, what is s now? It's the address of a string.

- [9:40:40](https://www.youtube.com/watch?v=undefined&t=34840s) OK, but what is s, really? s is the


address of the first char in a string and again, that's sufficient for defining a string because eventually the
computer will see that there's a backslash n at the end of it. So s is specifically the address of the first
character in a string. So that means, using my new syntax, if I want to print out that first character I can
print out star s, because recall that star is the dereference operator when you don't repeat the word
char, you don't repeat the word int--

- [9:41:09](https://www.youtube.com/watch?v=undefined&t=34869s) you just use the star here. That


means go to that address. Similarly, if I, in my newfound knowledge of how strings work, know that the h
comes first, then the i right after it, then the exclamation point, then the backslash zero, contiguously
one byte apart, I could start to do some arithmetic. I could go to s plus 1 byte and print out the second
character, and I could print out whatever is at s plus 2-- in fact, doing what's generally known as pointer
arithmetic.

- [9:41:42](https://www.youtube.com/watch?v=undefined&t=34902s) Literally treating pointers as the


numbers they are-- hexadecimal or decimal, doesn't really matter-- it's still just numbers. And go ahead
and add one byte or two bytes to them to start at the beginning of a string and just poke around from
left to right. So this now is equivalent to what we did last week using square bracket notation, but now
I'm re implementing that same idea with this lower level plumbing, understanding ampersand and stars
now a little bit more, so if I remake this program and do ./address,

- [9:42:12](https://www.youtube.com/watch?v=undefined&t=34932s) I should still see h-i exclamation


point. But what I'm really doing is just kind of demonstrating, hopefully, my understanding of what really
is going on in the computer's memory. Now, programmers who are maybe trying to show off might
actually write this syntax. I think the more common syntax would be what we did last week-- s bracket
zero, s bracket one.

- [9:42:30](https://www.youtube.com/watch?v=undefined&t=34950s) Why? It's just a little more


readable and we don't need to brag about or care about this underlying representation. The square
brackets last week we're an abstraction, if you will, on top of what is lower level math. But that's all
that's going on underneath the hood. We're poking around from byte to byte to byte.

- [9:42:48](https://www.youtube.com/watch?v=undefined&t=34968s) All right, let me pause here, see if


there's any questions on that one. Any questions on this? Let's do one more then, just to demonstrate
that this is not even specific to strings. Let me go ahead and get rid of all of this and let me give myself
an array of numbers like I did last week. So if I'm going to declare all the numbers at once using this
funky curly brace notation, I can do like 4, 6, 8, 2, 7, 5, 0.

- [9:43:15](https://www.youtube.com/watch?v=undefined&t=34995s) So seven different numbers inside


of an array that's automatically initialized like this. I don't, strictly speaking, need to say seven. The
compiler is smart enough to figure out how many numbers I put with commas between them, and that
just gives me an array containing 4, 6, 8, 2, 7, 5, 0. So it turns out I can print each of these numbers in the
familiar way.

- [9:43:35](https://www.youtube.com/watch?v=undefined&t=35015s) I can do a printf of %i backslash n,


and I can print numbers bracket zero, and let me just do some quick copy/paste just to print the first
three of these. Theoretically, that should print out 4, 6, 8, and so forth. But I can do the same sort of
manipulation understanding what pointers now are, using pointer arithmetic.

- [9:43:55](https://www.youtube.com/watch?v=undefined&t=35035s) So let me actually unwind this


and just go back to one printf, and instead of printing numbers bracket zero like I might have last week,
let me just go and print out whatever is at that address-- so asterisk numbers. Let me then print out the
second digit, which is going to be whatever is at numbers plus 1, and then let me do this further and do
whatever is at numbers plus 2, and if I really want to repeat this, let me do it four more times and do
what's at location three, four, five, and six.

- [9:44:27](https://www.youtube.com/watch?v=undefined&t=35067s) And that's seven total numbers


because I started counting at zero. So let me just quickly run this. Make address, ./address. There are
those seven digits being printed. But there's something subtle but also useful here. Each of these digits--
4, 6, 8, 2,7,5, 0-- is an int. Why? Because I made an array of integers.

- [9:44:48](https://www.youtube.com/watch?v=undefined&t=35088s) But think back-- how big is a


typical integer, have we claimed? Four bytes, or 32 bits, so it's worth noting that I don't really need to
worry about that detail. Notice that I did not do plus 4, plus 8, plus 12, plus 16, plus 20. I, the
programmer, strictly speaking, don't need to worry about how big the data type is.

- [9:45:10](https://www.youtube.com/watch?v=undefined&t=35110s) This is the power of pointer


arithmetic. The compiler is smart enough to know that if you add 1 to this pointer, that is the same as
saying go one more piece of data-- not just one byte-- so if it's an int, move four. If it's a second int, move
eight. If it's a third int, move 12. Pointer arithmetic handles that annoying arithmetic for you so you can
just think of this as a number after a number after a number that are back to back to back but not one
byte apart, but four bytes apart.

- [9:45:39](https://www.youtube.com/watch?v=undefined&t=35139s) Which is only to say plus 1, plus 2,


plus 3 works no matter the data type. Why? Because the compiler knows what type of data you're
talking about. Now, there's one other detail I should reveal here that I've taken for granted. In the past I
was using double quotes to represent strings, and I claim that the compiler's smart enough to realize
that oh, if I have double quote hi, that means it's an array of h-i exclamation point, and then the
backslash zero.

- [9:46:06](https://www.youtube.com/watch?v=undefined&t=35166s) Notice this usefulness. It turns out


that you can actually treat arrays as though the name of the array is itself a pointer, and this is actually
going to be something useful in upcoming problems when we want to pass arrays around in the
computer's memory. Notice that strictly speaking on line five, there's no pointers going on.

- [9:46:26](https://www.youtube.com/watch?v=undefined&t=35186s) There's no star, there's no


ampersand-- there's nothing new there, and yet instantly on line seven I'm pretending that it is the
address, and this is actually OK. It turns out that an array really can be treated as the address of the first
element in that array. The difference is that there's no secret backslash zero anywhere.

- [9:46:47](https://www.youtube.com/watch?v=undefined&t=35207s) This is just part of the phone


number here, the ending in zero-- that's not like a special backslash zero. So this is something we're
going to take advantage of too, before long. There's this interrelationship between addresses and arrays
that just generally allows you to treat one as though it is the other, but the math is taken care of for you.

- [9:47:06](https://www.youtube.com/watch?v=undefined&t=35226s) Are any questions then on this


before we start to solve some bigger problems? Yeah. AUDIENCE: [INAUDIBLE] DAVID J. MALAN:
Potentially. If you go beyond the end of an array, you might get a segmentation fault. The problem is that
that symptom is sometimes nondeterministic, which means that sometimes it will happen, sometimes it
won't.

- [9:47:31](https://www.youtube.com/watch?v=undefined&t=35251s) It often depends on how far off


the end of the array you actually go. You'll often not induce the segmentation fault if you just poke a
little too far, but if you go way too far it quite likely will. But we'll give you a tool today actually for
detecting and solving exactly that kind of situation.

- [9:47:47](https://www.youtube.com/watch?v=undefined&t=35267s) So let's go ahead now and do


something a little different in code, but that actually comes back to that spoiler from earlier. Let me go
ahead and create a program called compare.c, and in this program I'm going to go ahead and allow
myself the CS50 library, not so much for string but so that I can actually use GetInt still, which is way
easier than the way we'll see that C normally lets you get input.

- [9:48:08](https://www.youtube.com/watch?v=undefined&t=35288s) Let me give myself stdio.h, do an


int main(void), not worrying about command line arguments today, and let me go ahead and get an int i
using get int, and ask the human for the value of i, then let me give myself an int j, ask the user for
another int, calling it j, and then let me go ahead and kind of naively, but to your point earlier, if i equals
equals j, then let's go ahead and print out something like "same," backslash n, else let's go ahead and
print out "different" if they are not, in fact, the same.

- [9:48:40](https://www.youtube.com/watch?v=undefined&t=35320s) So that would seem to be a


program that compares the value of two integers. All right, so let's go ahead and run make compare-- so
far so good-- ./compare. OK, i will be 50, j will be 50-- they're the same. Let's do it once more. i will be
50, j will be 42. They are different. So so far, so good in this first version of comparison.
- [9:49:03](https://www.youtube.com/watch?v=undefined&t=35343s) But as you might see where I'm
going with this, let's move away from integers and let's actually change these things to char-- to strings.
So I could do string s over here-- GetString s over here. Then I could do string t over here, and GetString
over here, asking the user for t this time, here.

- [9:49:25](https://www.youtube.com/watch?v=undefined&t=35365s) And then I can compare the two.


If s equals equals t-- and this is a common convention. If you've used s for string already you can use t for
the next one, at least for simple demonstrations like this. I'm going to compare the two, just like I did for
ints, which worked great. Make compare-- so far so good-- .

- [9:49:42](https://www.youtube.com/watch?v=undefined&t=35382s) /address-- oh, sorry. Wrong


program-- ./compare. Let me go ahead and type in something like hi, exclamation point and bye,
exclamation point, which of course should definitely be different. Let me run it again with hi, exclamation
point and hi, exclamation point. Different-- maybe I messed up. Let's maybe do it lowercase, maybe
that'll fix.

- [9:50:06](https://www.youtube.com/watch?v=undefined&t=35406s) But no, those two are different.


So to come back to what I described as a spoiler earlier, what's the fundamental issue here, to be clear?
Why is it saying different even though I'm pretty sure I typed the same thing twice. Yeah. Yeah, this is
where it's now useful to know that string has been an abstraction-- a training wheel, if you will-- and if
we take that away-- still use GetString because that's convenient still-- but if I change string to be char
star, it's a little more explicit as to what s and what t are. s is a pointer to a char,

- [9:50:40](https://www.youtube.com/watch?v=undefined&t=35440s) that is the address of a char. t is a


pointer to a char, that is the address of a char. Specifically, the first character in s and the first character
in t, respectively. So if I'm comparing these two it should stand to reason that they're going to be
different. Why? Because s might end up here in memory and t might end up here in memory.

- [9:50:57](https://www.youtube.com/watch?v=undefined&t=35457s) Each time I call GetString, it is not


smart enough or advanced enough to know that, wait a minute-- you typed the same thing. I'm just
going to hand you back the same address. That doesn't happen because we did not design GetString that
way. Each time I call GetString, it returns, apparently, a different copy of the string that was typed in.

- [9:51:13](https://www.youtube.com/watch?v=undefined&t=35473s) A hi over here and a hi over here.


They might look the same to the human but to the computer they are different chunks of memory, and
therefore at different addresses. And here, too, we can reveal what is GetString returning? Well, up until
today it was returning a string, so to speak. That's not really a thing.

- [9:51:31](https://www.youtube.com/watch?v=undefined&t=35491s) Technically, what GetString has


always been doing is returning the address of the first char in a string and trusting that we put a
backslash zero at the end of whatever the human typed in, and that's enough now for printf, for strlen,
for you to know where a string begins and ends. So GetString has actually always returned a pointer.

- [9:51:53](https://www.youtube.com/watch?v=undefined&t=35513s) It has not returned a quote


unquote string per se, but there are functions that can solve this comparison for us. Recall that I could do
something like this. I could actually go in here and I could-- let's see, where was it? So if I include str
compare here and use it to pass in two values, s and t, let's see now what happens when I make
compare.
- [9:52:18](https://www.youtube.com/watch?v=undefined&t=35538s) Implicitly declaring library
function str compare with type int-- and well, there's a star. So you might have seen this error before and
you might have ignored most of it, but there's some evidence of stars or pointers going on here. It looks
like I didn't include the string.h header file, so that's an easy fix.

- [9:52:34](https://www.youtube.com/watch?v=undefined&t=35554s) Include string.h which, despite its


name, does not create a data type called string, it just has string-related functions in it like str compare.
Let's make compare again. Now it compiles, ./compare. Now let's type in hi, exclamation point and even
the same thing again. These are now-- oh, I used it wrong.

- [9:52:54](https://www.youtube.com/watch?v=undefined&t=35574s) OK, user error. That was supposed


to be impressive, but it's the opposite. What did I do wrong? What did I do wrong here? Yeah. Yeah.
AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yeah, it returns three different values. Zero if they're the
same, positive 1 becomes before the other, negative if the opposite is true.

- [9:53:15](https://www.youtube.com/watch?v=undefined&t=35595s) I just forgot that, so like I did last


week correctly, if I want to compare them for equality per the manual page, I should be checking for zero
as the return value. Now make compare, ./compare, Enter. Let's try it one last time-- hi and hi. OK now,
they're in fact the same. And Justin, thank you. And indeed, not that it's returning same all the time.

- [9:53:40](https://www.youtube.com/watch?v=undefined&t=35620s) If I type in hi and then bye, it's


indeed noticing that difference as well. Well, let me go ahead and do one other thing here. Let's do one
other thing. Let me go ahead now and just reveal more pictorially what's going on. Let's get rid of the
string comparison and let's just print these things out.

- [9:53:58](https://www.youtube.com/watch?v=undefined&t=35638s) The simple way to print this out


would be with %s and again, %s is special-- printf knows-- taking an address and start there, print every
character up until the backslash n, so let's just hand it s and do that. And then let's do one more, %s,t.
This is, again, sort of a mix of week 1 and this week because I got rid of the word string.

- [9:54:19](https://www.youtube.com/watch?v=undefined&t=35659s) I'm using char star, but I'm still


using printf and %s in the same way. Let me go ahead and run compare now, and if I type hi and hi, I
should see the same thing twice. So they look the same, but here now we have the syntax today to print
out the actual addresses of these things. So let me just change the s to a p, because p means don't go to
the address and print it, it means just print the address as a pointer.

- [9:54:44](https://www.youtube.com/watch?v=undefined&t=35684s) So make compare, ./compare, and


now let's type in hi, and once more, and I should see, indeed, two slightly different addresses given in
hexadecimal. One's got a B at the end, one's got an F at the end, and they are indeed a few bytes apart.
So this is just confirming what our suspicions have actually been.

- [9:55:02](https://www.youtube.com/watch?v=undefined&t=35702s) So what does this mean, perhaps


in the computer's memory? Well, let's take a look. I've zoomed out so I have a little more squares to look
at at once. Here might be s in memory when I do string s equals, or char star s equals. I get a variable
that's of size 1, 2, 3, 4, 5, 6, 7, 8, because I claimed earlier that on modern systems, pointers are generally
eight bytes nowadays so they can count even higher.
- [9:55:26](https://www.youtube.com/watch?v=undefined&t=35726s) And inside of the computer's
memory, also, might be hi. And I don't know where it ends up so for the sake of discussion it ended up
down here. That's what was free when I ran the program. h-i exclamation point, backslash zero. Maybe it
ended up, for the sake of discussion, at 0x123, 4, 5, and 6.

- [9:55:42](https://www.youtube.com/watch?v=undefined&t=35742s) So to be clear, what is s storing


once the assignment operator copies from right to left? What is s storing if I advance one more slide?
Yeah. 0x123, the presumption being that if a string is defined by the address of its first char and that
address of its first char is 0x123, then that's indeed what should be in the variable s.

- [9:56:09](https://www.youtube.com/watch?v=undefined&t=35769s) And so technically, that's what's


been happening with that assignment operator from right to left. GetString indeed returns a string, so to
speak, but more properly it returns the address of a char. What's been then copied from right to left
using that assignment operator all these weeks is indeed that address.

- [9:56:27](https://www.youtube.com/watch?v=undefined&t=35787s) Now technically, we don't really


need to care about where these addresses are. It suffices to just think about them referentially, but let's
first consider where t might be. t is just another variable that I created on my second line of code. Maybe
it ends up there, maybe somewhere else. For the sake of discussion I'll draw it left and right.

- [9:56:44](https://www.youtube.com/watch?v=undefined&t=35804s) Where did the second word end


up that I typed in? Well, suppose the second copy of hi ended up at 0x456457458459. What ended up in
t? I'll pluck this one off myself. 0x456, presumably. And so this is now a pictorial representation of why,
and let's abstract away everything else. When I compared s against t using equal equals, based on the
picture they're obviously not the same.

- [9:57:10](https://www.youtube.com/watch?v=undefined&t=35830s) One is over here, one is over here.


And per a moment ago, one is 0x123, the other is 0x456. Yes, technically they're pointing at something
that's the same, but that just reveals how str compare works. str compare is apparently a function that
takes in the address of a string as its argument and the address of another string as its argument, it goes
to the first character in each of those strings, respectively, and probably has a for loop or a while loop
and just goes from left to right, comparing, looking

- [9:57:42](https://www.youtube.com/watch?v=undefined&t=35862s) for the same chars left and right,


and if it doesn't notice any differences, boom-- it returns zero. If it does notice a difference it returns a
positive or a negative value. And that's very similar, recall, to how we implemented string length
ourselves last week. I used a for loop, I was looking for a backslash zero.

- [9:57:59](https://www.youtube.com/watch?v=undefined&t=35879s) str compare is probably a little


similar in spirit, looping from left to right but comparing, this time not just counting. Are any questions
then, on string comparison and why it is that we use str compare and not equals equals? Yeah.
AUDIENCE: Do pointers have addresses? DAVID J.

- [9:58:19](https://www.youtube.com/watch?v=undefined&t=35899s) MALAN: Do pointers have


addresses? Yes. So we won't do that today, but I could actually use the ampersand operator on s or on t.
That would give me the equivalent of a char star star that itself could be stored elsewhere in memory.
That's where it ends. We don't do that recursively forever. There's star and there's star star, but yes, that
is a thing and it's very often useful in the context of two dimensional arrays, which we haven't really
talked about, but that is a feature of the language, too.

- [9:58:45](https://www.youtube.com/watch?v=undefined&t=35925s) But not today. Good question. All


right, so what might we now do to take things up a notch? Well let's go ahead and implement a different
program here that maybe tries copying some values, just to demonstrate this. Let me open up a file
called, how about copy.c, and I'm going to start off with a few includes.

- [9:59:03](https://www.youtube.com/watch?v=undefined&t=35943s) So let's include the CS50 library


just so we have a way of getting user input. Let's include-- how about stdio as always, let's preemptively
include string.h and maybe one other in a moment. Let's do int main(void) as before. And then in here,
let's get a string from the user and just call it s for simplicity.

- [9:59:23](https://www.youtube.com/watch?v=undefined&t=35963s) And heck, we can actually just call


this char star if we want, or string, since we're using the RS50 library. But we'll come back to that. Let's
now make a copy of s and do s equals t, using a single assignment operator and then let's check
something like this. Let's go into the first character of t, which is t bracket zero, and then let's uppercase
it using that function that we've used in the past of toupper t bracket zero, semicolon.

- [9:59:51](https://www.youtube.com/watch?v=undefined&t=35991s) And actually, I should go back up


here. If I'm using toupper or if you use tolower or isupper or islower-- I might not remember this
offhand, but it was in another header file called C type dot h. There was a bunch of helpful functions in
that library as well. Now at the very last line of the program let's just print out what both s and t are by
simply printing out %s for each of them, and t is %s also, not %t, of course, and let's see what happens
here.

- [0:00:20](https://www.youtube.com/watch?v=undefined&t=36020s) So let me make copy-- oh my God,


so many mistakes. What did I do wrong? Oh. OK, that was unintended. String t equals s, sorry, so I'm
creating two variables, s and t respectively, and I'm copying s into t. Make copy, Enter. There we go.
./copy, and let's now type in, for instance, how about hi exclamation point in all lowercase this time, and
now what gets printed? I don't think that's what I intended, so to speak, here.

- [0:00:52](https://www.youtube.com/watch?v=undefined&t=36052s) Because notice that I got s from


the user, so that checks out. I then copied t into s, which looks correct. That's what we always use
assignment for. Then I uppercase the first letter in t, but not s-- at least in my code-- then I printed s and t
and then noticed, apparently, both s and t got capitalized.

- [0:01:13](https://www.youtube.com/watch?v=undefined&t=36073s) So if you're starting to get a little


comfortable with what's going on underneath the hood, what's the fundamental problem here? Why did
both get capitalized? Why did both get capitalized? Yeah, over here. AUDIENCE: Could it be they're
referencing the same address? DAVID J. MALAN: Yeah, they're representing the same address.

- [0:01:29](https://www.youtube.com/watch?v=undefined&t=36089s) So C is really literal. If you create


another variable called t and you assign it the value of s, you are literally assigning it the value in s, which
is 0x123 or something like that. And so at that point in the story both s and t presumably have a value of
0x123, which means they technically point to the same h-i exclamation point in memory.
- [0:01:51](https://www.youtube.com/watch?v=undefined&t=36111s) Nowhere did I tell the computer
to give me a copy of a h-i exclamation point per se, I literally said just copy s. So here's where an
understanding of what s literally is explains the situation. I'm only copying the pointers. So what actually
went on in memory? Let's take a look here at this grid.

- [0:02:10](https://www.youtube.com/watch?v=undefined&t=36130s) If I created s initially, maybe it


ends up here. And I created hi in lowercase, and it ended up down here. Then the address was, again,
like 0x123456, 0x123 is what's in s. If then I create a second variable called t, and I call it a string, a.k.a.
char star, maybe it again ends up here. But when I copy s into t by doing t equals s semicolon, that
literally just copies s into t, which puts the value 0x123 there.

- [0:02:40](https://www.youtube.com/watch?v=undefined&t=36160s) So if we now abstract away all


these numbers and just think about a picture with arrows, what we've drawn in the computer's memory
is this. Two different pointers but storing the same address, which means the breadcrumbs lead to the
same place. And so if you follow the t breadcrumb and capitalize the first letter, it is functionally the
same as copying the-- changing the first letter in the version s as well.

- [0:03:08](https://www.youtube.com/watch?v=undefined&t=36188s) So what's the solution, then, to


this kind of problem? Even if you have no idea how to do it in code, what's the gist of what I really
intended, which is, I want a genuine copy of s, called t. I want a new h-i exclamation point backslash zero.
What do I need to do to make that happen? Thoughts? AUDIENCE: I think there's a function called str
copy.

- [0:03:31](https://www.youtube.com/watch?v=undefined&t=36211s) DAVID J. MALAN: So there is a


function called str copy, strcpy, which is a possible answer to this question. The catch with stir copy is
that you have to tell it in advance not only what the source string is-- the one you want to copy-- you also
need to pass in the address of a chunk of memory into which you can copy the string, and here's one
thing we haven't seen yet, and we need one more building block today, if you will.

- [0:03:53](https://www.youtube.com/watch?v=undefined&t=36233s) We haven't yet seen a way to


create new chunks of memory and then let some other function copy into them. And for this, we're
going to introduce something called dynamic memory allocation. And this is the last and most powerful
feature perhaps, today, whereby we're going to introduce two functions, malloc and free, where malloc
means memory allocate, which literally does just that.

- [0:04:15](https://www.youtube.com/watch?v=undefined&t=36255s) It's a function that takes a


number as input-- how many bytes of memory do you want the operating system to find for you
somewhere in that big grid? It's going to find it and it's going to return to you the address of the first
byte of contiguous memory back to back to back, and then you can do anything you want with that
chunk of memory.

- [0:04:30](https://www.youtube.com/watch?v=undefined&t=36270s) free is going to do the opposite.


When you're done using a chunk of memory that malloc has given you, you can say free it, and that
means you hand it back to the operating system and then the operating system can use it for something
else later. So this is actually evidence of a common problem in programming.

- [0:04:44](https://www.youtube.com/watch?v=undefined&t=36284s) If your Mac your PC has ever been


in the habit of starting to get really, really slow, or it's slowing to a crawl-- heck, maybe it even freezes--
one of the possible explanations could be that the program you're running by Apple or Microsoft or
whoever, maybe they're using malloc or some equivalent, asking the operating system-- Mac OS or
Windows-- for, give me more memory.

- [0:05:06](https://www.youtube.com/watch?v=undefined&t=36306s) I need more memory. The user is


creating more images. The user is typing a longer essay. Give me more memory, more memory. If the
program has a bug and never actually frees any of that memory, your computer might end up using all of
the available memory and honestly, humans are not very good at handling corner cases like that.

- [0:05:22](https://www.youtube.com/watch?v=undefined&t=36322s) Very often programs, computers


just freeze at that point or get really, really slow because they start trying to be creative when there's not
enough memory left. So one of the reasons for a computer really slowing down might be calling for
malloc a lot, or some equivalent, but never freeing it. Which is to say, you should always use these two
functions in concert and free memory once you are done with it.

- [0:05:44](https://www.youtube.com/watch?v=undefined&t=36344s) So let me go ahead and do this in


code and solve this problem properly. Let me go ahead and do this. Before I copy s into t using
something like str copy, I first need to get a bunch of memory from the computer. So to do that, let's
make this super clear that we're dealing with pointer, so I'm going to change my strings to char stars for
both s and t, and what I technically am going to store in t is the address of an available chunk of memory.

- [0:06:10](https://www.youtube.com/watch?v=undefined&t=36370s) To do that, I can ask the computer


to allocate memory for me, and how many bytes. If I want to create a copy of h-i exclamation point, I
need how many bytes? Good! Four! Because I need the h, the i, the exclamation point, and additional
space for the backslash zero. It's up to me to understand that and ask for it.

- [0:06:31](https://www.youtube.com/watch?v=undefined&t=36391s) It's not going to happen magically.


Nothing does in C. So I could just naively type four there, and that would be correct if I type in h-i
exclamation point or any other three letter word or phrase, but to do this dynamically I should probably
do something like strlen of s plus 1 for the additional null character.

- [0:06:50](https://www.youtube.com/watch?v=undefined&t=36410s) Recall that string length does it in


the English sense-- it returns the length of the string you see, plus 1 also takes into account the fact that
I'm going to need that backslash n. Now let me do this old school style first. Let me go ahead and
manually copy the string s into t first. So for int i equals 0, i is less than the string length of s, i plus plus.

- [0:07:14](https://www.youtube.com/watch?v=undefined&t=36434s) Then inside my for loop, I'm going


to do t bracket i equals s bracket i, but actually I want the null character too, so I want to do the length of
the string plus 1 more, and heck, I think I learned an optimization last time. If I'm doing this again and
again, I could really do n equals strlen of s plus 1 and then do i is less than n, just as a nice design
optimization.

- [0:07:39](https://www.youtube.com/watch?v=undefined&t=36459s) I think this for loop will actually


handle the process, then, of copying every character from s into every available byte of memory in t. Or I
could get rid of all of that and take your suggestion, which is to use str copy, which takes as its first
argument the destination and its second argument the source.
- [0:07:59](https://www.youtube.com/watch?v=undefined&t=36479s) So copy from right to left in this
case, too, that's going to do all of that automatically for me as well. Now I think I'm good. I can now
capitalize safely. The first character in t, which is now a different chunk of memory than s, and then I can
print them both out to see that one has not changed but the other has.

- [0:08:20](https://www.youtube.com/watch?v=undefined&t=36500s) So make copy-- all right, what did


I do wrong? Implicitly declaring library function malloc dot, dot, dot. So we've seen this kind of error
before. What is-- even if you don't know quite how to solve it, what's the essence of the solution? What
do I need to do to fix this kind of problem involving implicitly declaring a library function? What did I
forget? Yeah.

- [0:08:42](https://www.youtube.com/watch?v=undefined&t=36522s) I need to include the library. And I


could look this up in the manual, or I know it off the top of my head, I just forgot it. There's another
library we'll occasionally need now called standard lib-- standard library-- that contains malloc and free
prototypes and some other stuff, too. All right, let me just clear this away and do make copy one more
time.

- [0:09:00](https://www.youtube.com/watch?v=undefined&t=36540s) Now I'm good. ./copy, Enter, All


right. s, I'm going to type in hi, lowercase. t and s now come back as intended. s is untouched, it would
seem, but t is now capitalized. Are any questions, then, on what we just did in code? Yeah. AUDIENCE:
You said that malloc and free go together. [INAUDIBLE] DAVID J. MALAN: Indeed.

- [0:09:28](https://www.youtube.com/watch?v=undefined&t=36568s) There's a few improvements I


want to make, so let me actually do those right now. Technically, I should practice what I preached and I
should indeed, when I'm done with t, free t. Fortunately, I don't have to worry about how big t was-- the
computer remembers how many bytes it gave me and it will go free all of them, not just the first.

- [0:09:45](https://www.youtube.com/watch?v=undefined&t=36585s) I should do free t. I don't need to


do free s, and I shouldn't, because that is handled automatically by the CS50 library. s, recall, came from
GetString, and we actually have some fancy code in place that makes sure that at the end of your
program's execution we free any memory that we allocated so we don't actually waste memory like I
described earlier.

- [0:10:04](https://www.youtube.com/watch?v=undefined&t=36604s) But there's actually a couple of


other things if I really want to be pedantic I should put in here. It turns out that sometimes malloc can
fail, and sometimes malloc doesn't have enough memory available because maybe your computer's
doing so much stuff there's just no more RAM available. So technically, I should do something like this-- if
t equals equals null, with two L's today, then I should just return 1 or something to say that there was a
problem.

- [0:10:28](https://www.youtube.com/watch?v=undefined&t=36628s) I should probably print an error


message too, but for now I'm going to keep it simple. I should also probably check this. This is a little
risky of me. If I'm doing t bracket zero, this is assuming that there is a letter there. But what if the human
just hit Enter at the prompt and didn't even type h, let alone h-i exclamation point? What if there is no t
bracket zero? So technically, what I should probably do here is, if the length of t is at least greater than
zero, then go ahead and safely capitalize
- [0:11:00](https://www.youtube.com/watch?v=undefined&t=36660s) the first letter of it. And then at
the very end if all goes well, I can return zero, thereby signifying that indeed, this thing was successful.
So yes, these two functions, malloc and free, should be in concert. And so if you call malloc you should
call free eventually. But you did not call malloc for s, so you should not call free for s.

- [0:11:23](https://www.youtube.com/watch?v=undefined&t=36683s) Yeah, other question. AUDIENCE:


Here's a question. Why do we do malloc plus 1? DAVID J. MALAN: Why did I do malloc plus 1? So
malloc-- sorry, malloc of string length of s plus 1-- the string length is the literal length of the string as a
human would perceive it in English. So h-i exclamation point-- strlen gives me 3, but I know now as of
last week and this week what a string technically is and a string always has an extra byte.

- [0:11:45](https://www.youtube.com/watch?v=undefined&t=36705s) The onus is on me to understand


and apply that lesson learned so that I actually give str copy enough room for that trailing null character.
And here's just an annoying thing when we called the backslash zero N-U-L last week, it turns out that N-
U-L-L is the same idea. It's also zero, but it's zero in the context of pointer.

- [0:12:07](https://www.youtube.com/watch?v=undefined&t=36727s) So long story short, you never


really write N-U-L, I've just said it and we saw it on the screen. You will start writing N-U-L-L when you
want to check whether or not a pointer is valid or not. And what I mean by that is this. If malloc fails and
there's just not enough memory left inside of the computer for you, it's got to return a special value, and
that special value is N-U-L-L in all capital letters.

- [0:12:31](https://www.youtube.com/watch?v=undefined&t=36751s) That signifies something went


wrong. Do not trust that I'm giving you a useful return value. Other questions on these copies thus far?
Yeah, over there. AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Good question. Will str copy not work
without malloc? You kind of need both in this case because str copy, by definition-- if I pull up its manual
page-- needs a destination to put the copied characters.

- [0:12:59](https://www.youtube.com/watch?v=undefined&t=36779s) It's not sufficient just to say char


star t semicolon. That only gives you a pointer. But I need another chunk of memory that's just as big as
h-i exclamation point backslash zero, so malloc gives me a whole bunch of memory and then str copy fills
it with h-i exclamation point backslash zero. So again, that's why we're going down to this lower level,
because once you understand what needs to be done you now have the functions to do it.

- [0:13:23](https://www.youtube.com/watch?v=undefined&t=36803s) So let's actually consider what we


just solved. So in this next version of the program where I actually introduced malloc, t was initialized for
the return value of malloc, and maybe the memory that I got back was here-- 0x456457458459. I've left
it blank initially because nothing is put there automatically by malloc.

- [0:13:42](https://www.youtube.com/watch?v=undefined&t=36822s) I just get a chunk of memory that


is now mine to use as I see fit. I then assign t to that return value, which points t at the first address.
Notice there's no backslash zero. This is not yet a string it's just a chunk of memory-- four bytes-- an
array of four bytes. What str copy eventually did for me was it copied the h over, the i over, the
exclamation point over, and the backslash zero.

- [0:14:06](https://www.youtube.com/watch?v=undefined&t=36846s) And if I didn't want to use str copy


or I forgot that it existed, my for loop would have done exactly the same thing. Are any questions, then,
on these examples here. Any questions? Yeah. AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Good question.
After malloc, if I had then still done just t equals s, it actually would have recreated the same original
problem by just copying 0x123 from s into t.

- [0:14:41](https://www.youtube.com/watch?v=undefined&t=36881s) So then I would have been left


with a picture that looked like this a few steps ago, I would have-- and I can't quite do it live-- this arrow,
if I did what you just described, would now be pointing over here and so I wouldn't have fundamentally
solved the problem, I would have just additionally wasted four bytes temporarily that I'm not actually
using.

- [0:15:00](https://www.youtube.com/watch?v=undefined&t=36900s) Yeah. AUDIENCE: [INAUDIBLE]


DAVID J. MALAN: You can-- do you always use malloc and str copy together? Not necessarily. These are
both solving two different problems. malloc's giving me enough memory to make a copy, str copy is
doing the copy. However, you could actually use an array, if you wanted, of characters, and you could use
str copy on that, and there's other use cases for str copy.

- [0:15:22](https://www.youtube.com/watch?v=undefined&t=36922s) But thus far, it's a reasonable


mental model to have that if you want to copy strings, you use malloc and then str copy, or your own
homegrown loop. Yeah. AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Say that once more. AUDIENCE:
[INAUDIBLE] DAVID J. MALAN: No. It will-- good question. If I had a-- str copy, per its documentation, will
copy the whole string plus the null character at the end.

- [0:16:01](https://www.youtube.com/watch?v=undefined&t=36961s) It just assumes there will be one


there. It's therefore up to you to pass str copy a long enough chunk of memory to have room for that. If I
only ask malloc for three bytes, that could have potentially created a memory problem whereby str copy
would just still blindly copy one, two, three, four bytes, but technically it should have only touched three
of those.

- [0:16:20](https://www.youtube.com/watch?v=undefined&t=36980s) You do not yet have access to the


fourth one, or the rights to it, because you never asked malloc for it. Yeah. AUDIENCE: So the number
inside malloc would be the number of bytes. DAVID J. MALAN: Correct. The number inside malloc-- it's
one argument. It's the number of bytes you want back. AUDIENCE: Does that mean you have to
remember [INAUDIBLE]?? DAVID J.

- [0:16:43](https://www.youtube.com/watch?v=undefined&t=37003s) MALAN: Yes, the onus is on you,


the programmer, to remember or frankly, use a function to figure out how many bytes you actually need.
That's why I did not ultimately type in four manually, I used str length plus 1. So the plus 1 is necessary if
you understand how strings are represented, but using strlen means that I can actually play around with
any types of inputs and it will dynamically figure out the length.

- [0:17:03](https://www.youtube.com/watch?v=undefined&t=37023s) So suffice it to say, there's so


many ways already where you can start to break programs. Let's give you at least one tool for finding
mistakes that you might make. And indeed, in upcoming problem sets you will use this to find bugs in
your own code. Not just using printf, not just using the built-in debugger, but another tool here as well.

- [0:17:20](https://www.youtube.com/watch?v=undefined&t=37040s) So let me go ahead and


deliberately write a program called memory.c that has some memory-related errors. Let me include
stdio.h at the top and let me include stdlib.h at the top so I have access to malloc now. Let me do int
main(void) and then inside of main, let me do this-- I want to allocate maybe how about three-- space
for three integers.

- [0:17:41](https://www.youtube.com/watch?v=undefined&t=37061s) Why? Just for the sake of


discussion. So I'm going to go ahead and do malloc of three, but I don't want three bytes. I want three
integers and an integer is four bytes, so technically I could do this-- 3 times 4, or I could do 12 but again,
that's making certain assumptions and if I run this program on a slightly different computer, int might be
a different size.

- [0:18:01](https://www.youtube.com/watch?v=undefined&t=37081s) so the better way to do this would


be 3 times whatever the size is of an int. And this is just an operator you can use any time if you just
want to find out on this computer, how big is an int? How big is a float, or something else? So that's
going to give me that many-- that much memory for three ints.

- [0:18:18](https://www.youtube.com/watch?v=undefined&t=37098s) What do I want to assign this to?


Well, malloc returns an address. Pointers are addresses, so I'm going to create a pointer to an int called x
and assign it the value. So what am I doing here? This is a little less obvious, but again go back to basics.
The right hand side here gives me a chunk of memory for three integers.

- [0:18:38](https://www.youtube.com/watch?v=undefined&t=37118s) malloc returns the address of the


first byte of that chunk. How do I store the address of anything? I need a pointer. The syntax for today is
type of data, star, where the type of data in question is three ints, so I do int star x. Again, it's kind of
purposeless, only for sort of instructional purposes here, but this is equivalent now to having a chunk of
memory of size 12 in total, presumably, so I can technically now do this.

- [0:19:07](https://www.youtube.com/watch?v=undefined&t=37147s) I can go into maybe the first


location and assign it the number 72 like the other day. Second location, the number 73, and the third
location, maybe the number 33. Now I've deliberately made two mistakes here because I'm trying to trip
over my newfound understanding, or my greenness with understanding pointers.

- [0:19:29](https://www.youtube.com/watch?v=undefined&t=37169s) One, I didn't remember that I


should be treating chunks of memory as zero indexed. malloc essentially returns an array, if you want to
think of it as that. An array of three ints, or more technically, the address of a chunk of memory that
could fit three ints. So I can use my square bracket notation, or I could be really cool and use pointer
arithmetic, but this is a little more user friendly.

- [0:19:49](https://www.youtube.com/watch?v=undefined&t=37189s) But I have made two mistakes. I


did not start indexing at zero, so line seven should have been x bracket zero. Line eight should have been
x bracket 1, and then line nine should have been x bracket 2. So first mistake. The second mistake that
I've made as a side effect, is I'm also touching memory that I shouldn't.

- [0:20:08](https://www.youtube.com/watch?v=undefined&t=37208s) x bracket 3 would mean go to the


fourth int in the chunk of memory that came back. I only asked for enough memory for three ints, not
four, so this is what's called a buffer overflow. I am accidentally, but deliberately at the moment, going
beyond the boundaries of this array, this chunk of memory. So bad things happen, but not necessarily by
just running your program.
- [0:20:30](https://www.youtube.com/watch?v=undefined&t=37230s) Let me go ahead and just try this.
Make memory, and you'll see here that it compiles OK. ./memory, and it actually does not segmentation
fault, which comes back to that point of nondeterminism. Sometimes it does, sometimes it doesn't-- it
depends on how bad of a mistake you made. But there's a program that can spot these kinds of
mistakes, and I'm going to go ahead and expand my terminal window for a moment and I'm going to run
not just ./memory, but a program called Valgrind./memory.

- [0:20:57](https://www.youtube.com/watch?v=undefined&t=37257s) This is a command that comes


with a lot of computer systems that's designed to find memory-related bugs in code. So it's a new tool in
your toolkit today, and you'll use it with the coming problem sets. I'm going to run this now. It's output,
honestly, it's hideous. But there's a few things that will start to jump out and will help you with tools and
the problems sets to see these kinds of things.

- [0:21:17](https://www.youtube.com/watch?v=undefined&t=37277s) Here's the first mistake. Invalid


write of size four. That's on memory.c line nine, per my highlights. So let me go look at line nine. In what
sense is this an invalid write of size four? Well, I'm touching memory that I shouldn't, and I'm touching it
as though it's an int. And an int is four bytes-- size four.

- [0:21:38](https://www.youtube.com/watch?v=undefined&t=37298s) So again, this takes some practice


to get used to, the nomenclature here, but this is now a clue for me, the programmer, that not only did I
screw up, but I screwed up related to memory and so this is just a hint, if you will. It's not going to
necessarily tell you exactly how to fix it, you have to wrestle with the semantics, but invalid write of size
four-- oh, OK.

- [0:21:58](https://www.youtube.com/watch?v=undefined&t=37318s) So I should not have indexed past


the boundary here. All right, so I shouldn't have done that. So let me go ahead then and change this to
zero, one, and two, perhaps, here. All right, so let me go ahead and recompile my code. Make memory, .

- [0:22:20](https://www.youtube.com/watch?v=undefined&t=37340s) /memory, still doesn't seem to be


broken but it is technically buggy. Let me go ahead and run Valgrind again, so Valgrind of ./memory,
Enter. And now there's fewer scary-- less scary output now, but there's still something in there. Notice
this-- 12 bytes in one blocks-- no regard for grammar there-- are definitely lost in lost record one of one.
Super cryptic, but this is hinting at a so-called memory leak.

- [0:22:43](https://www.youtube.com/watch?v=undefined&t=37363s) The blocks of memory are lost in


the sense that I malloc'd them-- I asked for them but I never-- take a guess-- freed them. I have a
memory leak. And this is the arcane way of saying, you've screwed up. You have a memory leak. So this is
an easy fix, fortunately. Once I'm done with this memory I just need to free it at the end.

- [0:23:02](https://www.youtube.com/watch?v=undefined&t=37382s) So now let me go ahead and


rerun make memory, it's still runs fine so all the while I might have thought, incorrectly, my code is
correct. But let me run Valgrind one more time. Valgrin of ./memory, Enter. Now, this is pretty good. All
heap blocks were freed, whatever that means. No leaks are possible. And even though it's still a little
cryptic, there's no other error here and in fact, it's pretty explicit-- error summary, zero errors from zero
contexts, dot, dot, dot.

- [0:23:27](https://www.youtube.com/watch?v=undefined&t=37407s) So even though this is one of the


most arcane tools we'll use, it's also one of the most powerful because it can see things that you, the
human, might not, and maybe even that the debugger might not. It does a much closer reading of your
code while it's running to figure out exactly what is going on.

- [0:23:44](https://www.youtube.com/watch?v=undefined&t=37424s) Any questions, then, on this tool?


And we'll guide you after today with actually using this, too. Just helps you find memory-related mistakes
that you might now be capable of making. All right, let's do one other memory-related thing. Let me
shrink my terminal window here. Let me create one other file here called garbage.c.

- [0:24:03](https://www.youtube.com/watch?v=undefined&t=37443s) It turns out there's a term of ours


called garbage values in programming that we can reveal as follows. Let me include stdio.h, and let me
include-- how about stdlib.h, and then let me give myself int main(void), and then in this relatively short
program let me give myself three ints using last week's notation, just int scores bracket 3 for 3 quiz
scores, or whatever.

- [0:24:25](https://www.youtube.com/watch?v=undefined&t=37465s) Then let me go ahead and do for


int i equals zero, i less than 3, i plus plus, then let me go ahead and print out, %i backslash n, scores
bracket i semicolon. That's it. This code, pretty sure is going to compile and it's going to run, but what is
my logical bug? I've forgotten a step even though the code that's written is not so wrong.

- [0:24:51](https://www.youtube.com/watch?v=undefined&t=37491s) Yeah? Yeah, I didn't provide the


scores, so I didn't actually initialize the array called scores to have any scores whatsoever. What's curious
about this, though, is that the computer technically doesn't mind. Let me go ahead and playfully make
garbage, Enter, and it's an apt description because what I'm about to see are so-called garbage values.

- [0:25:14](https://www.youtube.com/watch?v=undefined&t=37514s) When you, the programmer, do


not initialize your codes variables to have values, sometimes, who knows what's going to be there. The
computer's been doing some other things, there's a bit of work that happens even before your code runs
in the computer, so there might be remnants of past ints, chars, strings, floats-- anything else in there
and what you're seeing is those garbage values, which is to say you should never forget, as I just did, to
initialize the value of some variable.

- [0:25:41](https://www.youtube.com/watch?v=undefined&t=37541s) And this is actually pretty


dangerous, and there have been many examples of software being compromised because of one of
these issues where a variable wasn't initialized and all of a sudden users, maybe people on the internet
in the context of web applications, could suddenly see the contents of someone else's memory, or
remnants.

- [0:25:59](https://www.youtube.com/watch?v=undefined&t=37559s) Maybe someone's password that


had been previously typed in or some other value like a credit card number that had been previously
typed in. There are different defense mechanisms in place to generally make this not so likely, but it's
certainly very possible, at least in this kind of context, to see values that you probably shouldn't because
they might be remnants from something else that used them.

- [0:26:21](https://www.youtube.com/watch?v=undefined&t=37581s) So this is to say again, you have


this great power now to manipulate memory, but also now you have this great hacking ability to poke
around the contents of memory, and this is exactly what hackers sometimes do when trying to find ways
to exploit systems. Are any questions here? No? All right, let's go ahead and take a quick five minute
break and when we come back, we'll build on these final topics.
- [0:26:45](https://www.youtube.com/watch?v=undefined&t=37605s) See you in five. We are back. First,
just a little programmer humor from XKCD, which hopefully now will make a little bit of sense to you.
And what we'll also do next to take a look at a short two minute video that animates with claymation, if
you will, from our friends at Stanford, exactly what happens now if you have an understanding of what
garbage values are and how they get there, and what happens then if you misuse them.

- [0:27:07](https://www.youtube.com/watch?v=undefined&t=37627s) It's one thing just to print them


out as I just did, it's another if you actually mistake a garbage value for a valid pointer, because garbage
values are just zeros and ones somewhere-- numbers, that is. But if you use that new dereference
operator, the star, and try to go to a garbage value thinking incorrectly that it's a valid pointer, bad things
can happen.

- [0:27:27](https://www.youtube.com/watch?v=undefined&t=37647s) Computers can crash or more


familiarly, segmentation faults can happen. So allow me to introduce, if we could dim the lights for two
minutes, our friend Binky from Stanford. SPEAKER 1: Hey Binky, wake up. It's time for pointer fun. BINKY:
What's that? Learn about pointers? Oh, goody! SPEAKER 1: Well, to get started, I guess we're going to
need a couple of pointers.

- [0:27:52](https://www.youtube.com/watch?v=undefined&t=37672s) BINKY: OK, this code allocates two


pointers which can point to integers. SPEAKER 1: OK. Well, I see the two pointers, but they don't seem to
be pointing to anything. BINKY: That's right. Initially, pointers don't point to anything. The things they
point to are called pointees, and setting them up is a separate step.

- [0:28:08](https://www.youtube.com/watch?v=undefined&t=37688s) SPEAKER 1: Oh, right, right. I knew


that. The pointees are separate. So how do you allocate a pointee? BINKY: OK, well this code allocates a
new integer pointee, and this part sets x to point to it. SPEAKER 1: Hey, that looks better. So make it do
something. BINKY: OK, I'll dereference the pointer x to store the number 42 into its pointee.

- [0:28:29](https://www.youtube.com/watch?v=undefined&t=37709s) For this trick, I'll need my magic


wand of dereferencing. SPEAKER 1: Your magic wand of dereferencing? That great. BINKY: This is what
the code looks like. I'll just set up the number and-- SPEAKER 1: Hey, look. There it goes. So doing a
dereference on x follows the arrow to access its pointee, in this case to store 42 in there.

- [0:28:51](https://www.youtube.com/watch?v=undefined&t=37731s) Hey, try using it to store the


number 13 through the other pointer, y. BINKY: OK. I'll just go over here to y and get the number 13 set
up, and then take the wand of dereferencing and just-- whoa! SPEAKER 1: Oh hey, that didn't work. Say,
Binky, I don't think dereferencing y is a good idea because setting up the pointee is a separate step and I
don't think we ever did it.

- [0:29:19](https://www.youtube.com/watch?v=undefined&t=37759s) BINKY: Good point. SPEAKER 1:


Yeah, we allocated the pointer y, but we never set it to point to a pointee. BINKY: Very observant.
SPEAKER 1: Hey, you're looking good there, Binky. Can you fix it so that y points to the same pointee as
x? BINKY: Sure, I'll use my magic wand of pointer assignment. SPEAKER 1: Is that going to be a problem,
like before? BINKY: No, this doesn't touch the pointees, it just changes one pointer to point to the same
thing as another.

- [0:29:43](https://www.youtube.com/watch?v=undefined&t=37783s) SPEAKER 1: Oh, I see. Now y


points to the same place as x. So wait, now y is fixed. It has a pointee so you can try the wand of
dereferencing again to send the 13 over. BINKY: OK, here it goes. SPEAKER 1: Hey, look at that. Now
dereferencing works on y. And because the pointers are sharing that one pointee, they both see the 13.

- [0:30:04](https://www.youtube.com/watch?v=undefined&t=37804s) BINKY: Yeah, sharing. Whatever.


So are we going to switch places now? SPEAKER 1: Oh look, we're out of time. BINKY: But-- That's from
our friend Nick Parlante at Stanford. So let's consider what Nick did here as Binky. So here is all the code
together. These first couple of lines were not bad, and notice that in Stanford's code they move the stars
to the left.

- [0:30:22](https://www.youtube.com/watch?v=undefined&t=37822s) That's fine. Again, more


conventional might be this syntax here. These two lines are fine. It's OK to create variables, even
pointers, and not assign them a value initially so long as you eventually do. So we eventually do here,
with this line. We assign to x the return value of malloc, which is presumably the address of something.

- [0:30:41](https://www.youtube.com/watch?v=undefined&t=37841s) To be fair, we should really be


checking for null as well, but that's not the biggest problem here. The biggest problem is not even this
next line, which means go to the memory location in x and store the number 42 there. That's fine,
because again, malloc returns the address of some chunk of memory.

- [0:30:59](https://www.youtube.com/watch?v=undefined&t=37859s) This chunk of memory is big


enough for an int. x is therefore going to store the address of that chunk that's big enough for an int. Star
x recalls the dereference operator, means go to that address and put 42 in it. It's like going to the
mailbox and putting the number 42 in it instead of taking the number 50 out, like we did before.

- [0:31:17](https://www.youtube.com/watch?v=undefined&t=37877s) But why is this line bad? This is


where Binky lost his head, so to speak. Why is this bad? Yeah. AUDIENCE: We haven't yet allocated space
for it. DAVID J. MALAN: Exactly. We haven't yet allocated space for y. There's no mention of malloc,
there's no assignment of y, even to that same memory. So this would be, go to the address in y, but if
there is no known address in y, it is a so-called garbage value, which means go to some random address
that you have no control over, and boom-- that might cause what we've seen in the past, perhaps as a
segmentation fault.

- [0:31:48](https://www.youtube.com/watch?v=undefined&t=37908s) Now this, fortunately, is the kind


of thing that if you don't quite have the eye for it yet, Valgrins, that new tool, could help you find as well.
But it's just another example of again, the sort of upside and downside of having control now over
memory at this level. All right. Well, let's go ahead and do one other thing.

- [0:32:05](https://www.youtube.com/watch?v=undefined&t=37925s) Considering from last week that


this notion of swapping was actually a really common operation. We had all of our volunteers come up,
we had to swap a lot of things during bubble sorts and even selection sort, and we just took for granted
that the two humans would swap themselves just fine. But there needs to be code to do that if you
actually implement bubble sort, selection sort, or anything that involves swapping.

- [0:32:25](https://www.youtube.com/watch?v=undefined&t=37945s) So let's consider some code like


this. We'll keep it simple like last week, and where we wanted to swap some values like int A and int B,
for instance, here. Void because I'm not going to return a value, but I have a function called swap. So
here, for instance, might be some code for this. But why is it so complicated? Here, let's actually take a
step back.
- [0:32:47](https://www.youtube.com/watch?v=undefined&t=37967s) Why don't we do this here. I think
we have time for one more volunteer. Could we get someone to come on up? You have to be comfy on
camera and you're being asked to help with your-- oh, I'll go with the friend, pointing. So whoever has
their friend doing this here-- no? Now they're pointing it over here.

- [0:33:04](https://www.youtube.com/watch?v=undefined&t=37984s) Now, literally an arm is being


twisted. OK. Come on down. That backfired. Come on over. And what is your name? AUDIENCE: Marina.
DAVID J. MALAN: Marina. Nice to meet you. Who were you trying to volunteer? AUDIENCE: My friend
Jesse. DAVID J. MALAN: OK. So here we have for Marina two glasses of liquid, orange and purple, just so
that they're super obvious.

- [0:33:35](https://www.youtube.com/watch?v=undefined&t=38015s) And suppose that the problem at


hand, like last week, it's just to swap two values, as though these two glasses represented two people
and we want to swap them. But let's consider these glasses to be like variables, or location in an array,
and you know what? I'd really like you to swap the values.

- [0:33:50](https://www.youtube.com/watch?v=undefined&t=38030s) So orange has to go in there, and


purple has to go in there. How would you do it? And we'll see if we can then translate that to code.
AUDIENCE: [INAUDIBLE] DAVID J. MALAN: OK, what-- say it a little louder. All right, yeah. So presumably,
you're struggling mentally with how you would do this without having an extra cup, so good foresight
here.

- [0:34:09](https://www.youtube.com/watch?v=undefined&t=38049s) Let me go ahead and we do have


a temporary variable, if you will. So if I hand you this, how would you now solve this problem?
AUDIENCE: I would go like that, but it's-- DAVID J. MALAN: No, that's-- Oh. Well, OK. Go do it-- go with
your instincts. OK. Sure, go ahead. Go to whatever your instincts are.

- [0:34:35](https://www.youtube.com/watch?v=undefined&t=38075s) Yeah, so a little-- so strictly


speaking, probably shouldn't have moved the glasses just because that would be like moving the array
locations, so let's actually do it one more time but the glasses now have to go back where they originally
are. So how would you swap these now, using this temporary variable? OK, good.

- [0:34:52](https://www.youtube.com/watch?v=undefined&t=38092s) Otherwise we'd be completely


uprooting the array, for instance, by just physically moving it around. So you moved the orange into this
temporary variable, then you copied the purple into where the orange was, and now, presumably,
excellent. The orange is going to end up where the purple once was and this temporary variable, it
stored up some extra memory.

- [0:35:09](https://www.youtube.com/watch?v=undefined&t=38109s) It was necessary at the time, but


not necessary, ultimately. But a round of applause if we could, and thank you for doing that so well. So
the fact that it instantly occurred to Mariana that you need some temporary variable is a perfect
translation to code, and in fact this code here, that we might glimpse now, is reminiscent of exactly that
algorithm, where A and B, at the end of the day, are the same chunks of memory.

- [0:35:33](https://www.youtube.com/watch?v=undefined&t=38133s) Just like the second time, the two


glasses have to kind of stay put, even though we're physically lifting them, but they're going back to
where they were, is kind of like having two values, A and B, and you just have a temporary variable into
which you copy A, then you change A with B, then you go and change B with whatever the original value
of A was, because you temporarily stored it in this temporary variable, tmp.

- [0:35:55](https://www.youtube.com/watch?v=undefined&t=38155s) Unfortunately, this code doesn't


necessarily work as intended. So let me go over to my VS Code here and open up a program called
swap.c, and in swap.c, let me whip up something really quickly here with, how about include stdio.h, int
main(void). Inside of main let me do something like x gets 1 and y gets 2.

- [0:36:18](https://www.youtube.com/watch?v=undefined&t=38178s) Let me just print out as a visual


confirmation that x is %i, y is %i backslash n, plugging in x and y, respectively. Then let me call a swap
function that we'll invent in just a moment. Swap x and y And then let me print out again x is %i, y is %i
backslash n, just to print out again what they are, because presumably I should see 1, 2 first, then 2, 1
the second time.

- [0:36:45](https://www.youtube.com/watch?v=undefined&t=38205s) Now how is swap going to be


implemented? Let me implement it exactly as on the screen a moment ago. So void swap int x-- or let's
call it int A for consistency, int B. But I could always call those anything I want. Int tmp gets A, A gets B, B
gets tmp. So exactly as I proposed a moment ago, and exactly as Mariana really implemented it using
these glasses of water.

- [0:37:08](https://www.youtube.com/watch?v=undefined&t=38228s) I need to now include my


prototype, as always, so nothing new there. And I'll just copy/paste that up here, and now let's go ahead
and run this. So make swap-- so far, so good-- swap-- x is now 1, y is 2, x is 1, y is 2. So there seems to be
a bit of a bug here, but why might this be? This code does not in fact work, even though it obviously
works in reality.

- [0:37:33](https://www.youtube.com/watch?v=undefined&t=38253s) Yeah? AUDIENCE: Because A and


B have different addresses than x and y [INAUDIBLE].. DAVID J. MALAN: Good, and let me summarize. A
and B do indeed have different addresses of x and y, and in fact what happens when you call a function
like this on line 11, calling swap, passing in x and y, you are calling a function by value, so to speak.

- [0:37:56](https://www.youtube.com/watch?v=undefined&t=38276s) And this is a term of art that just


means you are passing in copies of x and y, respectively, and calling them A and B in the context of this
function, but they're indeed copies. Now technically, these names are local only. I could have called this
x, I could have called this y, I could have changed this to x, this to y, this to x, and this to y.

- [0:38:18](https://www.youtube.com/watch?v=undefined&t=38298s) The problem would still remain.


Just because you use the same names in one function as you do elsewhere, that doesn't mean they're
the same. They just look the same to you. But indeed, swap is going to get copies of this x and y, and in
this context, this scope, so to speak-- x and y will be copies of the original.

- [0:38:36](https://www.youtube.com/watch?v=undefined&t=38316s) So for clarity, let me revert this


back to A and B just to make super clear that they're indeed different, albeit copies, but there's indeed a
problem there. This function actually works fine. In fact, notice this. Let me go ahead and print out inside
of this. printf A is %i, B is %i backslash n, and then I'll print A and B.

- [0:38:56](https://www.youtube.com/watch?v=undefined&t=38336s) And let me do that same thing at


the beginning of this function before it does any work. Let me go ahead and rerun. Make swap, ./swap,
and this is promising. Initially, x is 1, y is 2, A is 1, B is 2, A is 2, B is 1, but then nope-- x is 1, y is 2. So if
anything, I've confirmed that the logic is right-- Mariana's logic is right, but there's something about C.

- [0:39:20](https://www.youtube.com/watch?v=undefined&t=38360s) There's something about using


one function versus another that's actually creating a problem here. The fact that I'm passing in copies of
these values is creating this problem. So what in fact is going on? Well again, inside of your computer's
memory there is these little chips, and we've been talking about them abstractly, it's just this grid of
memory locations.

- [0:39:39](https://www.youtube.com/watch?v=undefined&t=38379s) It turns out that your computer


uses this memory in a pretty conventional way. It's not just random, where it just puts stuff wherever is
available, it actually uses different parts of the memory for different purposes. And you have control over
a lot of it, but the computer uses some of it for itself.

- [0:39:55](https://www.youtube.com/watch?v=undefined&t=38395s) And let's go ahead and zoom out


from this and consider that within your computer's memory, what a computer will typically do is actually
store initially, all of the zeros and ones that you compiled in the top of your computer's memory, so to
speak. So when you compile a program and then you run it with .

- [0:40:12](https://www.youtube.com/watch?v=undefined&t=38412s) /whatever, or on a Mac or PC you


double click on it, the computer first-- the operating system first-- loads all of your program zeros and
ones, a.k.a. Machine code, into just one big chunk of memory at the top, so to speak. Below that it
stores global variables-- any variables you have created in your program that are outside of main and
outside of any functions.

- [0:40:33](https://www.youtube.com/watch?v=undefined&t=38433s) Generally, the top of your file.


Globals tend to go at the top there. Then there's this chunk of memory that's generally known as the
heap-- and we saw that word briefly in Valgin's output, and then there's this other chunk of memory
called the stack. And it turns out that up until this week you were using the stack heavily.

- [0:40:51](https://www.youtube.com/watch?v=undefined&t=38451s) Any time you use local variables


in a function they end up on the stack. Any time you use malloc, that memory ends up on the heap. Now
as the arrow suggests, this actually looks like a problem waiting to happen because if you use more and
more and more heap, and more and more and more stack, it's like two things barreling down the tracks
at one another-- this does not end well.

- [0:41:10](https://www.youtube.com/watch?v=undefined&t=38470s) And that's actually a problem. If


you've ever heard the phrase stack overflow, or use the website, this is the origin of its name. When you
start to use more and more and more memory by calling lots and lots of functions or using lots and lots
of local variables, you use a lot of this stack memory.

- [0:41:26](https://www.youtube.com/watch?v=undefined&t=38486s) Or if you use malloc a lot and


keep calling malloc, malloc, malloc, and never really, or rarely calling free, you just use more and more
memory and eventually these two things might overflow each other, at which point you're just out of
luck. The program will crash or something bad will happen. So the onus is on you just to don't do that.

- [0:41:43](https://www.youtube.com/watch?v=undefined&t=38503s) But this is the design, generally, of


what's going on inside of your computer's memory. Now within that memory, though, there are certain
conventions focusing on here, the stack. And in fact, let me go over here with a marker and say that this
represents the bottom of my memory, ultimately. And so here we have a whole bunch of wooden blocks
and each of these squares represents a byte of memory and this, for instance, might represent four
bytes altogether-- good enough for an int, or something like that.

- [0:42:09](https://www.youtube.com/watch?v=undefined&t=38529s) So in my original code that I wrote


earlier, that is in fact, buggy, what is in fact going on inside the swap function? We can visualize it like
this-- when you run ./swap or any program for that matter, main is the first function to get called with a C
program, and so I'm just going to label this bottom row of memory as main.

- [0:42:27](https://www.youtube.com/watch?v=undefined&t=38547s) And what were the two variables I


had in main called in this code? Yeah. x and y. And each of those was an int, so that's four bytes, so it's
deliberate that I reserved four-- a chunk of wood here that's four bytes. So let me just call this x, and I'm
just going to write the number 1 in this box here.

- [0:42:46](https://www.youtube.com/watch?v=undefined&t=38566s) And then I had my other variable


y, and I'm going to put the number 2 there. What happens when main calls swap like it does in this code
here? Well, it has two variables of its own, A and B, and A initially is 1 and B is initially 2, but it has a third
variable, tmp, which is a local variable in addition to the arguments A and B that are passed in, so I'm
going to call this tmp, tmp over here.

- [0:43:12](https://www.youtube.com/watch?v=undefined&t=38592s) And what is the value of tmp?


Well, we have to look back at the code. tmp initially gets the value of A. All right, the value of a was 1, so
tmp initially gets 1. That's step one in my three line program. OK, A equals B. So that is assigned from the
right to the left of the B into the A So B is 2, A is this, so let me go ahead and erase this and just
overwrite that.

- [0:43:34](https://www.youtube.com/watch?v=undefined&t=38614s) So at this moment in the story


you have two copies of two, so that's OK though, because the third line of code says tmp gets copied into
B. So what's tmp-- 1, gets copied into B, so let me overwrite this 2 with a 1, and now what happens?
Now unfortunately, the code ends. swap doesn't actually do anything with the result, and the problem in
C is that I could have had a return value.

- [0:43:59](https://www.youtube.com/watch?v=undefined&t=38639s) I could go in there and change


void to int, but which one am I going to return? The A or the B? The whole goal is to swap two values,
and it seems kind of lame if you can't write a function to do something as common per last week sorting
algorithms as swapping two values. But what really happens? Well, even though when this program
starts running, main is using this chunk of memory at the bottom in the so-called stack, and the stack is
just like a cafeteria stack of trays-- it grows up, like this.

- [0:44:26](https://www.youtube.com/watch?v=undefined&t=38666s) Here's main's memory on the


stack. Here's the swap function's memory on the stack. It's using three ints instead of two-- instead of
only two. What happens when the function returns, whether it's void or not? The sort of recollection
that this is swap's memory goes away and garbage values are left.

- [0:44:43](https://www.youtube.com/watch?v=undefined&t=38683s) So, adorably, we get rid of these


values here, and there's still data there-- technically, the numbers 1, 1, and 2 are still there in the
computer's memory but they no longer belong to us because the function has now returned. So they're
still in there and this is kind of an example visually of why there's other stuff in memory even though you
didn't put it there, necessarily.

- [0:45:04](https://www.youtube.com/watch?v=undefined&t=38704s) Sometimes you did put it there,


but now once swap returns you only should be touching memory inside of main. But we've never
actually copied one value into main. We haven't returned anything and we haven't solved this
fundamentally. So how could we do this? Well, what if we instead passed into swap not copies of x and y,
calling them A and B.

- [0:45:28](https://www.youtube.com/watch?v=undefined&t=38728s) What if they passed in


breadcrumbs to x and y, sort of a treasure map that will lead swap to the actual x and to the actual y?
Today we have that capability using pointers. So suppose that we use this code instead. There's a lot of
stars going on here, which is a bit annoying, but let's consider what it is we're trying to achieve. What if
we pass in not x and y, but the address of x and the address of y, respectively-- breadcrumbs, if you will--
that will lead swap to the original values.

- [0:45:56](https://www.youtube.com/watch?v=undefined&t=38756s) Then what we do is we still give


ourselves a tmp variable, like an empty glass. It's still a glass, so we still call it an int, but what do we
want to put into that temporary variable? We don't want to put A into it, because that's an address now.
We want to go to that address per the star and put whatever's at that address.

- [0:46:13](https://www.youtube.com/watch?v=undefined&t=38773s) What do we then want to do?


Well, we want to then copy into whatever's at location A, we want to copy over to location A's contents
whatever is at location B's contents and then lastly, we want to copy tmp into whatever's at location B.
So again, we're very deliberately introducing all of these stars because we don't want to change any of
these addresses, we want to go to these addresses per the reference operator and put values there, or
get values from.

- [0:46:42](https://www.youtube.com/watch?v=undefined&t=38802s) So what does this actually mean?


Well, if I kind of rewind in this story and I go back here, I still have tmp, although I'm going to delete its
value to begin with, I still have B and I still have A, but what's going to be different this time is how I use
A and B. So let me finish erasing those. That's A on the left, this is B on the right.

- [0:47:03](https://www.youtube.com/watch?v=undefined&t=38823s) At this point in the story, we're


rerunning swap with this new and improved version, and let's see what happens. Well, x is presumably
at some address. Maybe it's like 0x123, as always. What then does A get when I'm using this code? The
value of A is 0x123. What is the value of B? Maybe y is that 0x456.

- [0:47:27](https://www.youtube.com/watch?v=undefined&t=38847s) What goes in B? Well, I'm going to


put 0x456, and the what am I going to do? Based on these three lines of code, I'm going to store in tmp
whatever is at the address in A. What is the address in A? That's this thing here, so I'm going to put 1 in
tmp. Line two-- I'm going to go to B-- all right, B is 456, so I'm going to B and I'm going to store 2 at
whatever is at location A, and at location A is 123, so that's this, so what am I going to do? I'm going to
change this 1 to a 2.

- [0:47:59](https://www.youtube.com/watch?v=undefined&t=38879s) Last line of code-- get the value of


tmp, which is 1, and then put it at whatever the location B is, so B, 456, go there and change it to be the
value of tmp, tmp, which puts 1 here. That's it for the code. There's still no return value. swap returns,
which means these three temporary variables are garbage values now.

- [0:48:19](https://www.youtube.com/watch?v=undefined&t=38899s) They can be reused by


subsequent function calls but now, I've actually swapped the values of x and y. Which is to say what
came as naturally as the real world here for Mariana is not quite as simply done in C because again,
functions are isolated from each other. You can pass in values but you get copies of those values.

- [0:48:40](https://www.youtube.com/watch?v=undefined&t=38920s) If you want one function to affect


the value of a variable somewhere else, you have to 1, understand what's going on but 2, pass things in
as by a pointer here. So if I go back to my code here, I need to make a few changes now. Let me get rid of
these extra printf's. Let me go in and add all these stars.

- [0:48:59](https://www.youtube.com/watch?v=undefined&t=38939s) So I'm dereferencing these actual


addresses here and here, and I've got to make one more change. How do I now call swap if swap is
expecting an int star and an int star? That is, the address of an int and the address of another int. What
do I change on line 11 here? Yeah. Sorry, a little louder. AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Sorry,
the address of operator.

- [0:49:28](https://www.youtube.com/watch?v=undefined&t=38968s) So up here on line 11, we do


ampersand x and ampersand y. So that yes, we're technically passing in a copy of a value, but this time
the copy we're passing in is technically an address, and as soon as we have an address, just like when I
held up the fuzzy finger-- the foamy finger-- I can point at that address, I can go to that address and
actually get a value from the mailbox or put a value into the mailbox if I even want.

- [0:49:52](https://www.youtube.com/watch?v=undefined&t=38992s) So let's cross our fingers now and


do make swap, Enter. Oh my God, so many mistakes. Oh, I didn't remember to change my prototype, so
let me go way up here and add two more stars because I made that change already. Make swap, ./swap,
and viola-- now I have actually swapped. Thank you. Thank you. The two values.

- [0:50:17](https://www.youtube.com/watch?v=undefined&t=39017s) All right, so what more can we do


here? Well, let me consider that all this time we've been deliberately using GetString and GetInt and
GetFloat and so forth, but for a reason. These aren't just training wheels for the sake of making things
easier, they're actually in place to make your code safer.

- [0:50:36](https://www.youtube.com/watch?v=undefined&t=39036s) And to illustrate this, let me go


ahead and open up one other file here. How about a file called scanf.c. It turns out that the old school
way-- the way in C, really, of getting user input, is via functions like scanf, and let me go ahead and
include stdio.h, int main(void), and without using the CS50 library at all for strings or for any of those get
functions.

- [0:51:01](https://www.youtube.com/watch?v=undefined&t=39061s) Let me give myself an int called x.


Let me just print out what the value of x is, even though it's going to be a-- or rather, ask the user for the
value by asking them for x. And I'm going to use a function called scanf that's going to scan in an integer
using %i, and I'm going to store whatever the human types in at this location.

- [0:51:23](https://www.youtube.com/watch?v=undefined&t=39083s) And then I'm going to go ahead


and, just so we can see what happened, I'm going to print out with %i whatever the human typed in as
follows. All right, so line eight is week 1 style code. Line five and six is week 1 style code. So the curiosity
today is this new line. scanf is another function in stdio.

- [0:51:42](https://www.youtube.com/watch?v=undefined&t=39102s) h, and notice what I'm doing. I'm


using the same syntax that I use for printf, which is kind of a little clue-- a format code to tell scanf what
it is I want to scan in, that is, read from the human's keyboard-- and I'm telling it where to put whatever
the human typed in. I can't just say x, because we run into the same darn problem as with swap.

- [0:52:00](https://www.youtube.com/watch?v=undefined&t=39120s) I have to give a little breadcrumb


to the variable where I want scanf to put the human's integer. And so this just tells the computer to get
an int. This is what you would have had to type, essentially, in week 1 just to get an int from the user, and
there's a whole bunch of things that can go wrong still, but that's the cryptic syntax we would have had
to show you in week 1.

- [0:52:20](https://www.youtube.com/watch?v=undefined&t=39140s) Let me go ahead and make scanf


here-- oops-- user error. Put the semicolon in the wrong place. Make scanf, Enter. Oh my God. Non void
doesn't return a value. Oh, thank you. Strike two. OK. Make scanf. There we go. OK, so scanf-- I'm going
to type in a number like 50 and it just prints it back out. So that is the traditional way of implementing
something like GetInt.

- [0:52:50](https://www.youtube.com/watch?v=undefined&t=39170s) The problem, though, is when you


start to get into strings, things get dangerous quickly. Let me delete all of this and give myself a string s,
although wait a minute-- we don't call it strings anymore-- char star to store a string. Then let me go
ahead and just prompt the user for a string, using just printf.

- [0:53:06](https://www.youtube.com/watch?v=undefined&t=39186s) Then let me go ahead and use


scanf, ask them for a string this time with %s, and store it at that address. Then let me go ahead and
print out whatever the human typed in just by using the same notation. So here, line five is the same
thing as string s, but we've taken back that layer today so it's char star s.

- [0:53:27](https://www.youtube.com/watch?v=undefined&t=39207s) This is just week one this is just


week one, line seven is new. scanf will also read from the human's keyboard a string and store it at s. But
that's OK, because s is an address. It's correct not to do the ampersand. It's not necessary. A string is and
has always been a char star, a.k.a string.

- [0:53:47](https://www.youtube.com/watch?v=undefined&t=39227s) The problem, though, arises as


follows-- if I do make scanf-- oh my God, what did I do wrong-- I can't-- OK, we have certain defenses in
place with make. Let me do clang of scanf.c, an output of program called scanf. All right, so I'm overriding
some of our pedagogical defenses that we have in place with make.

- [0:54:07](https://www.youtube.com/watch?v=undefined&t=39247s) Let me now run scanf of this


version, Enter, and let me type in something like, how about hi again. So it didn't even store something
and it weirdly printed out null. This time it's in lowercase, but that is somewhat related. What did I
fundamentally do wrong though, here? Why is this getting more and more dangerous? And let me
illustrate the point even more.

- [0:54:31](https://www.youtube.com/watch?v=undefined&t=39271s) What if I type in not just


something like hello, which also doesn't work. What if I do like, hellooooo and make a really long string,
Enter-- that still works. Can I do this again? Let's try again. Right, a really long, unexpectedly long string.
This is the nondeterminism kicking in. Enter. All right, damn it.

- [0:54:52](https://www.youtube.com/watch?v=undefined&t=39292s) I was trying to trigger a


segmentation fault but it wouldn't, but the point still remains. It's still not working, but what's the
essence of why this isn't working, and it's not storing my actual input? Yeah. AUDIENCE: Do you have to
make a space? DAVID J. MALAN: We have to make space for it.

- [0:55:08](https://www.youtube.com/watch?v=undefined&t=39308s) So what we're missing here is


malloc, or something like that. So I could do that, I could do something like this. Well, let the human type
in at least a three letter word so I could do malloc of 3 plus 1 for the null character. So let me give them
four characters, and let me go ahead and do make scanf-- whoops.

- [0:55:26](https://www.youtube.com/watch?v=undefined&t=39326s) Nope, sorry. clang, I have to--


nope. Dammit. Oh, include stdlib.h-- there we go. That gives me malloc, now I'm going to recompile this
with clang, now I'm going to rerun it, and now I'm going to type in my first thing, hi. That now works.
And let me get a little aggressive now and type in hello, which is too long.

- [0:55:47](https://www.youtube.com/watch?v=undefined&t=39347s) Still works, but I'm getting lucky.


Let me try a hellooooooo. Damn it, that still works, too. Sort of. But it actually-- not quite. There's some
weirdness going on there already. It turns out I can also do this. I could actually just say char star four
and give myself an array of four characters. Let me try this one more time.

- [0:56:08](https://www.youtube.com/watch?v=undefined&t=39368s) So let me rerun clang ./scanf.


Hellooooooo, clearly exceeding the four characters-- there we go. Thank you, all right. So the point here,
though, is if we hadn't given you GetInt, you would have had to use the scanf thing-- not a huge deal
because it seemed to work. But if we hadn't given you GetString you would have had to do stuff like this,
knowing about malloc already or knowing about strings being erased, and even now there's a danger.

- [0:56:37](https://www.youtube.com/watch?v=undefined&t=39397s) If the human types in five letters,


six letters, 100 letters-- this code, like with the Hello input, will probably just crash, which is bad. So
GetString also has this functionality built in where we have a fancy loop inside such that we allocate
using malloc as many bytes as you physically type in, and we use malloc essentially every keystroke.

- [0:56:56](https://www.youtube.com/watch?v=undefined&t=39416s) The moment you type in h-e-l-l-o,


we're laying the tracks as we go and we keep allocating more and more memory so that we theoretically
will never crash with GetString even though it's this easy to crack-- this easy to crash your code using
scanf if you again did it without the help of a library.

- [0:57:13](https://www.youtube.com/watch?v=undefined&t=39433s) So where are we all going with


this? Well, let me show you a few final examples that'll pave the way for what will be problem set four.
Let me go ahead and open up from today's code-- which is available on the course's website-- for
instance, a program like this, called phonebook.c, and I'm just going to give you a quick tour of it, that
you'll see more details on in the context of p-set four itself.

- [0:57:38](https://www.youtube.com/watch?v=undefined&t=39458s) We're going to introduce a few


new functions you're going to see. You're going to see a function called fopen, which stands for file open,
and it takes two arguments-- the name of a file to open like a CSV that you might manipulate in Excel or
Google Spreadsheets or the like-- comma separated values, and then something like A for append, R for
read, W for write, depending on whether you want to add to the file, just open it up, or change it.

- [0:58:01](https://www.youtube.com/watch?v=undefined&t=39481s) We're going to introduce you to a


file pointer. You'll see that capital file-- which is a little bit unconventional-- capital file is a pointer to an
actual file on the computer's hard drive so that you can actually access something like a CSV file, or heck,
even images. And we're going to see down below that you're also going to have the ability to write files
as well, or print to files.

- [0:58:20](https://www.youtube.com/watch?v=undefined&t=39500s) You'll see functions like printf


printf for file printf. Or fwrite-- file write-- which now that you will begin to understand pointers, you'll
have the ability to actually not only read files-- text files, images, other things-- but also write them out.
In fact for instance, just as a teaser here, JPEGs will be one of the things we focus on this week where we
give you a forensic image and your goal is to recover as many photographs from this forensic image of a
digital camera as you possibly can.

- [0:58:51](https://www.youtube.com/watch?v=undefined&t=39531s) And the way you're going to do


that is by knowing in advance that every JPEG in the world starts with these three bytes, written in
hexadecimal, but these three numbers. And so in fact, just as a teaser, let me open up an example you'll
see on the course's website for today. If I scroll through here, you'll see a program that does a little
something like this.

- [0:59:11](https://www.youtube.com/watch?v=undefined&t=39551s) And again, more on this-- if we


could hit the button-- there we go. So here we have the notion of a byte we're going to create for
ourselves. We'll see a data type called byte, which is a common convention. This gives me three bytes.
And you're going to learn about a function called fread, which reads from a file some number of bytes--
for instance, three bytes.

- [0:59:32](https://www.youtube.com/watch?v=undefined&t=39572s) We might then use code like this.


If bytes bracket zero equals equals 0xFF and bytes bracket 1 equals 0xD8 and bytes bracket 2 equals 0xFF,
all three of those bytes I just claimed represent a JPEG, you'll see an output like this. Let me go ahead
and run this program as follows. Let me copy jpeg.c into my directory from today's distribution.

- [0:59:55](https://www.youtube.com/watch?v=undefined&t=39595s) Let me do make jpeg, and let me


run jpeg on a file which is available online called lecture.jpeg, and I claim yes, it's possibly a JPEG. Well,
what is that file? Let me open it up for us, called lecture.jpeg, and here, for instance, is that same photo
with which we began class, namely implemented as a JPEG.

- [1:00:16](https://www.youtube.com/watch?v=undefined&t=39616s) But what we're also going to do


this week is start to implement our own sort of filters a la Instagram, whereby we might take images and
actually run them through a program that creates different versions thereof. For instance, using a
different file format called BMP, which essentially lays out all of its pixels from left to right, top to
bottom, in a grid.

- [1:00:35](https://www.youtube.com/watch?v=undefined&t=39635s) You're going to see a struct-- a


data struct in C that's way more complicated than the candidate structure from the past, or the person
structure from the past, that looks like this, which is just a whole bunch more values in it, but we'll walk
you through these in the p-set. And we might take a photograph like this and ask you to run a few
different filters on it a la Instagram, like a black and white filter, or grayscale, a sepia filter to give it some
old school feel, or a reflection like this to invert it,

- [1:01:00](https://www.youtube.com/watch?v=undefined&t=39660s) or blur it, even in this way. And


just to end on a note here, I have a version of this code ready to go that doesn't implement all of those
filters, it just implements one filter initially. Let me go ahead and just ready this on my computer here.
I'm going to go into my own version of filter and you'll see a few files that will give you a tour of this
coming week in bitmap.

- [1:01:22](https://www.youtube.com/watch?v=undefined&t=39682s) h, for instance, is a version of this


structure that I claimed existed a moment ago. And let me show you this file here, helpers.c, in which
there is a function called filter that I've already implemented in advance today. But the ones we give you
for the piece that won't already be implemented, this function called filter takes the height of an image,
the width of an image, and a two dimensional array.

- [1:01:47](https://www.youtube.com/watch?v=undefined&t=39707s) So rows and columns of pixels,


and then I have a loop like this that iterates over all of the pixels in an image from top to bottom, left to
right. And then notice what I'm going to do here. I'm going to change the blue value to be zero in this
case, and the green value to be zero in this case. But why? Well, the image I have here in mind is this
one, whereby we have this hidden image that simply has old school style-- a secret message embedded
in it.

- [1:02:14](https://www.youtube.com/watch?v=undefined&t=39734s) And if you don't happen to have


in your dorm one of these secret decoder glasses that essentially make everything red-- getting rid of the
green in the world and the blue in the world-- you can actually-- I'm actually probably the only one who
can read this right now-- see what message is hidden behind all of this red noise.

- [1:02:29](https://www.youtube.com/watch?v=undefined&t=39749s) But if using my code written here


in helpers.c I get rid of all the blue in the picture and I get rid of all the green in the picture, essentially
implementing the idea of this filter-- this red filter where you only see red-- well, let's go ahead and
compile this program. Make filter, run ./filter on this hidden message.bmp.

- [1:02:51](https://www.youtube.com/watch?v=undefined&t=39771s) I'm going to save it in a new file


called message.bmp, and with one final flourish we're going to open up message.bmp, which is the
result of having put on these glasses, and hopefully now you too will see what I see. All right, that's it for
CS50! We'll see you next time. [MUSIC PLAYING] SPEAKER 1: All right.

- [1:04:37](https://www.youtube.com/watch?v=undefined&t=39877s) This is CS 50. And this is already


week 5, which means this is actually our last week in C together. In fact, in just a few days' time, what has
looked like this and much more cryptic than this perhaps, is going to be distilled into something much
simpler next week. When we transition to a language called Python.

- [1:04:56](https://www.youtube.com/watch?v=undefined&t=39896s) And with Python, we'll still have


our conditionals, and loops, and functions, and so forth. But a lot of the low-level plumbing that you
might have been wrestling with, struggling with, frustrated by, over the past couple of weeks, especially,
now that we've introduced pointers. And it feels like you probably have to do everything yourself.
- [1:05:12](https://www.youtube.com/watch?v=undefined&t=39912s) In Python, and in a lot of higher
level languages so to speak-- more modern, more recent languages, you'll be able to do so much more
with just single lines of code. And indeed, we're going to start leveraging libraries, all the more code that
other people wrote. Frameworks, which is collections of libraries that other people wrote.

- [1:05:28](https://www.youtube.com/watch?v=undefined&t=39928s) And on top of all that, will you be


able to make even better, grander, more impressive projects, that actually solve problems of particular
interest to you. Particularly, by way of your own final project. So last week though, in week 4, recall that
we focused on memory. And we've been treating this memory inside of your computer is like a canvas,
right.

- [1:05:45](https://www.youtube.com/watch?v=undefined&t=39945s) At the end of the day, it's just


zeros and ones, or bytes, really. And it's really up to you what you do with those bytes. And how you
interconnect them, how you represent information on them. And arrays, were like one of the simplest
ways. We started playing around with that memory. Just contiguous chunks of memory.

- [1:06:01](https://www.youtube.com/watch?v=undefined&t=39961s) Back-to-back, to back. But let's


consider, for a moment, some of the problems that pretty quickly arise with arrays. And then, today
focus on what more generally are called data structures. Using your computer's memory as a much more
versatile canvas, to create even two-dimensional structures. To represent information, and, ultimately, to
solve more interesting problems.

- [1:06:21](https://www.youtube.com/watch?v=undefined&t=39981s) So here's an array of size 3.


Maybe, the size of 3 integers. And suppose that this is inside of a program. And at this point in the story,
you've got 3 numbers in it already. 1, 2 and 3. And suppose, whatever the context, you need to now add
a fourth number to this array. Like, the number 4. Well, instinctively, where should the number 4 go? If
this is your computer's memory and we currently have this array 1, 2, 3, from what.

- [1:06:43](https://www.youtube.com/watch?v=undefined&t=40003s) Left to right. Where should the


number 4 just, perhaps, naively go. Yeah, what do you think? AUDIENCE: Replace number 1. SPEAKER 1:
Sorry? AUDIENCE: Replace number 1. SPEAKER 1: Oh, OK. So you could replace number 1. I don't really
like that, though, because I'd like to keep number 1 around. But that's an option.

- [1:06:58](https://www.youtube.com/watch?v=undefined&t=40018s) But I'm losing, of course,


information. So what else could I do if I want to add the number 4. Over there? AUDIENCE: On the right
side of 3. SPEAKER 1: Yeah. So, I mean, it feels like if there's some ordering to these, which seems kind of
a reasonable inference, that it probably belongs somewhere over here.

- [1:07:11](https://www.youtube.com/watch?v=undefined&t=40031s) But recall last week, as we started


poking around a computer's memory, there's other stuff potentially going on. And if fill that in, ideally,
we'd want to just plop the number 4 here. If we're maintaining this kind of order. But recall in the
context of your computer's memory, there might be other stuff there.

- [1:07:26](https://www.youtube.com/watch?v=undefined&t=40046s) Some of these garbage values


that might be usable, but we don't really know or care what they are. As represented by Oscar here. But
there might actually be useful data in use. Like, if your program has not just a few integers in this array,
but also a string that says like, "Hello, world." It could be that your computer has plopped the H-E-L-L-O
W-O-R-L-D right after this array.
- [1:07:48](https://www.youtube.com/watch?v=undefined&t=40068s) Why? Well, maybe, you created
the array in one line of code and filled it with 1, 2, 3. Maybe the next line of code used GET-STRING. Or
maybe just hard coded a string in your code for "Hello, world." And so you painted yourself into a corner,
so to speak. Now I think you might claim, well, let's just overwrite the H.

- [1:08:03](https://www.youtube.com/watch?v=undefined&t=40083s) But that's problematic for the


same reasons. We don't want to do that. So where else could the 4 go? Or how do we solve this problem
if we want to add a number, and there's clearly memory available. Because those garbage values are
junk that we don't care about anymore. So we could certainly reuse those.

- [1:08:20](https://www.youtube.com/watch?v=undefined&t=40100s) Where could the 4, and perhaps


this whole array, go? OK. So I'm hearing we could move it somewhere. Maybe, replace some of those
garbage values. And honestly, we have a lot of options. We could use any of these garbage values up
here. We could use any of these down here, or even further down. The point is there is plenty of
memory available as indicated by these Oscars, where we could put 4, maybe even, 5, 6 or more
integers.

- [1:08:43](https://www.youtube.com/watch?v=undefined&t=40123s) The catch is that we chose poorly


early on. Or we just got unlucky. And 1, 2, 3 ended up back-to-back with some other data that we care
about. All right, so that's fine. Let's go ahead and assume that we'll abstract away everything else. And
we'll plop the new array in this location here. So I'm going to go ahead and copy the 1 over.

- [1:09:00](https://www.youtube.com/watch?v=undefined&t=40140s) The 2 over. The 3 over. And then,


ultimately, once I'm ready to fill the 4, I can throw away, essentially, the old array at this point. Because I
have it now entirely in duplicate. And I can populate it with the number 4. All right. So problem solved.
That is a correct potential solution to this problem.

- [1:09:16](https://www.youtube.com/watch?v=undefined&t=40156s) But, what's the trade off? And this


is something we're going to start thinking about all the more. What's the downside of having solved this
problem in this way? Yeah. I'm adding a lot of running time. It took me a lot of effort to copy those
additional numbers. Now, granted, it's a small array.

- [1:09:30](https://www.youtube.com/watch?v=undefined&t=40170s) 3 numbers, who cares. It's going


to be over in the blink of an eye. But if we start talking about interesting data sets, web application data
sets, mobile app data sets. Where you have not just a few, but maybe a few hundred, few thousand, a
few million pieces of data. This is probably a suboptimal solution to just, oh, move all your data from one
place to another.

- [1:09:48](https://www.youtube.com/watch?v=undefined&t=40188s) Because who's to say that we're


not going to paint ourselves into a new corner. And it would feel like you're wasting all of this time
moving stuff around. And, ultimately, just costing yourself a huge amount of time. In fact, if we put this
now into the context of our Big O notation from a few weeks back, what might the running time now of
Search be for an array? Let's start simple.

- [1:10:09](https://www.youtube.com/watch?v=undefined&t=40209s) A throwback a couple of weeks


ago. If you're using an array, to recap, what was the running time of a Search algorithm in Big O
notation? So, maybe, in the worst case. If you've got n numbers, 3 in this case or 4, but n more generally.
Big O of what for Search? Yeah. What do you think? AUDIENCE: Big O of n.
- [1:10:28](https://www.youtube.com/watch?v=undefined&t=40228s) SPEAKER 1: Big O of n. And what's
your intuition for that? AUDIENCE: [INAUDIBLE]. SPEAKER 1: OK. Yeah. So if we go through each element,
for instance, from left to right, then Search is going to take this a Big O running time. If, though, we're
talking about these numbers, specifically. And now I'll explicitly stipulate that, yeah, they're sorted.

- [1:10:49](https://www.youtube.com/watch?v=undefined&t=40249s) Does that buy us anything? What


would the Big O notation be for Searching an array in this case, be it of size 3, or 4, or n, more generally.
AUDIENCE: Big O of n. SPEAKER 1: Big O of, not n, but rather? AUDIENCE: Log n. SPEAKER 1: Log n, right.
Because we could use per week zero binary search on an array like this, we'd have to deal with some
rounding.

- [1:11:07](https://www.youtube.com/watch?v=undefined&t=40267s) Because there's not a perfect


number of elements at the moment. But you could use binary search. Go to the middle roughly. And
then go left or right, left or right, until you find the element you care about. So Search remains in Big O of
log n when using arrays. But what about insertion, now? If we start to think about other operations.

- [1:11:23](https://www.youtube.com/watch?v=undefined&t=40283s) Like, adding a number to this


array, or adding a friend to your contacts app, or Google finding another page on the internet. So
insertion happens all the time. What's the running time of Insert? When it comes to inserting into an
existing array of size n. How many steps might that take? Big O of n. It would be, indeed, n.

- [1:11:43](https://www.youtube.com/watch?v=undefined&t=40303s) Why? Because in the worst case,


where you're out of space, you have to allocate, it would seem, a new array. Maybe, taking over some of
the previous garbage values. But the catch is, even though you're only inserting one new number, like
the number 4, you have to copy over all the darn existing numbers into the new one.

- [1:11:59](https://www.youtube.com/watch?v=undefined&t=40319s) So if your original array of size n,


the copying of that is going to take Big O of n plus 1. But we can throw away the plus 1 because of the
math we did in the past. So Insert now becomes Big O of n. And that might not be ideal. Because if
you're in the habit of inserting things frequently, that could start to add up, and add up, and add up.

- [1:12:16](https://www.youtube.com/watch?v=undefined&t=40336s) And this is why computer


programs, and websites, and mobile apps could be slow. If you're not being mindful of these trade offs.
So what about, just for good measure, Omega notation. And maybe, the best case. Well just to recap
here, we could get lucky and Search could just take one step. Because you might just get lucky, and boom
the number you're looking for is right there in the middle, if using binary search.

- [1:12:38](https://www.youtube.com/watch?v=undefined&t=40358s) Or even linear search, for that


matter. And insert 2. If there's enough room, and we didn't have to move all of those numbers-- 1, 2, and
3, to a new location. You could get lucky. And we could have, as someone suggested, just put the number
4 right there at the end. And if we don't get lucky, it might take n steps.

- [1:12:54](https://www.youtube.com/watch?v=undefined&t=40374s) If we do get lucky, it might just


take the one, or constant number, of steps. In fact, let me go ahead and do this. How about we do
something like this? Let me switch over to some code here. Let me start to make a program called List.C.
And in List.C, let's start with the old way. So we follow the breadcrumbs we've laid for ourselves as
follows.
- [1:13:12](https://www.youtube.com/watch?v=undefined&t=40392s) So in this List.C, I'm going to
include standardio.h. Int main(void) as usual. Then inside of my code here, I'm going to go ahead and
give myself the first version of memory. So int list 3 is now implemented at the moment, in an array. So
we're rewinding for now to week 2 style code. And then, let me just initialize this thing.

- [1:13:31](https://www.youtube.com/watch?v=undefined&t=40411s) At the first location will be 1. At


the next location will be 2. And at the last location will be 3. So the array is zero indexed always. I, for just
the sake of discussion though, am putting in the numbers 1, 2, 3, like a normal person might. All right. So
now let's just print these out. 4 int i gets 0.

- [1:13:48](https://www.youtube.com/watch?v=undefined&t=40428s) I less than 3, i++. Let's go ahead


now and print out using printf. %i/n list [i]. So very simple program, inspired by what we did in week 2.
Just to create and then print out the contents of an array. So let's Make List. So far, so good. ./list And
voila, we see 1, 2, 3. Now let's start to practice some of what we're preaching with this new syntax.

- [1:14:15](https://www.youtube.com/watch?v=undefined&t=40455s) So let me go in now and get rid of


the array version. And let me zoom out a little bit to give ourselves some more space. And now let's
begin to create a list of size 3. So if I'm going to do this now, dynamically, so that I'm allocating these
things again and again, let me go ahead and do this.

- [1:14:35](https://www.youtube.com/watch?v=undefined&t=40475s) Let me give myself a list that's of


type int* equal the return value of malloc of 3 times the size of an int, so what this is going to do for me
is give me enough memory for that very first picture we drew on the board. Which was the array
containing 1, 2, and 3. But laying the foundation to be able to resize it, which was ultimately the goal.

- [1:14:59](https://www.youtube.com/watch?v=undefined&t=40499s) So my syntax is a little different


here. I'm going to use malloc and get memory from the so-called "heap", as we called it last week.
Instead of using the stack by just doing the previous version where I said, int list 3. That is to say this line
of code from the first version is in some sense identical to this line of code in the second version.

- [1:15:20](https://www.youtube.com/watch?v=undefined&t=40520s) But the first line of code puts the


memory on the stack, automatically, for me. The second line of code, that I've left here now, is creating
an array of size 3, but it's putting it on the heap. And that's important because it was only on the heap
and via this new function last week, malloc. That you can actually ask for more memory, and even give it
back.

- [1:15:38](https://www.youtube.com/watch?v=undefined&t=40538s) When you just use the first


notation int list 3, you have permanently given yourself an array of size 3. You cannot add to that in code.
So let me go ahead and do this. If list==null, something went wrong. The computers out of memory. So
let's just return 1 and quit out of this program. There's nothing to see here.

- [1:15:58](https://www.youtube.com/watch?v=undefined&t=40558s) So just a good error check there.


Now let me go ahead and initialize this list. So list [0] will be 1 again. List [1] will be 2. And list [2] will be
3. So that's the same kind of syntax as before. And notice this equivalence. Recall that there's this
relationship between chunks of memory and arrays.

- [1:16:18](https://www.youtube.com/watch?v=undefined&t=40578s) And arrays are really just doing


pointer arithmetic for you, where the square bracket notation is. So if I've asked myself here, in line 5, for
enough memory for 3 integers, it is perfectly OK to treat it now like an array using square bracket
notation. Because the computer will do the arithmetic for me and find the first location, the second, and
the third.

- [1:16:38](https://www.youtube.com/watch?v=undefined&t=40598s) If you really want to be cool and


hacker-like, well, you could say list=1, list+1=2, list+2=3. That's the same thing using very explicit, pointer
arithmetic, which we looked at briefly last week. But this is atrocious to look at for most people. It's just
not very user friendly. It's longer to type, so most people, even when allocating memory dynamically as I
did a second ago, would just use the more familiar notation of an array.

- [1:17:10](https://www.youtube.com/watch?v=undefined&t=40630s) All right. So let's go on. Now


suppose time passes and I realize, oh shoot, I really wanted this array to be of size 4 instead of size 3.
Now, obviously, I could just rewind and like fix the program. But suppose that this is a much larger
program. And I've realized, at this point, that I need to be able to dynamically add more things to this
array for whatever reason.

- [1:17:32](https://www.youtube.com/watch?v=undefined&t=40652s) Well let me go ahead and do this.


Let me just say, all right, list should actually be the result of asking for 4 chunks of memory from malloc.
And then, I could do something like this, list [3]=4. Now this is buggy, potentially, in a couple of ways. But
let me ask first, what's really wrong, first, with this code? The goal at hand is to start with the array of
size 3 with the 1, 2, 3.

- [1:18:03](https://www.youtube.com/watch?v=undefined&t=40683s) And I want to add a number 4 to


it. So at the moment, in line 17, I've asked the computer for a chunk of 4 integers. Just like the picture.
And then I'm adding the number 4 to it. But I have skipped a few steps and broken this somehow. Yeah.
AUDIENCE: You don't know exactly [INAUDIBLE].. SPEAKER 1: Yeah.

- [1:18:22](https://www.youtube.com/watch?v=undefined&t=40702s) I don't necessarily know where


this is going to end up in memory. It's probably not going to be immediately adjacent to the previous
chunk. And so, yes, even though I'm putting the number for there, I haven't copied the 1, the 2, or the 3
over to this chunk of memory. So well let me fix-- well, that's actually, indeed, really the essence of the
problem.

- [1:18:40](https://www.youtube.com/watch?v=undefined&t=40720s) I am orphaning the original chunk


of memory. If you think of the picture that I drew earlier, the line of code up here on line 5 that allocates
space for the initial 3 integers. This code is fine. This code is fine. But as soon as I do this, I'm clobbering
the value of list. And saying no, don't point at this chunk of memory.

- [1:19:01](https://www.youtube.com/watch?v=undefined&t=40741s) Point at this chunk of memory, at


which point I've forgotten if you will, where the original chunk of memory is. So the right way to do
something like this, would be a little more involved. Let me go ahead and give myself a temporary
variable. And I'll literally call it TMP. T-M-P, like I did last week.

- [1:19:18](https://www.youtube.com/watch?v=undefined&t=40758s) So that I can now ask the


computer for a completely different chunk of memory of size 4. I'm going to again say if TMP equals null,
I'm going to say bad things happened here. So let me just return 1. And you know what, just to be tidy,
let me free the original list before I quit. Because remember from last week, any time you use malloc you
eventually have to use free.
- [1:19:38](https://www.youtube.com/watch?v=undefined&t=40778s) But this chunk of code here is just
a safety check. If there's no more memory, there's nothing to see here. I'm just going to clean up my
state and quit. But now, if I have asked for this chunk of memory, now I can do this 4 int i gets 0. I is less
than 3, i++. What if I do something like this? TMP [i] equals list [i].

- [1:20:04](https://www.youtube.com/watch?v=undefined&t=40804s) That would seem to have the


effect of copying all of the memory from one to the other. And then, I think I need to do one last thing
TMP [3] gets the number 4, for instance. Again, I'm hard coding the numbers for the sake of discussion.
After I've done this, what could I now do? I could now set list equals to TMP.

- [1:20:28](https://www.youtube.com/watch?v=undefined&t=40828s) And now, I have updated my


linked list properly. So let me go ahead and do this. 4 int i gets 0. I is less than 4, i++. Let me go ahead
and print each of these elements out with %i using list [i]. And then, I'm going to return 0 just to signify
that all is successful. Now so to recap, we initialize the original array of size 3 and plug-in the values 1, 2,
3.

- [1:20:53](https://www.youtube.com/watch?v=undefined&t=40853s) Time passes. And then, I realize,


wait a minute, I need more space. And so I asked the computer for a second chunk of memory. This one
of size 4. Just as a safety check, I make sure that TMP doesn't equal null. Because if it does I'm out of
memory. So I should just quit altogether. But once I'm sure that it's not null, I'm going to copy all the
values from the old list into the new list.

- [1:21:13](https://www.youtube.com/watch?v=undefined&t=40873s) And then, I'm going to add my


new number at the end of that list. And then, now that I'm done playing around with this temporary
variable, I'm going to remember in my list variable what the addresses of this new chunk of memory.
And then, I'm going to print all of those values out. So at least, aesthetically, when I make this new
version of my list, except for my missing semicolon.

- [1:21:34](https://www.youtube.com/watch?v=undefined&t=40894s) Let me try this again. When I


make lists, Oh OK. What did I do this time? Implicitly declaring a library function malloc. What's my
mistake any time you see that kind of error? AUDIENCE: Library. SPEAKER 1: Yeah. A library. So up here, I
forgot to do include stdlib.h, which is where malloc lives. Let me go ahead and, again, do make list.

- [1:21:54](https://www.youtube.com/watch?v=undefined&t=40914s) There we go. So I fixed that


dot/list. And I should see 1, 2, 3, 4. But they're still a bug here. Does anyone see the the-- bug or
question? AUDIENCE: You forgot to free them. SPEAKER 1: I'm sorry, say again. AUDIENCE: You forgot to
free them. SPEAKER 1: I forgot to free the original list. And we could see this, even if not just with our
own eyes or intuition.

- [1:22:16](https://www.youtube.com/watch?v=undefined&t=40936s) If I do something like Valgrind of


dot/list, remember our tool from this past week. Let me increase the size of my terminal window,
temporarily. The output is crazy cryptic at first. But, notice that I have definitely lost some number of
bytes here. And indeed, it's even pointing at the line number in which some of those bytes were lost.

- [1:22:34](https://www.youtube.com/watch?v=undefined&t=40954s) So let me go ahead and back to


my code. And indeed, I think what I need to do is, before I clobber the value of list pointing it at this new
chunk of memory instead of the old, I think I now need to first, proactively, say free the old list of
memory. And then, change its value. So if I now do Make List and do dot /list, the output is still the
same.

- [1:22:57](https://www.youtube.com/watch?v=undefined&t=40977s) And, if I cross my fingers and run


Valgrind again after increasing my window size, hopefully here. Oh, still a bug. So better. It seems like less
memory is lost. What have I now forgotten to do? AUDIENCE: You forgot to free the end. SPEAKER 1: I
forgot to free it at the very end, too. Because I still have a chunk of memory that I got from malloc.

- [1:23:19](https://www.youtube.com/watch?v=undefined&t=40999s) So let me go to the very bottom


of the program now. And after I'm done senselessly just printing this thing out, let me free the new list.
And now let me do Make List, dot/list. It's still works, visually. Now let's do Valgrind of dot/list, Enter. And
now, hopefully, all heap blocks were freed.

- [1:23:43](https://www.youtube.com/watch?v=undefined&t=41023s) No leaks are possible. So this is


perhaps the best output you can see from a tool like Valgrind. I used the heap, but I freed all the
memory as well. So there were 2 fixes needed there. All right. Any questions then on this array-based
approach, the first of which is statically allocating an array, so to speak.

- [1:23:59](https://www.youtube.com/watch?v=undefined&t=41039s) By just hard coding the number 3.


The second version now is dynamically allocating the array, using not the stack but the heap. But, it too,
suffers from the slowness we described earlier, of having to copy all those values from one to the other.
OK. A hand was over here. AUDIENCE: Why do you not have to free the TMP? SPEAKER 1: Good
question.

- [1:24:18](https://www.youtube.com/watch?v=undefined&t=41058s) Why did I not have to free the


TMP? I essentially did eventually. Because TMP was pointing at the chunk of 4 integers. But on line 33
here, I assigned list to be identical to what TMP was pointing at. And so, when I finally freed the list, that
was the same thing as freeing TMP. In fact, if I wanted to, I could say free TMP here and it would be the
same.

- [1:24:44](https://www.youtube.com/watch?v=undefined&t=41084s) But conceptually, it's wrong.


Because at this point in the story, I should be freeing the actual list, not that temporary variable. But they
were the same at that point in the story. Yeah. AUDIENCE: Is [? the line ?] part of it? SPEAKER 1: Good
question. And long story short, everything we're doing thus far is still in the world of arrays.

- [1:25:00](https://www.youtube.com/watch?v=undefined&t=41100s) The only distinction we're making


is that in version 1, when I said int list [3], that was an array of fixed size. So-called statically allocated on
the stack, as per last week. This version now is still dealing with arrays, but I'm flexing my muscles and
using dynamic memory allocation. So that I can still use an array per the first pictures we started talking
about.

- [1:25:22](https://www.youtube.com/watch?v=undefined&t=41122s) But I can at least grow the array if


I want. So we haven't even now solved this, even better in a sense, with linked lists. That's going to come
next. Yeah. AUDIENCE: How are you able to free list and then still make list? SPEAKER 1: How am I able to
free list? I freed the original address of list.

- [1:25:42](https://www.youtube.com/watch?v=undefined&t=41142s) I, then, changed what list is


storing. I'm moving its arrow to a new chunk of memory. And that is perfectly reasonable for me to now
manipulate because now list is pointing at the same value of TMP. And TMP is what was given the return
value of malloc, the second time. So that chunk of memory is valid.

- [1:26:02](https://www.youtube.com/watch?v=undefined&t=41162s) So these are just squares on the


board, right. There's just pointers inside of them. So what I'm technically saying is, and I'm not pointing
I'm not freeing list per se, I am freeing the chunk of memory that begins at the address currently in list.
Therefore, if a few lines later, I change what the address is in list.

- [1:26:22](https://www.youtube.com/watch?v=undefined&t=41182s) Totally reasonable to then touch


that memory, and eventually free it later. Because you're not freeing the variable per se, you're freeing
the address in the variable. Good distinction. All right. So let me back up here and now make one final
edit. So let's finish this with one final improvement here.

- [1:26:42](https://www.youtube.com/watch?v=undefined&t=41202s) Because it turns out, there's a


somewhat better way to actually resize an array as we've been doing here. And there's another function
in stdlib that's called realloc, for re-allocate. And I'm just going to go in and make a little bit of a change
here so that I can do the following. Let me go ahead and first comment this now, just so we can keep
track of what's been going on this whole time.

- [1:27:03](https://www.youtube.com/watch?v=undefined&t=41223s) So dynamically allocate an array


of size 3. Assign 3 numbers to that array. Time passes. Allocate new array of size 4. Copy numbers from
old array into new array. And add fourth number to new array. Free old array. Remember, if you will, new
array using my same list variable. And now, print new array. Free new array.

- [1:27:49](https://www.youtube.com/watch?v=undefined&t=41269s) Hopefully, that helps. And we'll


post this code online after 2, which tells a more explicit story. So it turns out that we can reduce some of
the labor involved with this. Not so much with the printing here, but with this copying. Turns out c does
have a function called realloc, that can actually handle the resizing of an array for you, as follows.

- [1:28:07](https://www.youtube.com/watch?v=undefined&t=41287s) I'm going to scroll up to where I


previously allocated a new array of size 4. And I'm instead going to say this, resize old array to be of size
4. Now, previously this wasn't necessarily possible. Because recall that we had painted ourselves into a
corner with the example on the screen where "Hello, world" happened to be right after the original
array.

- [1:28:28](https://www.youtube.com/watch?v=undefined&t=41308s) But let me do this. Let me use


realloc, for re-allocate. And pass in not just the size of memory we want this time, but also the address
that we want to resize. Which, again, is this array called list. All right. The code thereafter is pretty much
the same. But what I don't need to do is this. So realloc is a pretty handy function that will do the
following.

- [1:28:54](https://www.youtube.com/watch?v=undefined&t=41334s) If at the very beginning of class,


when we had 1, 2, 3 on the board. And someone's instinct was to just plop the 4 right at the end of the
list. If there's available memory, realloc will just do that. And boom, it will just grow the array for you in
the computer's memory. If, though, it realizes, sorry, there's already a string like "Hello, world" or
something else there, realloc will handle the trouble of moving that whole array from 1 chunk of
memory, originally, to a new chunk of memory.
- [1:29:20](https://www.youtube.com/watch?v=undefined&t=41360s) And then realloc will return to
you, the address of that new chunk of memory. And it will handle the process of freeing the old chunk
for you. So you do not need to do this yourself. So in fact, let me go ahead and get rid of this as well. So
realloc just condenses, a lot of what we just did, into a single function.

- [1:29:42](https://www.youtube.com/watch?v=undefined&t=41382s) Whereby, realloc handles it for


you. All right. So that's the final improvement on this array-based approach. So what now, knowing what
your memory is, what can we now do with it that solves that kind of problem? Because the world is
going to get really slow. And our apps, and our phones, and our computers are getting really slow, if
we're just constantly wasting time moving things around in memory.

- [1:30:04](https://www.youtube.com/watch?v=undefined&t=41404s) What could we perhaps do


instead? Well there's one new piece of syntax today that builds on these 3 pieces of syntax from the
past. Recall, that we've looked at struct, which is a keyword in C, that just lets you invent your own
structure. Your own variable, if you will, in conjunction with typedef.

- [1:30:20](https://www.youtube.com/watch?v=undefined&t=41420s) Which lets you say a person has a


name and a number, or something like that. Or a candidate has a name and some number of votes. You
can encapsulate multiple pieces of data inside of just one using struct. What did we use the Dot Notation
for now, a couple of times? What does the Dot operator do in C? AUDIENCE: Access the structure.

- [1:30:39](https://www.youtube.com/watch?v=undefined&t=41439s) SPEAKER 1: Perfect. To access the


field inside of a structure. So if you've got a person with a name and a number, you could say something
like person.name or person.number, if person is the name of one such variable. Star, of course, we've
seen now in a few ways. Like way back in week 1, we saw it as like, multiplication.

- [1:30:55](https://www.youtube.com/watch?v=undefined&t=41455s) Last week, we began to see it in


the context of pointers, whereby, you use it to declare a pointer. Like, int* p, or something like that. But
we also saw it in one other context, which was like the opposite, which was the dereference operator.
Which says if this is an address, that is if this is a variable like a pointer, and you put a star in front of it
then with no int or no char, no data type in front of it.

- [1:31:17](https://www.youtube.com/watch?v=undefined&t=41477s) That means go to that address.


And it dereferences the pointer and goes to that location. So it turns out that using these 3 building
blocks, you can actually start to now use your computer's memory almost any way you want. And even
next week, when we transition to Python, and you start to get a lot of features for free.

- [1:31:34](https://www.youtube.com/watch?v=undefined&t=41494s) Like a single line of code will just


do so much more in Python than it does in C. It boils down to those basic primitives. And just so you've
seen it already. It turns out that it's so common in C to use this operator to go inside of a structure and
this operator to go to an address, that there's shorthand notation for it, a.k.a.

- [1:31:54](https://www.youtube.com/watch?v=undefined&t=41514s) syntactic sugar. That literally looks


like an arrow. So recall last week, I was in the habit of pointing, even with the big foam finger. This arrow
notation, a hyphen and an angled bracket, denotes going to an address and looking at a field inside of it.
But we'll see this in practice in just a bit. So what might be the solution, now, to this problem we saw a
moment ago whereby, we had painted ourselves into a corner.
- [1:32:20](https://www.youtube.com/watch?v=undefined&t=41540s) And our memory, a few moments
ago, looked like this. We could just copy the whole existing array to a new location, add the 4, and go
about our business. What would another, perhaps better solution longer term be, that doesn't require
constantly moving stuff around? Maybe hang in there for your instincts if you know the buzz phrase
we're looking for from past experience, hang in there.

- [1:32:45](https://www.youtube.com/watch?v=undefined&t=41565s) But if we want to avoid moving


the 1, 2, and the 3, but we still want to be able to add endless amounts of data. What could we do?
Yeah. So maybe create some kind of list using pointers that just point at a new location, right. In an ideal
world, even though this piece of memory is being used by this h in the string "Hello, world", maybe we
could somehow use a pointer from last week.

- [1:33:05](https://www.youtube.com/watch?v=undefined&t=41585s) Like an arrow, that says after the


3, oh I don't know, go down over here to this location in memory. And you just stitch together these
integers in memory so that each one leads to the next. It's not necessarily the case that it's literally back-
to-back. That would have the downside, it would seem, of costing us a little bit of space.

- [1:33:25](https://www.youtube.com/watch?v=undefined&t=41605s) Like a pointer, which recall, takes


up some amount of space. Typically 8 bytes or 64 bits. But I don't have to copy potentially a huge
amount of data just to add one more number. And so these things do have a name. And indeed, these
things are what generally would be called a linked list. A linked list captures exactly that intuition of
linking together things in memory.

- [1:33:47](https://www.youtube.com/watch?v=undefined&t=41627s) So let's take a look at an example.


Here's a computer's memory in the abstract. Suppose that I'm trying to create an array. Let's generalize it
as a list, now, of numbers. An array has a very specific meaning. It's memory that's contiguous, back, to
back, to back. At the end of the day, I as the programmer, just care about the data-- 1, 2, 3, 4, and so
forth.

- [1:34:06](https://www.youtube.com/watch?v=undefined&t=41646s) I don't really care how it's stored.


I don't care how it's stored when I'm writing the code, I just wanted to work at the end of the day. So
suppose that I first insert my number 1. And, who knows, it ends up, up there at location, 0X123, for the
sake of discussion. All right. Maybe there's something already here.

- [1:34:24](https://www.youtube.com/watch?v=undefined&t=41664s) And heck, maybe there's


something already here, but there's plenty of other options for where this thing can go. And suppose
that, for the sake of discussion, the first available spot for the next number happens to be over here at
location 0X456, for the sake of discussion. So that's where I'm going to plop the number 2.

- [1:34:40](https://www.youtube.com/watch?v=undefined&t=41680s) And where might the number 3


end up? Oh I don't know, maybe down over there at 0X789. The point being, I don't know what is, or
really care about, everything else that's in the computer's memory. I just care that there are at least 3
locations available where I can put my 1, my 2, and my 3.

- [1:34:58](https://www.youtube.com/watch?v=undefined&t=41698s) But the catch is, now that we're


not using an array, we can't just naively assume that you just add 1 to an index and boom, you're at the
next number. Add 2 to an index, and boom you're at the next, next number. Now you have to leave these
little breadcrumbs, or use the arrow notation, to lead from one to the other.
- [1:35:17](https://www.youtube.com/watch?v=undefined&t=41717s) And sometimes, it might be close,
a few bytes away. Maybe, it's a whole gigabyte away in an even bigger computer's memory. So how
might I do this? Like where do these pointers go, as you proposed? All right. All I have access to here are
bytes. I've already stored the 1, the 2, and the 3. So what more should I do? OK, yeah.

- [1:35:38](https://www.youtube.com/watch?v=undefined&t=41738s) So let me, you put the pointers


right next to these numbers. So let me at least plan ahead, so that when I ask the computer like malloc,
recall from last week, for some memory, I don't just ask it now for space for just the number. Let me start
getting into the habit of asking malloc for enough space for the number and a pointer to another such
number.

- [1:35:57](https://www.youtube.com/watch?v=undefined&t=41757s) So it's a little more aggressive of


me to ask for more memory. But I'm planning ahead. And here is an example of a trade off. Almost any
time in CS, when you start using more space, you can save time. Or if you try to conserve space, you
might have to lose time. It's being that trade off there.

- [1:36:12](https://www.youtube.com/watch?v=undefined&t=41772s) So how might I solve this? Well let


me abstract this away. And either next to or below, I'm just drawing it vertically, just for the sake of
discussion. So the arrows are a bit prettier. I've asked malloc for now twice as much space, it would
seem, than I previously needed. But I'm going to use this second chunk of memory to refer to the next
number.

- [1:36:31](https://www.youtube.com/watch?v=undefined&t=41791s) And I'm going to use this chunk of


memory to refer to the next, essentially, stitching this thing together. So what should go in this first box?
Well, I claim the number, 0X456. And it's written in hex because it represents a memory address. But this
is the equivalent of drawing an arrow from one to the other.

- [1:36:48](https://www.youtube.com/watch?v=undefined&t=41808s) As a little check here, what should


go in this second box if the goal is to stitch these together in order 1, 2, 3? Feel free to just shout this
out. AUDIENCE: 0X789. SPEAKER 1: OK, that worked well. So 0X789, indeed. And you can't do that with
the hands because I can't count that fast. So 0X789 should go here because that's like a little breadcrumb
to the next.

- [1:37:09](https://www.youtube.com/watch?v=undefined&t=41829s) And then, we don't really have


terribly many possibilities here. This has to have a value, right. Because at the end of the day, it's got to
use its 64 bits in some way. So what value should go here, if this is the end of this list? AUDIENCE: 0.
SPEAKER 1: So it could be 0X123. The implication being that it would be a cyclical list.

- [1:37:30](https://www.youtube.com/watch?v=undefined&t=41850s) Which is OK, but potentially


problematic. If any of you have accidentally lost control over your code space because you had an infinite
loop, this would seem a very easy way to give yourself the accidental probability of an infinite loop. What
might be simpler than that and ward that off? AUDIENCE: Null. SPEAKER 1: Say again? AUDIENCE: Null.

- [1:37:49](https://www.youtube.com/watch?v=undefined&t=41869s) SPEAKER 1: So just the null


character. Not N-U-L, confusingly, which is at the end of strings. But N-U-L-L, as we introduced it last
week. Which is the same as 0x0. So this is just a special value that programmers decades ago decided
that if you store the address 0, that's not a valid address. There's never going to be anything useful at
0x0.
- [1:38:08](https://www.youtube.com/watch?v=undefined&t=41888s) Therefore, it's a sentinel value,
just a special value, that indicates that's it. There's nowhere further to go. It's OK to come back to your
suggestion of making a cyclical list. But we'd better be smart enough to, maybe, remember where did
the list start so that you can detect cycles.

- [1:38:24](https://www.youtube.com/watch?v=undefined&t=41904s) If you start looping around in this


structure, otherwise. All right. But these addresses, who really cares at the end of the day if we abstract
this away. It really just now looks like this. And indeed, this is how most anyone would draw this on a
whiteboard if having a discussion at work. Talking about what data structure we should use to solve
some problem in the real world.

- [1:38:40](https://www.youtube.com/watch?v=undefined&t=41920s) We don't care generally about the


addresses. We care that in code we can access them. But in terms of the concept alone this would be,
perhaps, the right way to think about this. All right, let me pause here and see if there's any questions on
this idea of creating a linked list in memory by just storing, not just the numbers like 1, 2, 3, but twice as
much data.

- [1:39:00](https://www.youtube.com/watch?v=undefined&t=41940s) So that you have little


breadcrumbs in the form of pointers that can lead you from one to the next. Any questions on these
linked lists? Any questions? No? All right. Oh, yeah. Over here. AUDIENCE: So does this takes time more
memory than an array? SPEAKER 1: This does take more memory than an array because I now need
space for these pointers.

- [1:39:24](https://www.youtube.com/watch?v=undefined&t=41964s) And to be clear, I technically


didn't really draw this to scale. Thus far, in the class, we've generally thought about integers like, 1, 2 and
3, as being 4 bytes, or 32 bits. I made the claim last week that on modern computer's pointers tend to be
8 bytes or 64 bits. So, technically, this box should actually be a little bigger.

- [1:39:43](https://www.youtube.com/watch?v=undefined&t=41983s) It was just going to look a little


stupid in the picture. So I abstracted it away. But, indeed, you're using more space as a result. AUDIENCE:
[INAUDIBLE]. SPEAKER 1: Oh, how does-- sorry. How does the computer identify useful data from used
data? So, for instance, garbage values or non-garbage values.

- [1:39:58](https://www.youtube.com/watch?v=undefined&t=41998s) For now, think of that as the job


of malloc. So when you ask malloc for memory, as we started to last week, malloc keeps track of the
addresses of the memory it has handed to as valid values. The other type of memory you use, not just
from the heap. Because recall we briefly discussed that malloc uses space from the heap, which was
drawn at the top of the picture, pointing down.

- [1:40:19](https://www.youtube.com/watch?v=undefined&t=42019s) There's also stack memory, which


is where all of your local variables go. And where all of the memory used by individual functions go. And
that was drawn in the picture is working its way up. That's just an artist's rendition of direction. The
compiler, essentially, will also help keep track of which values are valid or not inside of the stack.

- [1:40:37](https://www.youtube.com/watch?v=undefined&t=42037s) Or really the underlying code that


you've written will keep track of that for you. So it's managed for you at that point. All right. Good
question. Sorry it took me a bit to catch on. So let's now translate this to actual code. How could we
implement this idea of, let's call these things nodes.
- [1:40:52](https://www.youtube.com/watch?v=undefined&t=42052s) And that's a term of our NCS.
Whenever you have some data structure that encapsulates information, node, N-O-D-E, is the generic
term for that. So each of these might be said to be a node. Well, how can we do this? Well a couple of
weeks ago, we saw how we could represent something like a student or a candidate.

- [1:41:08](https://www.youtube.com/watch?v=undefined&t=42068s) And a student, or rather a person,


we said has a name and a number. And we used a few pieces of syntax here. One, we use the struct
keyword, which gives us a data structure. We use typedef, which defines the name person to be our new
data type representing that whole structure. So we probably have the right ingredients here to build up
this thing called a node.

- [1:41:29](https://www.youtube.com/watch?v=undefined&t=42089s) And just to be clear, what should


go inside of one of these nodes, do we think? It's not going to be a name or a number, obviously. But
what should a node have in terms of those fields, perhaps? Yeah? AUDIENCE: [? Data. ?] SPEAKER 1: So a
number like a number and a pointer in some form. So let's translate this to actual code.

- [1:41:46](https://www.youtube.com/watch?v=undefined&t=42106s) So let's rename person to node to


capture this notion here. And the number is easy. If it's just going to be an int, that's fine. We can just say
int number, or int n, or whatever you want to call that particular field. The next one is a little non-
obvious. And this is where things get a little weird at first, but, in retrospect, it should all fit together.

- [1:42:05](https://www.youtube.com/watch?v=undefined&t=42125s) Let me propose that, ideally, we


would say something like node* next. And I could call the word next anything I want. Next just means
what comes after me is the notion I'm using it at. So a lot of CS people would just use next to represent
the name of this pointer. But there's a catch here. C and C compilers are pretty naive, recall.

- [1:42:26](https://www.youtube.com/watch?v=undefined&t=42146s) They only look at code top to


bottom, left to right. And any time they encounter a word they have never seen before, bad things
happen. Like, you can't compile your code. You get some cryptic error message or the like. And that
seems to be about to happen here. Because if the compiler is reading this code from top to bottom, it's
going to say, oh, inside of this struct should be a variable called next.

- [1:42:47](https://www.youtube.com/watch?v=undefined&t=42167s) Which is of type node*. What the


heck is a node? Because it literally does not find out until 2 lines later, after that semicolon. So the way to
avoid this, which we haven't quite seen before, is that you can temporarily name this whole thing up
here, struct node. And then, down here inside of the data structure, you say struct node*.

- [1:43:08](https://www.youtube.com/watch?v=undefined&t=42188s) And then, you leave the rest


alone. This is a workaround this is possible because now you're teaching the compiler, from the first line,
that here comes a data structure called struct node. Down here, you're shortening the name of this
whole thing to just node. Why? It's just a little more convenient than having to write struct everywhere.

- [1:43:26](https://www.youtube.com/watch?v=undefined&t=42206s) But you do have to write struct


node* inside of the data structure. But that's OK because it's already come into existence now, as of that
first line of code. So that's the only fundamental difference between what we did last week with a
person or a candidate. We just now have to use this struct workaround, syntactically.
- [1:43:45](https://www.youtube.com/watch?v=undefined&t=42225s) All right. Yeah, question.
AUDIENCE: So [INAUDIBLE] have like right next to the [INAUDIBLE] point to another [INAUDIBLE].
SPEAKER 1: Why is the next variable a struct node* pointer and not an int star pointer, for instance? So
think about the picture we are trying to draw. Technically, yes, each of these arrows I deliberately drew is
pointing at the number.

- [1:44:07](https://www.youtube.com/watch?v=undefined&t=42247s) But that's not alone. They need to


point at the whole data structure in memory. Because the computer, ultimately, and the compiler, in
turn, needs to know that this chunk of memory is not just an int. It is a whole node. Inside of a node is a
number and also another pointer. So when you draw these arrows, it would be incorrect to point at just
the number.

- [1:44:27](https://www.youtube.com/watch?v=undefined&t=42267s) Because that throws away


information that would leave the compiler wondering, OK, I'm at a number. Where the heck is the
pointer? You have to tell it that it's pointing at a whole node so it knows a few bytes away is that
corresponding pointer. Good question. Yeah. AUDIENCE: How do you [INAUDIBLE]. SPEAKER 1: Really
good question.

- [1:44:43](https://www.youtube.com/watch?v=undefined&t=42283s) It would seem that just as copying


the array earlier required twice as much memory, because we copied from old to new. So, technically,
twice as much plus 1 for the new number. Here, too, it looks like we're using twice as much memory,
also. And to my comment earlier, it's even more than twice as much memory because these pointers are
8 bytes, and not just 4 bytes like a typical integer is.

- [1:45:03](https://www.youtube.com/watch?v=undefined&t=42303s) The differences are these. In the


context of the array, you were using that memory temporarily. So, yes, you needed twice as much
memory. But then you were quickly freeing the original array. So you weren't consuming long-term,
more memory than you might need. The difference here, too, is that, as we'll see in a moment, it turns
out it's going to be relatively quick for me, potentially, to insert new numbers in here.

- [1:45:25](https://www.youtube.com/watch?v=undefined&t=42325s) Because I'm not going to have to


do a huge amount of copying. And even though I might still have to follow all of these arrows, which is
going to take some amount of time, I'm not going to have to be asking for more memory, freeing more
memory. And certain operations in the computer, anything involving asking for or giving back memory,
tends to be slower.

- [1:45:42](https://www.youtube.com/watch?v=undefined&t=42342s) So we get to avoid that situation


as well. There's going to be some downsides, though. This is not all upside. But we'll see in a bit just
what some of those trade offs actually are. All right. So from here, if we go back to the structure in code
as we left it, let's start to now build up a linked list with some actual code.

- [1:45:59](https://www.youtube.com/watch?v=undefined&t=42359s) How do you go about, in C,


representing a linked list in code? Well, at the moment, it would actually be as simple as this. You declare
a variable, called list, for instance. That itself stores the address of a node. That's what node* means. The
address of a node. So if you want to store a linked list in memory, you just create a variable called list, or
whatever else.
- [1:46:20](https://www.youtube.com/watch?v=undefined&t=42380s) And you just say that this variable
is going to be pointing at the first node in a list, wherever it happens to end up. Because malloc is
ultimately going to be the tool that we use just to go get at any one particular node in memory. All right.
So let's actually do this in pictorial form. When you write a line of code, like I just did here-- and I do not
initialize it to anything with the assignment operator, an equal sign.

- [1:46:44](https://www.youtube.com/watch?v=undefined&t=42404s) It does exist in memory as a box,


as I'll draw it here, called list. But I've deliberately drawn Oscar inside of it. Why? To connote what
exactly? AUDIENCE: Garbage value. SPEAKER 1: It's a garbage value. I have been allocated the variable in
memory, called list. Which is going to give me 64 bits or 8 bytes somewhere drawn here with this box.

- [1:47:05](https://www.youtube.com/watch?v=undefined&t=42425s) But if I myself have not used the


assignment operator, it's not going to get magically initialized to any particular address for me. It's not
going to even give me a node. This is literally just going to be an address of a future node that exists. So
what would be a solution here? Suppose that I'm beginning to create my linked list, but I don't have any
nodes yet.

- [1:47:25](https://www.youtube.com/watch?v=undefined&t=42445s) What would be a sensible thing to


initialize the list to, perhaps? AUDIENCE: Null. SPEAKER 1: Yeah, again. AUDIENCE: To null. SPEAKER 1: So
just null, right. When in doubt with pointers, generally it's a good thing to initialize things to null, so at
least it's not a garbage value. It's a known value.

- [1:47:39](https://www.youtube.com/watch?v=undefined&t=42459s) Invalid, yes. But it's a special value


you can then check for with a conditional, or the like. So this might be a better way to create a linked list,
even before you've inserted any numbers into the thing itself. All right. So after that, how can we go
about adding something to this linked list? So now the story looks like this.

- [1:47:57](https://www.youtube.com/watch?v=undefined&t=42477s) Oscar is gone because inside of


this box is all zero bits. Just because it's nice and clean, and this represents an empty linked list. Well, if I
want to add the number 1 to this linked list, what could I do? Well, perhaps I could start with code like
this. Borrowing inspiration from last week. Let's ask malloc for enough space for the size of a node.

- [1:48:16](https://www.youtube.com/watch?v=undefined&t=42496s) And this gets to your question


earlier, like, what is it I'm manipulating here? I don't just need space for an int and I don't just need
space for a pointer. I need space for both. And I gave that thing a name, node. So size of node figures out
and does the arithmetic for me. And gives me back the right number of bytes.

- [1:48:33](https://www.youtube.com/watch?v=undefined&t=42513s) This, then, stores the address of


that chunk of memory in what I'll temporarily called n. Just to represent a generic new node. And it's of
type node*. Because just like last week when I asked malloc for enough space for an int and I stored it in
an int* pointer. This week, if I'm asking for memory for a node, I'm storing it in a node* pointer.

- [1:48:53](https://www.youtube.com/watch?v=undefined&t=42533s) So technically, nothing new there


except for this new term of art in data structure called node. All right. So what does that do for me? It
essentially draws a picture like this in memory. I still have my list variable from my previous line of code
initialize to null. And that's why I've drawn it blank.
- [1:49:09](https://www.youtube.com/watch?v=undefined&t=42549s) I also now have a temporary
variable called n, which I initialize to the return value of malloc. Which gave me one of these nodes in
memory. But I've drawn it having garbage values, too, because I don't know what int is there. I don't
know what pointer is there. It's garbage values because malloc does not magically initialize memory for
me.

- [1:49:27](https://www.youtube.com/watch?v=undefined&t=42567s) There is another function for that.


But malloc alone just says, sure, use this chunk of memory. Deal with whatever is there. So how can I go
about initializing this to known values? Well, suppose I want to insert the number 1 and then, leave it at
that. A list of size 1, I could do something like this. And this is where you have to think back to some of
these basics.

- [1:49:47](https://www.youtube.com/watch?v=undefined&t=42587s) My conditional here is asking the


question if n does not equal null. So that is, if malloc gave me valid memory, and I don't have to quit
altogether because my computer's out of memory. If n does not equal null, but is equal to valid address,
I'm going to go ahead and do this. And this is cryptic looking syntax now.

- [1:50:06](https://www.youtube.com/watch?v=undefined&t=42606s) But does someone want to take a


stab at translating this inside line of code to English, in some sense? How might you explain what that
inner line of code is doing? *n. number equals 1. Let me go further back. Nope? OK, over here. Yeah.
AUDIENCE: [INAUDIBLE]. SPEAKER 1: Perfect. The place that n is pointing to, set it equal to 1.

- [1:50:30](https://www.youtube.com/watch?v=undefined&t=42630s) Or using the vernacular of going


there, go to the address in n and set it's number field to 1. However you want to think about it, that's
fine. But the * again is the dereference operator here. And we're doing the parentheses, which we
haven't needed to do before because we haven't dealt with pointers and data structures together until
today.

- [1:50:47](https://www.youtube.com/watch?v=undefined&t=42647s) This just means go there first. And


then once you're there, go access number. You don't want to do one thing before the other. So this is just
enforcing order of operations. The parentheses just like in grade school math. All right. So this line of
code is cryptic. It's ugly. It's not something most people easily remember.

- [1:51:03](https://www.youtube.com/watch?v=undefined&t=42663s) Thankfully, there's that syntactic


sugar that simplifies this line of code to just this. And this, even though it's new to you today, should
eventually feel a little more familiar. Because this now is shorthand notation for saying, start at n. Go
there as by following the arrow. And when you get there, change the number field.

- [1:51:20](https://www.youtube.com/watch?v=undefined&t=42680s) In this case, to 1. So most people


would not write code like this. It's just ugly. It's a couple extra keystrokes. This just looks more like the
artist's renditions we've been talking about. And how most CS people would think about pointers as
really just being arrows in some form. All right.

- [1:51:37](https://www.youtube.com/watch?v=undefined&t=42697s) So what have we just done? The


picture now, after setting number to 1, looks a little something like this. So there's still one step missing.
And that's, of course, to initialize, it would seem, the pointer in this new node to something known like
null. So I bet we could do this like this. With a different line of code, I'm just going to say if n does not
equal null, then set n's next field to null.
- [1:52:00](https://www.youtube.com/watch?v=undefined&t=42720s) Or more pedantically, go to n,
follow the arrow, and then update the next field that you find there to equal null. And again, this is just
doing some nice bookkeeping. Technically speaking, we might not need to set this to null if we're going
to keep adding more and more numbers to it. But I'm doing it step-by-step so that I have a very clean
picture.

- [1:52:20](https://www.youtube.com/watch?v=undefined&t=42740s) And there's no bugs in my code at


this point. But I'm still not done. There's one last thing I'm going to have to do here. If the goal,
ultimately, was to insert the number 1 into my linked list, what's the last step I should, perhaps, do here?
Just been English is fine. Yeah. AUDIENCE: Set the pointer value to null.

- [1:52:41](https://www.youtube.com/watch?v=undefined&t=42761s) SPEAKER 1: Yes. I now need to


update the actual variable, that represents my linked list, to point at this brand new node. That is now
perfectly initialized as having an integer and a null pointer. Yeah, technically, this is already pointing
there. But I describe this deliberately earlier as being temporary.

- [1:52:58](https://www.youtube.com/watch?v=undefined&t=42778s) I just needed this to get it back


from malloc and clean things up, initially. This is the long term variable I care about. So I'm going to want
to do something simple like this. List equals n. And this seems a little weird that list equals n. But again,
think about what's inside this box. At the moment this is null because there is no linked list at the
beginning of our story.

- [1:53:17](https://www.youtube.com/watch?v=undefined&t=42797s) N is the address of the beginning,


and it turns out, end of our linked list. So it stands to reason that if you set list equal to n, that has the
effect of copying this address up here. Or really just copying the arrow into that same location so that
now the picture looks like this. And heck, if this was a temporary variable, it will eventually go away.

- [1:53:36](https://www.youtube.com/watch?v=undefined&t=42816s) And now, this is the picture. So an


annoying number of steps, certainly, to walk through verbally like this. But it's just malloc to give yourself
a node, initialize the 2 fields inside of it, update the linked list, and boom, you're on your way. I didn't
have to copy anything. I just had to insert something in this case.

- [1:53:56](https://www.youtube.com/watch?v=undefined&t=42836s) Let me pause here to see if


there's any questions on those steps. And we'll see before long it all in context with some larger code.
AUDIENCE: So if the statements [INAUDIBLE].. SPEAKER 1: Yes. I drew them separately just for the sake of
the voiceover of doing each thing very methodically. In real code, as we'll transition to now, I could have
and should have just done it all inside of one conditional after checking if n is not equal to null.

- [1:54:20](https://www.youtube.com/watch?v=undefined&t=42860s) I could set number to a value like


1. And I could set the pointer itself to something like null. All right. Well let's translate, then, this into
some similar code that allows us to build up a linked list now using code similar in spirit to before. But
now, using this new primitive. So I'm going to go back into VS Code here.

- [1:54:40](https://www.youtube.com/watch?v=undefined&t=42880s) I'm going to go ahead now and


delete the entirety of this old version that was entirely array-based. And now, inside of my main function,
I'm going to go ahead and first do this. I'm going to first give myself a list of size 0. And I'm going to call
that node* list. And I'm going to initialize that to null, as we proposed earlier.
- [1:54:59](https://www.youtube.com/watch?v=undefined&t=42899s) But I'm also now going to have to
take the additional step of defining what this node is. So recall that I might do something like typedef,
struct node. Inside of this struct node, I'm going to have a number, which I'll call number of type int. And
I'm going to have a structure called node with a * that says the next pointer is called next.

- [1:55:17](https://www.youtube.com/watch?v=undefined&t=42917s) And I'm going to call this whole


thing, more succinctly, node, instead of struct node. Now as an aside, for those of you wondering what
the difference really is between struct and node. Technically, I could do something like this. Not use
typedef and not use the word node alone. This syntax here would actually create for me a new data type
called, verbosely, struct node.

- [1:55:40](https://www.youtube.com/watch?v=undefined&t=42940s) And I could use this throughout


my code saying struct node. Struct node. That just gets a little tedious. And it would be nicer just to refer
to this thing more simplistically as a node. So what typedef has been doing for us is it, again, lets us
invent our own word that's even more succinct. And this just has the effect now of calling this whole
thing node without the need, subsequently, to keep saying struct all over the place.

- [1:56:02](https://www.youtube.com/watch?v=undefined&t=42962s) Just FYI. All right. So now that this


thing exists in main, let's go ahead and do this. Let's add a number to list. And to do this, I'm going to
give myself a temporary variable. I'll call it n for consistency. I'm going to use malloc to give myself the
size of a node, just like in our slides.

- [1:56:20](https://www.youtube.com/watch?v=undefined&t=42980s) And then, I'm going to do a little


safety check. If n equals equals null, I'm going to do the opposite of the slides. I'm just going to quit out
of this program because there's nothing useful to be done at this point. But most likely my computer is
not going to run out of memory. So I'm going to assume we can keep going with some of the logic here.

- [1:56:34](https://www.youtube.com/watch?v=undefined&t=42994s) If n does not equal null, and that


is it's a valid memory address, I'm going to say n []-- I'm going to build this up backwards. Well let's do.
That's OK, let's go ahead and do this. N [number] equals 1. And then n [arrow next] equals null. And
now, update list to point to new node, list equals n.

- [1:57:00](https://www.youtube.com/watch?v=undefined&t=43020s) So at this point in the story, we've


essentially constructed what was that first picture, which looks like this. This is the corresponding code
via which we built up this node in memory. Suppose now, we want to add the number 2 to the list. So
let's do this again. Add a number to list. How might I do this? Well, I don't need to redeclare n because I
can use the same temporary variables before.

- [1:57:26](https://www.youtube.com/watch?v=undefined&t=43046s) So this time, I'm just going to say


n equals malloc and the size of a node. I'm, again, going to have my safety check. So if n equals equals
null, then let's just quit out of this altogether. But, I have to be a little more careful now. Technically
speaking, what do I still need to do before I quit out of my program to be really proper? Free the
memory that did succeed a little higher up.

- [1:57:51](https://www.youtube.com/watch?v=undefined&t=43071s) So I think it suffices to free what is


now called list, way at the top. All right. Now, if all was well, though, let's go ahead and say n [number]
equals 2. And now, n [arrow next] equals null. And now, let's go ahead and add it to the list. If I go ahead
and do list arrow next equals n, I think what we've just done is build up the equivalent, now, of this in
the computer's memory.

- [1:58:27](https://www.youtube.com/watch?v=undefined&t=43107s) By going to the list field's next


field, which is synonymous with the 1 nodes, bottom-most box. And store the address of what was n,
which a moment ago looked like this. And I'm just throwing away, in the picture, the temporary variable.
All right. One last thing to do. Let me go down here and say, add a number to list, n equals malloc.

- [1:58:48](https://www.youtube.com/watch?v=undefined&t=43128s) Let's do it one more time. Size of


node. And clearly, in a real program, we might want to start using a loop. And do this dynamically or a
function because it's a lot of repetition now. But just to go through the syntax here, this is fine. If n
equals equals null, out of memory for some reason. Let's return 1, but we should free the list itself and
even the second node, list [next].

- [1:59:13](https://www.youtube.com/watch?v=undefined&t=43153s) But I've deliberately done this


poorly. All right. This is a little more subtle now. And let me get rid of the highlighting just so it's a little
more visible. If n happens to equal equal null, and something really just went wrong they're out of
memory, why am I freeing 2 addresses now? And again, it's not that I'm freeing those variables per se.

- [1:59:35](https://www.youtube.com/watch?v=undefined&t=43175s) I'm freeing the addresses at in


those variables. But there's also a bug with my code here. And it's subtle. Let me ask more pointedly.
This line here, 43, what is that freeing specifically? Can I go to you? AUDIENCE: You're freeing list 2 times.
SPEAKER 1: I'm freeing, not so. That's OK.

- [1:59:55](https://www.youtube.com/watch?v=undefined&t=43195s) I'm not freeing list 2 times.


Technically, I'm freeing list once and list next once. But let me just ask the more explicit question. What
am I freeing with line 43 at the moment? Which node? I think node number 1. Why? Because if 1 is at
the beginning of the list, list contains the address of that number 1 node.

- [2:00:14](https://www.youtube.com/watch?v=undefined&t=43214s) And so this frees that node. This


line of code, you might think now intuitively, OK, it's probably freeing the node number 2. But this is bad.
And this is subtle. Valgrind might help you catch this. But by eyeing it, it's not necessarily obvious. You
should never touch memory that you have already freed.

- [2:00:31](https://www.youtube.com/watch?v=undefined&t=43231s) And so, the fact that I did in this


order, very bad. Because I'm telling the operating system, I don't know. I don't need the list address
anymore. Do with it what you want. And then, literally one line later, you're saying, wait a minute. Let
me actually go to that address for a moment and look at the next field of that first node.

- [2:00:48](https://www.youtube.com/watch?v=undefined&t=43248s) It's too late. You've already given


up control over the node. So it's an easy fix in this case, logically. But we should be freeing the second
node first and then the first one so that we're doing it in, essentially, reverse order. And again, Valgrind
would help you catch that. But that's the kind of thing one needs to be careful about when touching
memory at all.

- [2:01:08](https://www.youtube.com/watch?v=undefined&t=43268s) You cannot touch memory after


you freed it. But here is my last step. Let me go ahead and update the number field of n to be 3. The next
node of n to be null. And then, just like in the slide earlier, I think I can do list next, next equals n. And
that has the effect now of building up in the computer's memory, essentially, this data structure.

- [2:01:34](https://www.youtube.com/watch?v=undefined&t=43294s) Very manually. Very pedantically.


Like, in a better world, we'd have a loop and some functions that are automating this process. But, for
now, we're doing it just to play around with the syntax. So at this point, unfortunately, suppose I want to
print the numbers. It's no longer as easy as int i equals 0, i less than 3, i++.

- [2:01:54](https://www.youtube.com/watch?v=undefined&t=43314s) Because you cannot just do


something like this. Because pointer arithmetic no longer comes into play when it's you, who are
stitching together the data structure in memory. In all of our past examples with arrays, you've been
trusting that all of the bytes in the array are back, to back, to back.

- [2:02:16](https://www.youtube.com/watch?v=undefined&t=43336s) So it's perfectly reasonable for the


compiler and the computer to just figure out, oh, well if you want [0], that's at the beginning. [1], it's one
location over. [2], it's one location over. This is way less obvious now. Because even though you might
want to go to the first element in the linked list, or the second, or the third, you can't just jump to those
arithmetically by doing a bit of math.

- [2:02:38](https://www.youtube.com/watch?v=undefined&t=43358s) Instead, you have to follow all of


those arrows. So with linked lists, you can't use this square bracket notation anymore because one node
might be here, over here, over here, over here. You can't just use some simple offset. So I think our code
is going to have to be a little fancier. And this might look scary at first, but it's just an application of some
of the basic definitions here.

- [2:03:00](https://www.youtube.com/watch?v=undefined&t=43380s) Let me do a for-loop that actually


uses a node* variable initialized to the list itself. I'm going to keep doing this, so long as TMP does not
equal null. And on each iteration of this loop, I'm going to update TMP to be whatever TMP arrow next
is. And I'll remind you in a moment and explain in more detail.

- [2:03:23](https://www.youtube.com/watch?v=undefined&t=43403s) But when I print something here


with printf, I can still use %i. Because it's still a number at the end of the day. But what I want to print out
is the number in this temporary variable. So maybe the ugliest for-loop we've ever seen. Because it's
mixing, not just the idea of a for-loop, which itself was a bit cryptic weeks ago.

- [2:03:41](https://www.youtube.com/watch?v=undefined&t=43421s) But now, I'm using pointers


instead of integers. But I'm not violating the definition of a for-loop. Recall that a for-loop has 3 main
things in parentheses. What do you want to initialize first? What condition do you want to keep checking
again and again? And what update do you want to make on every iteration of the loop? So with that
basic definition in mind, this is giving me a temporary variable called TMP that is initialized to the
beginning of the loop.

- [2:04:04](https://www.youtube.com/watch?v=undefined&t=43444s) So it's like pointing my finger at


the number 1 node. Then, I'm asking the question, does TMP not equal null? Well, hopefully, not
because I'm pointing at a valid node that is the number 1 node. So, of course, it doesn't equal null yet.
Null won't be until we get to the end of the list.
- [2:04:20](https://www.youtube.com/watch?v=undefined&t=43460s) So what do I do? I started this
TMP variable. I follow the arrow and go to the number field they're in. What do I then do? The for-loop
says, change TMP to be whatever is at TMP, by following the arrow and grabbing the next field. That,
then, has the result of being checked against this conditional. No, of course, it doesn't equal null because
the second node is the number 2 node.

- [2:04:44](https://www.youtube.com/watch?v=undefined&t=43484s) Null is still at the very end. So I


print out the number 2. Next step, I update TMP one more time to be whatever is next. That, then, does
not yet equal null. So I go ahead and print out the number 3 node. Then one last time, I update TMP to
be whatever TMP is in the next field. But after 1, 2, 3, that last next field is null.

- [2:05:05](https://www.youtube.com/watch?v=undefined&t=43505s) And so, I break out of this for-loop


altogether. So if I do this in pictorial form, all we're doing, if I now use my finger to represent the TMP
variable. I initialize TMP to be whatever list is, so it points here. That's obviously not null so I print out
whatever is that TMP, follow the arrow in number, and I print that out.

- [2:05:27](https://www.youtube.com/watch?v=undefined&t=43527s) Then I update TMP to point here.


Then I update TMP to point here. Then I update TMP to point here. Wait, that's null. The for-loop ends.
So, again, admittedly much more cryptic than our familiar int i equals 0, and so forth. But it's just a
different utilization of the for-loop syntax. Yes. AUDIENCE: How does it happen that you're always
printing out the numbers.

- [2:05:51](https://www.youtube.com/watch?v=undefined&t=43551s) Because it seems to me that


addresses- SPEAKER 1: Good question. How is it that I'm actually printing numbers and not printing out
addresses instead. The compiler is helping me here. Because I taught it, in the very beginning of my
program, what a node is. Which looks like this here. The compiler knows that a node has a number of
fields and a next field down here, in the for-loop.

- [2:06:11](https://www.youtube.com/watch?v=undefined&t=43571s) Because I'm iterating using a


node* pointer, and not an int* pointer, the compiler knows that any time I'm pointing at something, I'm
pointing at the whole node. Doesn't matter where specifically in the rectangle I'm pointing per se. It's,
ultimately, pointing at the whole node itself.

- [2:06:27](https://www.youtube.com/watch?v=undefined&t=43587s) And the fact that I, then, use TMP


arrow number means, OK, adjust your finger slightly. So you're literally pointing at the number field and
not the next field. So that's sufficient information for the computer to distinguish the 2. Good question.
Other questions then on this approach here. Yeah, in the back.

- [2:06:46](https://www.youtube.com/watch?v=undefined&t=43606s) AUDIENCE: How would you--


SPEAKER 1: How would I use a for-loop to add elements to a linked list? You will do something like this, if
I may, in problem set 5. We will give you some of the scaffolding for doing this. But in this coming weeks
materials will we guide you to that. But let me not spoil it just yet.

- [2:07:05](https://www.youtube.com/watch?v=undefined&t=43625s) Fair question, though. Yeah.


AUDIENCE: So I had a question about line 49. SPEAKER 1: OK. AUDIENCE: Is line 49 possible in line 43?
SPEAKER 1: Good question. Is line 49 acceptable, even if we freed it earlier. We didn't free it in line 43, in
this case, right. You can only reach line 49, if n does not equal null.
- [2:07:22](https://www.youtube.com/watch?v=undefined&t=43642s) And you do not return on line 45.
So that's safe. I was only doing those freeing, if I knew on line 45 that I'm out of here anyway, at that
point. Good question. And, yeah. AUDIENCE: I had a quick question. Is TMP [INAUDIBLE]. SPEAKER 1:
Correct You're asking about TMP, because it's in a for-loop, does that mean you don't have to free it? You
never have to free pointers, per se.

- [2:07:44](https://www.youtube.com/watch?v=undefined&t=43664s) You should only free addresses


that were returned to you by malloc. So I haven't finished the program, to be fair. But you're not freeing
variables. You're not freeing like, fields. You are freeing specific addresses, whatever they may be. So the
last thing, and I was stalling on showing this because it too is a little cryptic.

- [2:08:03](https://www.youtube.com/watch?v=undefined&t=43683s) Here is how you can free, now, a


whole linked list. In the world of arrays, recall, it was so easy. You just say free list. You return 0 and
you're done. Not with a linked list. Because, again, the computer doesn't know what you have stitched
together using all of these pointers all over the computer's memory.

- [2:08:19](https://www.youtube.com/watch?v=undefined&t=43699s) You need to follow those arrows.


So one way to do this would be as follows. While the list itself is not null, so while there's a list to be
freed. What do I want to do? I'm going to give myself a temporary variable called TMP again. And it's a
different TMP because it's in a different scope.

- [2:08:35](https://www.youtube.com/watch?v=undefined&t=43715s) It's inside of the while loop


instead the for-loop, a few lines earlier. I am going to initialize TMP to be the address of the next node.
Just so I can get one step ahead of things. Why am I doing this? Because now, I can boldly free the list
itself, which does not mean the whole list. Again, I'm freeing the address in list, which is the address of
the number 1 node.

- [2:08:59](https://www.youtube.com/watch?v=undefined&t=43739s) That's what list is. It's just the


address of the number 1 node. So if I first use TMP to point out the number 2 slightly in the middle of
the picture, then it is safe for me on line 61, at the moment, to free list. That is the address of the first
node. Now I'm going to say, all right, once I freed the first node in the list, I can update the list itself to be
literally TMP.

- [2:09:25](https://www.youtube.com/watch?v=undefined&t=43765s) And now, the loop repeats. So


what's happening here? If you think about this picture, TMP is initially pointing at not the list, but list
arrow next. So TMP, represented by my right hand here, is pointing at the number 2. Totally safe and
reasonable to free now the list itself a.k.a. the address of the number 1 node.

- [2:09:45](https://www.youtube.com/watch?v=undefined&t=43785s) That has the effect of just


throwing away the number 1 node, telling the computer you can reuse that memory for you. The last
line of code I wrote updated list to point at the number 2, at which point my loop proceeded to do the
exact same thing again. And only once my finger is literally pointing at nowhere, the null symbol, will the
loop, by nature of a while loop as I'll toggle back to, break out.

- [2:10:06](https://www.youtube.com/watch?v=undefined&t=43806s) And there's nothing more to be


freed. So again, what you'll see, ultimately, in problem set 5, more on that later, is an opportunity to play
around with just this syntax. But also these ideas. But again, even though the syntax is admittedly pretty
cryptic, we're still using basics like these for-loops or while loops.
- [2:10:24](https://www.youtube.com/watch?v=undefined&t=43824s) We're just starting to now follow
explicit addresses rather than letting the computer do all of the arithmetic for us, as we previously
benefited from. At the very end of this thing, I'm going to return 0 as though all is well. And I think, then,
we're good to go. All right. Questions on this linked list code now? And again, we'll walk through this
again in the coming weeks spec.

- [2:10:46](https://www.youtube.com/watch?v=undefined&t=43846s) Yeah. AUDIENCE: Can you explain


the while loop [INAUDIBLE] starts in other ways? SPEAKER 1: Sure. Can we explain this while loop here
for freeing the list. So notice that, first, I'm just asking the obvious question. Is the list null? Because if it
is, there's no work to be done. However, while the list is not null, according to line 58, what do we want
to do? I want to create a temporary variable that points at the same thing that list arrow next is pointing
at.

- [2:11:15](https://www.youtube.com/watch?v=undefined&t=43875s) So what does that mean? Here is


list. List arrow next is whatever this thing is here. So if my right hand represents the temporary variable,
I'm literally pointing at the same thing as the list is itself. The next line of code, recall, was free the list.
And unlike, in our world of arrays, like half an hour ago where that just meant free the whole darn list,
you now have taken over control over the computer's memory with a linked list, in ways that you didn't
with the array.

- [2:11:43](https://www.youtube.com/watch?v=undefined&t=43903s) The computer knew how to free


the whole array because you malloc the whole thing at once. You are now mallocing the linked list one
node at a time. And the operating system does not keep track of for you where all these nodes are. So
when you free list, you are literally freeing the value of the list variable, which is just this first node here.

- [2:12:04](https://www.youtube.com/watch?v=undefined&t=43924s) Then my last line of code, which


I'll flip back to in a second, updates list to now ignore the free memory and point at 2. And the story then
repeats. So, again, it's just a very pedantic way of using this new syntax of star notation, and the arrow
notation, and the like, to do the equivalent of walking down all of these arrows.

- [2:12:26](https://www.youtube.com/watch?v=undefined&t=43946s) Following all of these


breadcrumbs. But it does take admittedly some getting used to. Syntax, you only have to do one week.
But, again, next week in Python will we begin to abstract a lot of this complexity away. But none of this
complexity is going away. It's just that someone else, the authors of Python for instance, will have
automated this stuff for us.

- [2:12:44](https://www.youtube.com/watch?v=undefined&t=43964s) The goal this week is to


understand what it is we're going to get for free, so to speak, next week. All right. Questions on these
length lists. All right. Just, yeah, in the back. AUDIENCE: So are the while loops strictly necessary for the
freeing [INAUDIBLE]. SPEAKER 1: Fair question. Let me summarize as, could we have freed this with a for-
loop? Absolutely.

- [2:13:05](https://www.youtube.com/watch?v=undefined&t=43985s) It just is a matter of style. It's a


little more elegant to do it in a while loop, according to me. But other people will reasonably disagree.
Anything you can do with a while loop you can do with a for-loop, and vise versa. Do while loops, recall,
are a little different. But they will always do at least one thing.
- [2:13:20](https://www.youtube.com/watch?v=undefined&t=44000s) But for-loops and while loops
behave the same in this case. AUDIENCE: Thank you. SPEAKER 1: Sure. Other questions? All right, well
let's just vary things a little bit here. Just to see what some of the pitfalls might now be without getting
into the weeds of code. Indeed, we'll try to save some of that for problem set 5's exploration.

- [2:13:36](https://www.youtube.com/watch?v=undefined&t=44016s) But instead, let's imagine that we


want to create a list here of our own. I can offer, in exchange for a few volunteers, some foam fingers to
bring to the next game, perhaps. Could we get maybe just one volunteer first? Come on up. You will be
our linked list from the get go. What's your name? AUDIENCE: Pedro.

- [2:13:52](https://www.youtube.com/watch?v=undefined&t=44032s) SPEAKER 1: Pedro, come on up.


All right, thank you to Pedro. [AUDIENCE CLAPPING] And if you want to just stand roughly over here. But
you are a null pointer so just point sort of at the ground, as though you're pointing at 0. All right. So
Pedro is our linked list of size 0, which pictorially might look a little something like this for consistency
with our past pictures.

- [2:14:11](https://www.youtube.com/watch?v=undefined&t=44051s) Now suppose that we want to go


ahead and malloc, oh, how about the number 2. Can we get a volunteer to be on camera here? OK. You
jumped out of your seat. Do you want to come up? OK, you really want the foam finger, I say. All right.
Round of applause, sure. [AUDIENCE CLAPPING] OK. And what's your name? AUDIENCE: Caleb.

- [2:14:32](https://www.youtube.com/watch?v=undefined&t=44072s) SPEAKER 1: Say again? AUDIENCE:


Caleb. SPEAKER 1: Halen? AUDIENCE: Caleb. SPEAKER 1: Caleb. Caleb, sorry. All right. So here is your
number 2 for your number field. And here is your pointer. And come on, let's say that there was room for
Caleb like, right there. That's perfect. So Caleb got malloced, if you will, over here.

- [2:14:47](https://www.youtube.com/watch?v=undefined&t=44087s) So now if we want to insert Caleb


and the number 2 into this linked list, well what do we need to do? I already initialized you to 2. And
pointing as you are to the ground means you're initialized to null for your next field. Pedro, what you
should you-- perfect. What should Pedro do. That's fine, too.

- [2:15:02](https://www.youtube.com/watch?v=undefined&t=44102s) So Pedro is now pointing at the


list. So now our list looks a little something like this. So far, so good. All is well. So the first couple of these
will be pretty straightforward. Let's insert one more, if anyone really wants another foam finger. Here,
how about right in the middle. Come on down. And just in anticipation, how about let's malloc someone
else.

- [2:15:19](https://www.youtube.com/watch?v=undefined&t=44119s) OK, your friends are pointing at


you. Do you want to come down too, preemptively? This is a pool of memory, if you will. What's your
name? AUDIENCE: Hannah. SPEAKER 1: Hannah. All right, Hanna. You are number 4. [AUDIENCE
CLAPPING] And hang there for just a moment. All right. So we've just malloced Hannah.

- [2:15:34](https://www.youtube.com/watch?v=undefined&t=44134s) And Hannah, how about Hannah,


suppose you ended up over there in just some random location. All right. So what should we now do, if
the goal is to keep these things sorted? How about? So Pedro, do you have to update yourself?
AUDIENCE: No. SPEAKER 1: No. All right. Caleb, what do you have to do? OK. And Hannah what should
you be doing? I would, it's just for you for now, so point at the ground representing null.
- [2:15:55](https://www.youtube.com/watch?v=undefined&t=44155s) OK. So, again demonstrating the
fact that, unlike in past weeks where we had our nice, clean array back, to back, to back, contiguously,
these guys are deliberately all over the stage. So let's malloc another. How about number 5. What's your
name? AUDIENCE: Jonathan. SPEAKER 1: Jonathan. All right, Jonathan.

- [2:16:09](https://www.youtube.com/watch?v=undefined&t=44169s) You are our number 5. And pick


your favorite place in memory. [AUDIENCE CLAPPING] OK. All right. So Jonathan's now over there. And
Hannah is over there. So 5, we want to point Hannah at number 5. So you, of course, are going to point
there. And where should you be pointing? Down to represent null, as well.

- [2:16:27](https://www.youtube.com/watch?v=undefined&t=44187s) OK. So pretty straightforward. But


now things get a little interesting. And here, we'll use a chance to, without the weeds of code, point out
how order of operations is really going to matter. Suppose that I next want to allocate say, the number 1.
And I want to insert the number 1 into this list. Yes. This is what the code would look like.

- [2:16:45](https://www.youtube.com/watch?v=undefined&t=44205s) But if we act this out-- could we


get one more volunteer? How about on the end there in the sweater. Yeah. Come on down. We have,
what's your name? AUDIENCE: Lauren. SPEAKER 1: Lauren. OK. Lauren, come on down. [AUDIENCE
CLAPPING] And how about, Lauren, why don't you go right in here in front, if you don't mind.

- [2:17:05](https://www.youtube.com/watch?v=undefined&t=44225s) Here is your number. Here is your


pointer. So I've initialized Lauren to the number 1. And your pointer will be null, pointing at the ground.
Where do you belong if we're maintaining sorted order? Looks like right at the beginning. What should
happen here? OK. So Pedro has presumed to point now at Lauren.

- [2:17:24](https://www.youtube.com/watch?v=undefined&t=44244s) But how do you know where to


point? AUDIENCE: He's number 2. SPEAKER 1: Pedro's undoing what he did a moment ago. So this was
deliberate. And that was perfect that Pedro presumed to point immediately at Lauren. Why? You literally
just orphaned all of these folks, all of these chunks of memory. Why? Because if Pedro was our only
variable pointing at that chunk of memory, this is the danger of using pointers, and dynamic memory
allocation, and building your own data structures.

- [2:17:49](https://www.youtube.com/watch?v=undefined&t=44269s) The moment you point


temporarily, if you could, to Lauren, I have no idea where he's pointing to. I have no idea how to get back
to Caleb, or Hannah, or anyone else on stage. So that was bad. So you did undo it. So that's good. I think
we need Lauren to make a decision first. Who should you point at? AUDIENCE: Caleb.

- [2:18:05](https://www.youtube.com/watch?v=undefined&t=44285s) SPEAKER 1: So pointing at Caleb.


Why? Because you're pointing at literally who Pedro is pointing at. Pedro, now what are you safe to do?
Good. So order of operations there matters. And if we had just done this line of code in red here, list
equals n. That was like Pedro's first instinct, bad things happen.

- [2:18:20](https://www.youtube.com/watch?v=undefined&t=44300s) And we orphaned the rest of the


list. But if we think through it logically and do this, as Lauren did for us, instead, we've now updated the
list to look a little something more like this. Let's do one last one. We got one more foam finger here for
the number 3. How about on the end? Yeah. You want to come down.
- [2:18:36](https://www.youtube.com/watch?v=undefined&t=44316s) All right. One final volunteer.
[AUDIENCE CLAPPING] All right. And what's your name? AUDIENCE: Miriam. SPEAKER 1: I'm sorry?
AUDIENCE: Miriam. SPEAKER 1: Miriam. All right. So here is your number 3. Here is your pointer. If you
want to go maybe in the middle of the stage in a random memory location. So here, too, the goal is to
maintain sorted order.

- [2:18:57](https://www.youtube.com/watch?v=undefined&t=44337s) So let's ask the audience, who or


what number should point at whom first here? So we don't screw up and orphan some of the memory.
And if we do orphan memory, this is what's called, again per last week, a memory leak. Your Mac, your
PC, your phone can start to slow down if you keep asking for memory but never give it back or lose track
of it.

- [2:19:14](https://www.youtube.com/watch?v=undefined&t=44354s) So we want to get this right. Who


should point at whom? Or what number? Say again. AUDIENCE: 3 to 4. SPEAKER 1: 3 should point at 4.
So 3, do you want to point at 4. And not, so, OK, good. And how did you know, Miriam, whom to point
at? AUDIENCE: Copying Caleb. SPEAKER 1: Perfect. OK, so copying Caleb. Why? Because if you look at
where this list is currently constructed, and you can cheat on the board here, 2 is pointing to 4.

- [2:19:43](https://www.youtube.com/watch?v=undefined&t=44383s) If you point at whoever Caleb,


number 2, is pointing out, that, indeed, leads you to Hannah for number 4. So now what's the next step
to stitch this together? Our voice in the crowd. AUDIENCE: 2 to 3. SPEAKER 1: 2 to 3. So, 2 to 3. So Caleb,
I think it's now safe for you to decouple. Because someone is already pointing at Hannah.

- [2:20:02](https://www.youtube.com/watch?v=undefined&t=44402s) We haven't orphaned anyone. So


now, if we follow the breadcrumbs, we've got Pedro leading to 1, to 2, to 3, to 4, to 5. We need the
numbers back, but you can keep the foam fingers. Thank you to our volunteers here. AUDIENCE: Thank
you. Thank you. [AUDIENCE CLAPPING] SPEAKER 1: You can just put the numbers here.

- [2:20:21](https://www.youtube.com/watch?v=undefined&t=44421s) AUDIENCE: Thank you. SPEAKER


1: Thank you to all. So this is only to say that when you start looking at the code this week and in the
problem set, it's going to be very easy to lose sight of the forest for the trees. Because the code does get
really dense. But the idea is, again, really do bubble up to these higher level descriptions.

- [2:20:38](https://www.youtube.com/watch?v=undefined&t=44438s) And if you think about data


structures at this level. If you go off in program after a class like CS50 and your whiteboarding something
with a friend or a colleague, most people think at and talk at this level. And they just assume that, yeah,
if we went back and looked at our textbooks or class notes, we could figure out how to implement this.

- [2:20:54](https://www.youtube.com/watch?v=undefined&t=44454s) But the important stuff is the


conversation. And the idea is up here. Even though, via this week, will we get some practice with the
actual code. So when it comes to analyzing an algorithm like this, let's consider the following. What
might be now the running time of operations like searching and inserting into a linked list? We talked
about arrays earlier.

- [2:21:19](https://www.youtube.com/watch?v=undefined&t=44479s) And we had some binary search


possibilities still, as soon as it's an array. But as soon as we have a linked list, these arrows, like our
volunteers, could be anywhere on stage. And so you can't just assume that you can jump arithmetically
to the middle element, to the middle element, to the middle one.
- [2:21:33](https://www.youtube.com/watch?v=undefined&t=44493s) You pretty much have to follow all
of these breadcrumbs again and again. So how might that inform what we see? Well, consider this too.
Even though I keep drawing all these pictures with all of the numbers exposed. And all of us humans in
the room can easily spot where the 1 is, where the 2 is, where the 3 is, the computer, again, just like with
our lockers and arrays, can only see one location at a time.

- [2:21:54](https://www.youtube.com/watch?v=undefined&t=44514s) And the key thing with a linked


list is that the only address we've fundamentally been remembering is what Pedro represented a
moment ago. He was the link to all of the other nodes. And, in turn, each person led to the next. But
without Pedro, we would have lost some of, or all of, the linked list. So when you start with a linked list, if
you want to find an element as via search, you have to do it linearly.

- [2:22:18](https://www.youtube.com/watch?v=undefined&t=44538s) Following all of the arrows.


Following all of the pointers on the stage in order to get to the node in question. And only once you hit
null can you conclude, yep, it was there. Or no, it was not. So given that if a computer, essentially, can
only see the number 1, or the number 2, or the number 3, or the number 4, or the number 5, one at a
time, how might we think about the running time of search? And it is indeed Big O of n.

- [2:22:45](https://www.youtube.com/watch?v=undefined&t=44565s) But why is that? Well, in the worst


case, the number you might be looking for is all the way at the end. And so, obviously, you're going to
have to search all of the n elements. And I drew these things with boxes on top of them. Because, again,
even though you and I can immediately see, where the 5 is for instance, the computer can only figure
that out by starting at the beginning and going there.

- [2:23:04](https://www.youtube.com/watch?v=undefined&t=44584s) So there, too, is another trade off.


It would seem that, overnight, we have lost the ability to do a very powerful algorithm from week 0
known as binary search, right. It's gone. Because there's no way in this picture to jump mathematically to
the middle node, unless you remember where it is. And then, remember where every other node is.

- [2:23:24](https://www.youtube.com/watch?v=undefined&t=44604s) And at that point, you're back to


an array. Linked list, by design, only remember the next node in the list. All right. How about something
like insert? In the worst case, perhaps, how many steps might it take to insert something into a linked
list? Someone else. Someone else. Yeah. AUDIENCE: N squared. SPEAKER 1: Say again? AUDIENCE: N
squared.

- [2:23:44](https://www.youtube.com/watch?v=undefined&t=44624s) SPEAKER 1: N squared.


Fortunately, it's not that bad. It's not as bad as n squared. That typically means doing n things, n times.
And I think we can stay under that, but not a bad thought. Yeah. AUDIENCE: Is it n? SPEAKER 1: Why
would it be n? AUDIENCE: Because the [INAUDIBLE]. SPEAKER 1: OK. So to summarize, you're proposing
n.

- [2:24:03](https://www.youtube.com/watch?v=undefined&t=44643s) Because to find where the thing


goes, you have to traverse, potentially, the whole list. Because if I'm inserting the number 6 or the
number 99, that numerically belongs at the very end, I can only find its location by looking for all of
them. At this point, though, in the term. And really, at this point in the story, you should start to question
these very simplistic questions, to be honest.
- [2:24:22](https://www.youtube.com/watch?v=undefined&t=44662s) Because the answer is almost
always going to depend, right. If I've just got a link to list that looks like this, the first question back to
someone asking this question would be, well does the list need to be sorted, right? I've drawn it as
sorted and it might imply as much. So that's a reasonable assumption to have made.

- [2:24:39](https://www.youtube.com/watch?v=undefined&t=44679s) But if I don't care about


maintaining sorted order, I could actually insert into a linked list in constant time. Why? I could just keep
inserting into the beginning, into the beginning, into the beginning. And even though the list is getting
longer, the number of steps required to insert something between the first element is not growing at all.

- [2:24:58](https://www.youtube.com/watch?v=undefined&t=44698s) You just keep inserting. If you


want to keep it sorted though, yes, it's going to be, indeed, Big O of n. But again, these kinds of, now,
assumptions are going to start to matter. So let's for the sake of discussion say it's Big O of n, if we do
want to maintain sorted order. But what about in the case of not caring.

- [2:25:14](https://www.youtube.com/watch?v=undefined&t=44714s) It might indeed be a Big O of 1.


And now these are the kinds of decisions that will start to leave to you. What about in the best case
here? If we're thinking about Big Omega notation, then, frankly, we could just get lucky in the best case.
And the element we're looking for happens to be at the beginning.

- [2:25:28](https://www.youtube.com/watch?v=undefined&t=44728s) Or heck, we just blindly insert to


the beginning irrespective of the order that we want to keep things in. All right. So besides then, how
can we improve further on this design? We don't need to stop at linked list. Because, honestly, it's not
been a clear win. Like, linked list allow us to use more of our memory because we don't need massive
growing chunks of contiguous memory.

- [2:25:50](https://www.youtube.com/watch?v=undefined&t=44750s) So that's a win. But they still


require Big O of n time to find the end of it, if we care about order. We're using at least twice as much
memory for the darn pointer. So that seems like a sidestep. It's not really a step forward. So can we do
better? Here's where we can now accelerate the story by just stipulating that, hey, even if you haven't
used this technique yet, we would seem to have an ability to stitch together pieces of memory just using
pointers .

- [2:26:17](https://www.youtube.com/watch?v=undefined&t=44777s) And anything you could imagine


drawing with arrows, you can implement, it would seem, in code. So what if we leverage a second
dimension. Instead of just stringing together things laterally, left to right, essentially, even though they
were bouncing around on the screen. What if we start to leverage a second dimension here, so to speak.

- [2:26:33](https://www.youtube.com/watch?v=undefined&t=44793s) And build more interesting


structures in the computer's memory. Well it turns out that in a computer's memory, we could create a
tree, similar to a family tree. If you've ever seen or draw on a family tree with grandparents, and parents,
and siblings, and so forth. So inverted branch of a tree that grows, typically when it's drawn, downward
instead of upward like a typical tree.

- [2:26:57](https://www.youtube.com/watch?v=undefined&t=44817s) But that's something we could


translate into code as well. Specifically, let's do something called a binary search tree. Which is a type of
tree. And what I mean by this is the following. Notice this. This is an example of an array from like week
2, when we first talked about those. And we had the lockers on stage.
- [2:27:14](https://www.youtube.com/watch?v=undefined&t=44834s) And recall that what was nice
about an array, if 1, it's sorted. And 2, all of its numbers are indeed contiguous, which is by definition an
array. We can just do some simple math. For instance, if there are 7 elements in this array, and we do 7
divided by 2, that's what? 3 and 1/2, round down through truncation, that's 3.

- [2:27:35](https://www.youtube.com/watch?v=undefined&t=44855s) 0, 1, 2, 3. That gives me the


middle element, arithmetically, in this thing. And even though I have to be careful about rounding, using
simple arithmetic, I can very quickly, with a single line of code or math, find for you the middle of the left
half, of the left half, of the right half, or whatever. That's the power of arrays.

- [2:27:51](https://www.youtube.com/watch?v=undefined&t=44871s) And that's what gave us binary


search. And how did binary search work? Well, we looked at the middle. And then, we went left or right.
And then, we went left or right again, implied by this color scheme here. Wouldn't it be nice if we
somehow preserved the new upsides today of dynamic memory allocation, giving ourselves the ability to
just add another element, add another element, add another element.

- [2:28:14](https://www.youtube.com/watch?v=undefined&t=44894s) But retain the power of binary


search. Because log of n was much better than n, certainly for large data sets, right. Even the phone book
demonstrated as much weeks ago. So what if I draw this same picture in 2 dimensions. And I preserve
the color scheme, just so it's obvious what came where. What are these things look like now? Maybe,
like, things we might now call nodes, right.

- [2:28:39](https://www.youtube.com/watch?v=undefined&t=44919s) A node is just a generic term for


like, storing some data. What if the data these nodes are storing are numbers. So still integers. But what
if we connected these cleverly, like an old family tree. Whereby, every node has not one pointer now, but
as many as 2. Maybe 0, like in the leaves at the bottom are in green.

- [2:29:00](https://www.youtube.com/watch?v=undefined&t=44940s) But other nodes on the interior


might have as many as 2. Like having 2 children, so to speak. And indeed, the vernacular here is exactly
that. This would be called the root of the tree. Or this would be a parent, with respect to these children.
The green ones would be grandchildren, respect to these. The green ones would be siblings with respect
to each other.

- [2:29:19](https://www.youtube.com/watch?v=undefined&t=44959s) And over there, too. So all the


same jargon you might use in the real world, applies in the world of data structures and CS trees. But this
is interesting because I think we could build this now, this data structure in the computer's memory.
How? Well, suppose that we defined a node to be no longer just this, a number in a next field.

- [2:29:40](https://www.youtube.com/watch?v=undefined&t=44980s) What if we give ourselves a bit


more room here? And give ourselves a pointer called left and another one called right. Both of which is a
pointer to a struct node. So same idea as before, but now we just make sure we think of these things as
pointing this way and this way, not just this way. Not just a single direction, but 2.

- [2:29:59](https://www.youtube.com/watch?v=undefined&t=44999s) So you could imagine, in code,


building something up like this with a node. That creates, in essence, this diagram here. But why is this
compelling? Suppose I want to find the number 3. I want to search for the number 3 in this tree. It would
seem, just like Pedro was the beginning of our linked list, in the world of trees, the root, so to speak, is
the beginning of your data structure.
- [2:30:21](https://www.youtube.com/watch?v=undefined&t=45021s) You can retain and remember this
entire tree just by pointing at the root node, ultimately. One variable can hang on to this whole tree. So
how can I find the number 3? Well, if I look at the root node and the number I'm looking for is less than.
Notice, I can go this way. Or if it's greater than, I can go this way.

- [2:30:40](https://www.youtube.com/watch?v=undefined&t=45040s) So I preserve that property of the


phone book, or just assorted array in general. What's true over here? If I'm looking for 3, I can go to the
right of the 2 because that number is going to be greater. If I go left, it's going to be smaller instead. And
here's an example of actually recursion.

- [2:30:56](https://www.youtube.com/watch?v=undefined&t=45056s) Recursion in a physical sense


much like the Mario's pyramid. Which was recursively to find. Notice this. I claim this whole thing is a
tree. Specifically, a binary search tree, which means every node has 2, or maybe 1, or maybe 0 children.
But no more than 2. Hence the bi in binary. And it's the case that every left child is smaller than the root.

- [2:31:20](https://www.youtube.com/watch?v=undefined&t=45080s) And every right child is larger than


the root. That definition certainly works for 2, 4, and 6. But it also works recursively for every sub tree, or
branch of this tree. Notice, if you think of this as the root, it is indeed bigger than this left child. And it's
smaller than this right child. And if you look even at the leaves, so to speak.

- [2:31:39](https://www.youtube.com/watch?v=undefined&t=45099s) The grandchildren here. This root


node is bigger than its left child, if it existed. So it's a meaningless statement. And it's less than its right
child. Or it's not greater than, certainly, so that's meaningless too. So we haven't violated the definition
even for these leaves, as well.

- [2:31:54](https://www.youtube.com/watch?v=undefined&t=45114s) And so, now, how many steps


does it take to find in the worst case any number in a binary search tree, it would seem? So it seems 2,
literally. And the height of this thing is actually 3. And so long story short, especially, if you're a little less
comfy with your logarithms from yesteryear. Log base 2 is the number of times you can divide something
in half, and half, and half, until you get down to 1.

- [2:32:16](https://www.youtube.com/watch?v=undefined&t=45136s) This is like a logarithm in the


reverse direction. Here's a whole lot of elements. And we're having, we're having until we get down to 1.
So the height of this tree, that is to say, is log base 2 of n. Which means that even in the worst case, the
number you're looking for maybe it's all the way at the bottom in the leaves.

- [2:32:32](https://www.youtube.com/watch?v=undefined&t=45152s) Doesn't matter. It's going to take


log base 2 of n steps, or log of n steps, to find, maximally, any one of those numbers. So, again, binary
search is back. But we've paid a price, right. This isn't a linked list anymore. It's a tree. But we've gained
back binary search, which is pretty compelling, right.

- [2:32:54](https://www.youtube.com/watch?v=undefined&t=45174s) That's where the whole class


began, on making that distinction. But what price have we paid to retain binary search in this new world.
Yeah. It's no longer sorted left to right, but this is a claim sorted, according to the binary search tree
definition. Where, again, left child is smaller than root.

- [2:33:13](https://www.youtube.com/watch?v=undefined&t=45193s) And right child is greater than


root. So it is sorted, but it's sorted in a 2-dimensional sense, if you will. Not just 1. But another price
paid? AUDIENCE: [INAUDIBLE] nodes now. SPEAKER 1: Exactly. Every node now needs not one number,
but 2, 3 pieces of data. A number and now 2 pointers. So, again, there's that trade off again.

- [2:33:33](https://www.youtube.com/watch?v=undefined&t=45213s) Where, well, if you want to save


time, you've got to give something if you start giving space. And you start using more space, you can
speed up time. Like, you've got it. There's always a price paid. And it's very often in space, or time, or
complexity, or developer time, the number of bugs you have to solve.

- [2:33:50](https://www.youtube.com/watch?v=undefined&t=45230s) I mean, all of these are finite


resources that you have to juggle them on. So if we consider now the code with which we can
implement this, here might be the node. And how might we actually use something like this? Well, let's
take a look at, maybe, one final program. And see here, before we transition to higher level concepts,
ultimately.

- [2:34:07](https://www.youtube.com/watch?v=undefined&t=45247s) Let me go ahead here and let me


just open a program I wrote here in advance. So let me, in a moment, copy over file called tree.c. Which
we'll have on the course's websites. And I'll walk you through some of the logic here that I've written for
tree.c. All right. So what do we have here first? So here is an implementation of a binary search tree for
numbers.

- [2:34:32](https://www.youtube.com/watch?v=undefined&t=45272s) And as before, I've played around


and I've inserted the numbers manually. So what's going on first? Here is my definition of a node for a
binary search tree, copied and pasted from what I proposed on the board a moment ago. Here are 2
prototypes for 2 functions, that I'll show you in a moment, that allow me to free an entire tree, one node
at a time.

- [2:34:53](https://www.youtube.com/watch?v=undefined&t=45293s) And then, also allow me to print


the tree in order. So even though they're not sorted left to right, I bet if I'm clever about what child I
print first, I can reconstruct the idea of printing this tree properly. So how might I implement a binary
search tree? Here's my main function. Here is how I might represent a tree of size 0.

- [2:35:11](https://www.youtube.com/watch?v=undefined&t=45311s) It's just a null pointer called tree.


Here's how I might add a number to that list. So here, for instance, is me malllocing space for a node.
Storing it in a temporary variable called n. Here is me just doing a safety check. Make sure n does not
equal null. And then, here is me initializing this node to contain the number 2, first.

- [2:35:30](https://www.youtube.com/watch?v=undefined&t=45330s) Then, initializing the left child of


that node to be null. And the right child of that null node to be null. And then, initializing the tree itself
to be equal to that particular node. So at this point in the story, there's just one rectangle on the screen
containing the number 2 with no children. All right.

- [2:35:47](https://www.youtube.com/watch?v=undefined&t=45347s) Let's just add manually to this a


little further. Let's add another number to the list, by mallocing another node. I don't need to declare n
as a node* because it already exists at this point. Here's a little safety check. I'm going to not bother with
my, let me do this, free memory here.

- [2:36:03](https://www.youtube.com/watch?v=undefined&t=45363s) Just to be safe. Do I want to do


this? We want a free memory too, which I've not done here, but I'll save that for another time. Here, I'm
going to initialize the number to 1. I'm going to initialize the children of this node to null and null. And
now, I'm going to do this. Initialize the tree's left child to be n.

- [2:36:24](https://www.youtube.com/watch?v=undefined&t=45384s) So what that's essentially doing


here is if this is my root node, the single rectangle I described a moment ago that currently has no
children, neither left nor right. Here's my new node with the number 1. I want it to become the new left
child. So that line of code on the screen there, tree left equals n, is like stitching these 2 together with a
pointer from 2 to the 1.

- [2:36:44](https://www.youtube.com/watch?v=undefined&t=45404s) All right. The next lines of code,


you can probably guess, are me adding another number to the list. Just the number 3. So this is a simpler
tree with 2, 1, and, 3 respectively. And this code, let me wave my hands, is almost the same. Except for
the fact that I'm updating the tree's right child to be this new and third node.

- [2:37:04](https://www.youtube.com/watch?v=undefined&t=45424s) Let's now run the code before


looking at those 2 functions. Let me do make tree, ./tree. And while I'll 1, 2, 3. So it sounds like the data
structure is sorted, to your concern earlier. But how did I actually print this? And then, eventually, free
the whole thing? Well let's look at the definition of first print tree.

- [2:37:23](https://www.youtube.com/watch?v=undefined&t=45443s) And this is where things get


interesting. Print tree returns nothing so it's a void function. But it takes a pointer to a root element as its
sole argument, node* root. Here's my safety check. If root equals equals null, there's obviously nothing
to print, just return. That goes without saying.

- [2:37:42](https://www.youtube.com/watch?v=undefined&t=45462s) But here's where things get a little


magical. Otherwise, print your left child. Then print your own number. Then, print your right child. What
is this an example of, even though it's not mentioned by name here? What programming technique
here? AUDIENCE: Recursion. SPEAKER 1: Yeah. So this is actually perhaps the most compelling use of
recursion, yet.

- [2:38:06](https://www.youtube.com/watch?v=undefined&t=45486s) It wasn't really that compelling


with the Mario thing because we had such an easy implementation with a for-loop loop weeks ago. But
here is a perfect application of recursion, where your data structure itself is recursive, right. If you take
any snip of any branch, it all still looks like a tree, just a smaller one.

- [2:38:22](https://www.youtube.com/watch?v=undefined&t=45502s) That lends itself to recursion. So


here is this leap of faith where I say, print my left tree, or my left sub tree, if you will, via my child at the
left. Then, I'll print my own root node here in the middle. Then, go ahead and print my right sub tree.
And because we have this base case that makes sure that if the root is null, there's nothing to do, you're
not going to recurse infinitely.

- [2:38:44](https://www.youtube.com/watch?v=undefined&t=45524s) You're not going to call yourself


again, and again, and again, infinitely, many times. So it works out and prints the 1, the 2, and the 3. And
notice what we could do, too. If you wanted to print the tree in reverse order, you could do that. Print
your right tree first, the greater element. Then, yourself.

- [2:39:01](https://www.youtube.com/watch?v=undefined&t=45541s) Then, your smaller sub tree. And


if I do make tree here and ./tree, well now, I've reversed the order of the list. And that's pretty cool. You
can do it with a for-loop in an array. But you can also do it, even with this 2-dimensional structure. Let's
lastly look at this free tree function. And this one's almost the same.

- [2:39:20](https://www.youtube.com/watch?v=undefined&t=45560s) Order doesn't matter in quite the


same way, but it does still matter. Here's what I did with free tree. Well, if the root of the tree is null,
there's obviously nothing to do. Just return. Otherwise, go ahead and free your left child and all of its
descendants. Then free your right child and all of its descendants.

- [2:39:36](https://www.youtube.com/watch?v=undefined&t=45576s) And then, free yourself. And


again, free literally just frees the address in that variable. It doesn't free the whole darn thing. It just
frees literally what's at that address. Why was it important that I did line 72 last, though? Why did I free
the left child and the right child before I freed myself, so to speak? AUDIENCE: [INAUDIBLE].

- [2:39:58](https://www.youtube.com/watch?v=undefined&t=45598s) SPEAKER 1: Exactly. If you free


yourself first, if I had done incorrectly this line higher up, you're not allowed to touch the left child tree
or the right child tree. Because the memory address is no longer valid at that point. You would get some
memory error, perhaps. The program would crash. Valgrind definitely wouldn't like it.

- [2:40:15](https://www.youtube.com/watch?v=undefined&t=45615s) Bad things would otherwise


happen. But here, then, is an example of recursion. And again, just a recursive use of an actual data
structure. And what's even cooler here is, relatively speaking, suppose we wanted to search something
like this. Binary search actually gets pretty straightforward to implement 2.

- [2:40:33](https://www.youtube.com/watch?v=undefined&t=45633s) For instance. here might be the


prototype for a search function for a binary search tree. You give me the root of a tree, and you give me
a number I'm looking for, and I can pretty easily now return true if it's in there or false if it's not. How?
Well, let's first ask a question. If tree equals equals null, then you just return false.

- [2:40:53](https://www.youtube.com/watch?v=undefined&t=45653s) Because if there's no tree, there's


no number, so it's obviously not there. Return false. Else if, the number you're looking for is less than the
tree's own number, which direction should we go? AUDIENCE: Left. SPEAKER 1: OK, left. How do we
express that? Well, let's just return the answer to this question.

- [2:41:12](https://www.youtube.com/watch?v=undefined&t=45672s) Search the left sub tree, by way of


my left child, looking for the same number. And you just assume through the beauty of recursion that
you're kicking the can and let yourself figure it out with a smaller problem. Just that snipped left tree
instead. Else if, the number you're looking for is greater than the tree's own number, go to the right, as
you might infer.

- [2:41:33](https://www.youtube.com/watch?v=undefined&t=45693s) So I can just return the answer to


this question. Search my right sub tree for that same number. And there's a fourth and final condition.
What's the fourth scenario we have to consider, explicitly? Yeah. AUDIENCE: The number. SPEAKER 1: If
the number, itself, is right there. So else if, the number I'm looking for equals the tree's own number,
then and only then, should you return true.

- [2:41:54](https://www.youtube.com/watch?v=undefined&t=45714s) And if you're thinking quickly


here, there's an optimization possible, better design opportunity. Think back to even our scratch days.
What could we do a little better here? You're pointing at it. AUDIENCE: Else. SPEAKER 1: Exactly. An else
suffices. Because if there's logically only 4 things that could happen, you're wasting your time by asking a
fourth gratuitous question.

- [2:42:12](https://www.youtube.com/watch?v=undefined&t=45732s) And else here suffices. So here to,


more so than the Mario example a few weeks ago, there's just this elegance arguably to recursion. And
that's it. This is not pseudocode. This is the code for binary search on a binary search tree. And so,
recursion tends to work in lockstep with these kinds of data structures that have this structure to them
as we're seeing here.

- [2:42:34](https://www.youtube.com/watch?v=undefined&t=45754s) All right. Any questions, then, on


binary search as implemented here with a tree? Yeah. AUDIENCE: About like third years. [INAUDIBLE]
SPEAKER 1: Good question. So when returning a Boolean value, true and false are values that are defined
in a library called Standard Bool, S-T-D-B-O-O-L dot H. With a header file that you can use.

- [2:43:00](https://www.youtube.com/watch?v=undefined&t=45780s) It is the case that true is, it's not


well defined what they are. But they would map indeed, yes. To 0 and 1, essentially. But you should not
compare them explicitly to 0 and 1. When you're using true and false, you should compare them to each
other. AUDIENCE: I meant if it's in a code return.

- [2:43:19](https://www.youtube.com/watch?v=undefined&t=45799s) SPEAKER 1: Oh, sorry. So if I am in


my own code from earlier, an avoid function, it is totally fine to return. You just can't return something
explicitly. So return just means that's it. Quit out of this function. You're not actually handing back a
value. So it's a way of short circuiting the execution.

- [2:43:37](https://www.youtube.com/watch?v=undefined&t=45817s) If you don't like that, and some


people do frown upon having code return from functions prematurely, you could invert the logic and do
something like this. If the root does not equal null, do all of these things. And then, indent all three of
these lines underneath. That's perfectly fine too. I happen to write it the other way just so that there was
explicitly a base case that I could point to on the screen.

- [2:43:58](https://www.youtube.com/watch?v=undefined&t=45838s) Whereas, now, it's implicitly there


for us only. But a good observation too. All right. So let's ask the question as before about running time
of this. It would look like binary search is back. And we can now do things in logarithmic time, but we
should be careful. Is this a binary search tree? Just to be clear.

- [2:44:19](https://www.youtube.com/watch?v=undefined&t=45859s) And again, a binary search tree is


a tree where the root is greater than its left child and smaller than its right child. That's the essence. So
you're nodding your head. You agree? I agree. So this is a binary search tree. Is this a binary search tree?
[INTERPOSING VOICES] OK. I'm hearing yeses. Or I'm hearing just my delay changing the vote it would
seem.

- [2:44:43](https://www.youtube.com/watch?v=undefined&t=45883s) So this is one of those trick


questions. This is a binary search tree because I've not violated the definition of what I gave you, right. Is
there any example of a left child that is greater than its parent? Or is there any example of a right child
that's smaller than its parent? That's just the opposite way of describing the same thing.

- [2:45:02](https://www.youtube.com/watch?v=undefined&t=45902s) No, this is a binary search tree.


Unfortunately, it also looks like, albeit at a different axis, what? AUDIENCE: A linked list. SPEAKER 1: A
linked list. But you could imagine this happening, right. Suppose that I hadn't been as thoughtful as I was
earlier by inserting 2, And then 1, and then 3. Which nicely balanced everything out.

- [2:45:20](https://www.youtube.com/watch?v=undefined&t=45920s) Suppose that instead, because of


what the user is typing in or whatever you contrive in your own code, suppose you insert a 1, and then a
2, and then a 3. Like, you've created a problem for yourself. Because if we follow the same logic as
before, going left or going right, this is how you might implement a binary search tree accidentally if you
just blindly keep following that definition.

- [2:45:42](https://www.youtube.com/watch?v=undefined&t=45942s) I mean, this would be better


designed as what? If we rotated the whole thing around. And that's totally fine. And those kinds of trees
actually have names. There's trees called AVL trees in computer science. There are red-black black trees
in computer science. There are other types of trees that, additionally, add some logic that tell you when
you got to pivot the thing, and rotate it, and snip off the root, and fix things in this way.

- [2:46:04](https://www.youtube.com/watch?v=undefined&t=45964s) But a binary search tree, in and of


itself, does not guarantee that it will be balanced, so to speak. And so, if you consider the worst case
scenario of even using a binary search tree. If you're not smart about the code you're writing and you
just blindly follow this definition, you might accidentally create a crazy, long and stringy binary search
tree that essentially looks like a linked list.

- [2:46:25](https://www.youtube.com/watch?v=undefined&t=45985s) Because you're not even using


any of the left children. So unfortunately, the literal answer to the question here is what's the running
time of search? Well, hopefully, log n. But not if you don't maintain the balance of the tree. Both, in
certain search, could actually devolve into instead of big O of log n, literally, big O of n.

- [2:46:44](https://www.youtube.com/watch?v=undefined&t=46004s) If you don't somehow take into


account, and we're not going to do the code for that here. It's a higher level thing you might explore
down the road. It can devolve into something that you might not have intended. And so, now that we're
talking about 2 dimensions, it's really the onus is on the programmer to consider what kinds of perverse
situations might happen.

- [2:47:02](https://www.youtube.com/watch?v=undefined&t=46022s) Where the thing devolves into a


structure that you don't actually want it to devolve into. All right. We've got just a few structures to go.
Let's go ahead and take one more 5 minute break here. When we come back, we'll talk at this level
about some final applications of this. See you in 5. All right.

- [2:47:18](https://www.youtube.com/watch?v=undefined&t=46038s) So we are back. And as promised,


we'll operate now at this higher level. Where if we take for granted that, even though you haven't had an
opportunity to play with these techniques yet, you have the ability now in code to stitch things together.
Both in a one dimension and even 2 dimensions, to build things like lists and trees.

- [2:47:35](https://www.youtube.com/watch?v=undefined&t=46055s) So if we have these building


blocks. Things like now arrays, and lists, and trees, what if we start to amalgamate them such that we
build things out of multiple data structures? Can we start to get some of the best of both worlds by way
of, for instance, something called a hash table. So a hash table is a Swiss army knife of data structures in
that it's so commonly used.
- [2:47:57](https://www.youtube.com/watch?v=undefined&t=46077s) Because it allows you to associate
keys with value, so to speak. So, for instance, it allows you to associate a username with a password. Or a
name with a number. Or anything where you have to take something as input, and get as output a
corresponding piece of information. A hash table is often a data structure of choice.

- [2:48:17](https://www.youtube.com/watch?v=undefined&t=46097s) And here's what it looks like. It's


actually looks like an array, at first glance. But for discussion's sake, I've drawn this array vertically, which
is totally fine. It's still just an array. But it allows you, a hash table, to jump to any of these locations
randomly. That is instantly.

- [2:48:32](https://www.youtube.com/watch?v=undefined&t=46112s) So, for instance, there's actually


26 locations in this array. Because I want to, for instance, store initially names of people, for instance.
And wouldn't it be nice if the person's name starts with A, I have a go to place for it. Maybe the first box.
And if it starts with Z, I put them at the bottom.

- [2:48:48](https://www.youtube.com/watch?v=undefined&t=46128s) So that I can jump instantly,


arithmetically, using a little bit of Ascii or Unicode fanciness, exactly to the location that they want to
they need to go. So, for instance, here's our array 0 index. 0 through 25. If I think of this, though, as A
through Z, I'm going to think of these 26 locations, now in the context of a hash table, is what we'll
generally call buckets.

- [2:49:07](https://www.youtube.com/watch?v=undefined&t=46147s) So buckets into which you can put


values. So, for instance, suppose that we want to insert a value, one name into this data structure. And
that name is say, Albus. So Albus starting with A. Albus might go at the very beginning of this list. All
right. And then, we want to insert another name. This one happens to be Zacharias.

- [2:49:25](https://www.youtube.com/watch?v=undefined&t=46165s) Starting with Z, so it goes all the


way at the end of this data structure in location 25 a.k.a. Z. And then, maybe a third name like Hermione,
and that goes at location H according to that position in the alphabet. So this is great because in constant
time, I can insert and conversely search for any of these names, based on the first letter of their name.

- [2:49:45](https://www.youtube.com/watch?v=undefined&t=46185s) A, or Z, or H, in this case. Let's fast


forward and assume we put a whole bunch of other names-- might look familiar, into this hash table. It's
great because every name has its own location. But if you're thinking of names you don't yet see it on
the screen, we eventually encounter a problem with this, right.

- [2:50:03](https://www.youtube.com/watch?v=undefined&t=46203s) When could something go wrong


using a hash table like this if we wanted to insert even more names? What's going to eventually happen?
Yeah. There's already someone with the first letter, right. Like I haven't even mentioned Harry, for
instance, or Hagrid. And yet, Hermione's already using that spot.

- [2:50:19](https://www.youtube.com/watch?v=undefined&t=46219s) So that invites the question, well,


what happens? Maybe, if we want to insert Harry next, do we maybe cheat and put him at location I?
But then if there's a location I, where do we put them? And it just feels like the situation could very
quickly devolve. But I've deliberately drawn this data structure, that I claim as a hash table, in 2
directions.
- [2:50:37](https://www.youtube.com/watch?v=undefined&t=46237s) An array vertically, here. But what
might this be hinting I'm using horizontally, even though I'm drawing the rectangles a little differently
from before? AUDIENCE: An array. SPEAKER 1: Yeah. Maybe another array, to be fair. But, honestly, arrays
are such a pain with the allocating, and reallocating, and so forth.

- [2:50:52](https://www.youtube.com/watch?v=undefined&t=46252s) These look like the beginnings of


a linked list, if you will. Where the name is where the number used to be, even though I'm drawing it
horizontally now just for discussion's sake. And this seems to be a pointer that isn't pointing anywhere
yet. But it looks like the array is 26 pointers, some of which are null, that is empty.

- [2:51:11](https://www.youtube.com/watch?v=undefined&t=46271s) Some of which are pointing at the


first node in a linked list. So that's really what a hash table might be in your mind. An amalgam of an
array, whose elements are linked lists. And in theory, this gives you the best of both worlds, right. You get
random access with high probability, right. You get to jump immediately to the location you want to put
someone.

- [2:51:30](https://www.youtube.com/watch?v=undefined&t=46290s) But, if you run into this perverse


situation where there's someone already there, OK, fine. It starts to devolve into a linked list, but it's at
least 26 smaller length lists. Not one massive linked list, which would be Big O of n. And quite slow to
solve. So if Harry gets inserted in Hagrid. Yeah, you have to chain them together, so to speak, in this way.

- [2:51:50](https://www.youtube.com/watch?v=undefined&t=46310s) But, at least you've not painted


yourself into a corner. And in fact, if we fast forward and put a whole bunch of familiar names in, the
data structure starts to look like this. So the chains not terribly long. And some of them are actually of
size 0 because there's just some unpopular letters of the alphabet among these names.

- [2:52:07](https://www.youtube.com/watch?v=undefined&t=46327s) But it seems better than just


putting everyone in one big array, or one big linked list. We're trying to balance these trade offs a little
bit in the middle here. Well, how might we represent something like this? Here's how we could describe
this thing. A node in the context of a linked list could be this.

- [2:52:23](https://www.youtube.com/watch?v=undefined&t=46343s) I have an array called word of


type char. And it's big enough to fit the longest word in the alphabet plus 1. And the plus 1 why,
probably? AUDIENCE: The null. SPEAKER 1: The null character. So I'm assuming that longest word is like a
constant defined elsewhere in the story. And it's something big like 40, 100, whatever.

- [2:52:40](https://www.youtube.com/watch?v=undefined&t=46360s) Whatever the longest word in the


Harry Potter universe is or the English dictionary is. Longest word plus 1 should be sufficient to store any
name in the story here. And then, what else does it each of these nodes have? Well it has a pointer to
another node. So here's how we might implement the notion of a node in the context of storing not
integers, but names.

- [2:53:04](https://www.youtube.com/watch?v=undefined&t=46384s) Instead, like this. But how do we


decide what the hash table itself is? Well, if we now have a definition of a node, we could have a variable
in main, or even globally, called hash table. That itself is an array of node* pointers. That is an array of
pointers to nodes. The beginnings of linked lists. Number of buckets is to me.
- [2:53:26](https://www.youtube.com/watch?v=undefined&t=46406s) I proposed, verbally, that it be 26.
But honestly, if you get a lot of collisions, so to speak. A lot of H names trying to go to the same place.
Well, maybe, we need to be smarter and not just look at the first letter of their name. But, maybe, the
first and the second. So it's H-A and H-E. But wait, no, then Harry and Hagrid still collide.

- [2:53:42](https://www.youtube.com/watch?v=undefined&t=46422s) But we start to at least make the


problem a little less impactful by tinkering with something like the number of buckets in a hash table like
this. But how do we decide where someone goes in a hash table in this way? Well, it's an old school
problem of input and output. The input to the problem is going to be something like the name.

- [2:54:01](https://www.youtube.com/watch?v=undefined&t=46441s) And the algorithm in the middle,


as of today, is going to be something called a hash function. A hash function is generally something that
takes as input, a string, a number, whatever, and produces as output a location in our context. Like a
number 0 through 25. Or 0 through 16,000. Or whatever the number of buckets you want is, it's going to
just tell you where to put that input at a specific location.

- [2:54:24](https://www.youtube.com/watch?v=undefined&t=46464s) So, for instance, Albus, according


to the story thus far, gave me back to 0 as output. Zacharias gave me 25. So the hash function, in the
middle of that black box, is pretty simplistic in this story. It's just looking at the Ascii value, it seems, of
the first letter in their name. And then, subtracting off what capital A is 65.

- [2:54:43](https://www.youtube.com/watch?v=undefined&t=46483s) So like doing some math to get


back in number between 0 and 25. So that's how we got to this point in the story. And how might we,
then, resolve the problem further and use this notion of hashing more generally? Well just for
demonstration sake here, here's actually some buckets, literally. And we've labeled, in advance, these
buckets with the suits from a deck of cards.

- [2:55:05](https://www.youtube.com/watch?v=undefined&t=46505s) So we've got some spades. And


we've got diamonds here. And we've got, what else here? Clubs and hearts. So we have a deck of cards
here, for instance, right. And this is something you, yourself, might do instinctively if you're getting ready
to start playing a game of cards. You're just cleaning up or you want things in order.

- [2:55:29](https://www.youtube.com/watch?v=undefined&t=46529s) Like, here is literally a jumbo deck


of cards. What would be the easiest way for me to sort these things? Well we've got a whole bunch of
sorting algorithms from the past. So I could go through like, here's the 3 of diamonds. And I could, here
let me throw this up on the screen. Just so, if you're far in back.

- [2:55:43](https://www.youtube.com/watch?v=undefined&t=46543s) So here's diamonds. I could put


this here. 3, 4. I could do this in order here. But a lot of us, honestly, if given a deck of cards. And you just
want to clean it up and sort it in order, you might do things like this. Well here's my input, 3 of diamonds,
let's put it in this bucket. 4 of diamonds, this bucket.

- [2:56:01](https://www.youtube.com/watch?v=undefined&t=46561s) 5 of diamonds, this bucket. And if


you keep going through the cards, here's seven of hearts, hearts bucket. 8's bucket. Queen of spades
over here. And it's still going to take you 52 steps. But at the end of it, you have hashed all of the cards
into 4 distinct buckets. And now you have problems of size 13, which is a little more tenable than doing
one massive 52 card problem.
- [2:56:24](https://www.youtube.com/watch?v=undefined&t=46584s) You can now do 4, 13 size
problems. And so hashing is something that even you and I might do instinctively. Taking as input some
card, some name, and producing as output some location. A temporary pile in which you want to stage
things, so to speak. But these collisions are inevitable. And honestly, if we kept going through the Harry
Potter universe, some of these chains would get longer, and longer and longer.

- [2:56:47](https://www.youtube.com/watch?v=undefined&t=46607s) Which means that instead of


getting someone's name quickly, by searching for them or inserting them, might start taking a decent
amount of time. So what could we do instead to resolve situations like this? If the problem,
fundamentally, is that the first letter is just too darn popular, H, we need to take in more input.

- [2:57:05](https://www.youtube.com/watch?v=undefined&t=46625s) Not just the first letter but maybe


the first 2 letters. So if we do that, we can go from A through Z to something more extreme like maybe H-
A, H-B, H-C, H-D, H-F, and so forth. So that now Harry and Hermione end up at different locations. But,
darn it, Hagrid still collides with Harry. So it's better than before.

- [2:57:25](https://www.youtube.com/watch?v=undefined&t=46645s) The chains aren't quite as long.


But the problem isn't fundamentally gone. And in this case here, anyone know how many buckets we
just increased to, if we now look at not just a through Z but AA through ZZ, roughly? AUDIENCE: 26
squared. SPEAKER 1: Yeah. OK, good. So the easy answer to 26 squared are 676.

- [2:57:46](https://www.youtube.com/watch?v=undefined&t=46666s) So that's a lot more buckets. And


this is why I only showed a few of them on the screen. So that's a lot more. And it spreads things out in
particular. What if we take this one step further? Instead of H-A, we do like H-A-A, H-A-B, H-A-C, H-Z-Z,
and so forth. Well now, we have an even better situation.

- [2:58:04](https://www.youtube.com/watch?v=undefined&t=46684s) Because Hermoine has her one


spot. Harry has his one spot. Hagrid has his one spot. But there's a trade off here. The upside is now,
arithmetically, we can find their locations in constant time. Maybe, technically 3 steps. But 3 is constant,
no matter how many other names are in here, it would seem. But what's the downside here? Sorry, say
again.

- [2:58:25](https://www.youtube.com/watch?v=undefined&t=46705s) AUDIENCE: Memory. SPEAKER 1:


Memory. So significantly more. We're now up to 17,576 buckets, which itself isn't that big a deal, right.
Computers have a lot of memory these days. But as you can infer, I can't really think of someone whose
name started with H-E-Q, for instance, in the Harry Potter universe.

- [2:58:44](https://www.youtube.com/watch?v=undefined&t=46724s) And if we keep going, definitely


don't know of anyone whose name started with Z-Z-Z or A-A-A. There's a lot of not useful combinations
that have to be there mathematically, so that you can do a bit of math and jump to randomly, so to
speak, the precise location. But they're just going to be empty.

- [2:59:01](https://www.youtube.com/watch?v=undefined&t=46741s) So it's a very sparsely populated


array, so to speak. So what does that really mean for performance, ultimately? Well let's consider, again,
in the context of our Big O notation. It turns out that a hash table, technically speaking, is still just going
to give us Big O of n in the worst case. Why? If you have some crazy perverse case where everyone in the
universe has a name that starts with A, or starts with H, or starts with Z, you just get really unlucky.
- [2:59:27](https://www.youtube.com/watch?v=undefined&t=46767s) And your chain is massively long.
Well then, at that point, it's just a linked list. It's not a hash table. It's like the perverse situation with the
tree, where if you insert it without any mind for keeping it balance, it just evolves. But there's a
difference here between a theoretical performance and an actual performance.

- [2:59:46](https://www.youtube.com/watch?v=undefined&t=46786s) If you look back at the the hash


table here, this is absolutely, in practice, going to be faster than a single linked list. Mathematically,
asymptotically, big O notation, sure. It's all the same. Big O of n. But if what we're really caring about is
real humans using our software, there's something to be said for crafting a data structure.

- [3:00:06](https://www.youtube.com/watch?v=undefined&t=46806s) That technically, if this data were


uniformly distributed, is 26 times faster than a linked list alone. And so, there's this tension too between
systems, types of CS, and theoretical CS. Where yeah, theoretically, these are all the same. But in
practice, for making real-world software, improving this speed by a factor of 26 in this case, let alone 576
or more, might actually make a big difference.

- [3:00:32](https://www.youtube.com/watch?v=undefined&t=46832s) But there's going to be a trade off.


And that's typically some other resource like giving up more space. All right. How about another data
structure we could build. Let me fast forward to something here called a trie. So a trie, a weird name in
pronunciation. Short for retrieval, pronounced trie typically.

- [3:00:49](https://www.youtube.com/watch?v=undefined&t=46849s) A trie is a tree that actually gives


us constant time lookup, even for massive data sets. What do I mean by this? In the world of a trie, you
create a tree out of arrays. So we're really getting into the Frankenstein territory of just building things
up with spare parts of data structures that we have here.

- [3:01:11](https://www.youtube.com/watch?v=undefined&t=46871s) But the root of a trie is, itself, an


array. For instance, of size 26. Where each element in that trie points to another node, which is to say
another array. And each of those locations in the array represents a letter of the alphabet like A through
Z. So for instance, if you wanted to store the names of the Harry Potter universe, not in a hash table, not
in a linked list, not in a tree, but in a trie.

- [3:01:37](https://www.youtube.com/watch?v=undefined&t=46897s) What you would do is hash on


every letter in the person's name one at a time. So a trie is like a multi-tier hash table, in a sense. Where
you first look at the first letter, then the second letter, then the third, and you do the following. For
instance, each of these locations represents a letter A through Z.

- [3:01:57](https://www.youtube.com/watch?v=undefined&t=46917s) Suppose I wanted to insert


someone's name into this that starts with the letter H, like Hagrid for instance. Well, I go to the location
H. I see it's null, which means I need to malloc myself another node or another array. And that's depicted
here. Then, suppose I want to store the second letter in Hagrid's name, an A. So I go to that location in
the second node.

- [3:02:15](https://www.youtube.com/watch?v=undefined&t=46935s) And I see, OK, it's currently null.


There's nothing below it. So I allocate another node using malloc or the like. And now I have H-A-G. And I
continue this with R-I-D. And then, when I get to the bottom of this person's name, I just have to indicate
here in color, but probably with a Boolean value or something.
- [3:02:32](https://www.youtube.com/watch?v=undefined&t=46952s) Like a true value that says, a name
stops here. So that it's clear that the person's name is not H-A, or H-A-G, or H-A-G-R, or H-A-G-R-I. It's H-
A-G-R-I-D. And the D is green, just to indicate there's like some other Boolean value that just says, yes.
This is the node in which the name stops.

- [3:02:53](https://www.youtube.com/watch?v=undefined&t=46973s) And if I continue this logic, here's


how I might insert someone like Harry. And here's how I might insert someone like Hermione. And
what's interesting about the design here is that some of these names share a common prefix. Which
starts to get compelling because you're reusing space. You're using the same nodes for names like H-A-G
and H-A-R because they share H and an A in common.

- [3:03:18](https://www.youtube.com/watch?v=undefined&t=46998s) And they all share an H in


common. So you have this data structure now that, itself, is a tree. Each node in the tree is, itself, an
array. And we, therefore, might implement this thing using code like this. Every node is containing, I'll do
it in reverse order, an array. I'll call it children because that's what it really represents.

- [3:03:39](https://www.youtube.com/watch?v=undefined&t=47019s) Up to 26 children for each of


these nodes. Size of the alphabet. So I might have used just a constant for number 26, to give myself 26
letters of the alphabet. And each of those arrays stores that many node stars. That many pointers to
another node. And here's an example of the Bool. This is what I represented in green on the slide a
moment ago.

- [3:03:58](https://www.youtube.com/watch?v=undefined&t=47038s) I also need another piece of data.


Just a 0 or 1, a true or false, that says yes. A name stops in this node or it's just a path to the rest of the
person's name. But the upside of this is that the height of this tree is only as tall as the person's longest
name. H-A-G-R-I-D or H-E-R-M-O-I-N-E.

- [3:04:22](https://www.youtube.com/watch?v=undefined&t=47062s) And notice that no matter how


many other people are in this data structure, there's 3 at the moment, if there were 3 million, it would
still take me how many steps to search for Hermoine? H-E-R-M-I-O-N-E. So, 8 steps total. No matter if
there's 2 other people, 2 million, 10 million other people. Because the path to her name is always on the
same path.

- [3:04:46](https://www.youtube.com/watch?v=undefined&t=47086s) And if you assume that there's a


maximum limit on the length of names in the human world. Maybe it's 40, 100, whatever. Whatever the
longest name in the world is. That's constant. Maybe it's 40, 100, but that's constant. Which is to say that
with a trie, technically speaking, it is the case that your lookup time, Big O of n, a big O notation, would
be big O of 1.

- [3:05:09](https://www.youtube.com/watch?v=undefined&t=47109s) It's constant time, because unlike


every other data structure we've looked at, with a trie, the amount of time it takes you to find one
person or insert one person is completely independent of how many other pieces of data are already in
the data structure. And this holds true even if one name is a prefix of another.

- [3:05:27](https://www.youtube.com/watch?v=undefined&t=47127s) I don't think there was a Daniel or


Danielle in the Harry Potter universe that I could think of. But, D-A-N-I-E-L could be one name. And,
therefore, we have a true there in green. And if there's a longer name like Danielle. Then, you keep going
until you get to the E. So you can still have with a trie, one name that's a substring of another name.
- [3:05:47](https://www.youtube.com/watch?v=undefined&t=47147s) So it's not as though we've
created a problem there. That, too, is still possible. But at the end of the day, it only takes a finite
number of steps to find any of these people. And again, that's what's particularly compelling. That you
effectively have constant time lookup. So that's amazing, right.

- [3:06:02](https://www.youtube.com/watch?v=undefined&t=47162s) We've gone through this whole


story for weeks now of like, linear time. And then, it went up to n squared. And then, log n. And now
constant time, what's the price paid for a data structure like this? This so-called trie? What's the
downside here? There's got to be a catch. And in fact, tries are not actually used that often, amazing as
they might sound on some CS level here.

- [3:06:25](https://www.youtube.com/watch?v=undefined&t=47185s) AUDIENCE: Memory. SPEAKER 1:


Memory. In what sense? AUDIENCE: Much like a [INAUDIBLE]. SPEAKER 1: Exactly. If you're storing all of
these darn arrays it's, again, a sparsely populated data structure. And you can see it here. Granted
there's only 3 names, but most of those boxes, most of those pointers, are going to remain null.

- [3:06:43](https://www.youtube.com/watch?v=undefined&t=47203s) So this is an incredibly wide data


structure, if you will. It uses a huge amount of memory to store the names. But again, you've got to pick
a lane. Either you're going to minimize space or you're going to minimize time. It's not really possible to
get truly the best of both worlds. You have to decide where the inflection point is for the device you're
writing software for, how much memory it has, how expensive it is.

- [3:07:03](https://www.youtube.com/watch?v=undefined&t=47223s) And again, taking all of these


things into account. So lastly, let's do one further abstraction. So even higher level to discuss something
that are generally known as abstract data structures. It turns out we could spend like all day, all week,
talking about different things we could build with these data structures.

- [3:07:19](https://www.youtube.com/watch?v=undefined&t=47239s) But for the most part, now that


we have arrays. Now that we have linked lists or their cousin's trees, which are 2-dimensional. And
beyond that, there's even graphs, where the arrows can go in multiple directions, not just down, so to
speak. Now that we have this ability to stitch things together, we can solve all different types of
problems.

- [3:07:34](https://www.youtube.com/watch?v=undefined&t=47254s) So, for instance, a very common


type of data structure to use in a program, or even our human world, are things called queues. A queue
being a data structure like a line outside of a store. Where it has what's called a FIFO property. First In,
First Out. Which is great for fairness, at least in the human world.

- [3:07:52](https://www.youtube.com/watch?v=undefined&t=47272s) And if you've ever waited outside


of Tasty Burger, or Salsa Fresca, or some other restaurant nearby, presumably, if you're queuing up at the
counter, you want them store to maintain a FIFO system. First in and first out. So that whoever's first in
line gets their food first and gets out first.

- [3:08:09](https://www.youtube.com/watch?v=undefined&t=47289s) So a queue is actually a computer


science term, too. And even if you're still in the habit of printing things on paper, there are things you
might have heard called printer queues, which also do things in order. The first person to send their
essay to the printer should, ideally, be printed before the last person to send their essay to the printer.
- [3:08:26](https://www.youtube.com/watch?v=undefined&t=47306s) Again, in the interest of fairness.
But how can you implement a queue? Well, you typically have to implement 2 fundamental operations,
enqueue and dequeue. So adding something to it and removing something from it. And the interesting
thing here is that how do you implement a queue? Well in the human world, you would just have
literally physical space for humans to line up from left to right, or right to left.

- [3:08:47](https://www.youtube.com/watch?v=undefined&t=47327s) Same in a computer. Like a printer


queue, if you send a whole bunch of jobs to be printed, a whole bunch of essays or documents, well, you
need a chunk of memory like an array. All right. Well, if you use an array, what's a problem that could
happen in the world of printing, for instance? If you use an array to store all of the documents that need
to be printed.

- [3:09:05](https://www.youtube.com/watch?v=undefined&t=47345s) AUDIENCE: It can be filled.


SPEAKER 1: It could be filled, right. So if the programmer decided, HP or whoever makes the printer
decides, oh, you can send like a megabyte worth of documents to this printer at once. At some point you
might get an error message, which says, sorry out of memory. Wait a few minutes.

- [3:09:18](https://www.youtube.com/watch?v=undefined&t=47358s) Which is maybe a reasonable


solution, but a little annoy. Or HP could write code that maybe dynamically resizes the array or so forth.
But at that point, maybe they should just use a linked list. And they could. So there, too, you could
implement the notion of a queue using a linked list instead. You're going to spend more memory, but
you're not going to run out of space in your array.

- [3:09:38](https://www.youtube.com/watch?v=undefined&t=47378s) Which might be more compelling.


This happens even in the physical world. You go to the store and you start having to line up outside and
down the road. And like, for a really busy store, they run out of space so they make do. But in that case,
it tends to be more of an array just because of the physical notion of humans lining up.

- [3:09:54](https://www.youtube.com/watch?v=undefined&t=47394s) But there's other data structures,


too. If you've ever gone to the dining hall and picked up like a Harvard or Yale tray, you're typically
picking up the last tray that was just cleaned, not the first tray that was cleaned. Why? Because these
cafeteria trays stack up on top of each other. And indeed a stack is another type of abstract data
structure.

- [3:10:14](https://www.youtube.com/watch?v=undefined&t=47414s) In the physical world, it's literally


something physical like a stack of trays. Which have what we would call a LIFO property. Last In, First Out.
So as these things come out of the washer, they're putting the most recent ones on the top. And then
you, the human, are probably taking the most recently cleaned one.

- [3:10:31](https://www.youtube.com/watch?v=undefined&t=47431s) Which means in the extreme, no


one on campus might ever use that very first tray. Which is probably fine in the world of trays, but would
really be bad in the world of Tasty Burger lining up for food if LIFO were the property being
implemented. But here, too, it could be an array. It could be a linked list.

- [3:10:47](https://www.youtube.com/watch?v=undefined&t=47447s) And you see this, honestly, every


day. If you're using Gmail and your Gmail inbox. That is actually a stack, at least by default, where your
newest message last in are the first ones at the top of the screen. That's a LIFO data structure. And it
means that you see your most recent emails. But if you have a busy day, you're getting a lot of emails, it
might not be a good thing.

- [3:11:06](https://www.youtube.com/watch?v=undefined&t=47466s) Because now you're ignoring the


people who wrote you way earlier in the day or the week. So LIFO and FIFO are just properties that you
can achieve with these very specific types of data structures. And the parliaments in the world of stacks
is to push something onto a stack or pop something out. These are here, for instance, as an example of
why might you always wear the same color.

- [3:11:25](https://www.youtube.com/watch?v=undefined&t=47485s) Well, if you're storing all of your


clothes in a stack, you might not ever get to the different colored clothes at the bottom of the list. And in
fact, to paint this picture, we have a couple of minute video here. Just to paint this here, made by a
faculty member elsewhere. Let's go ahead and dim the lights for just a minute or 2 here.

- [3:11:41](https://www.youtube.com/watch?v=undefined&t=47501s) So that we can take a look at Jack


learning some facts. [VIDEO PLAYING] SPEAKER 2: Once upon a time, there was a guy named Jack. When
it came to making friends Jack did not have the knack. So Jack went to talk to the most popular guy he
knew. He went up to Lou and asked, what do I do? Lou saw that his friend was really distressed.

- [3:12:00](https://www.youtube.com/watch?v=undefined&t=47520s) Well, Lou began, just look how


you're dressed. Don't you have any clothes with a different look? Yes, said Jack. I sure do. Come to my
house and I'll showed them to you. So they went off the Jack's. And Jack showed Lou the box, where he
kept all his shirts, and his pants, at his socks. Lou said, I see you have all your clothes in a pile.

- [3:12:19](https://www.youtube.com/watch?v=undefined&t=47539s) Why don't you wear some others


once in a while? Jack said, well, when I remove clothes and socks, I wash them and put them away in the
box. Then comes the next morning and up I hop. I go to the box and get my clothes off the top. Lou
quickly realized the problem with Jack. He kept clothes, CDs, and books in a stack.

- [3:12:39](https://www.youtube.com/watch?v=undefined&t=47559s) When he'd reached for something


to read or to wear, he chose a top book or underwear. Then when he was done he would put it right
back. Back it would go on top of the stack. I know the solution, said a triumphant Lou. You need to learn
to start using a queue. Lou took Jack's clothes and hung them in a closet.

- [3:12:57](https://www.youtube.com/watch?v=undefined&t=47577s) And when he had emptied the


box, he just tossed it. Then he said, now Jack, at the end of the day, put your clothes on the left when
you put them away. Then tomorrow morning when you see the sunshine, get your clothes from the
right, from the end of the line. Don't you see, said Lou, it will be so nice.

- [3:13:13](https://www.youtube.com/watch?v=undefined&t=47593s) You'll wear everything once


before you wear something twice. And with everything in queues in his closet and shelf, Jack started to
feel quite sure of himself. All thanks to Lou and his wonderful queue. SPEAKER 1: So just to help you
realize that these things are everywhere. [AUDIENCE CLAPPING] Even in our human world.

- [3:13:34](https://www.youtube.com/watch?v=undefined&t=47614s) If you've ever lined up at this


place. Anyone recognize this? OK, so sweetgreen, little salad place in the square. This is if you order
online or in advance, your food ends up according to the first letter in your name. Which actually sounds
awfully reminiscent of something like a hash table. And in fact, no matter whether you implement a hash
table like we did, with an array and linked list.

- [3:13:53](https://www.youtube.com/watch?v=undefined&t=47633s) Or with 3 shelves like this. This is


actually an abstract data type called a dictionary. And a dictionary, just like in our human world, has keys
and values. Words and their definitions. This just has letters of the alphabet and salads as their value.
But here, too, there's a real world constraint. In what kind of scenario does this system at sweetgreen
devolve into a problem, for instance? Because they, too, are using only finite space, finite storage.

- [3:14:20](https://www.youtube.com/watch?v=undefined&t=47660s) What could go wrong? Yeah.


AUDIENCE: Run out of space. SPEAKER 1: Yeah. If they run out of space on the shelf and there's a lot of
people whose names start with D, or E, or whatever. And so, they just pile up. And then, maybe, they
kind of overflow into the E's or the F's. And they probably don't really care because any human is going
to come by, and just eyeball it, and figure it out anyway.

- [3:14:34](https://www.youtube.com/watch?v=undefined&t=47674s) But in the world of a computer,


you're the one coding and have to be ever so precise. We thought we would lastly do one final thing
here. In advance, we prepared a linked list of sorts in the audience. Since this has become a bit of a
thing. I am starting to represent the beginning of this linked list.

- [3:14:50](https://www.youtube.com/watch?v=undefined&t=47690s) And so far as I have a pointer here


with seat location G9. Whoever is in G9, would you mind standing up? And what letter is on your sheet
there? AUDIENCE: F15. SPEAKER 1: OK, so you have S15 and your letter-- AUDIENCE: F15. SPEAKER 1: Say
again? AUDIENCE: F. SPEAKER 1: F15. So I see you're holding a C in your node.

- [3:15:09](https://www.youtube.com/watch?v=undefined&t=47709s) You are pointing to, if you could


physically, F15. F15, what do you hold? AUDIENCE: S. SPEAKER 1: You have an S. And who should you be
pointing at? AUDIENCE: F5. SPEAKER 1: F5. Could you stand up, F5. You're holding a 5, I see. What
address? AUDIENCE: F12. SPEAKER 1: F12. Big finale. F12, if you'd like to stand up holding a 0 and null,
which means that was CS50.

- [3:15:31](https://www.youtube.com/watch?v=undefined&t=47731s) [AUDIENCE CLAPPING] All right.


We'll see you next time. [MUSIC PLAYING] DAVID J. MALAN: All right, this is CS50, and this is already
week 6.

- [3:16:58](https://www.youtube.com/watch?v=undefined&t=47818s) And this is the week in which you


learn yet another language. But the goal is not just to teach you another language, for languages sake, as
we transition today and in the coming weeks from C, where we've spent the past several weeks, now to
Python. The goal ultimately is to teach you all how to teach yourselves new languages, so that by the end
of this course, it's not in your mind, the fact that you learned how to program in C or learned some
weeks back how to program in Scratch, but really how you learned how to program fundamentally,

- [3:17:25](https://www.youtube.com/watch?v=undefined&t=47845s) in a paradigm known as


procedural programming, as well as with some taste today, and in the weeks to come, of other aspects
of programming languages, like object-oriented programming, and more. So recall, though, back in week
zero, Hello, world looked a little something like this. And the world was quite simple.
- [3:17:40](https://www.youtube.com/watch?v=undefined&t=47860s) All you had to do was drag and
drop these puzzle pieces. But there were still functions and conditionals and loops and variables and all
of those kinds of primitives. We then transitioned, of course, to a much more arcane language that
looked a little something like this. And even now, some weeks later, you might still be struggling with
some of the syntax or getting annoying bugs when you try to compile your code, and it just doesn't work.

- [3:17:59](https://www.youtube.com/watch?v=undefined&t=47879s) But there, too, the past few


weeks, we've been focusing on functions and loops and variables, conditionals, and really all of those
same ideas. And so what we begin to do today is to, one, simplify the language we're using, transitioning
from C now to Python, this now being the equivalent program in Python, and look at its relative
simplicity, but also transitioning to look at how you can implement these same kinds of features, just
using a different language.

- [3:18:24](https://www.youtube.com/watch?v=undefined&t=47904s) So we're going to see a lot of code


today. And you won't have nearly as much practice with Python as you did with C. But that's because so
many of the ideas are still going to be with us. And, really, it's going to be a process of figuring out, all
right, I want to do a loop. I know how to do it in C.

- [3:18:38](https://www.youtube.com/watch?v=undefined&t=47918s) How do I do this in Python? How


do I do the same with conditionals? How do I declare variables, and the like, and moving forward, not
just in CS50, but in life in general, if you continue programming and learn some other language after the
class, if in 5-10 years, there's a new, more popular language that you pick up, it's just going to be a
matter of googling and looking at websites like Stack Overflow and the like, to look at just basic building
blocks of programming languages, because you already speak, after these past 6 plus weeks,

- [3:19:01](https://www.youtube.com/watch?v=undefined&t=47941s) you already speak programming


itself fundamentally. All right, so let's do a few quick comparisons, left and right, of what something
might have looked like in Scratch, and what it then looked like in C, but now, as of today, what it's going
to look like in Python. Then we'll turn our attention to the command line, ultimately, in order to
implement some actual programs.

- [3:19:19](https://www.youtube.com/watch?v=undefined&t=47959s) So in Scratch, we had functions


like this, say Hello, world, a verb or an action. In C it looked a little something like this, and a bit of a
cryptic mess the first week, you had the printf, you had the double quotes. You had the semicolon, the
parentheses. So there's a lot more syntax just to do the same thing.

- [3:19:35](https://www.youtube.com/watch?v=undefined&t=47975s) We're not going to get rid of all of


that syntax now, but as of today, in Python, that same statement is going to look a little something like
this. And just to perhaps call out the obvious, what is different or, now, simpler in Python versus C, even
in this simple example here? Yeah. AUDIENCE: Now print, instead of printf would be, something like that.

- [3:19:54](https://www.youtube.com/watch?v=undefined&t=47994s) DAVID J. MALAN: Good, so it's


now print instead of printf. And there's also no semicolon. And there's one other subtlety, over here.
AUDIENCE: No new line. DAVID J. MALAN: Yeah, so no new line, and that doesn't mean it's not going to
be printed. It just turns out that one of the differences we'll see is that, with print, you get the new line
for free.
- [3:20:08](https://www.youtube.com/watch?v=undefined&t=48008s) It automatically gets outputted by
default, being sort of a common case. But you can override it, we'll see, ultimately, too. How about in
Scratch? We had multiple functions like this, that not only said something on the screen, but also asked a
question, thereby being another function that returned a value, called answer.

- [3:20:24](https://www.youtube.com/watch?v=undefined&t=48024s) In C we saw code that looked a


little something like this, whereby that first line declares a variable called answer, sets it equal to the
return value of getString, one of the functions from the CS50 library, and then the same double quotes
and parentheses and semicolon. Then we had this format code in C that allowed us, with %S, to actually
print out that same value.

- [3:20:44](https://www.youtube.com/watch?v=undefined&t=48044s) In Python, this, too, is going to


look a little bit simpler. Instead, we're going to have answer equals getString, quote unquote "What's
your name," and then print, with a plus sign and a little bit of new syntax. But let's see if we can't just
infer from this example what it is that's going on.

- [3:20:59](https://www.youtube.com/watch?v=undefined&t=48059s) Well, first missing on the left is


what? To the left of the equal sign, there's no what this time? Feel free to just call it out. AUDIENCE:
Type. DAVID J. MALAN: So there's no type. There's no type, like the word string, which even though that
was a type in CS50, every other variable in C did we use Int or string or float, or Bool or something else.

- [3:21:18](https://www.youtube.com/watch?v=undefined&t=48078s) In Python, there are still going to


be data types, today onward, but you, the programmer, don't have to bother telling the computer what
types you're using. The computer is going to be smart enough, the language, really, is going to be smart
enough, to just figure it out from context. Meanwhile, on the right hand side, getString is going to be a
function we'll use today and this week, which comes from a Python version of the CS50 library.

- [3:21:38](https://www.youtube.com/watch?v=undefined&t=48098s) But we'll also start to take off


those training wheels, so that you'll see how to do things without any CS50 library moving forward, using
a different function instead. As before, no semicolon, but the rest of the syntax is pretty much the same
here. This starts, of course, to get a little bit different, though.

- [3:21:53](https://www.youtube.com/watch?v=undefined&t=48113s) We're using print instead of printf.


But now, even though this looks a little cryptic, perhaps, if you've never programmed before CS50, what
might that plus be doing, just based on inference here. What do you think? AUDIENCE: Adding answer to
the string Hello. DAVID J.

- [3:22:12](https://www.youtube.com/watch?v=undefined&t=48132s) MALAN: Yeah, so adding answer


to the string Hello, and adding, so to speak, not mathematically, but in the form of joining them together,
much like we saw the joined block in Scratch, or concatenation was the term of art there. This plus sign
appends, if you will, whatever's in answer to whatever is quoted here. And I deliberately left a space
there, so that grammatically it looks nice, after the comma as well.

- [3:22:30](https://www.youtube.com/watch?v=undefined&t=48150s) Now there's another way to do


this. And it, too, is going to look cryptic at first glance. But it just gets easier and more convenient over
time. You can also change this second line to be this, instead. So what's going on here. This is actually a
relatively new feature of Python in the past couple of years, where now what you're seeing is, yes, a
string, between these same double quotes, but this is what Python would call a format string, or Fstring.
- [3:22:54](https://www.youtube.com/watch?v=undefined&t=48174s) And it literally starts with the
letter F, which admittedly looks, I think, a little weird. But that just indicates that Python should assume
that anything inside of curly braces inside of the string should be interpolated, so to speak, which is a
fancy term saying, substitute the value of any variables therein.

- [3:23:13](https://www.youtube.com/watch?v=undefined&t=48193s) And it can do some other things


as well. So answer is a variable, declared, of course, on this first line. This Fstring, then, says to Python,
print out Hello comma space, and then the value of Answer. If, by contrast, you omitted the curly braces,
just take a guess, what would happen? What would the symptom of that bug be, if you accidentally
forgot the curly braces, but maybe still had the F there? AUDIENCE: It would print below it, too.

- [3:23:38](https://www.youtube.com/watch?v=undefined&t=48218s) DAVID J. MALAN: Yeah, it would


literally print Hello, comma answer, because it's going to take you literally. So the curly braces just kind of
allow you to plug things in. And, again, it looks a little more cryptic, but it's just going to save us time
over time. And if any of you programmed in Java in high school, for instance, you saw plus in that
context, too, for concatenation.

- [3:23:53](https://www.youtube.com/watch?v=undefined&t=48233s) This just kind of makes your code


a little tighter, a little more succinct. So it's a convenient feature now in Python. All right, this was an
example in Scratch of a variable, setting a variable like counter equal to 0. In C it looked like this, where
you specify the type, the name, and then the value, with a semicolon.

- [3:24:09](https://www.youtube.com/watch?v=undefined&t=48249s) In Python, it's going to look like


this. And I'll state the obvious here. You don't need to mention the type, just like before with string. And
you don't need a semicolon. So it's a little simpler. If you want a variable, just write it and set it equal to
some value. But the single equal sign still behaves the same as in C.

- [3:24:25](https://www.youtube.com/watch?v=undefined&t=48265s) Suppose we wanted to increment


counter by one. In Scratch, we use this puzzle piece here. In C, we could do this, actually, in a few
different ways. There was this way, if counter already exists, you just say counter equals counter plus 1.
There was the slightly less verbose way, where you could say, oops, sorry.

- [3:24:41](https://www.youtube.com/watch?v=undefined&t=48281s) Let me do the first sentence first.


In Python, that same thing, as you might guess, is actually going to be almost the same, you just throw
away the semicolon. And the mathematics are ultimately the same, copying from right to left, via the
assignment operator. Now, recall, in C, that we had this shorthand notation, which did the same thing.

- [3:24:59](https://www.youtube.com/watch?v=undefined&t=48299s) In Python, you can similarly do


the same thing, just no need for the semicolon. The only step backwards we're taking, if you were a big
fan of counter plus plus, that doesn't exist in Python, nor minus minus. You just can't do it. You have to
do the plus equals 1 or plus/minus or minus equals 1 to achieve that same result.

- [3:25:20](https://www.youtube.com/watch?v=undefined&t=48320s) All right, how about in Python 2?


Here in Scratch, recall, was a conditional, asking a silly question like is x less than y, and if so, just say as
much. In C, that looked a little something like this, printf and if with the parentheses, the curly braces,
the semicolon, and all of that. In Python, this is going to get a little more pleasant to type, too.
- [3:25:37](https://www.youtube.com/watch?v=undefined&t=48337s) It's going to be just this. And if
someone wants to call out some of the obvious changes here, what has been simplified now in Python
for a conditional, it would seem? Yeah, what's missing, or changed? AUDIENCE: Braces. DAVID J. MALAN:
So no curly braces. AUDIENCE: Colon is back. DAVID J.

- [3:25:52](https://www.youtube.com/watch?v=undefined&t=48352s) MALAN: I'm sorry? AUDIENCE:


Using the colon instead. DAVID J. MALAN: And we're using the colon instead. So I got rid of the curly
braces in Python. But I'm using a colon instead. And even though this is a single line of code, so long as
you indent subsequent lines along with the printf, that's going to imply that everything, if the if condition
is true, should be executed below it, until you start to un-indent and start writing a different line of code
altogether.

- [3:26:15](https://www.youtube.com/watch?v=undefined&t=48375s) So indentation in Python is


important. So this is among the reasons we've emphasized axes like style, just how well styled your code
is. And honestly, we've seen, certainly, in office hours, and you've seen in your own code, sort of a
tendency sometimes to be a little lax when it comes to indentation, right? If you're one of those folks
who likes to indent everything on the left hand side of the window, yeah, it might compile and run.

- [3:26:38](https://www.youtube.com/watch?v=undefined&t=48398s) But it's not particularly readable


by you or anyone else. Python actually addresses this by just requiring indentation, when logically
needed. So Python is going to force you to start inventing properly now, if that's been, perhaps, a
tendency otherwise. What else is missing? Well, we have no semicolon here.

- [3:26:56](https://www.youtube.com/watch?v=undefined&t=48416s) Of course, it's print instead of


printf. But otherwise, those seem to be the primary differences. What about something larger in
Scratch? If an if-else block, like this, you can perhaps guess what it's going to look like. In C it looks like
this, curly braces semicolons, and so forth. In Python, it's going to now look like this, almost the same,
but indentation is important.

- [3:27:15](https://www.youtube.com/watch?v=undefined&t=48435s) The colons are important. And


there's one other difference that's now again visible here, but we didn't call it out a second ago. What
else is different in Python versus C for these conditionals? Yeah. AUDIENCE: You don't have any
parentheses around the condition. DAVID J. MALAN: Perfect. We don't have any parentheses around the
condition, the Boolean expression itself.

- [3:27:32](https://www.youtube.com/watch?v=undefined&t=48452s) And why not? Well, it's just


simpler to type. It's less to type. You can still use parentheses. And, in fact, you might want to or need to,
if you want to combine thoughts and do this and that, or this or that. But by default, you no longer need
or should have those parentheses. Just say what you mean.

- [3:27:49](https://www.youtube.com/watch?v=undefined&t=48469s) Lastly, with conditionals, we had


something like this, an if else if else statement. In C, it looked a little something like this. In Python, it's
going to get really tighter now. It's just if, and this is the curiosity, elif x greater than y. So it's not else if,
it's literally one keyword, elif, and the colons remain now on each of the three lines.

- [3:28:10](https://www.youtube.com/watch?v=undefined&t=48490s) But the indentation is important.


And if we did want to do multiple things, we could just indent below each of these conditionals, as well.
All right, let me pause there first, to see if there's any questions on these syntactic differences. Yeah.
AUDIENCE: My thought is maybe like, it's good, though, does it matter if there's this in between thing
like that, but and why.

- [3:28:29](https://www.youtube.com/watch?v=undefined&t=48509s) DAVID J. MALAN: In between,


between what and what? AUDIENCE: So like the left-hand side and like the right side spaces? DAVID J.
MALAN: Ah, good question, is Python sensitive to spaces and where they go? Sometimes no, sometimes
yes, is the short answer. Stylistically, though, you should be practicing what we're preaching here,
whereby you do have spaces to the left and right of binary operators, that they're called, something like
less than or greater than is a binary operator, because there's

- [3:28:55](https://www.youtube.com/watch?v=undefined&t=48535s) two operands to the left and to


the right of them. And in fact, in Python, more so than the world of C, there's actually formal style
conventions. Not only within CS50 have we had a style guide on the course's website, for instance, that
just dictates how you should write your code so that it looks like everyone else's.

- [3:29:11](https://www.youtube.com/watch?v=undefined&t=48551s) In the Python community, they


take this one step further, and there's an actual standard whereby you don't have to adhere to it, but
generally speaking, in the real world, someone would reprimand you, would reject your code, if you're
trying to contribute it to another project, if you don't adhere to these standards.

- [3:29:25](https://www.youtube.com/watch?v=undefined&t=48565s) So while you could be lax with


some of this white space, do make things readable. And that's Python theme, for the code to be as
readable as possible. All right, so let's take a look at a couple of other constructs before transitioning to
some actual code. This, of course, in Scratch was a loop, meowing forever.

- [3:29:41](https://www.youtube.com/watch?v=undefined&t=48581s) In C, the closest we could get was


doing something while true, because true never changes. So it's sort of a simple way of just saying do
this forever. In Python, it's pretty much the same thing, but a couple of small differences here. The
parentheses are gone. The colon is there. The indentation is there.

- [3:29:57](https://www.youtube.com/watch?v=undefined&t=48597s) No semicolon, and there's one


other subtle difference. What do you see? AUDIENCE: True is capitalized? DAVID J. MALAN: True is
capitalized, just because. Both true and false are Boolean values in Python. But you've got to start
capitalizing them, just because. All right, how about a loop like this, where you repeat something a finite
number of times, like meowing three times.

- [3:30:15](https://www.youtube.com/watch?v=undefined&t=48615s) In C, we could do this a few


different ways. There's this very mechanical way, where you initialize a variable like i to zero. You then
use a while loop and check if i is less than 3, the total number of times you want to meow. Then you
print what you want to print. You increment i using this syntax, or the longer, more verbose syntax, with
plus equals or whatnot.

- [3:30:34](https://www.youtube.com/watch?v=undefined&t=48634s) And then you do it again and


again and again. In Python, you can do it functionally the same way, same idea, slightly different syntax.
You just don't bother saying what type of variable you want. Python will infer from the fact that there's a
0 right there. You don't need the parentheses. You do need the colon.
- [3:30:50](https://www.youtube.com/watch?v=undefined&t=48650s) You do need the indentation. You
can't do the i plus plus, but you can do this other technique, as we could have done in C, as well. How
else might we do this, though, too? Well. it turns out in C, we could do something like this, which, again,
sort of cryptic at first glance, became perhaps more familiar, where you have initialization, a conditional,
and then an update that you do after each iteration.

- [3:31:11](https://www.youtube.com/watch?v=undefined&t=48671s) In Python, there isn't really an


analog. There is no analog in Python, where you have the parentheses and the multiple semicolons in
the same line. Instead, there is a for loop, but it's meant to read a little more like English, for i in 0, 1, and
2. So we'll see in a bit, these square brackets represent an array, now to be called a list in Python.

- [3:31:34](https://www.youtube.com/watch?v=undefined&t=48694s) So lists in Python are more like


link lists than they are arrays. More on that soon. So this just means for i and the following list of three
values. And on each iteration of this loop, Python automatically, for you, it first sets i to zero. Then it sets
i to one. Then it sets i to two, so that you effectively do things three times.

- [3:31:54](https://www.youtube.com/watch?v=undefined&t=48714s) But this doesn't necessarily scale,


as I've drawn it on the board. Suppose you took this at face value as the way you iterate some number of
times in Python, using a for loop. At what point does this approach perhaps get bad, or bad design? Let
me give folks just a moment to think. Yeah, in back. AUDIENCE: If you don't know how many times, last
time, you know, you've got the link in there.

- [3:32:18](https://www.youtube.com/watch?v=undefined&t=48738s) DAVID J. MALAN: Sure, if you


don't know how many times you want to loop or iterate, you can't really create a hard-coded list like
that, of 0, 1, 2. Other thoughts? AUDIENCE: So you want to say raise a large number of allowances.
DAVID J.

- [3:32:32](https://www.youtube.com/watch?v=undefined&t=48752s) MALAN: Yeah, if you're iterating a


large number of times, this list is going to get longer and longer, and you're just kind of stupidly going to
be typing out like comma 3, comma 4, comma 5, comma dot dot dot, comma 99, comma 100. I mean,
your code would start to look atrocious, eventually. So there is a better way. In Python, there is a
function, or technically a type, called range, that essentially magically gives you back a range of values
from 0 on up to, but not through a value.

- [3:32:54](https://www.youtube.com/watch?v=undefined&t=48774s) So the effect of this line of code,


for i in the following range, essentially hands you back a list of three values, thereby letting you do
something three times. And if you want to do something 99 times instead, you, of course, just change
the 3 to a 99. Question. AUDIENCE: Is there a way to start the beginning point of that range at a number
or an integer that's higher than zero, or is there never a really any point to do so? DAVID J.

- [3:33:18](https://www.youtube.com/watch?v=undefined&t=48798s) MALAN: A really good question,


can you start counting at a higher number. So not 0, which is the implied default, but something larger
than that. Yes, so it turns out the range function takes multiple arguments, not just one but maybe two
or even three, that allows you to customize this behavior. So you can customize where it begins. You can
customize the increment.

- [3:33:34](https://www.youtube.com/watch?v=undefined&t=48814s) By default, it's one, but if you


want to do every two values, for like evens or odds, you could do that as well, and a few other things.
And before long, we'll take a look at some Python documentation that will become your authoritative
source for answers like that. Like, what can this function do.

- [3:33:47](https://www.youtube.com/watch?v=undefined&t=48827s) Other questions on this thus far?


Seeing none, so what else might we compare and contrast here. Well, in the world of C, recall that we
had a whole bunch of built-in data types, like these here, Bool and char and double and float, and so
forth, string, which happened to come from the CS50 library. But the language C itself certainly
understood the idea of strings, because the backslash 0, the support for %S and printf, that's all native,
built into C, not a CS50 simplification.

- [3:34:20](https://www.youtube.com/watch?v=undefined&t=48860s) All we did, and revealed, as of a


couple of weeks ago, is that string, this data type, is just a synonym for a typedef for char star, which is
part of the language natively. In Python now, this list actually gets a little shorter, at least for these
common primitive data types. Still going to have bulls, we're going to have floats, and Ints, and we're
going to have strings, but we're going to call them STRs.

- [3:34:39](https://www.youtube.com/watch?v=undefined&t=48879s) And this is not a CS50 thing from


the library, STR, S-T-R, is, in fact, a data type in Python, that's going to do a lot more than strings did for
us automatically in C. Ints and floats, meanwhile, don't need the corresponding longs and doubles,
because, in fact, among the problems Python solves for us, too, Ints can get as big as you want.

- [3:34:59](https://www.youtube.com/watch?v=undefined&t=48899s) Integer overflow is no longer


going to be an issue. Per week 1, the language solves that for us. Floating point imprecision,
unfortunately, is still a problem that remains. But there are libraries, code that other people have
written, as we briefly discussed in weeks past, that allow you to do scientific or financial computing,
using libraries that build on top of these data types, as well.

- [3:35:19](https://www.youtube.com/watch?v=undefined&t=48919s) So there's other data types, too,


in Python, which we'll see actually gives us a whole bunch of more power and capability, things called
ranges, like we just saw, lists, like I called out verbally, with the square brackets, things called tuples, for
things like x comma y, or latitude, longitude, dictionaries, or Dicts, which allow you to store keys and
values, much like our hash tables from last time, and then sets in the mathematical sense, where they
filter out duplicates for you, and you can just put a whole bunch of numbers,

- [3:35:46](https://www.youtube.com/watch?v=undefined&t=48946s) a whole bunch of words or


whatnot, and the language, via this data type, will filter out duplicates for you. Now there's going to be a
few functions we give you this week and beyond, training wheels that we're then going to very quickly
take off, just because, as we'll see today, they just simplify the process of getting user input correctly,
without accidentally writing buggy code, just when you're trying to get Hello, World, or something
similar, to work.

- [3:36:09](https://www.youtube.com/watch?v=undefined&t=48969s) And we'll give you functions, not


like, not as long as this list in C, but a subset of these, get float, get Int, and get string, that'll automate
the process of getting user input in a way that's more resilient against potential bugs. But we'll see what
those bugs might be. And the way we're going to do this is similar in spirit to C.

- [3:36:27](https://www.youtube.com/watch?v=undefined&t=48987s) Instead of doing include, CS50.h,


like we did in C, you're going to now start saying import CS50. Python supports, similar to C, libraries, but
there aren't header files anymore. You just use the name of the library in Python. And if you want to
import CS50's functions, you just say import CS50.

- [3:36:45](https://www.youtube.com/watch?v=undefined&t=49005s) Or, if you want to be more


precise, and not just import the whole thing, which could be slow, if you've got a really big library with a
lot of functionality in it, you can be more precise and say from CS50, import get float. From CS50 import
get Int, from CSM 50 import get string, or you can just separate them by commas and import 3 and only
3 things from a particular library, like ours.

- [3:37:07](https://www.youtube.com/watch?v=undefined&t=49027s) But starting today and onward,


we're going to start making much more heavy use of libraries, code that other people wrote, so that
we're no longer reinventing the wheel. We're not making our own linked lists, our own trees, our own
dictionaries. We're going to start standing on the shoulders of others, so that you can get real work
done, so to speak, faster, by building your software on top of others' code as well.

- [3:37:28](https://www.youtube.com/watch?v=undefined&t=49048s) All right, so that's it for the


syntactic tour of the language, and the sort of core features. Soon we'll transition to application thereof.
But let me pause here to see if there's any questions on syntax or primitives or otherwise, or otherwise.
Oh, yes, in back. AUDIENCE: Why don't Python have the increment operators.

- [3:37:53](https://www.youtube.com/watch?v=undefined&t=49073s) DAVID J. MALAN: I'm sorry, say it


again, why doesn't Python have what kind of operators? AUDIENCE: Why doesn't Python have the
increment operator? DAVID J. MALAN: Sorry, someone coughed when you said something operators.
AUDIENCE: The increment. DAVID J. MALAN: Oh, the increment operator? I'd have to check the history,
honestly.

- [3:38:07](https://www.youtube.com/watch?v=undefined&t=49087s) Python has tended to be a fairly


minimus language. And if you can do something one way, the community, arguably, has tended to not
give you multiple ways to do the same thing syntactically. There's probably a better answer. And I'll see if
I can dig in and post something online, to follow up on that.

- [3:38:22](https://www.youtube.com/watch?v=undefined&t=49102s) All right, so before we transition


to now writing some actual code, let me go ahead and consider exactly how we're going to write code. In
the world of C, recall that it's generally been a 2-step process. We create a file called like Hello.c, and
then, step one, make Hello, step 2, ./Hello. Or, if you think back to week two, when we sort of peeled
back the layer of what Hello, of what make was doing, you could more verbosely type out the name of
the actual compiler, Clang in our case, command line arguments like dash Oh, Hello,

- [3:38:54](https://www.youtube.com/watch?v=undefined&t=49134s) to specify what name you want to


create. And then you can specify the file name. And then you can specify what libraries you want to link
in. So that was a very verbose approach. But it was always a two-step approach. And so, even as you've
been doing recent problem sets, odds are you've realized that, any time you want to make a change to
your code, or make a change to your code and try and test your code again, you're constantly doing
those two steps.

- [3:39:19](https://www.youtube.com/watch?v=undefined&t=49159s) Moving forward in Python, it's


going to become simpler, and it's going to be just this. The file name is going to change, but that might
go without saying. It's going to be something like Hello.py, P-Y, instead of Hello.c. And that's just a
convention, using a different file extension. But there's no compilation step per se.

- [3:39:37](https://www.youtube.com/watch?v=undefined&t=49177s) You jump right to the execution of


your code. And so Python, it turns out, is the name, not only of the language we're going to start using,
it's also the name of a program on a Mac, a PC, assuming it's been pre-installed, that interprets the
language for you. This is to say that Python is generally described as being interpreted, not compiled.

- [3:39:58](https://www.youtube.com/watch?v=undefined&t=49198s) And by that, I mean you get to


skip, from the programmer's perspective, that compilation step. There is no manual step in the world of
Python, typically, of writing your code and then compiling it to zeros and ones, and then running the
zeros and ones. Instead, these kind of two steps get collapsed into the illusion of one, whereby you,
instead, are able to just run the code, and let the computer figure out how to actually convert it to
something the computer understands.

- [3:40:25](https://www.youtube.com/watch?v=undefined&t=49225s) And the way we do that is via this


old process, input and output. But now, when you have source code, it's going to be passed into an
interpreter, not a compiler. And the best analog of this is just to perhaps point out that, in the human
world, if you speak, or don't speak, multiple human languages, it can be a pretty slow process from going
from one language to another.

- [3:40:44](https://www.youtube.com/watch?v=undefined&t=49244s) For instance, here are step-by-


step instructions for finding someone in a phone book, unfortunately, in Spanish. Unfortunately, if you
don't speak or read Spanish. You could figure this out. You could run this algorithm, but you're going to
have to do some googling, or you're going to have to open up literal dictionary from Spanish to English
and convert this.

- [3:41:00](https://www.youtube.com/watch?v=undefined&t=49260s) And the catch with translating any


language, human or computer or otherwise, is that you're going to pay a price, typically some time. And
so converting this in Spanish to this in English is just going to take you longer than if this were already in
your native language. And that's going to be one of the subtleties with the world of Python.

- [3:41:18](https://www.youtube.com/watch?v=undefined&t=49278s) Yes, it's a feature that you can just


run the code without having to bother compiling it manually first. But we might pay a price. And things
might be a little slower. Now, there's ways to chip away at that. But we'll see an example thereof. In fact,
let me transition now to just a couple of examples that demonstrate how Python is not only easier for
many people to use, perhaps yourselves too, because it throws away a lot of the annoying syntax, it
shortens the number of lines you have to write, and also it comes with so many darn libraries,

- [3:41:46](https://www.youtube.com/watch?v=undefined&t=49306s) you can just do so much more


without having to write the code yourself. So, as an example of this, let me switch over here to this
image from problem set 4, which is the Weeks Bridge down by the Charles River here in Cambridge. And
this is the original photo, pretty clear, and it's even higher res if we looked at the original version of the
photo.

- [3:42:07](https://www.youtube.com/watch?v=undefined&t=49327s) But there have been no filters, a


la Instagram, applied to this photo. Recall, for problem set four, you had to implement a few filters. And
among them might have been blur. And blur was probably among the more challenging of the ones,
because you had to iterate over all of the pixels, you had to take into account what's above, what's
below, to the left, to the right.

- [3:42:24](https://www.youtube.com/watch?v=undefined&t=49344s) I mean, there was a lot of math


and arithmetic. And if you ultimately got it, it was probably a great sense of satisfaction. But that was
probably several hours later. In a language like Python, where there might be libraries that had been
written by others, on whose shoulders you can stand, we could perhaps do something like this.

- [3:42:40](https://www.youtube.com/watch?v=undefined&t=49360s) Let me go ahead and run a


program, or write a program, called Blur.py here. And in Blur.py, in VS Code, let me just do this. Let me
import from a library, not the CS50 library, but the Pillow library, so to speak, a keyword called image and
another one called image filter, then let me go ahead and say, let me open the current version of this
image, which is called Bridge.bmp.

- [3:43:04](https://www.youtube.com/watch?v=undefined&t=49384s) So the before version of the


image will be the result of calling image.open quote unquote "Bridge.bmp," and then, let me create an
after version. So you'll see before and after. After equals the before version .filter of image filter. And
there is, if I read the documentation, I'll see that there's something called a box blur, that allows you to
blur in box format, like one pixel above, below, left, and right.

- [3:43:30](https://www.youtube.com/watch?v=undefined&t=49410s) So I'll do one pixel there. And


then, after that's done, let me go ahead and save the file as something like Out.bmp. That's it. Assuming
this library works as described, I am opening the file in Python, using line 3. And this is somewhat new
syntax. In the world of Python, we're going to start making use of the dot operator more, because in the
world of Python, you have what's called object-oriented programming, or OOP, as a term of art.

- [3:43:56](https://www.youtube.com/watch?v=undefined&t=49436s) And what this means is that you


still have functions, you still have variables, but sometimes those functions are embedded inside of the
variables, or, more specifically, inside of the data types themselves. Think back to C. When you wanted to
convert something to uppercase, there was a to upper function that takes as input an argument that's a
char.

- [3:44:15](https://www.youtube.com/watch?v=undefined&t=49455s) And you can pass in any char you


want, and it will uppercase it for you and give you back a value. Well, you know what, if that's such a
common paradigm, where upper-casing chars is a useful thing, what the world of Python does is it
embeds into the string data type, or char if you will, the ability just to uppercase any char by treating the
char, or the string, as though it's a struct in C.

- [3:44:39](https://www.youtube.com/watch?v=undefined&t=49479s) Recall that structs encapsulate


multiple types of values. In object-oriented programming, in a language like Python, you can encapsulate
not just values, but also functionality. Functions can now be inside of structs. But we're not going to call
them structs anymore. We're going to call them objects. But that's just a different vernacular.

- [3:44:56](https://www.youtube.com/watch?v=undefined&t=49496s) So what am I doing here? Inside


of the image library, there's a function called open, and it takes an argument, the name of the file, to
open. Once I have a variable called before, that is a struct, or technically an object, inside of which is
now, because it was returned from this function, a function called filter, that takes an argument.
- [3:45:15](https://www.youtube.com/watch?v=undefined&t=49515s) The argument here happens to be
image.boxblur1, which itself is a function. But it just returns the filter to use. And then, after, dot save
does what you might think. It just saves the file. So instead of using fopen and fwrite, you just say dot
save, and that does all of that messy work for you. So it's just, what, four lines of code total? Let me go
ahead and go down to my terminal window.

- [3:45:37](https://www.youtube.com/watch?v=undefined&t=49537s) Let me go ahead and show you


with LS that, at the moment, whoops, sorry, let me not bother showing that, because I have other
examples to come. I'm going to go ahead and do Python of Blur.py, nope, sorry, wrong place. I did need
to make a command. There we go. OK, let me go ahead and type LS inside of my filter directory, which is
among the sample code online today.

- [3:45:58](https://www.youtube.com/watch?v=undefined&t=49558s) There's only one file called


Bridge.bmp, dammit, I'm trying to get these things ready at the same time. Let me rewind. Let me move
this code into place. All right, I've gone ahead and moved this file, Blur.py, into a folder called filter, inside
of which there's another file called Bridge.bmp, which we can confer with LS.

- [3:46:19](https://www.youtube.com/watch?v=undefined&t=49579s) Let me now go ahead and run


Python, which is my interpreter, and also the name of the language, and run Python on this file. So much
like running the Spanish algorithm through Google Translate, or something like that, as input, to get back
the English output, this is going to translate the Python language to something this computer, or this
cloud-based environment, understands, and then run the corresponding code, top to bottom, left to
right.

- [3:46:42](https://www.youtube.com/watch?v=undefined&t=49602s) I'm going to go ahead and Enter.


No error message is generally a good thing. If I type LS you'll now see out.bmp. Let me go ahead and
open that. And, you know what, just to make clear what's really happening, let me blur it even further.
Let's make a box that's not just one pixel around, but 10.

- [3:46:57](https://www.youtube.com/watch?v=undefined&t=49617s) So let's make that change. And let


me just go ahead and rerun it with Python of Blur.py. I still have Out.bmp. Let me go ahead and open
Out.bmp and show you first the before, which looks like this. That's the original. And now, crossing my
fingers, four lines of code later, the result of blurring it, as well.

- [3:47:16](https://www.youtube.com/watch?v=undefined&t=49636s) So the library is doing all of the


same kind of legwork that you all did for the assignment, but it's encapsulated it all into a single library,
that you can then use instead. Those of you who might have been feeling more comfortable, might have
done a little something like this. Let me go ahead and open up one other file, called Edges.py.

- [3:47:33](https://www.youtube.com/watch?v=undefined&t=49653s) And in Edges.py, I'm again going


to import from the Pillow library the image keyword, and the image filter. Then I'm going to go ahead
and create a before image, that's a result of calling image.open of the same thing, Bridge.bmp, then I'm
going to go ahead and run a filter on that, called image, whoops, image filter.

- [3:47:58](https://www.youtube.com/watch?v=undefined&t=49678s) find edges, which is like a content,


if you will, defined inside of this library for us. And then I'm going to do after.save quote unquote
"Out.bmp," using the same file name. I'm now going to run Python of Edges.py, after, sorry, user error.
We'll see what syntax error means soon. Let me go ahead and run the code now, Edges.py. Let me now
open that new file, Out.bmp.

- [3:48:21](https://www.youtube.com/watch?v=undefined&t=49701s) And before we had this, and now,


especially if what will look familiar if we did the more comfortable version of P set 4, we now get this,
after just four lines of code. So again, suggesting the power of using a language that's better optimized
for the tool at hand. And at the risk of really making folks sad, let's go ahead and re-implement, if we
could, problem set five, real quickly here.

- [3:48:43](https://www.youtube.com/watch?v=undefined&t=49723s) Let me go ahead and open


another version of this code, wherein I have a C version, just from problem set five, wherein you
implemented a spell checker, loading 100,000 plus words into memory. And then you kept track of just
how much time and memory it took. And that probably took a while, implementing all of those functions
in Dictionary.c.

- [3:49:03](https://www.youtube.com/watch?v=undefined&t=49743s) Let me instead now go into a new


file, called Dictionary.py. And let me stipulate, for the sake of discussion, that we already wrote in
advance, Speller.py, which corresponds to Speller.c. You didn't write either of those. Recall for problem
set five, we gave you Speller.c. Assume that we're going to give you Speller.py.

- [3:49:22](https://www.youtube.com/watch?v=undefined&t=49762s) So the onus on us right now is


only to implement Speller, Dictionary.py. All right, so I'm going to go ahead and define a few functions.
And we're going to see now the syntax for defining functions in Python. I want to go ahead and define
first, a hash table, which was the very first thing you defined in Dictionary.c.

- [3:49:41](https://www.youtube.com/watch?v=undefined&t=49781s) I'm going to go ahead, then, and


say words gets this, give me a dictionary, otherwise known as a hash table. All right, now let me define a
function called check, which was the first function you might have implemented. Check is going to take a
word, and you'll see in Python, the syntax is a little different.

- [3:49:57](https://www.youtube.com/watch?v=undefined&t=49797s) You don't specify the return type.


You use the word Def instead to define. You still specify the name of the function and any arguments
thereto. But you omit any mention of types. But you do use a colon and indent. So how do I check if a
word is in my dictionary, or in my hash table? Well, in Python, I can just say, if word in words, go ahead
and return true, else go ahead and return false, done, with the check function.

- [3:50:24](https://www.youtube.com/watch?v=undefined&t=49824s) All right, now I want to do like


load. That was the heavy lift, where you had to load the big file into memory. So let me define a function
called load. It takes a string, the name of a file to load. So I'll call that Dictionary, just like in C, but no data
type. Let me go ahead and open a file by using an open function in Python, by opening that Dictionary in
read mode.

- [3:50:43](https://www.youtube.com/watch?v=undefined&t=49843s) So this is a little similar to fopen, a


function in C you might recall. Then let me iterate over every line in the file. In Python, this is pretty
pleasant, for line in file colon indent. How, now, do I get at the current word, and then strip off the new
line, because in this file of words, 140,000 words, there's word backslash n, word backslash n, all right?
Well, let me go ahead and get a word from the current line, but strip off, from the right end of the string,
the new line, which the Rstrip function in Python does for me.
- [3:51:14](https://www.youtube.com/watch?v=undefined&t=49874s) Then let me go ahead and add to
my dictionary, or hash table, that word, done. Let me go ahead and close the file for good measure. And
then let me go ahead and return true, because all was well. That's it for the load function in Python. How
about the size function? This did not take any arguments, it just returns the size of the hash table or
dictionary in Python.

- [3:51:33](https://www.youtube.com/watch?v=undefined&t=49893s) I can do that by returning the


length of the dictionary in question. And then lastly, gone from the world of Python is malloc and free.
Memory is managed for you. So no matter what I do, there's nothing to unload. The computer will do
that for me. So I give you, in these functions, problem set five in Python.

- [3:51:51](https://www.youtube.com/watch?v=undefined&t=49911s) So, I'm sorry, we made you write


it in C first. But the implication now is that, what are you getting for free, in a language like Python? Well,
encapsulated in this one line of code is much of what you wrote for problem set five, implementing your
array for all of your letters of the alphabet or more, all of the linked lists that you implemented to create
chains, to store all of those words.

- [3:52:12](https://www.youtube.com/watch?v=undefined&t=49932s) All of that is happening. It's just


someone else in the world wrote that code for you. And you can now use it by way of a dictionary. And
actually, I can change this a little bit, because add is technically not the right function to use here. I'm
actually treating the dictionary as something simpler, a set.

- [3:52:28](https://www.youtube.com/watch?v=undefined&t=49948s) So I'm going to make one tweak,


set recall was another data type in Python. But set just allows it to handle duplicates, and it allows me to
just throw things into it by literally using a function as simple as add. And I'm going to make one other
tweak here, because, when I'm checking a word, it's possible it might be given to me in uppercase or
capitalized.

- [3:52:49](https://www.youtube.com/watch?v=undefined&t=49969s) It's not going to necessarily come


in in the same lowercase format that my dictionary did. I can force every word to lowercase by using
word.lower. And I don't have to do it character for character, I can do the whole darn string at once, by
just saying word.lower. All right, let me go ahead and open up a terminal window here.

- [3:53:09](https://www.youtube.com/watch?v=undefined&t=49989s) And let me go into, first, my C


version, on the left. And actually I'm going to go ahead and split my terminal window into two. And on
the right, I'm going to go into a version that I essentially just wrote. But it's also available online, if you
want to play along afterward. I'm going to go ahead and make speller in C on the left, and note that it
takes a moment to compile.

- [3:53:29](https://www.youtube.com/watch?v=undefined&t=50009s) Then I'm going to be ready to run


speller of dictionaries, let's do like the Sherlock Holmes text, which is pretty big. And then over here, let
me get ready to run Python of speller on texts/homes.txt2. So the syntax is a little different at the
command prompt. I just, on the left, have to compile the code, with make, and then run it with ./speller.

- [3:53:51](https://www.youtube.com/watch?v=undefined&t=50031s) On the right, I don't need to


compile it. But I do need to use the interpreter. So even though the lines are wrapping a little bit here, let
me go ahead and run it on the right. And I'm going to count how long it takes, verbally, for
demonstration sake. One Mississippi, two Mississippi, three Mississippi, OK, so it's like three seconds,
give or take.

- [3:54:08](https://www.youtube.com/watch?v=undefined&t=50048s) Now running it in Python, keeping


in mind, I spent way fewer hours implementing a spell checker in Python than you might have in problem
set five. But what's the trade-off going to be, and what kinds of design decisions do we all now need to
be making consciously? Here we go, on the right, in Python. One Mississippi, two Mississippi, three
Mississippi, four Mississippi, five Mississippi, six Mississippi, seven Mississippi, eight Mississippi, nine
Mississippi, 10 Mississippi, 11 Mississippi, all right, so 10 or 11 seconds.

- [3:54:37](https://www.youtube.com/watch?v=undefined&t=50077s) So which one is better? Let's go to


the group here, which of these programs is the better one? How might you answer that question, based
on demonstration alone? What do you think? AUDIENCE: I think Python's better for the programmer,
more comfortable for the programmer, but C is better for the user.

- [3:54:54](https://www.youtube.com/watch?v=undefined&t=50094s) DAVID J. MALAN: OK, so Python,


to summarize, is better for the programmer, because it was way faster to write, but C is maybe better for
the computer, because it's much faster to run. I think that's a reasonable formulation. Other opinions?
Yeah. AUDIENCE: I think it depends on the size of the project that you're dealing with.

- [3:55:10](https://www.youtube.com/watch?v=undefined&t=50110s) So if it's going to be something


that's relatively quick, I might not care that it takes 10 seconds to do it. And it could be way faster to do it
with Python. Whereas with C, if I'm dealing with something like a massive data set or something huge,
then that time is going to really build up on, it might be worth it to put in the upfront effort and just load
it into C, so the process continually will run faster over a longer period of time.

- [3:55:33](https://www.youtube.com/watch?v=undefined&t=50133s) DAVID J. MALAN: Absolutely, a


really good answer. And let me summarize, is it depends on the workload, if you will. If you have a very
large data set, you might want to optimize your code to be as fast and performant as it can be, especially
if you're running that code again and again. Maybe you're a company like Google.

- [3:55:47](https://www.youtube.com/watch?v=undefined&t=50147s) People are searching a huge


database all the time. You really want to squeeze every bit of performance as you can out of the
computer. You might want to have someone smart take a language like C and write it at a very low level.
It's going to be painful. They're going to have bugs. They're going to have to deal with memory
management and the like.

- [3:56:03](https://www.youtube.com/watch?v=undefined&t=50163s) But if and when it works correctly,


it's going to be much faster, it would seem. By contrast, if you have a data set that's big, and 140,000
words is not small, but you don't want to spend like 5 hours, 10 hours, a week of your time, building a
spell checker or a dictionary, you can instead leverage a different language with different libraries and
build on top of it, in order to prioritize the human time instead.

- [3:56:25](https://www.youtube.com/watch?v=undefined&t=50185s) Other thoughts? AUDIENCE:


Would you, because with Python, doesn't it also like convert the words, or like convert the words, for a
lesson? When we convert that into the same version again, do we just take that into view? DAVID J.
MALAN: That's a perfect segue to exactly the next point we wanted to make, which was, is there
something in between? And indeed there is.
- [3:56:47](https://www.youtube.com/watch?v=undefined&t=50207s) I'm oversimplifying what this
language is actually doing. It's not as stark a difference as saying, like, hey, Python is four times slower
than C. Like that's not the right takeaway. There are absolutely ways that engineers can optimize
languages, as they have already done for Python. And in fact, I've configured my settings in such a way
that I've kind of dramatized just how big the difference is.

- [3:57:05](https://www.youtube.com/watch?v=undefined&t=50225s) It is going to be slower, Python,


typically, than the equivalent C program. But it doesn't have to be as big of a gap as it is here, because,
indeed, among the features you can turn on in Python is to save some intermediate results. Technically
speaking, yes, Python is interpreting Dictionary.

- [3:57:23](https://www.youtube.com/watch?v=undefined&t=50243s) py and these other files,


translating them from one language to another. But that doesn't mean it has to do that every darn time
you run the program. As you propose, you can save, or cache, C-A-C-H-E, the results of that process. So
that the second time and the third time are actually notably faster. And, in fact, Python itself, the
interpreter, the most popular version thereof, itself is actually implemented in C.

- [3:57:43](https://www.youtube.com/watch?v=undefined&t=50263s) So you can make sure that your


interpreter is as fast as possible. And what then is maybe the high level takeaway? Yes, if you are going to
try to squeeze every bit of performance out of your code, and maybe code is constrained. Maybe you
have very small devices. Maybe it's like a watch nowadays. Or maybe it's a sensor that's installed in some
small format in an appliance, or in infrastructure, where you don't have much battery life and you don't
have much size, you might want to minimize just how much work is being done.

- [3:58:10](https://www.youtube.com/watch?v=undefined&t=50290s) And so the faster the code runs,


and the better it's going to be, if it's implemented something low level. So C is still very commonly used
for certain types of applications. But, again, if you just want to solve real world problems, and get real
work done, and your time is just as, if not more, valuable than the device you're running it on, long term,
you know what, Python is among the most popular languages as well.

- [3:58:32](https://www.youtube.com/watch?v=undefined&t=50312s) And frankly, if I were


implementing a spell checker moving forward, I'm probably starting with Python. And I'm not going to
waste time implementing all of that low-level stuff, because the whole point of using newer, modern
languages is to use abstractions that other people have created for you. And by abstraction, I mean
something like the dictionary function, that just gives you a dictionary, or hash table, or the equivalent
version that I used, which in this case was a set.

- [3:58:56](https://www.youtube.com/watch?v=undefined&t=50336s) All right, any questions, then, on


Python thus far? No, all right. Oh, yeah, in the middle. AUDIENCE: Could you compile the Python code, or
is there some, I'd imagine that with the audience that can happen, but it feels like if you can just come
up with a Python compiler, that would give you the best of both worlds.

- [3:59:17](https://www.youtube.com/watch?v=undefined&t=50357s) DAVID J. MALAN: Really good


question or observation, could you just compile Python code? Yes, absolutely, this idea of compiling code
or interpreting code is not native to the language itself. It tends to be native to the conventions that we
humans use. So you could actually write an interpreter for C that would read it top to bottom, left to
right, converting it to, on the fly, something the computer understands, but historically that's not been
the case.

- [3:59:38](https://www.youtube.com/watch?v=undefined&t=50378s) C is generally a compiled


language. But it doesn't have to be. What Python nowadays is actually doing is what you described
earlier. It technically is, sort of unbeknownst to us, compiling the code, technically not into 0's and 1's,
technically into something called byte code, which is this intermediate step that just doesn't take as
much time as it would to recompile the whole thing.

- [3:59:58](https://www.youtube.com/watch?v=undefined&t=50398s) And this is an area of research for


computer scientists working in programming languages, to improve these kinds of paradigms. Why?
Well, honestly, for you and I, the programmer, it's just much easier to, one, run the code and not worry
about the stupid second step of compiling it all the time. Why? It's literally half as many steps for me, the
human.

- [4:00:15](https://www.youtube.com/watch?v=undefined&t=50415s) And that's a nice thing to optimize


for. And ultimately, too, you might want all of the fancy features that come with these other languages.
So you should really just be fine-tuning how you can enable these features, as opposed to shying away
from them here. And, in fact, the only time I personally ever use C is from like September to October of
every year, during CS50.

- [4:00:34](https://www.youtube.com/watch?v=undefined&t=50434s) Almost every other month do I


reach for Python, or another language called JavaScript, to actually get real work done, which is not to
impugn C. It's just that those other languages tend to be better fits for the amount of time I have to
allocate, and the types of problems that I want to solve. All right, let's go ahead and take a five minute
break here.

- [4:00:51](https://www.youtube.com/watch?v=undefined&t=50451s) And when we come back, we'll


start writing some programs from Scratch. All right. So let's go ahead and start writing some code from
the beginning here, whereby we start small with some simple examples, and then we'll build our way up
to more sophisticated examples in Python. But what we'll do along the way is first, look side by side at
what the C code looked like way back in week 1 or 2 or 3 and so forth, and then write the corresponding
Python code at right.

- [4:01:13](https://www.youtube.com/watch?v=undefined&t=50473s) And then we'll transition just to


focusing on Python itself. What I've done in advance today is I've downloaded some of the code from the
course's website, my source 6 directory, which contains all of the pre-written C code from weeks past.
But it'll also have copies of the Python code we'll write here together and look at.

- [4:01:28](https://www.youtube.com/watch?v=undefined&t=50488s) So first, here is Hello.c back from


week 0. And this was version 0 of it. I'm going to go ahead and do this. I'm going to go ahead and split
my code window up here. I'm going to go ahead and create a new file called Hello.py. And this isn't
something you'll typically have to do, laying your code out side by side.

- [4:01:45](https://www.youtube.com/watch?v=undefined&t=50505s) But I've just clicked the little icon


in VS Code that looks like two columns, that splits my code editor into two places, so that we can, in fact,
see things, for now, side by side, with my terminal window down below. All right, now I'm going to go
ahead and write the corresponding Python program on the right, which, recall, was just print, quote
unquote, "Hello, world," and that's it.

- [4:02:04](https://www.youtube.com/watch?v=undefined&t=50524s) Now down in my terminal


window, I'm going to go ahead and run Python of Hello.py, Enter, and voila, we've got Hello.py working.
So again, I'm not going to play any further with the C code. It's there just to jog your memory left and
right. So let's now look at a second version of Hello, world from that first week, whereby if I go and get
Hello1.

- [4:02:21](https://www.youtube.com/watch?v=undefined&t=50541s) c, I'm going to drag that over to


the right. Whoops, I'm going to go ahead and drag that over to the left here. And now, on the right, let's
modify Hello.py to look a little more like this second version in C, all right? I want to get an answer from
the user as a return value, but I also want to get some input from them.

- [4:02:38](https://www.youtube.com/watch?v=undefined&t=50558s) So from CS50, I'm going to import


the function called getString for now. We're going to get rid of that eventually, but for now, it's a helpful
training wheel. And then down here, I'm going to say, answer equals getString quote unquote, "What's
your name"? Question mark, space.

- [4:02:53](https://www.youtube.com/watch?v=undefined&t=50573s) But no semicolon, no data type.


And then I'm going to go ahead and print, just like the first example on the slide, Hello, comma space
plus answer. And now let me go ahead and run this. Python, of Hello.py, all right, it's asking me what's
my name. David. Hello comma David. But it's worth calling attention to the fact that I've also simplified
further.

- [4:03:13](https://www.youtube.com/watch?v=undefined&t=50593s) It's not just that the individual


functions are simpler. What is also now glaringly omitted from my Python code at right, both in this
version, and the previous version. What did I not bother implementing? AUDIENCE: The main code.
DAVID J. MALAN: Yeah, so I didn't even need to implement main. We'll revisit the main function, because
having a main function actually does solve problems sometimes.

- [4:03:31](https://www.youtube.com/watch?v=undefined&t=50611s) But it's no longer required. In C


you have to have that to kick-start the entire process of actually running your code. And in fact, if you
were missing main, as you might have experienced if you accidentally compiled Helpers.c instead of the
file that contained main, you would have seen a compiler error.

- [4:03:45](https://www.youtube.com/watch?v=undefined&t=50625s) In Python it's not necessary. In


Python you can just jump right in, start programming, and boom, you're good to go. Especially if it's a
small program like this, you don't need the added overhead or complexity of a main function. So that's
one other difference here. All right, there are a few other ways we could say Hello, world.

- [4:04:00](https://www.youtube.com/watch?v=undefined&t=50640s) Recall that I could use a format


string. So I could put this whole thing in quotes, I could use this f prefix. And then let me go ahead and
run Python of Hello.py again. You can perhaps see where we're going with this. Let me type my name,
David, and here we go. OK, that's the mistake that someone identified earlier, you need the curly braces.

- [4:04:18](https://www.youtube.com/watch?v=undefined&t=50658s) Otherwise no variables are


interpolated, that is substituted, with their actual values. So if I go back in and add those curly braces to
the F string, now let me run Python of Hello.py, type in my name, and there we go. We're back in
business. Which one's better? I mean, it depends. But generally speaking, making shorter, more concise
code tends to be a good thing.

- [4:04:38](https://www.youtube.com/watch?v=undefined&t=50678s) So stylistically, the F string is


probably a reasonable instinct to have. All right, well, what more can we do besides this? Well, let me go
ahead here and let's get rid of the training wheel altogether, actually. So same C code at left. Let me get
rid of the CS50 library, which we will ultimately, in a couple of weeks, anyway.

- [4:04:56](https://www.youtube.com/watch?v=undefined&t=50696s) I can't use getString, but I can use


a function that comes with Python called input. And, in fact, this is actually a one-for-one substitution,
pretty much. There's really no downside to using input instead of getString. We implement getString just
for consistency with what you saw in C. Python of Hello.py, what's your name, David.

- [4:05:14](https://www.youtube.com/watch?v=undefined&t=50714s) Still actually works the same. So


gone are the CS50 specific training wheels. But we're going to bring them back shortly, just to deal with
integers or floats or other values, too, because it's going to make our lives a little simpler, with error
checking. All right, any questions, before we now pivot to revisiting other examples from week 1, but
now in Python? All right, let me go ahead and open up now.

- [4:05:35](https://www.youtube.com/watch?v=undefined&t=50735s) Let's say Calculator0.c, which was


one of the first examples we did involving math and operators like that, as well as functions like getInt,
let me go ahead and create a new file now called Calculator.py, at right, so that I have my C code at left
still, and my Python code at right. All right, let me go dive into a translation of this code into Python.

- [4:05:57](https://www.youtube.com/watch?v=undefined&t=50757s) I am going to use getInt from the


CS50 library. So let me import that. I'm going to go ahead now and get an Int from the user. So x equals
getInt, and I'll ask them for an x value, just like we did weeks ago. No need to specify a semicolon,
though, or an Int for the x. It will just figure it out. Y is going to get another Int via y colon, and then
down here, I'm going to go ahead and say print of x plus y.

- [4:06:23](https://www.youtube.com/watch?v=undefined&t=50783s) So this is already a bit new. Recall,


the C version required that I use this format string, as well as printf itself. Python is just a little more user-
friendly. If all you want to do is print out a value, like x plus y, just print it. Don't futz with any percent
signs or format codes. It's not printf, it's indeed just print now.

- [4:06:42](https://www.youtube.com/watch?v=undefined&t=50802s) All right, let me go ahead and run


Python of Calculator.py, Enter, just do a quick sample, 1 plus 2 indeed equals 3. As an aside, suppose I
had taken a different approach to importing the whole CS50 library, functionally, it's the same. You're not
to notice any performance impact here. It's a small library.

- [4:06:59](https://www.youtube.com/watch?v=undefined&t=50819s) But notice what does not work


now, whereas it did work in C. Python of Calculator.py, Enter, we see our first traceback deliberately here.
So a traceback is just a term of art that says, here is a trace back through all of the functions that just got
executed. In the world of C, you might call this a stack trace, stack being the operative word.

- [4:07:19](https://www.youtube.com/watch?v=undefined&t=50839s) Recall that when we talked about


the stack and the heap, the stack, like a stack of trays, was all of the functions that might get called, one
after the other. We had main, we had swap, then swap went away, and then main finished, recall. So
here's a trace back of all of the functions or code that got executed.

- [4:07:35](https://www.youtube.com/watch?v=undefined&t=50855s) There's not really any functions


other than my file itself. Otherwise there'd be more detail. But even though it's a little cryptic, we can
perhaps infer from the output here, name error, so something related to the name of something, name,
getInt is not defined. And this of course, happens on line 3 over there.

- [4:07:51](https://www.youtube.com/watch?v=undefined&t=50871s) All right, so why is that? Well,


Python essentially allows us to namespace our functions that come from libraries. There was a problem
in C. If you were using the CS50 library, and thus had access to getInt, getString, and so forth, you could
not use another library that had the same function names. They would collide, and the compiler would
not know how to link them together correctly.

- [4:08:13](https://www.youtube.com/watch?v=undefined&t=50893s) In Python, and other languages


like JavaScript, and in Java, you have support for effectively what would be called namespaces. You can
isolate variables and function names to their own namespace, like their own container in memory. And
what this means is, if you import all of CS50, you have to say that the getInt you want is inside the CS50
library.

- [4:08:36](https://www.youtube.com/watch?v=undefined&t=50916s) So just like with the image


blurring, and the image edges before, where I had to specify image dot and image filter dot, similarly
here, am I specifying with a dot operator, albeit a little differently, that I want CS50.getInt in both places.
And now if I rerun Python of Calculator.py, 1 and 2, now we're back in business.

- [4:08:56](https://www.youtube.com/watch?v=undefined&t=50936s) Which one is better? Generally


speaking, it depends on just how many functions you're using from the library. If you're using a whole
bunch of functions, just import the whole thing. If you're only using maybe one or two, import them line
by line. All right, so let's go ahead and make a little tweak here.

- [4:09:12](https://www.youtube.com/watch?v=undefined&t=50952s) Let's get rid of this library and


take this training wheel off, too, as quickly as we introduced it, though for the problems set six you'll be
able to use all of these same functions. Suppose I get rid of this, and I just use the input function, just like
I did by replacing getString earlier. Let me go ahead now and run this version of the code.

- [4:09:31](https://www.youtube.com/watch?v=undefined&t=50971s) Python of Calculator.py, OK, how


about 1 plus 2 equals 3. Huh. All right, obviously wrong, incorrect. Can anyone explain what just
happened, based on instincts? What just happened here. Yeah. AUDIENCE: You want an answer? DAVID
J. MALAN: Sure, yeah. AUDIENCE: Say you have a number of strings that don't have Ints, so you would
part with them and say, printing one, two, better.

- [4:09:58](https://www.youtube.com/watch?v=undefined&t=50998s) DAVID J. MALAN: Exactly, Python


is interpreting, or treating, both x and y as strings, which is actually what the input function returns by
default. And so plus is now being interpreted as concatenation, as we defined it earlier. So x plus y isn't x
plus y mathematically, but in terms of string joining, just like in Scratch.

- [4:10:15](https://www.youtube.com/watch?v=undefined&t=51015s) So that's why we're getting 12, or


really one two, which isn't itself a number. It, too, is another string. So we somehow need to convert
things. And we didn't have this ability quite as easily in C. We did have like the A to i function, ASCII to
integer, which did allow you to do this. The analog in Python is actually just to do a cast, a typecast, using
Int.

- [4:10:36](https://www.youtube.com/watch?v=undefined&t=51036s) So just like in C, you can use the


keyword Int, but you use it a little differently. Notice that I'm not doing parenthesis Int close parenthesis
before the value. I'm using Int as a function. So indeed, in Python, Int is a function. Float is a function,
that you can pass values into, to do this kind of conversion.

- [4:10:55](https://www.youtube.com/watch?v=undefined&t=51055s) So now, if I run Python of


Calculator.py, 1 and 2, now we're back in business, and getting the answer of 3. But there's kind of a
catch here. There's always going to be a trade-off. Like that sounds amazing that it just works in this way.
We can throw away the CS50 library already. But what if the user accidentally types, or maliciously types
in, like a cat, instead of a number.

- [4:11:16](https://www.youtube.com/watch?v=undefined&t=51076s) Damn, well, there's one of these


trace backs. Like, now my program has crashed. This is similar in spirit to the kinds of segfaults that you
might have had in C. But they're not segfaults per se. It doesn't necessarily relate to memory. This time it
relates to actual runtime values, not being as expected.

- [4:11:32](https://www.youtube.com/watch?v=undefined&t=51092s) So this time it's not a name error,


it's a value error, invalid literal for Int with base 10 quote unquote "cat." So, again, it's written for sort of
a programmer, more than sort of a typical person, because it's pretty arcane, the language here. But let's
try to interpret it.

- [4:11:47](https://www.youtube.com/watch?v=undefined&t=51107s) Invalid literal, a literal is just


something someone typed for Int, which is the function name, with base 10. It's just defaulting to
decimal numbers. Cat is apparently not a decimal number. It doesn't look like it, therefore it can't be
treated like it. Therefore, there's a value error. So what can we do? Unfortunately, you would have to
somehow catch this error.

- [4:12:07](https://www.youtube.com/watch?v=undefined&t=51127s) And the only way to do that in


Python really is by way of another feature that C did not have, namely, what are called exceptions. An
exception is exactly what just happened, name error, value error. They are things that can go wrong
when your Python code is running, that aren't necessarily going to be detected until you run your code.

- [4:12:27](https://www.youtube.com/watch?v=undefined&t=51147s) So in Python, and in JavaScript,


and in Java, and other more modern languages, there's this ability to actually try to do something,
except if something goes wrong. And in fact, I'm going to introduce a bit of syntax here, even though we
won't have to use this much just yet. Instead of just blindly converting x to an Int, let me go ahead and
try to do that.

- [4:12:48](https://www.youtube.com/watch?v=undefined&t=51168s) And if there's an exception, go


ahead and say something like print, that is not an Int. And then I'm going to do something like exit, right
there. And let me go ahead and do this here. Let me try to get y, except if there's an exception. Then let
me go ahead and say, again, that is not an Int exclamation point.
- [4:13:13](https://www.youtube.com/watch?v=undefined&t=51193s) And then I'm going to exit from
there to, otherwise I'll go ahead and print x plus y. If I run Python of Calculator.py now, whoops, oh,
forgot my close quote, sorry. All right, so close quote, Python of Calculator.py, 1 and 2 still work. But if I
try to type in something wrong like cat, now it actually detects the error.

- [4:13:36](https://www.youtube.com/watch?v=undefined&t=51216s) So what is the CS50 library in


Python doing? It's actually doing that try and accept for you, because suffice it to say, otherwise your
programs for something simple, like a calculator, start to get longer and longer. So we factored that kind
of logic out to the CS50 getInt function and get float function.

- [4:13:51](https://www.youtube.com/watch?v=undefined&t=51231s) But underneath the hood, they're


essentially doing this, try except, but they're being a little more precise. They're detecting a specific
error, and they are doing it in a loop, so that these functions will get executed again and again. In fact,
the best way to do this is to say except if there's a value error, then print that error message out to the
user.

- [4:14:11](https://www.youtube.com/watch?v=undefined&t=51251s) And again, let's not get too into


the weeds here with this feature. We've already put into the CS50 library. But that's why, for instance,
we bootstrap things, by just using these functions out of the box. All right, let's do something more with
our calculator here. How about this. In the world of C, we had another version of this code, which
actually did some division by way of-- which actually did division of numbers, not just the addition
herein.

- [4:14:38](https://www.youtube.com/watch?v=undefined&t=51278s) So let me go ahead and close the


C version, and let's focus only on Python now, doing some of these same lines of codes. But I'm going to
go ahead and just assume that the user is going to cooperate and use proper input. So from CS50, import
getInt, that will deal with any errors for me. X gets getInt, ask the user for an Int x, y equals getInt, ask
the user for an Int y.

- [4:15:02](https://www.youtube.com/watch?v=undefined&t=51302s) And then, let's go ahead and do


this. Let's declare a variable called z, set it equal to x divided by y. Then let's go ahead and print z. Still no
need for a format string, I can just print out the variable's value. Let me go ahead and run Python of
Calculator.py. Let me do 1, 10, and I get 0.1.

- [4:15:20](https://www.youtube.com/watch?v=undefined&t=51320s) What did I get in C, though, if you


think back. What would we have happened in C? AUDIENCE: Zero? DAVID J. MALAN: Yeah, we would
have gotten zero in C. But why, in C, when you divide one Int by another, and those Ints are like 1 and 10
respectively? AUDIENCE: It'll give you an integer back. DAVID J.

- [4:15:40](https://www.youtube.com/watch?v=undefined&t=51340s) MALAN: It will give you what?


AUDIENCE: An integer back. DAVID J. MALAN: It will give you an integer back, and, unfortunately, 0.1, the
integer part of it is indeed zero. So this was an example of truncation. So truncation was an issue in C.
But it would seem as though this is no longer a problem in Python, insofar as the division operator
actually handles that for us.

- [4:15:58](https://www.youtube.com/watch?v=undefined&t=51358s) As an aside, if you want the old


behavior, because it actually is sometimes useful for rounding or flooring values, you can actually use
two slashes. And now you get the C behavior. So that now 1 divided by 10 is zero. So you don't give up
that capability, but at least it does a more sensible default. Most people, especially new programmers,
when dividing one value by another, would want to get 0.

- [4:16:21](https://www.youtube.com/watch?v=undefined&t=51381s) 1, not 0, for reasons that indeed


we had to explain weeks ago. But what about another problem we had with the world of floats before,
whereby there is imprecision? Let me go ahead and, somewhat cryptically, print out the value of z as
follows. I'm going to format it using an f-string. And I'm going to go ahead and format, not just z, because
this is essentially the same thing.

- [4:16:40](https://www.youtube.com/watch?v=undefined&t=51400s) Notice this, if I do Python of


Calculator.py, 1 and 10, I get, by default, just one significant digit. But if I use this syntax in Python, which
we won't have to use often, I can actually do in C like I did before, 50 significant digits after the decimal
point. So now let me rerun Python of Calculator.

- [4:17:01](https://www.youtube.com/watch?v=undefined&t=51421s) py 1 and 10, and let's see if


floating point imprecision is still with us. Unfortunately, it is. And you can see as much here, the f-string,
the format string, is just showing us now 50 digits instead of the default one. So we've not solved all
problems. But we have solved at least some. All right, before we pivot away from a mere calculator, any
questions now on syntax or concepts or the like? Yeah.

- [4:17:23](https://www.youtube.com/watch?v=undefined&t=51443s) AUDIENCE: Do you think the


double slash you get has merit, how do you comment on that? DAVID J. MALAN: How do you what? Oh,
how do you comment. Really good question, if you're using double slash for division with flooring or
truncation, like I described, how do you do a comment in Python. This is a comment.

- [4:17:40](https://www.youtube.com/watch?v=undefined&t=51460s) And the convention is actually to


use a complete sentence, like with a capital T here. You don't need a period unless there's multiple
sentences. And technically, it should be above the line of code by convention. So you would use a hash
symbol instead. Good question. I haven't seen those yet. All right, let's go ahead and make something
else here, how about.

- [4:17:57](https://www.youtube.com/watch?v=undefined&t=51477s) Let me go ahead and open up, for


instance, an example called Points1.c, which we saw a few weeks back. And let me go ahead on the
other side and create a file called Points.py. This was a program, recall, that asked the user how many
points they lost on the first assignment. And then it went ahead and just printed out whether they lost
fewer points than me, because I lost two, if you recall the photo, more points than me, or the same
points as me.

- [4:18:24](https://www.youtube.com/watch?v=undefined&t=51504s) Let me go ahead and zoom out so


we can see a bit more of this. And let me now, on the top right here, go about implementing this in
Python. So I want to first prompt the user for some number of points. So from CS50 let's import getInt,
so it handles the error-checking. Let's then do points equals getInt, and ask the user, how many points
did you lose, question mark.

- [4:18:44](https://www.youtube.com/watch?v=undefined&t=51524s) Then let's go ahead and say, if


points less than two, which was my value, print, you lost fewer points than me. Otherwise, if it's else if
points greater than 2, go ahead and print, you lost more points than me. Else let's go ahead and handle
the final scenario, which is you lost the same number of points as me.
- [4:19:11](https://www.youtube.com/watch?v=undefined&t=51551s) Before I run this, does anyone
want to point out a mistake I've already made? Yeah. AUDIENCE: Else if has to be elif. DAVID J. MALAN:
Yeah, so else if in C is actually now elif in Python. It's a single word. So let me change this to elif, and now
cross my fingers, Python of Points.py, suppose you lost three points on some assignment.

- [4:19:30](https://www.youtube.com/watch?v=undefined&t=51570s) You lost more points than my two.


If you only lost one point, you lost fewer points than me. So the logic is the same. But notice the code is
much tighter. In 10 total lines, we did in what was 24 lines, because we've thrown away a lot of the
syntax. The curly braces are no longer necessary. The parentheses are gone, the semicolons.

- [4:19:47](https://www.youtube.com/watch?v=undefined&t=51587s) So this is why it just tends to be


more pleasant pretty quickly, using a language like this. All right, let's do one other example here. In C,
recall that we were able to determine the parity of some number, if something is even or odd. Well, in
Python, let me go ahead and create a file called Parity.

- [4:20:06](https://www.youtube.com/watch?v=undefined&t=51606s) py, and let's look for a moment at


the C version at left. Here was the code in C that we used to determine the parity of a number. And,
really, the key takeaway from all these lines was just the remainder operator. And that one is still with us.
So this is a simple demonstration, just to make that point, if in Python, I want to determine whether a
number is even or odd.

- [4:20:25](https://www.youtube.com/watch?v=undefined&t=51625s) Well, let's go ahead and from


CS50, import getInt, then let's go ahead and get a number like n from the user, using getInt, and ask
them for n. And then let's go ahead and say, if n percent sign 2 equals 0, then let's go ahead and print
quote unquote "Even." Else let's go ahead and print out Odd, but before I run this, anyone want to
instinctively, even though we've not talked about this, point out a mistake here? What I did wrong?
AUDIENCE: Double equals.

- [4:20:57](https://www.youtube.com/watch?v=undefined&t=51657s) DAVID J. MALAN: Yeah, so double


equals. Again, so even though some of the stuff is changing, some of the same ideas are the same. So
this, too, should be a double equal sign, because I'm comparing for equality here. And why is this the
right math? Well, if you divide a number by 2, it's either going to have 0 or 1 as a remainder.

- [4:21:13](https://www.youtube.com/watch?v=undefined&t=51673s) And that's going to determine if


it's even or odd for us. So let's run Python of Parity.py, type in a number like 50, and hopefully we get,
indeed, even. So again, same idea, but now we're down to eight lines of code instead of the 20. Well,
let's now do something a little more interactive and a little representative of tools that actually ask the
user questions.

- [4:21:31](https://www.youtube.com/watch?v=undefined&t=51691s) In C, recall that we had this


agreement program, Agree.c. And then let's go ahead and implement a corresponding version in Python,
in a file called Agree.py. And let's look at the C version first. On the left, we used get char here. And then
we used the double vertical bars to check if C is equal to capital Y or lowercase y.

- [4:21:53](https://www.youtube.com/watch?v=undefined&t=51713s) And then we did the same thing


for n for no. And so let's go over here and let's do from CS50, import get-- OK, get char is not a thing. And
this here is another difference with Python. There is no data type for individual characters. You have
strings, STRs, and, honestly, those are fine, because if you have a STR that's just one character, for all
intents and purposes, it is just a single character.

- [4:22:17](https://www.youtube.com/watch?v=undefined&t=51737s) So it's just a simplification. You


don't have to think as much. You don't have to worry about double quotes, single quotes. In fact, in
Python, you can use double quotes or single quotes, so long as you're consistent. So long as you're
consistent, the single quotes do not mean something different, like they do in C.

- [4:22:32](https://www.youtube.com/watch?v=undefined&t=51752s) So I'm going to go ahead and use


getString here, although, strictly speaking, I could just use the input function, as we saw before. I'm
going to get a string from the user that asks them this, getString, quote unquote, "Do you agree," like a
little checkbox or interactive prompt, where you have to say yes or no, you want to agree to the
following terms, or whatnot.

- [4:22:51](https://www.youtube.com/watch?v=undefined&t=51771s) And then let's translate the


conditionals to Python, now, too. So if S equals equals quote-unquote "Y," or S equals equals lowercase y,
let's go ahead and print out agreed, just like in C, elif S equals equals N or S equals equals little n. Let's go
ahead, then, and print out not agreed.

- [4:23:15](https://www.youtube.com/watch?v=undefined&t=51795s) And you can already see, perhaps,


one of the differences here, too. Is Python a little more English-like, in that you just literally use the
English word or, instead of the two vertical bars. But it's ultimately doing the same thing. Can we simplify
this code a bit, though. This would be a little annoying if we wanted to add support, not just for big Y and
little y, but Yes or big Yes or little yes or big Y, lowercase e, capital S, right? There's a lot of permutations
of Y-E-S or just y, that we ideally should tolerate.

- [4:23:45](https://www.youtube.com/watch?v=undefined&t=51825s) Otherwise, the user is going to


have to type exactly what we want, which isn't very user-friendly. Any intuition for how we could
logically, even if you don't know how to do it in code, make this better? Yeah. AUDIENCE: Write way over
the list, and then up, it's like the things in the list. DAVID J. MALAN: Nice, yeah, we saw an example of a
list before, just 0, 1, 2.

- [4:24:04](https://www.youtube.com/watch?v=undefined&t=51844s) Why don't we take that same idea


and ask a similar question. If S is in the following list of values, Y or little y, or heck, let me add to the list
now, yes, or maybe all capital YES. And it's going to get a little annoying, admittedly, but this is still better
than the alternative, with all the or's.

- [4:24:20](https://www.youtube.com/watch?v=undefined&t=51860s) I could do things like this, and so


forth. There's a whole bunch more permutations. But let's leave this alone, and let me just go into here
and change this to, if S is in the following list of N or little n or no, and I won't do as, let's just not worry
about the weird capitalizations there, for now.

- [4:24:38](https://www.youtube.com/watch?v=undefined&t=51878s) Let's go ahead and run this.


Python of Agree.py, do I agree? Y. OK, how about yes? All right, how about big Yes. OK, that does not
seem to work. Notice it did not say agreed, and it did not say not agreed. It didn't detect it. So how can I
do this? Well, you know what I could do, what I don't really need the uppercase and lowercase.
- [4:24:59](https://www.youtube.com/watch?v=undefined&t=51899s) Let me tighten this list up a little
bit. And why don't I just force S to be lowercase. S.lower, recall, whether it's one character or more, is a
function built into STRs now, strings in Python, that forces the whole thing to lowercase. So now, watch
what I can do. Python of Agree.py, little y, that works, big Y, that works.

- [4:25:19](https://www.youtube.com/watch?v=undefined&t=51919s) Big Yes, that works, big Y, little e,


big S, that also works. So we've now handled, in one fell swoop, a whole bunch more logic. And you
know what, we can tighten this up a bit. Here's an opportunity, in Python, for slightly better design.
What have I done in here that's a little redundant? Does anyone see an opportunity to eliminate a
redundancy, doing something more times than you need.

- [4:25:43](https://www.youtube.com/watch?v=undefined&t=51943s) Is a stretch here, no. Yep.


AUDIENCE: You can do S dot lower, above. DAVID J. MALAN: We could move the S dot lower above.
Notice that I'm using S dot lower twice. But it's going to give me the same answer both times. So I could
do a couple of things here. I could, first of all, get rid of this lower, and get rid of this lower, and then
above this, maybe I could do something like this, S equal-- I can't just do this, because that throws the
value away.

- [4:26:08](https://www.youtube.com/watch?v=undefined&t=51968s) It does the math, but it doesn't


convert the string itself. It's going to return a value. So I have to say S equals s.lower. I could do that. Or,
honestly, I can chain these things together. And this is not something we saw in C. If getString returns a
string, and strings have functions like lower in them, you can chain these functions together, like this,
and do dot this, dot that, dot this other thing.

- [4:26:30](https://www.youtube.com/watch?v=undefined&t=51990s) And eventually you want to stop,


because it's going to become crazy long. But this is reasonable, still fits on the screen. It's pretty tight. It
does in one place what I was doing in two. So I think that's OK. Let me go ahead and do Python of
Agree.py one last time. Let's try it one last time.

- [4:26:44](https://www.youtube.com/watch?v=undefined&t=52004s) And it's still working as intended.


Also if I tried those other inputs as well. Yeah, question. AUDIENCE: Could you add on like a for
uppercase as well, for like upper, and then cover all the functions where it's lowercase, for all the
functions where it's uppercase as well, or could you not just do this again.

- [4:27:02](https://www.youtube.com/watch?v=undefined&t=52022s) DAVID J. MALAN: Let me


summarize. Could we handle uppercase and lowercase together in some form? I'm actually doing that
already. I just have to pick a lane. I have to either be all lowercase in my logic or all uppercase, and not
worry about what the human types in, because no matter what the human types in, I'm forcing their
input to lowercase.

- [4:27:21](https://www.youtube.com/watch?v=undefined&t=52041s) And then I am using a lowercase


list of values. If I want to flip that, fine. I just have to be self-consistent. But I'm handling that already.
Yeah. AUDIENCE: Are strings no longer an array of characters? DAVID J. MALAN: A really good loaded
questions are strings no longer an array of characters? Conceptually, yes, underneath the hood, no.

- [4:27:41](https://www.youtube.com/watch?v=undefined&t=52061s) They're a little more sophisticated


than that, because with strings, you have a few changes. Not only do they have functions built into them,
because strings are now what we call objects, in what's called object-oriented programming. And we're
going to keep seeing examples of this dot operator.

- [4:27:54](https://www.youtube.com/watch?v=undefined&t=52074s) They are also immutable, so to


speak, I-M-M-U-T-A-B-L-E. Immutable means they cannot be changed, which means, unlike C, you can't
go into a string and change its individual characters. You can make a copy of the string that makes a
change, but you can't change the original string itself. This is both a little annoying, maybe, sometimes.

- [4:28:12](https://www.youtube.com/watch?v=undefined&t=52092s) But it's also pretty protective,


because you can't do screw-ups like I did weeks ago, when I was trying to copy S and call it T. And then
one affected the other. Python, underneath the hood, is handling all of the memory management and
the pointers and all of that. There are no pointers in Python.

- [4:28:28](https://www.youtube.com/watch?v=undefined&t=52108s) So If that wasn't clear, all of that


pain, if you will, all of that power, is now handled by the language itself, not by us, the programmers. All
right, so let's introduce maybe some loops, like we've been in the habit of doing. Let me open up
Meow.c, which was an example in C, just meowing a bunch of times textually.

- [4:28:46](https://www.youtube.com/watch?v=undefined&t=52126s) Let me create a file called


Meow.py here on the right. And notice on the left, this was correct code in C, but it was kind of poorly
designed. Why? Because it was a missed opportunity for a loop. Why say something three times when
you can say it just once? So in Python, let me do it the poorly designed way first.

- [4:29:03](https://www.youtube.com/watch?v=undefined&t=52143s) Let me print out meow. And, like I


generally should not, let me copy, paste it three times, run Python of Meow.py, and it works. OK, but not
good practice. So let me go ahead and improve this a little bit. And there's a few ways to do this. If I
wanted to do this three times, I could instead do something like this.

- [4:29:21](https://www.youtube.com/watch?v=undefined&t=52161s) For i in range of 3, recall that that


was the better version, rather than arbitrarily enumerate numbers yourself, let me go ahead and print
out quote unquote "Meow." Now if I run Python of Meow, still seems to work. So it's a little tighter, and,
my God, like, programs can't really get much shorter than this.

- [4:29:36](https://www.youtube.com/watch?v=undefined&t=52176s) We're down to two lines of code,


no main function, no gratuitous syntax. Let's now improve the design further, like we did in C, by
introducing a function called meow, that actually does the meowing. So this was our first abstraction,
recall, both in Scratch and in C.

- [4:29:55](https://www.youtube.com/watch?v=undefined&t=52195s) Let me focus now entirely on the


Python version here. Let me go ahead and first define a function. Let me first go ahead and do this, for i
in range of 3, let's assume for the moment that there's a meow function, that I'm just going to call. Let's
now go ahead and define, using the Def key word, which we saw briefly with the speller demonstration,
a function called meow that takes no arguments.

- [4:30:19](https://www.youtube.com/watch?v=undefined&t=52219s) And all it does for now is print


meow. Let me now go ahead and run Python of Meow.py Enter, huh, one of those trace backs. So this is
another name error. And, again, name meow is not defined. What's your instinct here, even though
we've not tripped over this yet in Python? Where does your mind go here? Yeah.
- [4:30:40](https://www.youtube.com/watch?v=undefined&t=52240s) AUDIENCE: Does it read top to
bottom, left to right? I'm guessing we could find a new case. DAVID J. MALAN: Perfect, as smart, as
smarter as Python seems to be, it still makes certain assumptions. And if it hasn't seen a keyword yet, it
just doesn't exist. So if you want it to exist, we have to be a little clever here.

- [4:30:58](https://www.youtube.com/watch?v=undefined&t=52258s) I could just put it, flip it around,


like this. But this honestly isn't particularly good design. Why? Because now, if you, the reader of your
code, whether you wrote it or someone else, you kind of have to go fishing now. Like where does this
program begin? And even though, yes, it's obvious that it begins on line four, logically, like, if the file
were longer, you're going to be annoyed and fishing visually for the right lines of code.

- [4:31:20](https://www.youtube.com/watch?v=undefined&t=52280s) So let's reintroduce main. And


indeed, this would be a common paradigm. When you want to start having abstractions in your own
functions, just put your own code in main, so that, one, you can leave it up top, and two, you can solve
the problem we just encountered. So let me define a function called main that has that same loop,
meowing three times.

- [4:31:37](https://www.youtube.com/watch?v=undefined&t=52297s) But now watch what happens. Let


me go into my terminal and run Python of Meow.py, Enter. Nothing. All right, investigate this. What could
explain this symptom. I have not told you the answer yet. So all you have is your instinct, assuming
you've never touched Python before. What might explain this symptom, where nothing is meowing?
Yeah? AUDIENCE: Didn't run the main function.

- [4:32:05](https://www.youtube.com/watch?v=undefined&t=52325s) DAVID J. MALAN: Yeah, I didn't


run the main function. So in C, this is functionality you get for free. You have to have a main function.
But, heck, so long as you make it, it will be called for you. In Python, this is just a convention, to create a
main function, borrowing a very common name for it. But if you want to call that main function, you
have to do it.

- [4:32:23](https://www.youtube.com/watch?v=undefined&t=52343s) So this looks a little weird,


admittedly, that you have to call your own main function now, and it has to be at the bottom of the file,
because only once the interpreter gets to the bottom of the file, have all of your functions been defined,
higher up. But this solves both problems. It keeps your code, that's the main part of your code, at the
very top of the file.

- [4:32:40](https://www.youtube.com/watch?v=undefined&t=52360s) So it's just obvious to you, and a


TF, or any reader in the future, where the program logically starts. But it also ensures that main is not
called until everything else, main included, has been defined. So this is another perfect example of we're
learning a new language for the first time. You're not going to have heard all of the answers before.

- [4:32:58](https://www.youtube.com/watch?v=undefined&t=52378s) Just apply some logic, as to, like,


all right, what could explain this symptom. Start to infer how the language does or doesn't work. If I now
go and run this, Python of Meow.py, now we're back in business. And just so you have seen it, there is a
quote unquote "better" way of doing this, that solves different problems that we are not going to
encounter, certainly in these initial days.

- [4:33:19](https://www.youtube.com/watch?v=undefined&t=52399s) Typically, you would see in online


tutorials or books, something that looks like this, where you actually have a weird conditional with
multiple underscores. That's functionally the same thing, but it solves problems with libraries, if we
ourselves were implementing a library or something similar in spirit.

- [4:33:34](https://www.youtube.com/watch?v=undefined&t=52414s) But we're going to keep things


simpler and just write main at the bottom, because we're not going to encounter that problem just yet.
All right, let's make one change to this, just to show how it's done. In C, the last version of meow also
took command line argument, sorry, also took arguments to the function meow.

- [4:33:50](https://www.youtube.com/watch?v=undefined&t=52430s) So suppose that I want to factor


this out. And I want to just call meow as a better abstraction, where I just say meow this number of
times. And I figure out how many times by just, like, putting in number 3 or using getInt or something
like that, to figure out how many times to say meow. Well, now, I have to define inside my meow
function, in input, let's call it n, and then use that, as by doing this, for i in range of n, let me go ahead
and print out meow that many times.

- [4:34:18](https://www.youtube.com/watch?v=undefined&t=52458s) So again, the only thing that's


different in C is we don't bother specifying return types for any of these functions, and we don't bother
specifying the type of our arguments or our variables. So same ideas, simpler in some sense. We're just
throwing away keystrokes. All right, let me run this one final time, Python of Meow.

- [4:34:36](https://www.youtube.com/watch?v=undefined&t=52476s) py, and we still have the same


program. All right, let me pause here. Any questions? And I know this is going fast. But hopefully, the C
code is still somewhat familiar. Yeah. AUDIENCE: Is there any difference between global and local
variables. DAVID J. MALAN: Good question. Is there any difference between global and local variables?
Short answer, yes, and we would run into that same problem, if we declare a variable in one function,
another function is not going to have access to it.

- [4:35:04](https://www.youtube.com/watch?v=undefined&t=52504s) We can solve that by putting


variables globally. But we don't have all of the features we had in C, like there's no such thing as a
constant in Python. The mentality in the Python community is, if you don't want some value to change,
don't touch it. Like just don't screw up. So there's trade-offs here, too.

- [4:35:19](https://www.youtube.com/watch?v=undefined&t=52519s) Some languages are stronger or


more defensive than that. But that, too, is part of the mindset with this particular language. [SIREN]
DAVID J. MALAN: Yeah. AUDIENCE: There is really only one green line, in the-- DAVID J. MALAN: Oh,
sorry, where's-- say it louder. AUDIENCE: There has only been one green line printed at a time.

- [4:35:35](https://www.youtube.com/watch?v=undefined&t=52535s) DAVID J. MALAN: That is an


amazing segue. Let's come to that in just a moment, because we're going to recreate also that Mario
example, where we had like the question marks for the coins and the vertical bars. So let's come back to
that in a second. And your question? AUDIENCE: If strings are immutable, and every time you like make a
copy.

- [4:35:50](https://www.youtube.com/watch?v=undefined&t=52550s) DAVID J. MALAN: Correct, strings


are immutable. Any time you seem to be modifying it, as with the lower function, you're getting back a
copy. So it's taking a little more memory somewhere. But you don't have to deal with it Python's doing
that for you. AUDIENCE: So you don't free anything.
- [4:36:05](https://www.youtube.com/watch?v=undefined&t=52565s) DAVID J. MALAN: Say it again? You
don't need what? AUDIENCE: You don't free like taking leave on stuff. DAVID J. MALAN: You don't free
anything. So if you weren't a big fan, over the past couple of weeks, of malloc or free or memory or
addresses, or all of those low level implementation details, Python is the language for you, because all of
that is handled for you automatically.

- [4:36:26](https://www.youtube.com/watch?v=undefined&t=52586s) Java does the same. JavaScript


does the same. Yeah. AUDIENCE: Each up for the variable, you put it before the name, use of the body
before the name, correct? Well, if there isn't a main function in Python, how do you define those words?
DAVID J. MALAN: How do you define a global variable if there's no main function in Python? Global
variables, by definition, always need to be outside of main, as well.

- [4:36:48](https://www.youtube.com/watch?v=undefined&t=52608s) So that's not a problem. If I


wanted to have a function that's outside of, and, therefore, global to all of these, like global-- actually,
don't use the word global, that's a special word in Python-- variable equals Foo, F-O-O, just as an
arbitrary string value that a computer scientist would typically use, that is now global.

- [4:37:08](https://www.youtube.com/watch?v=undefined&t=52628s) There are some caveats, though,


as to how you access that. But let's come back to that another time. But that problem is solvable, too. All
right. So let's go ahead and do this. To come back to the question about the print command, let me go
ahead and create a file now called Mario.py. Won't bother showing the C code anymore.

- [4:37:24](https://www.youtube.com/watch?v=undefined&t=52644s) We'll focus just on the new


language here. But recall that, in Python, in Mario, we wanted to first do something like this. This was a
random screen from the side scroller version 1 of Super Mario Brothers. And we just want to print like
three hashes to represent those three blocks. Well, in Python, we could do something like this, print, oh,
sorry, for i in the range of 3, go ahead and print out quote unquote "hash.

- [4:37:48](https://www.youtube.com/watch?v=undefined&t=52668s) " And I think this is pretty


straightforward. Python of Mario.py, we get our three hashes. You could imagine parameterizing this
now, though, and getting actual user input. So let's do that. Let me go up here and let me go and say
from CS50, import getInt, and then let's get the input from the user. So it actually is a value n, like, all
right, getInt the height of the column of bricks that you want to do.

- [4:38:15](https://www.youtube.com/watch?v=undefined&t=52695s) And then, let's go ahead and print


out n hashes instead of three. So let me run this. Let's print out like five hashes. OK, one, two, three,
four, five, that seems to work, too. And it's going to work for any positive value. But it's not going to work
for, how about negative 1? That just doesn't do anything.

- [4:38:31](https://www.youtube.com/watch?v=undefined&t=52711s) But that seems OK. But also recall


that it's not going to work if the user types in something weird, like, oh, sorry, it is going to work if the
user types in something weird like cat, why? We're using CS50's getInt function, which is handling all of
those headaches for us. But, what if the user indeed types a negative number? We're tolerating that.

- [4:38:53](https://www.youtube.com/watch?v=undefined&t=52733s) So that was the bug I wanted to


highlight. It would be nice to re-prompt them and re-prompt them. And in C, what was the programming
construct we used when we wanted to ask the user a question. And then, if they didn't cooperate,
prompt them again, prompt them again. What was that? Yeah. AUDIENCE: Do while loop.
- [4:39:07](https://www.youtube.com/watch?v=undefined&t=52747s) DAVID J. MALAN: Yeah, do while
loop, right? That was useful, because it's almost the same as a while loop. But instead of checking a
condition, and then doing something, you do something and then check a condition, which makes sense
with user input, because what are you even going to check if the user hasn't done anything yet? You
need that inverted logic.

- [4:39:23](https://www.youtube.com/watch?v=undefined&t=52763s) Unfortunately in Python, there is


no do while loop. There is a for loop. There is a while loop. And frankly, those are enough to recreate this
idea. And the way to do this in Python, the Pythonic way, which is another term of art in the community,
is to say this. Deliberately induce an infinite loop, while True, with capital T for true.

- [4:39:43](https://www.youtube.com/watch?v=undefined&t=52783s) And then do what you got to do,


like get an Int from a user, asking them for the height of this thing. And then, if that is what you want,
like a number greater than zero, go ahead and break out of the loop. So this is how, in Python, you could
recreate the idea of a do while loop. You deliberately induce an infinite loop.

- [4:40:04](https://www.youtube.com/watch?v=undefined&t=52804s) So something's going to happen


at least once. Then, if you get the answer you want, you break out of it, effectively achieving the same
logic. So this is the Pythonic way of doing a do while loop. Let me go ahead and run Python of Mario.py,
type in 3 this time. And now I get back just the 3 hashes as well.

- [4:40:21](https://www.youtube.com/watch?v=undefined&t=52821s) What if, though, I wanted to get


rid of, how about ultimately that CS50 library function, and also encapsulate this in a function. Well, let's
go ahead and tweak this a little bit. Let me go ahead and remove this temporarily. Give myself a main
function, so I don't make the same mistake as I did initially earlier.

- [4:40:40](https://www.youtube.com/watch?v=undefined&t=52840s) And let me give myself a function


called get height that takes no arguments. And inside of that function is going to be that same code. But I
don't want to break in this case, I want to return n. So, recall, that if you return from a function, you're
done, you're going to exit from right at that point.

- [4:40:56](https://www.youtube.com/watch?v=undefined&t=52856s) So this would be fine. You can just


say return n inside of the loop, or, if you would prefer to break out, you could do something like this
instead. Break, and then down here, you could return, down here, you could return n as well. And let me
make one point here before we go back up to main. This is a little different from C. And this one's subtle.

- [4:41:18](https://www.youtube.com/watch?v=undefined&t=52878s) What have I done here that in C


would have been a bug, but is apparently not, I claim, in Python. It's super subtle, this one. Yeah.
AUDIENCE: So aren't we like defining mostly object, like we're using it first, defining an object?
[INAUDIBLE] DAVID J. MALAN: So similar, it's not quite that we're using it first.

- [4:41:44](https://www.youtube.com/watch?v=undefined&t=52904s) So it's OK not to declare a variable


with like the data type. We've addressed that before, but on line 9, we're assigning n a value, it seems.
And then we return n on line 12. But notice the indentation. In the world of C, if we had declared a
variable inside of a loop, on line 9, it would have been scoped to that loop, which means as soon as you
get out of that loop, like further down in the program, n would not exist.
- [4:42:10](https://www.youtube.com/watch?v=undefined&t=52930s) It would be local to the curly
braces therein. Here, logically, curly braces are gone, but the indentation makes clear that n is still inside
of this loop, between lines 8 through 11. But n is actually still in scope in Python. The moment you create
a variable in Python, for better or for worse, It is available everywhere within that function, even outside
of the loop in which you defined it.

- [4:42:32](https://www.youtube.com/watch?v=undefined&t=52952s) So this logic is actually OK in


Python. In C, recall, to solve this same problem, we would have had to do something a little hackish like
this, like define n up here on line 8, so that it exists, now, on line 10, and so that it exists on line 13. That
is no longer an issue or need, in Python. Once you create a variable, even if it's nested, nested, nested
inside of some loops or conditionals, it still exists within the function itself.

- [4:43:00](https://www.youtube.com/watch?v=undefined&t=52980s) All right, any questions then on


this, before we now run this and then get rid of the CS50 library again? OK, so let me go ahead and get
the height from the user. Let's go ahead and create a variable in main called height. Let's call this get
height function. And then let's use that height value, instead of something hardcoded there.

- [4:43:20](https://www.youtube.com/watch?v=undefined&t=53000s) And let me see if this all works


now. Python of Mario.py. Hopefully, I haven't messed up, but I did. But this is an easy fix now. Yeah.
AUDIENCE: Got to call main. DAVID J. MALAN: I got to call main. So again, I deleted that earlier. But let
me bring it back. So I'm actually calling main. Let me rerun Python of Mario.py, there we go, height 3.

- [4:43:39](https://www.youtube.com/watch?v=undefined&t=53019s) Now it seems to be working. So


let's do one last thing with Mario, just to tie together that idea now of exceptions from before. Again,
exceptions are a feature of Python, whereby you can try to do something. And if there's a problem, you
can handle it in any way you see fit. Previously, I handled it by just yelling at the user that that's not an
Int.

- [4:43:57](https://www.youtube.com/watch?v=undefined&t=53037s) But let's actually use this to re-


implement CS50's own getInt function. Let me throw away CS50's getInt function. And now let me go
ahead and replace getInt with input. But it's not sufficient to just use input. What do I have to add to this
line of code on line 8? If I want to get back an Int? AUDIENCE: The Int function.

- [4:44:18](https://www.youtube.com/watch?v=undefined&t=53058s) DAVID J. MALAN: Yeah, I have to


cast it to an Int by calling the Int function around that value, or I could do it on a separate line, just to be
clear. I could also do n equals Int of n. That would work too, but it's sort of an unnecessary extra line.
This is not sufficient, because that does not change the value.

- [4:44:35](https://www.youtube.com/watch?v=undefined&t=53075s) It creates the value. But then it


throws it away. We need to assign it. So the conventional way to do this would probably be in one line,
just to keep things nice and tight. So that works fine now. If I run Python of Mario.py, I can still type in 3,
and all as well. I can still type in negative 1, because that is an Int that I am handling.

- [4:44:52](https://www.youtube.com/watch?v=undefined&t=53092s) What I'm not yet handling is


weird input like cat or some string that is not a base 10 number. So here, again, is my traceback. And
notice that here, let me scroll up a little bit, here we can actually see more detail in the traceback. Notice
that, just like in C, or just like in the debugger in VS Code, you can see a few things.
- [4:45:15](https://www.youtube.com/watch?v=undefined&t=53115s) You can see mention of module,
that just means your file, main, which is my main function, and get height. So notice, it's kind of
backwards. It's top to bottom instead of bottom up, as we drew it on the board the other day, and as we
envisioned stacks of trays in the cafeteria. But this is your stack, of functions that have been called, from
top to bottom.

- [4:45:31](https://www.youtube.com/watch?v=undefined&t=53131s) Get height is the most recent,


main is the very first, value error is the problem. So let's try to do, let's try to do this literally, except if
there's an error. So what do I want to do? I'm going to go in here, and I'm going to say, try to do the
following. Whoops, try to do the following, except if there's a value error, value error, then go ahead and
say something, well, like before, print, that's not an integer exclamation point.

- [4:46:00](https://www.youtube.com/watch?v=undefined&t=53160s) But the difference this time is


because I'm in a loop, the user is going to have a chance to recover from this issue. So if I run Mario.py, 3
still works as before. If I run Mario.py and type in cat, I detect it now, and because I'm still in that loop,
and because the program hasn't crashed, because I've caught, so to speak, the value error, using this line
of code here, that's the way in Python to detect these kinds of errors, that would otherwise end up being
on the user's own screen.

- [4:46:26](https://www.youtube.com/watch?v=undefined&t=53186s) If I type in cat, dog, that doesn't


work. If I type in, though, 2, I get my two hashes, because that's, indeed, an Int. Are any questions on
this, and we're not going to spend too much time on exceptions, but just wanted to show you what's
involved with getting rid of those training wheels.

- [4:46:40](https://www.youtube.com/watch?v=undefined&t=53200s) Yeah. AUDIENCE: Then the hash


marks in line. DAVID J. MALAN: OK, so let's do this. That actually comes to the earlier question about
printing the hashes on the same line, or maybe something like this, where we have the little bricks in the
sky, or little question marks. Let's recreate this idea, because the problem with print, as was noted
earlier, is you're automatically printing out new lines.

- [4:46:57](https://www.youtube.com/watch?v=undefined&t=53217s) But what if we don't want that.


Well, let's change this program entirely. Let me throw away all the functions. Let's just go to a simpler
world, where we're just doing this. So let me start fresh in Mario.py. I'm not going to bother with
exceptions or functions. Let's just do a very simple program, to create this idea, for i in range of 4 this
time, because there are four of these things in the sky.

- [4:47:19](https://www.youtube.com/watch?v=undefined&t=53239s) Let's go ahead and just print out a


question mark to represent each of those bricks. Odds are you know this not going to end well, because
these are unfortunately, as you've predicted, on separate lines. So it turns out that the print function
actually takes in multiple arguments, not just the thing you want to print, but also some additional
arguments, that allow you to specify what the default line ending should be.

- [4:47:43](https://www.youtube.com/watch?v=undefined&t=53263s) But what's interesting about this


is that, if you want to change the line ending to be something like, quote unquote, "that is nothing,"
instead of backslash n, this is not sufficient, because in Python, you can have two types of arguments, or
parameters. Some arguments are positional, which is the fancy way of saying it's a comma separated list
of arguments.
- [4:48:03](https://www.youtube.com/watch?v=undefined&t=53283s) And that's what we did all the
time in C. Something comma, something comma, something, we did it in printf all the time, and in other
functions that took multiple arguments. In Python, you have, not only positional arguments, where you
just separate them by commas, to give one or two or three or more arguments.

- [4:48:19](https://www.youtube.com/watch?v=undefined&t=53299s) There are also named arguments,


which looks weird but is helpful for reasons like this. If you read the documentation, you will see that
there is a named argument that Python accepts, called end. And if you set that equal to something, that
will be used as the end of every line, instead of the default, which the documentation will also say is
quote unquote backslash n.

- [4:48:41](https://www.youtube.com/watch?v=undefined&t=53321s) So this line here has no effect on


my logic at the moment. But if I change it to just quote unquote, essentially overriding the default new
line character, and now run Mario again, now I get all four on the same line. There's a bit of a bug,
though. My prompt is not meant to be on the same line. So I can fix that by just printing nothing.

- [4:49:02](https://www.youtube.com/watch?v=undefined&t=53342s) But, really, it's not nothing,


because you get the new line for free. So let me run Python of Mario.py again, and now we have what I
intended in the first place, which was a little something that looked like this. And this is just one example
of an argument that has a name. But this is a common paradigm in Python 2, to not just separate things
by commas, but to be very specific, because the print function might take 5, 10, even 20 different
arguments.

- [4:49:27](https://www.youtube.com/watch?v=undefined&t=53367s) And my God, if you had to


enumerate like 10 or 20 commas, you're going to screw up. You're going to get things in the wrong order.
Named arguments allow you to be resilient against that. So you only specify arguments by name, and it
doesn't matter what order they are in. All right, any questions, then, on this, and the overriding of new
line.

- [4:49:47](https://www.youtube.com/watch?v=undefined&t=53387s) And to be clear, you can do


something like, very weird, but logically expected, like this, by just changing the line ending, too. But the
right way to solve the Mario problem would be just to override it to be nothing like this. All right, how
about this for cool. And this is why a lot of people like Python.

- [4:50:06](https://www.youtube.com/watch?v=undefined&t=53406s) Suppose you don't really like


loops. You don't really like three-line programs, because that was kind of three times longer than it
needs to be. What if you just printed out a question mark four times? Python, whoops, Python of
Mario.py, that also works. So it turns out that, just like the plus operator in Python can join things
together, the multiply operator is not arithmetic in this case.

- [4:50:28](https://www.youtube.com/watch?v=undefined&t=53428s) It actually means, take this and


concatenate it four times over. So that's a way of just distilling into one line what would have otherwise
taken multiple lines in C, fewer, but still multiple lines in Python, but is really now rather succinct in
Python, by doing that instead. Let's do one last Mario example, which looked a little something like this.

- [4:50:48](https://www.youtube.com/watch?v=undefined&t=53448s) If this is another part of the Mario


interface, this is like a grid of like 3 by 3 bricks, for instance. So two dimensions now, just not just vertical,
not horizontal, but now both. Let's print out something like that, using hashes. Well, how about, how do
I do this. So how about for i in range of 3. Then I could do for j in range of 3, just because j comes after I
and that's reasonable for counting.

- [4:51:12](https://www.youtube.com/watch?v=undefined&t=53472s) I could now print out a hash


symbol, well, let's see what this does. Python of Mario.py, OK, that's just one crazy long column. What
do I need to fix and where here, to make this look like this? So 3 by 3 bricks, instead of one long column.
Any instincts? AUDIENCE: Why don't we create a line and then we'll skip it.

- [4:51:37](https://www.youtube.com/watch?v=undefined&t=53497s) DAVID J. MALAN: OK, so after


printing 3, we want to skip a line. So maybe like print out a blank line here. OK, let's try that. I like that
instinct, right, print 3, new line, print 3, new line. Let's go ahead and run Python of Mario.py. OK, it's
more visible, what I'm doing, but still wrong.

- [4:51:53](https://www.youtube.com/watch?v=undefined&t=53513s) What can I, what's the remaining


fix, though? Yeah. AUDIENCE: So right behind the two. DAVID J. MALAN: Yeah, I'm getting an extra new
line here, which I don't want while I'm on this row. So let me do n equals quote unquote, and now,
together, your solutions might take us the whole way there.

- [4:52:10](https://www.youtube.com/watch?v=undefined&t=53530s) Python of Mario.py, voila, now


we've got it, in two dimensions. And even this, we can tighten up. Like, we could just use the little trick
we learned. So we could just say, print a hash times 3 times, and we can get rid of one of those loops
altogether. All it's doing is, whoops, all it's doing is automating that process.

- [4:52:27](https://www.youtube.com/watch?v=undefined&t=53547s) But, no, I don't want to do that.


What do I, how do I fix this here. I don't think I want this anymore, right? Because that's giving me an
extra new line. So now this program is really tightened up. Same thing, two lines of code. But we're now
implementing this same two dimensional structure here.

- [4:52:44](https://www.youtube.com/watch?v=undefined&t=53564s) All right, any questions here on


these? Yeah. AUDIENCE: Is there any practical reason why when we write n, n is, I mean, the print
function, you don't put any spaces in it. DAVID J. MALAN: If I print n, any spaces. Say that once more.
AUDIENCE: Whenever we write n, for example, the print function is, you know, in order to stop it from
going to a new line, it seems like any spaces, we did like n equals and then too close.

- [4:53:14](https://www.youtube.com/watch?v=undefined&t=53594s) There were no spaces. Did you do


that on purpose? DAVID J. MALAN: Oh. yes, good question. I see what you're saying. So in a previous
version, let me rewind in time, when we had this, I did not put spaces. The convention in Python is not to
do that. Why? It just starts to add too much space. And this is a little inconsistent, because, earlier, when
we talked about like pluses or spaces around the less than or equal signs, I did say add it.

- [4:53:37](https://www.youtube.com/watch?v=undefined&t=53617s) Here it's actually clearer and


recommended to keep them tighter together. Otherwise it just becomes harder to read where the gaps
are. Good observation. All right, let's do, how about, another five minute break. Let's do that. And then
we're going to dive into some more sophisticated problems, and then ultimately build with some audio
and visual examples, as well.

- [4:53:58](https://www.youtube.com/watch?v=undefined&t=53638s) See you in five. All right, so almost


all of the examples we just did were recreations of what we did in week 1. And recall that week 1 was
like our most syntax-heavy week. It was when we were first learning how to program in C. But after week
1, we began to focus a bit more on ideas, like arrays, and other higher-level constructs.

- [4:54:18](https://www.youtube.com/watch?v=undefined&t=53658s) And we'll do that again here,


condensing some of those first early weeks into a fewer set of examples in Python. And we'll culminate
by actually taking Python out for a spin, and doing things that would be way harder to do, and way more
time-consuming to do in C, even more so than the speller example.

- [4:54:33](https://www.youtube.com/watch?v=undefined&t=53673s) But how do you go about figuring


out what functions exist, if you didn't hear it in class, you don't see it online, but you want to see it
officially, you can go to the Python documentation, docs.python.org here. And I will disclaim that,
honestly, the Python documentation is not terribly user-friendly.

- [4:54:49](https://www.youtube.com/watch?v=undefined&t=53689s) Google will often be your friend,


so googling something you're interested in, to find your way to the appropriate page on Python.org, or
StackOverflow.com is another popular website. As always, though, the line should be googling things
like, how do I convert a string to lowercase. Like that's reasonable to Google.

- [4:55:06](https://www.youtube.com/watch?v=undefined&t=53706s) Or how to convert to uppercase


or how implement function in Python. But googling, of course, things like how to implement problem set
6 in CS50, of course, crosses the line. But moving forward, and really with programming in general, like
Google and Stack Overflow are your friends, but the line is between the reasonable and the
unreasonable.

- [4:55:23](https://www.youtube.com/watch?v=undefined&t=53723s) So let me officially use the Python


documentation search, just to search for something like the lowercase function. Like, I know I can
lowercase things in Python. I don't quite remember how. So let me just search for the word lower. You're
going to get, often, an overwhelming number of results, because Python is a pretty big language, with
lots of functionality.

- [4:55:40](https://www.youtube.com/watch?v=undefined&t=53740s) And you're going to want to look


for familiar patterns. For whatever reason, string.lower, which is probably more popular or more
commonly used than these other ones, is third on the list. But it's purple, because I clicked it a moment
ago, when looking for it. So str.

- [4:55:55](https://www.youtube.com/watch?v=undefined&t=53755s) lower is probably what I want,


because I am interested at the moment in lower casing strings. When I click on that, this is an example of
what Python's documentation tends to look like. It's in this general format. Here's my str.lower function.
This returns a copy of the string, with all of the cased characters converted to lowercase, and the lower-
casing algorithm, dot dot dot.

- [4:56:12](https://www.youtube.com/watch?v=undefined&t=53772s) So that doesn't give me much. It


doesn't give me sample code. But it does say what the function does. And if we keep looking, you'll see
mention of Lstrip, which is left strip. I used its analog, Rstrip before, right strip, which allows you to
remove, that is strip, from the end of a string, something like white space, like a new line, or even
something else.
- [4:56:29](https://www.youtube.com/watch?v=undefined&t=53789s) And if you scroll through string,
this web page here. And we're halfway down the page already. If you see my scroll bar, tiny on the right,
there's a huge amount of functionality built into string objects, here. And this is just testament to just
how rich the language itself is. But it's also reason to reassure that the goal, when playing around with
some new language and learning it, is not to learn it exhaustively.

- [4:56:53](https://www.youtube.com/watch?v=undefined&t=53813s) Just like in English or any human


language, there's always going to be vocab words you don't know, ways of presenting the same
information in some language. That's going to be the case with Python. And what we'll do today and this
week in problem set 6 is really get your footing with this language.

- [4:57:07](https://www.youtube.com/watch?v=undefined&t=53827s) But you won't know all of Python,


just like you won't know all of C. And, honestly, you won't know all of any of these languages on your
own, unless you're, perhaps, using them full time professionally, and even then, there's more libraries
than one might even retain themselves. So let's actually now pivot to a few other ideas, that we'll
implement in Python, in a moment.

- [4:57:24](https://www.youtube.com/watch?v=undefined&t=53844s) Let me switch back over to VS


Code here. And let me whip up, say, a recreation of our scores example from week two, where we
averaged like three scores together. And that was an opportunity in week 2 to play with arrays, to realize
how constrained arrays are. They can't grow or shrink. You have to decide in advance.

- [4:57:42](https://www.youtube.com/watch?v=undefined&t=53862s) But let's see what's different here


in Python. So let me do Scores.py, and let me give myself an array in Python called scores, sorry, let me
give myself a variable in Python called scores. Set it equal to a list of three scores, which are the same
ones we've used before, 72, 73, 33, in this context meant to be scores, not ASCII values.

- [4:58:01](https://www.youtube.com/watch?v=undefined&t=53881s) And then let's just do the average


of these. So average will be another variable. And it turns out I can do, well, how did I sum these before?
I probably had a for loop to add one, then I knew how long they were. Turns out in Python, you can just
say sum of scores divided by the length of scores. That's going to give me my average.

- [4:58:20](https://www.youtube.com/watch?v=undefined&t=53900s) So sum is a function that takes a


list, in this case, as input, and it just does the sum for you, with a for loop or whatever underneath the
hood. Len gives you the length of the list, how many things are in it. So I can dynamically figure that out.
Now let me go ahead and print out, using print, the word average, and then, in curly braces, the actual
average, close quote.

- [4:58:40](https://www.youtube.com/watch?v=undefined&t=53920s) All right, so let's run this code,


Python of Scores.py. And there is my average, in this case, 59.33333 and so forth, based on the math.
Well, let's actually, now, change this a little bit and make it a little more interesting, and actually get input
from the user rather than hard coding this. Let me go back up here and use from CS50 import getInt,
because I don't want to deal with all the exceptions and the loops.

- [4:59:02](https://www.youtube.com/watch?v=undefined&t=53942s) Like, I just want to use someone


else's function here. Let me give myself an empty list called scores. And this is not something we were
able to do in C, right? Because in C, if you tried to make an empty array, well, that's pretty stupid,
because you can't add things to it. It's a fixed size.
- [4:59:17](https://www.youtube.com/watch?v=undefined&t=53957s) So it wouldn't even let you do
that. But I can just create an empty list in Python, because lists, unlike arrays, are really lengthless. They'll
grow and shrink. But you and I are not dealing with all the pointers underneath the hood. Python's doing
that for us. So now, let's go ahead and get a whole bunch of scores from the user.

- [4:59:35](https://www.youtube.com/watch?v=undefined&t=53975s) How about three of them in total.


So for i in range of 3, let's go ahead and grab a score from the user, using getInt, asking them for score.
And then let's go ahead and append, to the scores list, that particular score. So it turns out that a list,
and I could read the Python documentation to confirm as much, lists have a function built into them, and
functions built into objects are generally known as methods, if you've heard that term before.

- [5:00:03](https://www.youtube.com/watch?v=undefined&t=54003s) Same idea, but whereas a


function kind of stands on its own, a method is a function built into an object, like a list here. That's
going to achieve the same result. Strictly speaking, I don't need the variable. Just like in C, I could tighten
this up and do something like this as well. But, I don't know, I kind of like it this way.

- [5:00:19](https://www.youtube.com/watch?v=undefined&t=54019s) It's more clear, to me, at least,


that what I'm doing here, getting the score and then appending it to the list. Now the rest of the code
can stay the same. Python of Scores.py, score will be 72, 73, 33. And I get back the math. But now the
program's a little more dynamic, which is nice. But there's other syntax I could use here.

- [5:00:37](https://www.youtube.com/watch?v=undefined&t=54037s) Just so you've seen it, Python


does have some neat syntactic tricks, whereby, if you don't want to do scores.append, you can actually
say scores plus equals this score. So you can actually concatenate lists together in Python 2. Just as we
used plus to join two strings together, you can use plus to join two lists together.

- [5:00:58](https://www.youtube.com/watch?v=undefined&t=54058s) The catch is, you need to put the


one score I'm adding here in a list of its own, which is kind of silly. But it's necessary, so that this thing
and this thing are both lists. To do this more verbosely, which most programmers wouldn't do, but just
for clarity, this is the same thing as saying scores plus this score.

- [5:01:15](https://www.youtube.com/watch?v=undefined&t=54075s) So now maybe it's a little more


clear that scores and brackets score plural, sorry, singular, are both lists themselves, being concatenated
or joined together. So two different ways, not sure one is better than the other. This way is pretty
common, but .append is also quite reasonable as well. All right, how about another example from week
two.

- [5:01:37](https://www.youtube.com/watch?v=undefined&t=54097s) This one was called uppercase. So


let me do this in Uppercase.py, though, this time. And let me import from CS50, get string again. And let
me go ahead and say, before will be my first variable. Let me get a string from the user, asking them for a
before string. And then let me go ahead and say, after, just to demonstrate some changes, upper-casing
to this string.

- [5:02:02](https://www.youtube.com/watch?v=undefined&t=54122s) Let me change my line ending to


be that, using our new trick. And this is where things get cool in Python, relatively speaking. If I want to
iterate over all of the characters in a string, and print them out in uppercase, one way to do that would
be this. For c in the before string, go ahead and print out C.uppercase, sorry, C.
- [5:02:23](https://www.youtube.com/watch?v=undefined&t=54143s) upper, but don't end the line yet,
because I want to keep these all on the same line until I'm all done. So what am I doing? Python of
Uppercase.py, let me type in Hello in all lowercase. I've just upper-cased the whole string. How? I first
get string, calling it before. I then just print out some fluffy text that says after colon, and I get rid of the
line ending, just so I can kind of line these up.

- [5:02:41](https://www.youtube.com/watch?v=undefined&t=54161s) Notice I hit the spacebar a couple


of times just so letters line up to be pretty. For c and before, this is new. This is powerful in C, sorry, in
Python, whereby you don't have to do like Int i equals 0 and i less than this, you could just say, for c in
the string in question, for c and before. And then here is just upper-casing that specific character, and
making sure we don't output a new line too soon.

- [5:03:04](https://www.youtube.com/watch?v=undefined&t=54184s) But this is actually more work


than I need to do. Based on what we've seen thus far, like from our agreement example, can I tighten
this up further? Can I collapse lines 5 and 6, maybe even 7, all together? If the goal of this program is just
to uppercase the before string, how might I do this? Yeah, in back.

- [5:03:27](https://www.youtube.com/watch?v=undefined&t=54207s) AUDIENCE: Would it be str.upper?


DAVID J. MALAN: Str.upper, yeah, so I could do something like this, after gets before.upper. So it's not stir
literally dot upper, stir just represents the string in question. So it would be before.upper, but right idea
otherwise. And so let me go ahead and just tweak my print statement a little bit.

- [5:03:45](https://www.youtube.com/watch?v=undefined&t=54225s) Let me just go ahead and print


out the after variable here, after creating it. So this line is the same, I'm getting a string called before. I'm
creating another variable called after, and, as you propose, I'm calling upper on the whole string, not one
character at a time. Why? Because it's allowed.

- [5:04:00](https://www.youtube.com/watch?v=undefined&t=54240s) And, again, in Python, there


aren't technically characters individually. There's only strings, anyway. So I might as well do them all at
once. So if I rerun the code now, Python of Uppercase.py. Now I'll type in Hello in all lowercase, and, oh,
so close, I think I can get rid of this override, because I'm printing the whole thing out at once, not
character by character.

- [5:04:22](https://www.youtube.com/watch?v=undefined&t=54262s) So now if I type in Hello before,


now I have an even tighter version of the program here. All right, any questions, then, on lists or on
strings, and what this kind of function, upper, represents, with its docs. No? All right, so a couple other
building blocks before we start. Oh. Where was that? AUDIENCE: To the right.

- [5:04:45](https://www.youtube.com/watch?v=undefined&t=54285s) DAVID J. MALAN: To the right,


right. Yes, thank you. AUDIENCE: Could you write, very close to variable string, and then print upper, you
start creating a variable upper. DAVID J. MALAN: Yes, do I have to create this variable, upper? No, I don't.
I could actually tighten this up, and, if you really want to see something neat, inside of the curly braces,
you don't have to just put the names of variables.

- [5:05:08](https://www.youtube.com/watch?v=undefined&t=54308s) You can put a small amount of


logic, so long as it doesn't start to look stupid and kind of overwhelmingly complex, such that it's sort of
bad design at that point. I can tighten this up like this. And now we're in Python of Uppercase.py, writing
Hello again. And that, too, works. But I would be careful about this.
- [5:05:24](https://www.youtube.com/watch?v=undefined&t=54324s) You want to resist the temptation
of having like a long line of code that's inside the curly braces, because it's just going to be harder to
read. But, absolutely, you could indeed do that, too. All right, how about command line arguments,
which was one thing we introduced in week two also, so that we could actually have the ability to take
input from the user, whoops.

- [5:05:43](https://www.youtube.com/watch?v=undefined&t=54343s) So we could actually take input


from the user at the command line, so as to take literally command line arguments. These are a little
different, but it follows the same paradigm. There's no main by default. And there's no Def main int arg c
char, or we called it string, argv by default. There's none of this.

- [5:06:03](https://www.youtube.com/watch?v=undefined&t=54363s) So if you want access to the


argument vector, argv, you import it. And it turns out, there's another module in Python, or library in
Python called CIS, and you can import from the system this thing called argv. So same idea, different
place. Now I'm going to go ahead and do this. Let's write a program that just requires that the user types
in two, a word, after the program's name, or none at all.

- [5:06:27](https://www.youtube.com/watch?v=undefined&t=54387s) So if the length of argv equals 2,


let's go ahead and print out, how about, Hello comma argv bracket 1 close quote, else if they don't type
two words total at the prompt, let's just say the default's, like we did weeks ago, Hello, world. So the
only thing that's new here is we're importing argv from CIS, and we're using this fancy f-string format,
which kind of to your point, too, it's putting more complex logic in the curly braces.

- [5:06:55](https://www.youtube.com/watch?v=undefined&t=54415s) But that's OK. In this case, it's a


list called argv, and we're getting bracket 1 from it. Let's do Python of Argv.py, Enter, Hello, world. What if
I do Argv.py David at the command line. Now I get Hello, David. So there's one curiosity here. Python is
not included in argv, whereas in C, dot slash whatever was the first thing.

- [5:07:18](https://www.youtube.com/watch?v=undefined&t=54438s) If the analog in Python is that the


name of your Python program is the first thing, in bracket 0, which is why David is in bracket 1, the word
Python does not appear in the argv list, just to be clear. But otherwise, the idea of these arguments is
exactly the same as before. And in fact, what you can do, which is kind of cool, is, because argv is a list,
you can do things like this.

- [5:07:42](https://www.youtube.com/watch?v=undefined&t=54462s) For arg in argv, go ahead and print


out each argument. So instead of using a for loop and i and all of this, if I do Python of argv Enter, it just
writes the program's name. If I do Python of argv Foo, it puts Argv.py and Foo. If I do, sorry, if I do Foo
and bar, those words all print out. If I do Foobar baz, those print out too.

- [5:08:05](https://www.youtube.com/watch?v=undefined&t=54485s) And Foo and bar or baz are like a


mathematician's x and y and z for computer scientists, when you just need some placeholder words. So
this is just nice. It reads a little more like English, and a for loop is just much more concise, allows you to
iterate very quickly when you want something like that.

- [5:08:20](https://www.youtube.com/watch?v=undefined&t=54500s) Suppose I only wanted the real


words that the human typed after the program's name. Like, suppose I want to ignore Argv.py. I mean I
could do something hackish like this. If arg equals Argv.py, I could just ignore, you know, let's invert the
logic. I could do this, for instance. So if the arg does not equal the program name, then go ahead and
print out the word.

- [5:08:44](https://www.youtube.com/watch?v=undefined&t=54524s) So I get Foobar and baz only. Or,


this is what's kind of neat about Python 2, let me undo that. And let me just take a slice of the array of
the list instead. So it turns out, if argv is a list, I can actually say, you know what, go into that list, start at
element 1, instead of 0, and then go all the way to the end.

- [5:09:06](https://www.youtube.com/watch?v=undefined&t=54546s) And we have not seen this syntax


in C. But this is a way of slicing a list in Python. So now watch what happens. If I run Python of Argv.py,
Foo bar baz Enter, I get only a subset of the list, starting at position 1, going all of the way to the end.
And you can even do kind of the opposite. If, for whatever reason, you want to ignore the last element,
you can say colon, we could say colon negative 1, and use a negative number, which we've not seen
before, which slices off the end of the list, as well.

- [5:09:39](https://www.youtube.com/watch?v=undefined&t=54579s) So there's some syntactic tricks


that tend to be powerful in Python 2, even if at first glance, you might not need them for typical things.
All right, let's do one other example with exit, and then we'll start actually applying some algorithms, to
make things interesting. So in one last program here, let's do Exit.

- [5:09:57](https://www.youtube.com/watch?v=undefined&t=54597s) py, just to do one more mechanic,


before we introduce some algorithms. And let's do this. Let's import from CIS, import argv. Let's now do
this. Let's make sure the user gives me one command line argument. So if the length of argv does not
equal 2 in total, then let's go ahead and print out something like missing command line argument, just to
explain what the problem is.

- [5:10:21](https://www.youtube.com/watch?v=undefined&t=54621s) And then let's do this. We can


exit. But I'm going to use a better version of exit here. Let me import two functions from CIS. Turns out
the better way to do this is with CIS.exit, because I can then exit specifically 2, with this exit code.
Otherwise, down here, I'm going to go ahead and print out, something like Hello, comma argv bracket 1,
same as before.

- [5:10:43](https://www.youtube.com/watch?v=undefined&t=54643s) And then I'm going to exit with


zero. So, again, this was a subtle thing we introduced in week two, where you can actually have your
programs exit, with some number, where 0 signifies success, and anything else signifies error. This is just
the same idea in Python. So if I, for instance, just run the program like this, oops, I screwed up.

- [5:11:00](https://www.youtube.com/watch?v=undefined&t=54660s) I meant to say exit here and exit


here. Let me do that again. If I run this like this, I'm missing a command line argument. So let me rerun it
with like my name at the prompt. So I have exactly two command line arguments, the file name and my
name, Hello comma David. And if I do David Malan, it's not going to work either, because now argv does
not equal 2.

- [5:11:19](https://www.youtube.com/watch?v=undefined&t=54679s) But the difference here is that


we're exiting with 1, so that special programs can detect an error, or 0 in the event of success. And now
there's one other way to do this, too. Suppose that you're importing a lot of functions, and you don't
really want to make a mess of things and just have all of these function names available, without it being
clear where they came from.
- [5:11:38](https://www.youtube.com/watch?v=undefined&t=54698s) Let's just import all of CIS. And
let's just change our syntax, kind of like I proposed for CS50, where we just prepend to all of these library
functions, CIS, just to be super-explicit where they came from, and if there's another exit or argv value
that we want to import from a library, this is one way to avoid collision.

- [5:11:58](https://www.youtube.com/watch?v=undefined&t=54718s) So if I do it one last time here,


missing command line argument. But David still actually worked. All right, only to demonstrate how we
can implement that same idea. Let's now do something more powerful, like a search algorithm, like
binary search. I'm going to go ahead and open up a file called Numbers.

- [5:12:13](https://www.youtube.com/watch?v=undefined&t=54733s) py, and let's just do some


searching or linear search, rather, on a list of numbers. Let's go ahead and do this. How about import CIS
as before. Let me give myself a list of numbers, like 4, 6, 8, 2, 7, 5, 0, so just a bunch of integers. And then
let's do this. If you recall from week three, we searched for the number 0 at the end of the lockers on
stage.

- [5:12:38](https://www.youtube.com/watch?v=undefined&t=54758s) So let's just ask that question in


Python. No need for a loop or anything like that. If 0 is in the numbers, go ahead and print out found.
And then let's just exit successfully, with 0, else, if we get down here, let's just say print not found. And
then we'll CIS exit with 1. So this is where Python starts to get powerful again.

- [5:12:58](https://www.youtube.com/watch?v=undefined&t=54778s) Here's your list. Here is your loop,


that's doing all of the checking for you. Underneath the hood, Python is going to use linear search. You
don't have to implement it yourself. No while loop, no for loop, you just ask a question. If 0 is in
numbers, then do the following. So that's one feature we now get with Python, and get to throw away a
lot of that code.

- [5:13:17](https://www.youtube.com/watch?v=undefined&t=54797s) We can do it with strings, too. Let


me open a file called Names.py instead, and do something that was even more involved in C, because we
needed Str Comp and the for loop, and so forth. Let me import CIS for this file. Let's give myself a bunch
of names like we did in C. And those were Bill and Charlie and Fred and George and Ginny, and two
more, Percy, and lastly Ron.

- [5:13:42](https://www.youtube.com/watch?v=undefined&t=54822s) And recall, at the time, we looked


for Ron. And so we had to iterate through the whole thing, doing Str Comp and i plus plus and all of that.
Now just ask the question, if Ron is in names, then let's go ahead and, whoops, let me hide that. I hit the
command too soon. Let me go ahead and say print, found, as before.

- [5:14:03](https://www.youtube.com/watch?v=undefined&t=54843s) CIS exit 1, just to indicate success,


and then down here, if we get to this point, we can say not found. And then we'll just CIS exit 1 instead.
So, again, this just does linear search for us by default, Python of Names.py, we found Ron, because,
indeed, he's there, and at the end of the list. But we don't need to deal with all of the mechanics of it.

- [5:14:25](https://www.youtube.com/watch?v=undefined&t=54865s) All right, let's take things one step


further. In week three, we also implemented the idea of a phone book, that actually associated keys with
values. But remember, the phone book in C, was kind of a hack, right? Because we first had two arrays,
one with names, one with numbers. Then we introduced structs, and so we gave you a person structure.
- [5:14:44](https://www.youtube.com/watch?v=undefined&t=54884s) And then we had an array of
persons. You can do this in Python, using objects and things called classes. But we can also just use a
general purpose dictionary, because just like in P set 5, you can associate keys with values, using a hash
table, using a try. Well, similarly, can Python just do this for us.

- [5:15:03](https://www.youtube.com/watch?v=undefined&t=54903s) From CS50, let's import get string.


And now let's give myself a dictionary of people, D-I-C-T () open paren closed paren gives you a
dictionary. Or you can simplify the syntax, actually, and a dictionary again is just keys and values, words
and definitions. You can also just use curly braces instead.

- [5:15:22](https://www.youtube.com/watch?v=undefined&t=54922s) That gives me an empty


dictionary. But if I know what I want to put in it by default, let's put Carter in there, with a number of
plus 1-617-495-1000, just like last time, and put myself, David, with plus 1-949-468-2750. And it came to
my attention, tragically, after class that day, that we had a bug in our little Easter egg.

- [5:15:45](https://www.youtube.com/watch?v=undefined&t=54945s) If today, you would like to call me


or text me, at that number, we have fixed the code that underlies that little Easter egg. Spoiler ahead. All
right, so this now gives me a variable called people, that's associating keys with values. There is some
new syntax here in Python, not just the curly braces, but the colons, and the quotes on the left and the
right.

- [5:16:05](https://www.youtube.com/watch?v=undefined&t=54965s) This is a way, in Python, of


associating keys with values, words with definitions, anything with anything else. And it's going to be a
super-common paradigm, including in week seven, when we look at CSS and HTML and web
programming, keys and values are like this omnipresent idea in computer science and programming,
because it's just a really useful way of associating one thing with another.

- [5:16:26](https://www.youtube.com/watch?v=undefined&t=54986s) So, at this point in the story, we


have a dictionary, a hash table, if you will, of people, associating names with phone numbers, just like a
real world phone book. So let's write a program that gets a string from the user and asks them whose
number they would like to look up. Then, let's go ahead and say, if that name is in the people dictionary,
go ahead and print out that person's number, by going into the people dictionary and going to that
specific name, within there, using an f-string for the whole thing.

- [5:16:56](https://www.youtube.com/watch?v=undefined&t=55016s) So this is similar in spirit to


before. Linear search and dictionary lookups will just happen automatically for you in Python, by just
asking the question, if name and people. And this line is just going to print out, whoever is in the people
dictionary, at that name. So I'm using square brackets, because here's the interesting thing in Python,
just like you can index into an array, or a list in Python, using numbers, 0, 1, 2, you can very conveniently
index into a dictionary in Python, using square brackets, as well.

- [5:17:30](https://www.youtube.com/watch?v=undefined&t=55050s) And just to make clear what's


going on here, let me go and create a temporary variable, person equals people bracket name. And then
let's just, or, sorry, let's say, number equals people bracket name. And that will just print out the number
in question. In C, and previously in Python, anything with square brackets like this would have been go to
a location in a list or an array, using a number.
- [5:17:53](https://www.youtube.com/watch?v=undefined&t=55073s) But that can actually be a string,
like a word the human has typed. And this is what's amazing about dictionaries, it's not like a big line, a
big linear thing. It's this table, that you can look up in one column the name, and get back in the other
column the number. So let's go ahead and run Python of Phonebook.

- [5:18:10](https://www.youtube.com/watch?v=undefined&t=55090s) py, found, not that, oh, wait.


That's not what's supposed to happen at all. I think I'm in the wrong play. Phonebook.py. What's going
on? Print found. I am confused. OK, let's run this again. Python of Phonebook.py, what the-- OK, stand
by. [KEYS CLICKING] What the heck? What am I not understanding here? OK, Roxanne, Carter, do you see
what I'm doing wrong? AUDIENCE: I don't.

- [5:19:06](https://www.youtube.com/watch?v=undefined&t=55146s) DAVID J. MALAN: What the--


[LAUGHTER] Say again? SPEAKER 47: When you found the test results, it was doing both commands.
DAVID J. MALAN: Oh, yeah, found, OK, we're going to do this. One sec. [KEYS CLICKING] Whoa, OK. All
this is coming out of the video. So. [LAUGHTER] [APPLAUSE] Thanks. All right. I will try to figure out what
was going wrong.

- [5:19:45](https://www.youtube.com/watch?v=undefined&t=55185s) The best I can tell, it was running


the wrong program. I don't quite understand why. So we will diagnose this later. I just put the file into a
temporary directory, for now, to run it. So let me go ahead and just run this, Python of Phonebook.py,
type in, for instance, my name. And there's my corresponding number.

- [5:20:03](https://www.youtube.com/watch?v=undefined&t=55203s) Have no idea what was just


happening. But I will get to the bottom of it and update you, if we can put our finger on it. So this was
just an example, now, of implementing a phone book. Let's now consider what we can do that's a little
more powerful, in these examples, like a phone book that actually keeps this information around.

- [5:20:19](https://www.youtube.com/watch?v=undefined&t=55219s) Thus far, these simple phone book


examples throw the information away. But using CSV files, comma separated values, maybe we could
actually keep around the names and numbers, so that, like on your phone, you can actually keep your
contacts around long-term. So I'm going to go ahead now and do a slightly different example.

- [5:20:36](https://www.youtube.com/watch?v=undefined&t=55236s) And let me just hide this detail, so


it's not confusing. Whoops, I'm going to change my prompt temporarily. So let me go ahead now and
refine this example as follows. I'm going to go into Phonebook.py, and I'm going to import a whole library
called CSV. And this is a powerful one, because Python comes with a library that just handles CSV files for
you.

- [5:20:58](https://www.youtube.com/watch?v=undefined&t=55258s) A CSV file is just a file with comma


separated values. And, in fact, to demonstrate this, let me check on one thing here, just to make this a
little more real. To demonstrate this, let's go ahead and do this. Let me import the CSV library from CS50.
Let me import getString. Let me then open a file, using the open function, open a file called Phonebook.

- [5:21:29](https://www.youtube.com/watch?v=undefined&t=55289s) csv, in append format, in contrast


with read format and write format. Write just blows it away if it exists, append adds to the bottom of it.
So I keep this phone book around, just like you might keep adding contacts to your phone. Now let me
go ahead and get a couple of values from the user. Let me say getString and ask the user for a name.
- [5:21:45](https://www.youtube.com/watch?v=undefined&t=55305s) Then let me getString again, and
ask the user for their number. And now, let me go ahead and do this. And this is new, and this is Python-
specific. And you would only know this by following a tutorial, or reading the documentation. Let me
give myself a variable called writer, and ask the CSV library for a writer to that file.

- [5:22:06](https://www.youtube.com/watch?v=undefined&t=55326s) Then, let me go ahead and use


that writer variable, use a function or a method inside of it, called write row, to write out a list containing
that person's name and number. Notice the square brackets inside the parentheses, because I'm just
printing a list to that particular row in the file. And then I'm just going to close the file.

- [5:22:28](https://www.youtube.com/watch?v=undefined&t=55348s) So what is the effect of all of this?


Well, let me go ahead and run this version of Phonebook.py, and I'm prompted for a name. Let's do
Carter's first, plus 1-617-495-1000, and then, let's go ahead and LS. Notice in my current directory,
there's two files now, Phonebook.py, which I wrote, and apparently Phonebook.csv.

- [5:22:51](https://www.youtube.com/watch?v=undefined&t=55371s) CSV just stands for comma


separated values. And it's like a very simple way of storing data in a spreadsheet, if you will, where the
comma represents the separation between your columns. There's only two columns here, name and
number. But, because I'm writing to this file in append mode, let me run it one more time, Python of
Phonebook.

- [5:23:10](https://www.youtube.com/watch?v=undefined&t=55390s) py, and let me go ahead and do


David and plus 1-949-468-2750, Enter. And notice what happened in the CSV file. It automatically
updated, because I'm now persisting this data to the file in question. So if I wanted to now read this file
in, I could actually go ahead and do linear search on the data, using a read function to actually read from
the CSV.

- [5:23:35](https://www.youtube.com/watch?v=undefined&t=55415s) But, for now, we'll just leave it a


little simply as write. And let me make one refinement here. It turns out that, if you're in the habit of re-
opening a file, you don't have to even close it explicitly. You can instead do this. You can instead say, with
the opening of a file called Phonebook.

- [5:23:53](https://www.youtube.com/watch?v=undefined&t=55433s) csv in append mode, calling the


thing file, go ahead and do all of these lines here. So the with keyword is a new thing in Python. And it's
used in a few different ways, but one of the ways it's used is to tighten up code here. And I'm going to
move my variables to the outside, because they don't need to be inside of the with statement, where
the file is open.

- [5:24:10](https://www.youtube.com/watch?v=undefined&t=55450s) This just has the effect of ensuring


that you, the programmer, don't screw up, and accidentally don't close your file. In fact, you might recall,
from C, Valgrind might have complained at you, if you had a file that, you didn't close a file, you might
have had a memory leak as a result.

- [5:24:24](https://www.youtube.com/watch?v=undefined&t=55464s) The with keyword takes care of all


of that for you, as well. How about let's do, want to do this. How about, let's do one other thing. Let's do
this. Let me go ahead and propose, that on your phone or laptop here, or online, go to this URL here,
where you'll find a Google form. And just to show that these CSVs are actually kind of omnipresent, and
if you've ever like used a Google Form or managed a student group, or something where you've collected
data via Google Forms, you can actually export all of that data via CSV files.

- [5:24:55](https://www.youtube.com/watch?v=undefined&t=55495s) So go ahead to this URL here. And


those of you watching on demand later, will find that the form is no longer working, since we're only
doing this live. But that will lead to a Google Form that's going to let everyone input their answer to a
question, like what house do you want to end up into, sort of an approximation of the sorting hat in
Harry Potter.

- [5:25:13](https://www.youtube.com/watch?v=undefined&t=55513s) And via this form, will we then


have the ability to export, we'll see, a CSV file. So let's give you a moment to do that. In just a moment,
I'll share my version of the screen, which is going to let me actually open the file, the form itself. And in
just a moment, I'll switch over. OK, so this is now my version of the form here, where we have 200 plus
responses to a simple question of the form, what house do you belong in, Gryffindor, Hufflepuff,
Ravenclaw, or Slytherin.

- [5:25:45](https://www.youtube.com/watch?v=undefined&t=55545s) If I go over to responses, I'll see all


of the responses in the GUI form here. So graphical user interface, and we could flip through this. And it
looks like, interestingly, 40% of Harvard students want to be in Gryffindor, 22% in Slytherin, and everyone
else in between the others. But you might have noticed, if ever using a Google Form, this Google
Spreadsheets link.

- [5:26:05](https://www.youtube.com/watch?v=undefined&t=55565s) So I'm going to go ahead and click


that. And that's going to automatically open, in this case, Google Spreadsheets. But you can do the same
thing with Office 365 as well. And now you see the raw data as a spreadsheet. But in Google
Spreadsheets, if I go to File and then I go to Download, notice I can download this as an Excel file, a PDF,
and also a CSV, comma separated values.

- [5:26:25](https://www.youtube.com/watch?v=undefined&t=55585s) So let me go ahead and do that.


That gives me a file in my Downloads folder on my computer. I'm going to now go back to my code editor
here. And what I'm going to go ahead and do is upload this file, from my Downloads folder to VS Code,
so that we can actually see it within here. And now you can see this open file.

- [5:26:45](https://www.youtube.com/watch?v=undefined&t=55605s) And I'm going to shorten its


name, just so it's a little easier to read. I'm going to rename this using the MV command, to just
Hogwarts.csv. And then we can see, in the file, that there's two columns, timestamp column house,
where you have a whole bunch of time stamps when people filled out the form, with someone very early
in class.

- [5:27:01](https://www.youtube.com/watch?v=undefined&t=55621s) And then everyone else just a


moment ago. And the second value, after each comma, is the name of the house. Well, let me go ahead
here and implement a program in a file called Hogwarts.py, that processes this data. So in Hogwarts.py,
let's just write a program that now reads a CSV, in this case not a phone book, but everyone's sorting hat
information.

- [5:27:20](https://www.youtube.com/watch?v=undefined&t=55640s) And I'm going to go ahead and


Import CSV. And suppose I want to answer a reasonable question, ignoring the fact that Google's GUI or
graphical user interface, can do this for me. I just want to count up who's going to be in which house. So
let me give myself a dictionary called houses, that's initially empty, with curly braces.

- [5:27:37](https://www.youtube.com/watch?v=undefined&t=55657s) And let me pre-create a few keys.


Let me say Gryffindor is going to be initialized to 0, Hufflepuff will be initialized to 0 as well, Ravenclaw
will be initialized to 0. And finally, Slytherin will be initialized to 0. So here's another example of a
dictionary, or a hash table, just being a very general-purpose piece of data.

- [5:27:59](https://www.youtube.com/watch?v=undefined&t=55679s) You can have keys and values. The


keys, in this case, are the houses. The values are initially zero, but I'm going to use this, instead of like
four separate variables, to keep track of everyone's answer to this form. So I'm going to do this. With
opening Hogwarts.csv, in read mode, not append, I don't want to change it.

- [5:28:20](https://www.youtube.com/watch?v=undefined&t=55700s) I just want to read it, as file as my


variable name. Let's go ahead and create a reader this time, that is using the reader function in the CSV
library, by opening that file. I'm going to go ahead and ignore the first line of the file, because, recall, that
the first line is just timestamp and house.

- [5:28:37](https://www.youtube.com/watch?v=undefined&t=55717s) I want to get the real data. So this


next function is just a little trick for ignoring the first line of the file. Then let's do this. For every other
row in the reader, that is line by line, get the current person's house, which is in row bracket 1. This is
what the CSV reader library is doing for us.

- [5:28:55](https://www.youtube.com/watch?v=undefined&t=55735s) It's handling all of the reading of


this file. It figures out where the comma is, and, for every row in the file, it hands you back a list of size 2.
In bracket 0 is the time stamp, in bracket 1 is the house name. So, in my code, I can say house equals row
bracket 1. I don't care about the time stamp for this program.

- [5:29:13](https://www.youtube.com/watch?v=undefined&t=55753s) And then let's go into my


dictionary called houses, plural, index into it at the house location, by its name, and increment that 0 to
1. And now, at the end of this block of code, that has the effect of iterating over every line of the file,
updating my dictionary in four different places, based on whether someone typed Gryffindor or Slytherin
or anything else.

- [5:29:36](https://www.youtube.com/watch?v=undefined&t=55776s) And notice that I'm using the


name of the house to index into my dictionary, to essentially go up to this little cheat sheet and change
the 0 to a 1, the 1 to a 2, the 2 to a 3, instead of having like four separate variables, which would just be
much more annoying to maintain. Down at the bottom, let's just print out the results.

- [5:29:53](https://www.youtube.com/watch?v=undefined&t=55793s) For each house in those houses,


iterating over the keys they're in by default in Python, let's go ahead and print out an f-string that says,
the current house has the current count. And count will be the result of indexing into houses, for that
given house. And let me close my quote. So let's run this to summarize the data, Hogwarts.

- [5:30:18](https://www.youtube.com/watch?v=undefined&t=55818s) py, 140 of you answered


Gryffindor, 54 Hufflepuff, 72 Ravenclaw, and 80 of you Slytherin. And that's just my now way of code, and
this is, oh, my God, so much easier than C, to actually analyze data in this way. And one of the reasons
that Python is so popular for data science and analytics, more generally, is that it's actually really easy to
manipulate data, and run analytics like this.

- [5:30:37](https://www.youtube.com/watch?v=undefined&t=55837s) And let me clean this up slightly.


It's a little annoying that I just have to know and trust that the house name is in bracket 1 and timestamp
is in bracket 0. Let's clean this up. There's something called a Dictionary Reader in the CSV library that I
can use instead. Capital D, capital R, this means I can throw away this next thing, because what a
dictionary reader does is it still returns to me every row from the file, one after the other, but it doesn't
just give me a list of size 2 representing each row.

- [5:31:09](https://www.youtube.com/watch?v=undefined&t=55869s) It gives me a dictionary. And it


uses, as the keys in that dictionary, timestamp and house, for every row in the file, which is just to say it
makes my code a little more readable, because instead of doing this little trickery, bracket 1, I can say
quote unquote "Bracket House" with a capital H, because it's capitalized in the Google Form itself.

- [5:31:29](https://www.youtube.com/watch?v=undefined&t=55889s) So the code now is just minorly


different, but it's way more resilient, especially if I'm using Google Spreadsheets, and I'm moving the
columns around or doing something like that, where the numbers might get messed up. Now I can run
this on Hogwarts.py again, and I get the same answers. But I now don't have to worry about where those
individual columns are.

- [5:31:46](https://www.youtube.com/watch?v=undefined&t=55906s) All right, any questions on those


capabilities there. And that's a teaser of sorts, for some of the manipulation we'll do in P set 6. All right,
so some final examples and flair, to intrigue with what you can do with Python. I'm going to actually
switch over to a terminal window on my own Mac, so that I can actually use audio a little more
effectively.

- [5:32:08](https://www.youtube.com/watch?v=undefined&t=55928s) So here's just a terminal window


on Mac OS. I before class have preinstalled some additional Python libraries, that won't really work in VS
Code in the cloud, because they require audio that the browser won't necessarily support. But I'm going
to go ahead and write an example here that involves writing a speech-based program, that actually does
something with speech.

- [5:32:27](https://www.youtube.com/watch?v=undefined&t=55947s) And I'm going to go ahead and


import a library, that, again, I pre-installed, called Python text to speech, and I'm going to go ahead and,
per its documentation, give myself a speech engine, by using that library's init function, for initialize. I'm
then going to use this engine's save function to do something fun, like Hello, world.

- [5:32:46](https://www.youtube.com/watch?v=undefined&t=55966s) And then I'm going to go ahead


and tell this engine to run and wait, while it says those words. All right, I'm going to save this file. I'm not
using VS Code at the moment. I'm using another popular program that we used in CS50 back in my day,
called Vim, which is a command line program that's just in this black and white window.

- [5:33:01](https://www.youtube.com/watch?v=undefined&t=55981s) Let me go ahead now and run


Python of Speech.py, and-- COMPUTER: Hello, world. DAVID J. MALAN: All right, so it's a little
computerized, but it is speech that has been synthesized from this example. Let's change it a little bit to
be more interesting. Let's do something like this. Let's ask the user for their name, like what's your name
question mark.
- [5:33:20](https://www.youtube.com/watch?v=undefined&t=56000s) And then, let's use the little F
string, and say, not Hello, world, but Hello to that person's name. Let me save my file, run Python of
Speech.py, Enter. David. COMPUTER: Hello, David. DAVID J. MALAN: All right, so we pronounce my name
OK, might struggle with different names, depending on the phonetics.

- [5:33:39](https://www.youtube.com/watch?v=undefined&t=56019s) But that one seemed to be OK.


Let's do something else with Python, using similarly, just a few lines of code. Let me go into today's
examples. And I'm going to go into a folder called Detect, whoops, a folder called Faces.py. Sorry, Faces.
And in this folder, that I've written in advance, are a few files, Detect.py, Recognize.

- [5:34:02](https://www.youtube.com/watch?v=undefined&t=56042s) py, and two full of photos,


Office.jpeg and Toby.jpeg. If you're familiar with the show, here, for instance, is the cast photo from The
Office here. So here's a photo as input. Suppose I want to do something very Facebook-style, where I
want to analyze all of the faces, or detect all of the faces in there.

- [5:34:19](https://www.youtube.com/watch?v=undefined&t=56059s) Well, let me go ahead and show


you a program I wrote in advance, that's not terribly long. Much of it is actually comments. But let's see
what I'm doing. I'm importing the Pillow library, again, to get access to images. I'm importing a library
called face recognition, which I downloaded and installed in advance.

- [5:34:35](https://www.youtube.com/watch?v=undefined&t=56075s) But it does what it says. According


to its documentation, you go into that library and you call a function called load image file, to load
something like Office.jpeg, and then you can use the line of code like this. Call a function called face
locations, passing the images input, and you get back a list of all of the faces in the image.

- [5:34:54](https://www.youtube.com/watch?v=undefined&t=56094s) And then down here, a for loop,


that iterates over all of those face locations. And inside of this loop, I just do a bit of trickery. I figure out
the top, right, bottom, and left corners of those locations. And then, using these lines of code here, I'm
using that image library, to just draw a box, essentially.

- [5:35:11](https://www.youtube.com/watch?v=undefined&t=56111s) And the code looks cryptic.


Honestly, I would have to look this up to write it again. But per the documentation, this just draws a nice
little box around the image. So let me go ahead and zoom out here, and run this now on Office.jpeg. All
right, it's analyzing, analyzing, and you can see in the sidebar here, here's the original.

- [5:35:31](https://www.youtube.com/watch?v=undefined&t=56131s) And here is every face that my,


what, 10 lines of Python code found, within that file. What's a face? Presumably the library is looking for
something, maybe without a mask, that has two eyes, a nose, and a mouth, in some kind of
arrangement, some kind of pattern. So it would seem pretty reliable, at least on these fairly easy-to-read
faces here.

- [5:35:50](https://www.youtube.com/watch?v=undefined&t=56150s) What if we want to look for


someone specific, for instance, someone that's always getting picked on. Well, we could do something
like this. Recognize.py, which is taking two files as input, that image and the image of one person in
particular. And if you're trying to find Toby in a crowd, here I conflated the program, sorry, this is the
version that draws a box around the given face.
- [5:36:08](https://www.youtube.com/watch?v=undefined&t=56168s) Here we have Toby as identified.
Why? Because that program, Recognize.py, has a few more lines of code, but long story short, it
additionally loads as input Toby.jpeg, in order to recognize that specific face. And that specific face is a
completely different photo, but it looks similar enough to the person, that it all worked out OK.

- [5:36:29](https://www.youtube.com/watch?v=undefined&t=56189s) Let's do one other that's a little


sensitive to microphones. Let me go into, how about my listen folder here, which is available online, too.
And let's just run Python of Listen0.py. I'm going to type in like David. Oh, sorry, no, I'm going to-- Hello,
world. Oh, no, that's the wrong version.

- [5:36:54](https://www.youtube.com/watch?v=undefined&t=56214s) [CHUCKLES] OK, I looked like an


idiot. OK, hello, there we go. Hello to you, too. And if I say goodbye, I'm talking to my laptop like an idiot,
OK. Now it's detecting what I'm saying here. So this first version of the program is just using some
relatively simple, if elif elif, and it's just asking for input, forcing it to lowercase.

- [5:37:13](https://www.youtube.com/watch?v=undefined&t=56233s) And that was my mistake with the


first example. And then, I'm just checking, is Hello in the user's words? Is how are you in the user's
words? Didn't see that, but it's there. Is goodbye in the user's words? Now let's do a cooler version, using
a library, just by looking at the effect.

- [5:37:26](https://www.youtube.com/watch?v=undefined&t=56246s) Python of Listen1.py. Hello, world.


Huh. Let's do version 2 of this, that uses an audio speech-to-text library. Hello, world. OK, so now it's
artificial intelligence. Now let's do something a little more interesting. The third version of this program
that actually analyzes the words that are said.

- [5:37:53](https://www.youtube.com/watch?v=undefined&t=56273s) Hello, world, my name is David.


How are you? OK, so that time, it not only analyzed what I said, but it plucked my name out of it. Let's do
two final examples. This one will generate a QR code. Let me go ahead and write a program called QR.py,
that very simply does this. Let me import a library called OS.

- [5:38:17](https://www.youtube.com/watch?v=undefined&t=56297s) Let me import a library called QR


code. Let me grab an image here, that's QRcode.make. And let me give you the URL of like a lecture
video on YouTube, or something like that, with this ID. Let me just type this, so I don't get it wrong. OK,
so if I now use this URL here, of a video on YouTube, making sure I haven't made any typos, I'm now
going to go ahead and do two lines of code in Python.

- [5:38:46](https://www.youtube.com/watch?v=undefined&t=56326s) I'm going to first save that as a file


called QR.png, which is a two dimensional barcode, a QR code. And, indeed, I'm going to use this format.
And I'm going to use the OS.system library to open QR.png automatically. And if you'd like to take out
your phone at this point, you can see the result of my barcode, that's just been dynamically generated.

- [5:39:09](https://www.youtube.com/watch?v=undefined&t=56349s) Hopefully from afar that will scan.


[UPROAR] And I think that's an appropriate line to end on. So that's it for CS50. We will see you next
time. [APPLAUSE] [MUSIC PLAYING] [MUSIC PLAYING] DAVID J. MALAN: This is CS50.

- [5:40:45](https://www.youtube.com/watch?v=undefined&t=56445s) And this is week 7, the week,


here, of Halloween. Indeed, special thanks to CS50's own Valerie and her mom for having created this
very festive scenery, and all past ones as well. Today, we pick up where we left off last time, which, recall,
we introduced Python. And that was our big transition from C, where suddenly things started to look
new again, probably, syntactically.

- [5:41:06](https://www.youtube.com/watch?v=undefined&t=56466s) But also, probably things


hopefully started to feel easier. Well, with that said, problem set 6 certainly added some challenges, and
you did some new things. But hopefully you've begun to appreciate that with Python, just a lot more
stuff is easier to do. You get more out of the box with the language itself.

- [5:41:22](https://www.youtube.com/watch?v=undefined&t=56482s) And that's going to be so useful


over the coming weeks as we transition further to introducing something called databases today, web
programming next week and the week after. So that by term's end, and perhaps even for your final
project, you really are building something from scratch using all of these various tools somehow
together.

- [5:41:41](https://www.youtube.com/watch?v=undefined&t=56501s) So before we do that, though,


today, let's consider what we weren't really able to do last week, which was actually create and store
data ourselves. In Python, we've played around with the CSV, comma-separated values library. And
you've been able to read in CSVs from disk, so to speak, that is, from files in your programming
environments.

- [5:42:03](https://www.youtube.com/watch?v=undefined&t=56523s) But we haven't necessarily started


saving data, persisting data ourselves. And that's a huge limitation, because pretty much all of the
examples we've done thus far with a couple of exceptions have involved my providing input at the
keyboard or even vocally. But then nothing happens to it. It disappears the moment the program quits,
because it was only being stored in memory.

- [5:42:21](https://www.youtube.com/watch?v=undefined&t=56541s) But today, we'll start to focus all


the more on storing things on disk, that is, storing things in files and folders so that you can actually
write programs that remember what it is the human did last time. And ultimately, you can actually make
mobile or web apps that actually begin to grow, and grow, and grow their data sets, as might happen if
you get more and more users, for instance, on a website.

- [5:42:41](https://www.youtube.com/watch?v=undefined&t=56561s) To play, then, with this new


capability of being able to write files, let's go ahead and just collect some data. In fact, those of you here
in person, if you want to pull up this URL on your phone or laptop, that's going to lead you to a Google
Form. And that Google Form is going to ask you in just a moment for really just your favorite TV show.

- [5:43:01](https://www.youtube.com/watch?v=undefined&t=56581s) And it's going to ask you to


categorize it according to a genre, like comedy, or drama, or action, or musical, or something like that.
And this is useful, because if you've ever used a Google Form before, or Microsoft's equivalent with
Office 365, it's a really useful mechanism at just collecting data from users, and then ultimately, putting it
into a spreadsheet form.

- [5:43:19](https://www.youtube.com/watch?v=undefined&t=56599s) So this is a screenshot of the form


that those of you here in person or tuning in on Zoom are currently filling out. It's asking only two
questions. What's the title of your favorite TV show? And what are one or more genres into which your
TV show falls? And I'll go ahead and pivot now to the view that I'll be able to see as the person who
created this form, which is quite simply a Google spreadsheet.
- [5:43:43](https://www.youtube.com/watch?v=undefined&t=56623s) Google Forms has this nice
feature, if you've never noticed, that allows you to export your data to a Google Spreadsheet. And then
from there, we can actually grab the file and download it to my own Mac or your own PC so that we can
actually play around with the data that's come in. So in fact, let me go ahead and slide over to this, the
live Google Spreadsheet.

- [5:44:02](https://www.youtube.com/watch?v=undefined&t=56642s) And you'll see, probably, a whole


bunch of familiar TV shows here, all coming in. And if we keep scrolling, and scrolling, and scrolling-- only
46, 47. There we go, up to 50 plus already. If you need that URL again here, if you're just tuning in, you
can go to this URL here. And in just a moment, we'll have a bunch of data with which we can start to
experiment.

- [5:44:25](https://www.youtube.com/watch?v=undefined&t=56665s) I'll give you a moment or so there.


All right. Let me hang in there a little longer. OK, we've got over 100 submissions. Good. Good, even
more coming in now. And we can see them coming in live. Here, let me switch back to the spreadsheet.
The list is growing, and growing, and growing. And in just a moment-- let me give Carter a moment to
help me export it in real time.

- [5:44:52](https://www.youtube.com/watch?v=undefined&t=56692s) Carter, just give me a heads up


when it's reasonable for me to download this file. All right, and I'll begin to do this very slowly. So I'm
going to go up to the File menu, if you've never done this before. Download-- you can download a whole
bunch of formats, one in Excel. But more simply, and the one we'll start to play with here, is comma-
separated values.

- [5:45:09](https://www.youtube.com/watch?v=undefined&t=56709s) So CSV files we used this past


week, why are they useful? Now that you've played with them or used them in past real world, what's
the utility of a CSV file versus something like Excel, for instance? Why CSV in the first place? Any
instincts? Yeah? AUDIENCE: Because it's just a text file? DAVID J. MALAN: OK, so storage is compelling.

- [5:45:30](https://www.youtube.com/watch?v=undefined&t=56730s) A simple text file with ASCII or


Unicode text is probably pretty small. I like that. Other thoughts? AUDIENCE: Structure of it? DAVID J.
MALAN: Yeah, well said. It's just a simple text format, but using conventions like commas you can
represent the idea of columns using new lines, backslash ends invisibly at the end of your lines, you can
create the idea of rows.

- [5:45:47](https://www.youtube.com/watch?v=undefined&t=56747s) So it's a very simple way of


implementing what we might call a flat-file database. It's a way of storing data in a flat, that is, very
simple file that's just pure ASCII or Unicode text. And more compellingly, I dare say, is that with a CSV file,
it's completely portable. Something is portable in the world of computing if it means you can use it on a
Mac or a PC running this operating system, or this other one.

- [5:46:08](https://www.youtube.com/watch?v=undefined&t=56768s) And portability is nice because if I


were to download an Excel file, there'd be a whole bunch of people in this room and online who couldn't
download it because they haven't bought Microsoft Excel or installed it. Or if they have a Mac, or if it's
a .numbers file in the Mac world, a PC user might not be able to download it.

- [5:46:23](https://www.youtube.com/watch?v=undefined&t=56783s) So a CSV is indeed very portable.


So I'm going to go ahead and download, quite simply, the CSV version of this file. That's going to put it
onto my own Mac's Downloads folder. And let me go ahead here, and in just a moment, let me just
simplify the name. Because it actually downloads it at a pretty large name.

- [5:46:40](https://www.youtube.com/watch?v=undefined&t=56800s) And give me just one moment


here, and you'll see that, indeed, on my Mac I have a file called favorites.csv. I shortened the name real
quick. And now what I'm going to do is go over to VS Code, and in VS Code, I'm going to open my File
Explorer. And if I minimize my window here for a moment, a handy feature of VS Code is that you can
just drag and drop a file, for instance, into your Explorer.

- [5:47:03](https://www.youtube.com/watch?v=undefined&t=56823s) And voila, it's going to


automatically upload it for you. So let me go ahead and full screen here, close my Explorer, temporarily
close my Terminal window. And you'll see here a CSV file, favorites.csv. And the first row, by convention,
has whatever the columns were in Google Spreadsheets, or Office 365, in Excel online, timestamp,
comma, title, comma, genres.

- [5:47:24](https://www.youtube.com/watch?v=undefined&t=56844s) Then, we have timestamps, which


indicates when people started submitting. Looks like a couple of people were super eager to get started
an hour or two ago. And then, you have the title next, after a comma. But there's kind of a curiosity after
that. Sometimes I see the genre like comedy, comedy, comedy, but sometimes it's like crime, comma,
drama, or action, comma, crime, comma, drama.

- [5:47:46](https://www.youtube.com/watch?v=undefined&t=56866s) And those things are quoted. And


yet, I didn't do any quotes. You probably didn't type any quotes. Where are those quotes coming from in
this CSV file? Why are they there if we infer? Yeah? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yeah, so
you have a corner case, if you will. Because if you're using commas, as you described, to separate your
data into what are effectively columns, well, you've painted yourself into a corner if your actual data has
commas in it itself.

- [5:48:14](https://www.youtube.com/watch?v=undefined&t=56894s) So what Google has done, what


Microsoft does, what Apple does is, they quote any strings of text that themselves have commas so that
these are now English grammatical commas, not CSV specific commas. So it's a way of escaping your
data, if you will. And escaping just means to call out a symbol in a special way so it's not misinterpreted
as something else.

- [5:48:34](https://www.youtube.com/watch?v=undefined&t=56914s) All right, so this is all to say that


we now have all of this data with which we can play in the form of what we'll start calling a flat-file
database. So suppose I wanted to now start manipulating this data, and I want to store it ultimately,
indeed, in this CSV format. How can I actually start to read this data, maybe clean it up, maybe do some
analytics on it and actually figure out, what's the most popular show among those who submitted here
over the past few minutes? Well, let me go ahead and close this.

- [5:49:00](https://www.youtube.com/watch?v=undefined&t=56940s) Let me go ahead, then, and open


up, for instance, just my Terminal window. And let's code up a file called favorites.py. And let's go ahead
and iteratively start simple by just opening up this file and printing out what's inside of it. So you might
recall that we can do this by doing something like import CSV to give myself some CSV reading
functionality.
- [5:49:20](https://www.youtube.com/watch?v=undefined&t=56960s) Then, I can go ahead and do
something like with open, the name of the file that I want to open in read mode. Quote, unquote, "r"
means to read it. And then, I can say as file, or whatever other name for a variable to say that I want to
open this file, and essentially store some kind of reference to it in that variable called file.

- [5:49:39](https://www.youtube.com/watch?v=undefined&t=56979s) Then, I can give myself a reader,


and I can say csv.reader, passing in that file as input. And this is the magic of that library. It deals with the
process of opening it, reading it, and giving you back something that you can just iterate over, like with a
for loop I do want to skip the first row, and recall that I can do this.

- [5:49:56](https://www.youtube.com/watch?v=undefined&t=56996s) Next, reader, is this little trick that


just says, ignore the first row. Because the first one is special. It said timestamp, title, genres. That's not
your data, that was mine. But this means now that I've skipped that first row. Everything hereafter is
going to be the title of a show that you all like, so let me do this.

- [5:50:12](https://www.youtube.com/watch?v=undefined&t=57012s) For row in the reader, let's go


ahead and print out the title of the show each of you typed in. How do I get at the title of the show each
of you typed in? It's somewhere inside of row. Row recalls a list. So what do I want to type next in order
to get at the title of the current row just as a quick check here? What do I want to type to get at the title
of the row, keeping in mind, again, that it was timestamp, title, genres? Yeah? AUDIENCE: [INAUDIBLE]
DAVID J. MALAN: So row bracket 1 would give me

- [5:50:44](https://www.youtube.com/watch?v=undefined&t=57044s) the second column, 0 index, that


is, the one in the middle with the title. So this program isn't that interesting yet, but it's a quick and dirty
way to figure out, all right, what's my data look like? Let me actually just do a little bit of a check here
and see if it contains the data I think it does.

- [5:50:58](https://www.youtube.com/watch?v=undefined&t=57058s) Let me maximize my Terminal


window here. Let me run Python of favorites.py, hitting Enter. And you'll see now a purely textual list of
all of the shows you all seem to like here. But what's noteworthy about it? Specific shows aside,
judgment aside as to people's TV tastes, what's interesting or noteworthy about the data that might
create some problems for us if we start to analyze this data, and figure out what's the most popular?
How many people like this or that? What do you think? Yeah?

- [5:51:29](https://www.youtube.com/watch?v=undefined&t=57089s) AUDIENCE: User errors


[INAUDIBLE]. DAVID J. MALAN: Yeah, there might be user errors, or just stylistic differences that give the
appearance that one show is different from the other. For instance, here. Let's see if I can see an
example on the screen here. Yeah, so friends here is an all lowercase, Friends here is capitalized.

- [5:51:50](https://www.youtube.com/watch?v=undefined&t=57110s) No big deal. We can sort of


mitigate that. But this is just a tiny example of where data in the real world can get messy fast. And that
probably wasn't even a typo. It was just someone not caring as much to capitalize it, and that's fine. Your
users are going to type what they're going to type.

- [5:52:05](https://www.youtube.com/watch?v=undefined&t=57125s) So let's see if we can't now begin


to get at more specific data, and maybe even clean some of this data up. Let me go back into my file
called favorites.py here, and let's actually do something a little more user friendly for me. Instead of a
reader, recall that there was this dictionary reader that's just a little more user friendly.
- [5:52:25](https://www.youtube.com/watch?v=undefined&t=57145s) And it means I can type in
dictionary reader here, passing in the same file. But now, when I iterate over this reader variable, what is
each row? When using a DictReader instead of a reader, recall, and this is just a peculiarity of the CSV
library, this gives me back, not a list of cells, but what instead, which is marginally more user friendly for
me? Yeah? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yeah.

- [5:52:54](https://www.youtube.com/watch?v=undefined&t=57174s) I can now use open bracket,


quotes, and the title. Because what's coming back now is a dict object, that is, a dictionary which has
keys and values. The keys of which are the column headings. The values of which are the data I actually
care about. So this is just marginally better because, one, it's just way more obvious to me, the author of
this code, what it is I'm getting at.

- [5:53:13](https://www.youtube.com/watch?v=undefined&t=57193s) I don't remember what column


the title was. Was it 0? Was it 1? Was it 2? That's something you're going to forget over time. And God
forbid someone changes the data by just dragging and dropping the columns in Excel, or Apple Numbers,
or Google Spreadsheets. That's going to break all of your numeric indices.

- [5:53:27](https://www.youtube.com/watch?v=undefined&t=57207s) And so a dictionary reader is


arguably just better design because it's more robust against changes and potential errors like that. Now
the effect of this change isn't going to be really any different. If I run Python of favorites.py, voila, I get all
of the same results. But I've now not made any assumptions as to where each of the columns actually is
numerically.

- [5:53:48](https://www.youtube.com/watch?v=undefined&t=57228s) All right. Well, let's go ahead and


now filter out some duplicates. Because there's a lot of commonality among some of the shows here, so
let's see if we can't filter out duplicates. If I'm reading a CSV file top to bottom, what intuitively might be
the logic I want to implement to filter out duplicates? It's not going to be quite as simple as a simple
function that does it for me.

- [5:54:10](https://www.youtube.com/watch?v=undefined&t=57250s) I'm going to have to build this. But


logically, if you're reading a file from top to bottom, how might you go about, in Python or just any
context, getting rid of duplicate values? Yeah, what do you think? AUDIENCE: [INAUDIBLE] DAVID J.
MALAN: Sure. I could use a list and I could add each title to the list, but first check if I put this into the list
before.

- [5:54:38](https://www.youtube.com/watch?v=undefined&t=57278s) So let's try a little something like


that. Let me go ahead and create a variable at the top of my program here. I'll call it titles, for instance,
initialize to an empty list, open bracket, close bracket. And then, inside of my loop here, instead of
printing it out, let's start to make a decision.

- [5:54:55](https://www.youtube.com/watch?v=undefined&t=57295s) So if the current row's title is in


the titles list I don't want to put it there. And actually, let me invert the logic so I'm doing something
proactively. So if it's not the case that row bracket title is in titles, then, go ahead and do something like
titles.append the current row's title.

- [5:55:22](https://www.youtube.com/watch?v=undefined&t=57322s) And recall that we saw .append a


week or so ago, where it just allows you to append to the current list. And then, what can I do at the very
end, after I'm all done reading the whole file? Why don't I go ahead and say, for title in titles, go ahead
and print out the current title? So it's two loops now, and we can come back to the quality of that design.

- [5:55:42](https://www.youtube.com/watch?v=undefined&t=57342s) But let me go ahead here and


rerun Python of favorites.py. Let me increase the size of my Terminal window so we can focus just on
this, and hit Enter. And now, I'm just skimming. I don't think I'm seeing duplicates, although I am seeing
some near duplicates. For instance, there's Friends again.

- [5:56:02](https://www.youtube.com/watch?v=undefined&t=57362s) And if we keep going, and going,


and going, and going, there's Friends again. Oh, interesting, so that's curious that I seem to have multiple
Friends, and I have this one here, too. So how might we clean this up further? I like your instincts, and
it's a step closer to it. What are we going to have to do to really filter out those near duplicates? Any
thoughts? AUDIENCE: You could set everything to lower [INAUDIBLE]..

- [5:56:29](https://www.youtube.com/watch?v=undefined&t=57389s) DAVID J. MALAN: Yeah. What are


the common mistakes to summarize? We could ignore the capitalization altogether and maybe just force
everything to lowercase, or everything to uppercase. Doesn't matter which, but let's just be consistent.
And for those of you who might have accidentally or instinctively hit the spacebar at the beginning of
your input or even at the end, we can strip that off, too.

- [5:56:47](https://www.youtube.com/watch?v=undefined&t=57407s) Stripping whitespace is a common


thing just to clean up user input. So let me go back into my code here, and let me go ahead and tweak
the title a little bit. Let me say that the current title inside of this loop is not going to be just the current
row's title. But let me go ahead and strip off, from the left and the right implicitly, any whitespace.

- [5:57:07](https://www.youtube.com/watch?v=undefined&t=57427s) If you read the documentation for


the strip function, it does just that. It gets rid of whitespace to the left, whitespace to the right. And
then, if I want to force everything to maybe uppercase, I can just uppercase the entire string. And
remember, what's handy about Python is you can chain some of these function calls together by just
using dots again and again.

- [5:57:25](https://www.youtube.com/watch?v=undefined&t=57445s) And that just takes whatever just


happened, like the whitespace got stripped off, then, it additionally uppercases the whole thing as well.
So now, I'm going to just check whether this specific title is in titles. And if not, I'm going to go ahead and
append that title, massaged into this different format, if you will.

- [5:57:43](https://www.youtube.com/watch?v=undefined&t=57463s) So I'm throwing away some


information. I'm sacrificing all of the nuances of your grammar and input to the form itself. But at least
I'm trying to canonicalize size, that is, standardize what the data actually looks like. So let me go ahead
and run Python of favorites.py again and hit Enter. Oh, and this is just user error.

- [5:58:01](https://www.youtube.com/watch?v=undefined&t=57481s) Maybe you haven't seen this


before. This just looks like a mistake on my part. I meant to say not even uppercase. That's completely
wrong. The function is called upper, now that I think of it. All right. Let's go and increase the size of the
Terminal window again. Run Python of favorites.py. And now, it's a little more overwhelming to look at
because it's not sorted yet and it's all capitalized.
- [5:58:23](https://www.youtube.com/watch?v=undefined&t=57503s) But I don't think I'm seeing
multiple Friends, so to speak. There's one Friends up here and that's it. I'm back up at my prompt
already. So we seem now to be filtering out duplicates. Now, before we dive in further and clean this up
further than this, what else could we have done? Well, it turns out that in Python 2 you often do get a lot
of functionality built into the language.

- [5:58:45](https://www.youtube.com/watch?v=undefined&t=57525s) And I'm kind of implementing


myself the idea of a set. If you think back to mathematics, a set is typically something with a bunch of
values that has duplicates filtered out. Recall that Python already has this for us. And we saw it really
briefly when I whipped up the dictionary implementation a couple of weeks back.

- [5:59:02](https://www.youtube.com/watch?v=undefined&t=57542s) So I could actually define my titles


to be a set instead of a list, and this would just modestly allow me to refine my code here, such that I
don't have to bother checking for duplicates anyway. I can instead just say something like, titles.add the
current title, like this. Marginally better design if you know that a set exists because you're just getting
more functionality out of this.

- [5:59:26](https://www.youtube.com/watch?v=undefined&t=57566s) All right, so let's clean the data up


further. We've now gone ahead and fixed the problem of case sensitivity. We threw away whitespace in
case someone had hit the spacebar with some of the input. Let's go ahead now and sort these things by
the titles themselves. So instead of just printing out the titles in the same order you all inputted them,
but filtering out duplicates as we go, let me go ahead and use another function in Python you might not
have seen, which is literally called sorted, and will

- [5:59:52](https://www.youtube.com/watch?v=undefined&t=57592s) take care of the process of


actually sorting titles for you. Let me go ahead and increase the font size of my Terminal, run Python of
favorites.py, and hit Enter. And now you can really see how many of these shows start with the word
"the" or do not. Now it's a little easier to wrap our minds around, just because it's at least sorted
alphabetically.

- [6:00:12](https://www.youtube.com/watch?v=undefined&t=57612s) But now you can really see some


of the differences in people's inputs. So far, so good. But a few of you decided to stylize Avatar in three
different ways here. Brooklyn 99 is a couple of different ways here. And I think if we keep going we'll see
further and further variances that we did not fix by focusing on whitespace and capitalization alone.

- [6:00:32](https://www.youtube.com/watch?v=undefined&t=57632s) So already here, this is only, what,


100 plus, 200 rows. Already real-world data starts to get messy quickly, and that might not bode well
when we actually want to keep around real data from real users. You can imagine an actual website or a
mobile application dealing with this kind of thing on scale. Well, let's go ahead and do this.

- [6:00:48](https://www.youtube.com/watch?v=undefined&t=57648s) Let's actually figure out the


popularity of these various shows by now iterating over my data, and keeping track of how many of you
inputted a given title. We're going to ignore the problems like Brooklyn 99 and the Avatar. Sorry, yeah,
Avatar, where there was things that were different beyond just whitespace and capitalization.

- [6:01:12](https://www.youtube.com/watch?v=undefined&t=57672s) But let's go ahead and keep track


of, now, how many of you inputted each of these titles. So how can I do this? I'm still going to take this
approach of iterating over the CSV file from top to bottom. We've used a couple of data structures thus
far, a list to keep track of titles, or a set to keep track of titles.

- [6:01:30](https://www.youtube.com/watch?v=undefined&t=57690s) But what if I now want to keep


around a little more information? For each title, I want to keep around how many times I've seen it
before. I'm not doing that yet. I'm throwing away the total number of times I see these shows. How
could I start to keep that around? AUDIENCE: Use a dictionary. DAVID J.

- [6:01:49](https://www.youtube.com/watch?v=undefined&t=57709s) MALAN: We could use a


dictionary, and how? Elaborate on that. AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Perfect, really good
instincts. Using a dictionary, insofar as it lets us store keys and values, that is, associate something with
something else. This is why a dictionary or hash tables more generally are such a useful, practical data
structure. Because they just let you remember stuff in some kind of structured way.

- [6:02:08](https://www.youtube.com/watch?v=undefined&t=57728s) So if the keys are going to be the


titles I've seen, the values could be the number of times I've seen each of those titles. And so it's kind of
like just having a two-column table on paper. For instance, if I were going to do this on a piece of paper, I
might just have two columns here, where maybe this is the title that I've seen, and this is the count over
here.

- [6:02:30](https://www.youtube.com/watch?v=undefined&t=57750s) This is, in effect, a dictionary in


Python. It's two columns, keys on the left, values on the right. And this, if I can implement in code, will
actually allow me to store this data, and then maybe do some simple arithmetic to figure out which is
the most popular. So let's do this. Let me go ahead and change my titles to not be a list, not be a set.

- [6:02:50](https://www.youtube.com/watch?v=undefined&t=57770s) Let's have it be a dictionary


instead, either doing this, or more succinctly, two curly braces that are empty gives me an empty
dictionary automatically. What do I now want to do? I think most of my code can stay the same. But
down here, I don't want to just blindly add titles to the data structure.

- [6:03:08](https://www.youtube.com/watch?v=undefined&t=57788s) I somehow need to keep track of


the count. And unfortunately, if I just do this-- let's do titles, bracket, title, plus equals 1. This is a
reasonable first attempt at this. Because what am I doing? If titles is a dictionary and I want to look up
the current title therein, the syntax for that, like before, is titles, bracket, and then the key you want to
use to index into the dictionary.

- [6:03:34](https://www.youtube.com/watch?v=undefined&t=57814s) It's not a number in this case, it's


an actual word, a title. And you're just going to increment it by one, and then eventually I'll come back
and finish my second loop and do things in terms of the order. But for now, let's just keep track of the
total counts. Let me go ahead and increase my Terminal window.

- [6:03:51](https://www.youtube.com/watch?v=undefined&t=57831s) Let me do Python of favorites.py


and hit Enter. Huh. How I Met Your Mother is giving me a key error. What does that mean? And why am I
seeing this? And in fact, just to give a little bit of a breadcrumb here, let me zoom out here. Let me open
up the CSV file again real quickly. And wow, we didn't even get past the second row in the file or the first
show in the file.
- [6:04:18](https://www.youtube.com/watch?v=undefined&t=57858s) Notice that How I Met Your
Mother, somewhat lowercased, is the very first show in therein. What's your instinct for why this is
happening? AUDIENCE: You don't have a starting point. DAVID J. MALAN: I don't have a starting point.
I'm adding one to what? I'm blindly indexing into the dictionary using a key, How I Met Your Mother, that
doesn't yet exist in the dictionary.

- [6:04:37](https://www.youtube.com/watch?v=undefined&t=57877s) And so Python throws what's


called a key error because the key you're trying to use just doesn't exist yet. So logically, how could we fix
this? We're close. We got half of the problem solved, but I'm not handling the obvious, now, case of
nothing being there. Yeah? AUDIENCE: Creating a counter.

- [6:04:54](https://www.youtube.com/watch?v=undefined&t=57894s) DAVID J. MALAN: Creating a--


AUDIENCE: Counter. DAVID J. MALAN: Creating the counter itself. So maybe I could do something like
this. Let me close my Terminal window and let me ask a question first. If the current title is in the
dictionary already, if title in titles, that's going to give me a true-false answer it turns out.

- [6:05:13](https://www.youtube.com/watch?v=undefined&t=57913s) Then, I can safely say, titles,


bracket, title, plus equals 1. And recall, this is just shorthand notation for the same thing as in C, title plus
1. Whoops, typo. Don't do that. That's the same thing as this but it's a little more succinct just to say plus
equals 1. Else, if it's logically not the case that the current title is in the titles dictionary, then I probably
want to say titles, bracket, title equals? Feel free to just shout it out.

- [6:05:41](https://www.youtube.com/watch?v=undefined&t=57941s) AUDIENCE: Zero. DAVID J.


MALAN: Zero. I just have to put some value there so that the key itself is also there. All right. So now that
I've got this going on, let me go ahead and undo my sorting temporarily. And now let me go ahead and
do this. I can, as a quick check, let me go ahead and just run the code as is, Python of favorites.py.

- [6:06:00](https://www.youtube.com/watch?v=undefined&t=57960s) I'm back in business. It's printing


correctly, no key errors, but it's not sorted. And I'm not seeing any of the counts. Let me just quickly add
the counts, and there's a couple of ways I could do this. I could, say, print out the title, and then, maybe,
let's do something like-- how about just, comma, titles, bracket, title? So I'm going to print two things at
once, both the current title in the dictionary, and whatever its value is by indexing into it.

- [6:06:30](https://www.youtube.com/watch?v=undefined&t=57990s) Let me increase my Terminal


window. Let me run Python of favorites.py, Enter, and OK. Huh. Huh. None of you said a whole lot of TV
shows, it seems. What's the logical error here? What did I do wrong if I look back at my code here?
Yeah? Why so many 0s? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Exactly. To summarize, I initialized the
count to 0 the first time I saw it, but I should have initialized it at least to 1 because I just saw it.

- [6:07:04](https://www.youtube.com/watch?v=undefined&t=58024s) Or I should change my code a bit.


So for instance, if I go back in here, the simplest fix is probably to initialize to 1, because on this iteration
of the loop, obviously, I'm seeing this title for the very first time. Or I could change my logic a little bit. I
could do something like this instead. If the current title is not in titles, then I could initialize it to 0.

- [6:07:24](https://www.youtube.com/watch?v=undefined&t=58044s) And then I could get rid of the


else, and now blindly index into the titles dictionary. Because now, on line 11, I can trust that lines 9 and
10 took care of the initialization for me if need be. Which one is better? I don't know. This one's a little
nicer, maybe because it's one line fewer. But I think both approaches are perfectly reasonable and well-
designed.

- [6:07:46](https://www.youtube.com/watch?v=undefined&t=58066s) But the key thing, no pun


intended, is that we have to make sure the key exists before we presume to actually incrue. Oh, this is
wrong. This is incorrect code. What did I do wrong? OK, yes. There we go. So otherwise, everyone would
have liked this show once, and no matter how many people said the same thing.

- [6:08:05](https://www.youtube.com/watch?v=undefined&t=58085s) Now the code is as it should be.


So let me go ahead and open up my Terminal window again. Let me run Python of favorites.py, and now
we see more reasonable counts. Some shows weren't that popular. There's just 1s and maybe 2s. But I
bet if we sort these things we can start to see a little more detail.

- [6:08:22](https://www.youtube.com/watch?v=undefined&t=58102s) So how else can we do this? Well,


turns out, when dealing with a dictionary like this-- let's go ahead and just sort the titles themselves. So
let's reintroduce the sorted function as I did before, but no other changes. Let me go ahead now and run
Python of favorites.py. Now it's just a little easier to wrap your mind around it because at least it's
alphabetical.

- [6:08:44](https://www.youtube.com/watch?v=undefined&t=58124s) But it's not sorted by value, it's


sorted by key. But sure enough, if we scroll down, there's something down here, for instance, like, let's
see, The Office. That's definitely going to be a contender for most popular, 15 responses. But let's see
what's actually going to bubble up to the top.

- [6:09:01](https://www.youtube.com/watch?v=undefined&t=58141s) Unfortunately, the sorted


function only sorts dictionaries by keys by default, not by values. But it turns out, in Python, if you read
the documentation for the sorted function, you can actually pass in other arguments that tell it how to
sort things. For instance, if I want to do things in reverse order, I can add a second parameter to the
sorted function called reverse.

- [6:09:27](https://www.youtube.com/watch?v=undefined&t=58167s) And it's a named parameter. You


literally say, reverse equals true, so that the position of it in the comma-separated list doesn't matter. If I
now rerun this after increasing my Terminal window, you'll see now that it's in the opposite order. Now
adventure and Anne with an E is at the bottom of the output instead of the top.

- [6:09:44](https://www.youtube.com/watch?v=undefined&t=58184s) How can I tell it to sort by values


instead of by key? Well, let's go ahead and do this. Let me go ahead and define a function. I'm just going
to call it f to keep things simple. And this f function is going to take a title as input. And given a given
title, it's going to return the value of that title.

- [6:10:06](https://www.youtube.com/watch?v=undefined&t=58206s) So actually, maybe a better name


for this would be get value, and/or we could come up with something else as well. The purpose of the
get value function, to be clear, is to take it as input a title and then return the corresponding value. Why
is this useful? Well, it turns out that the sorted function in Python, according to its documentation, also
takes a key parameter, where you can pass in, crazy enough, the name of a function that it will use in
order to determine what it should sort by, by the key,
- [6:10:37](https://www.youtube.com/watch?v=undefined&t=58237s) or by the value, or in other cases,
even other types of data as well. So there's a curiosity here, though, that's very deliberate. Key is the
name of the parameter, just like reverse was the name of this other parameter. The value of it, though, is
not a function call. It's a function name. Notice I am not doing this, no parentheses.

- [6:10:56](https://www.youtube.com/watch?v=undefined&t=58256s) I'm instead passing in get value,


the function I wrote, by its name. And this is a feature of Python and certain other languages. Just like
variables, you can actually pass whole functions around so that they can be called for you later on by
someone else. So what this means is that the sorted function written by Python, they didn't know what
you're going to want to sort by today.

- [6:11:17](https://www.youtube.com/watch?v=undefined&t=58277s) But if you provide them with a


function called get value, or anything else, now their sorted function will use that function to determine,
OK, if you don't want to sort by the key of the dictionary, what do you want to sort by? This is going to
tell it to sort by the value by returning the specific value we care about.

- [6:11:34](https://www.youtube.com/watch?v=undefined&t=58294s) So let me go ahead now and


rerun this after increasing my Terminal, Python of favorites.py, Enter. Here we have now an example of
all of the titles you all typed in, albeit forced to uppercase and with any whitespace thrown out. And
now, The Office is an easy win over Friends, versus Community, versus Game of Thrones, Breaking Bad,
and then a lot of variants thereafter.

- [6:11:55](https://www.youtube.com/watch?v=undefined&t=58315s) So there's a lot of steps to go


through. This isn't that bad once you've done it once, and you know what these functions are, and you
know that these parameters exist. But it's a lot of work. That's 17 lines of code just to analyze a CSV file
that you all created by way of those Google Form submissions.

- [6:12:11](https://www.youtube.com/watch?v=undefined&t=58331s) But it took me a lot of work just


to get simple answers out of it. And indeed, that's going to be among the goals for today, ultimately, is,
how can we just make this easier? It's one thing to learn new things in Python, but if we can avoid
writing code, or this much code, that's going to be a good thing.

- [6:12:24](https://www.youtube.com/watch?v=undefined&t=58344s) And so one other technique we


can introduce here that does allow us to write a little less code is, we can actually get rid of this function.
It turns out, in Python, if you just need to make a function but it's going to be used and then essentially
thrown away, it's not something you're going to be reusing in multiple places-- it's not like a library
function that you want to keep around-- you can actually just do this.

- [6:12:45](https://www.youtube.com/watch?v=undefined&t=58365s) You can change the value of this


key parameter to be what's called a lambda function, which is a fancy way of saying a function that
technically has no name. It's an anonymous function. Why does it have no name? Well, it's kind of stupid
that I invented this name on line 13. I used it on line 16, and then I never again used it.

- [6:13:04](https://www.youtube.com/watch?v=undefined&t=58384s) If there's only being used in one


place, why bother giving it a name at all? So if you instead, in Python, say lambda, and then type out the
name of the parameter you want this anonymous function to take, you can then say, go ahead and
return this value. Now let's notice the inconsistencies here. When you use this special lambda keyword
that says, hey Python, give me an anonymous function, a function with no name, it then says, Python,
this anonymous function will take one parameter.

- [6:13:32](https://www.youtube.com/watch?v=undefined&t=58412s) Notice there's no parentheses.


And that's deliberate, if confusing. It just tightens things up a little bit. Notice that there's no return
keyword, which similarly tightens things up a bit, albeit inconsistently. But this line of code I've just
highlighted is actually identical in functionality to this.

- [6:13:52](https://www.youtube.com/watch?v=undefined&t=58432s) But it throws away the word


[INAUDIBLE]. It throws away the word get value. It throws away the parentheses, and it throws away the
return keyword just to tighten things up. And it's well suited for a problem like this where I just want to
pass in a tiny little function that does something useful. But it's not something I'm going to reuse.

- [6:14:08](https://www.youtube.com/watch?v=undefined&t=58448s) It doesn't need multiple lines to


take up space. It's just a nice, elegant one liner. That's all a lambda function does. It allows you to create
an anonymous function right then and there. And then the function you're passing it to, like sorted, will
use it as before. Indeed, if I run Python of favorites.

- [6:14:26](https://www.youtube.com/watch?v=undefined&t=58466s) py after growing my Terminal


window, the result is exactly the same. And we see at the bottom here all of those small results. Are any
questions, then, on this syntax, on these ideas? The goal here has been to write a Python program that
just starts to analyze or clean up data like this. Yeah? AUDIENCE: [INAUDIBLE] DAVID J.

- [6:14:52](https://www.youtube.com/watch?v=undefined&t=58492s) MALAN: Could you use the


lambda if it's just returning immediately? It's really meant for one line of code, generally. So you don't
use the return keyword. You just say what it is you want to return. AUDIENCE: [INAUDIBLE] DAVID J.
MALAN: Good question. Could you do more in that one line if it's got to be a more involved algorithm?
Yes, but you would just ultimately return the value in question.

- [6:15:11](https://www.youtube.com/watch?v=undefined&t=58511s) In short, if it's getting at all


sophisticated you don't use the lambda function in Python. You go ahead and actually just define a name
for it, even if it's a one-off name. JavaScript, another language we'll look at in a few weeks, makes
heavier use, I dare say, of lambda functions. And those can actually be multiple, multiple lines, but
Python does not support that instinct.

- [6:15:31](https://www.youtube.com/watch?v=undefined&t=58531s) All right. So let's go ahead and do


one other thing. Office was clearly popping out of the code here quite a bit. Let's go ahead and write a
slightly different program that maybe just focuses on The Office for the moment, just focuses on The
Office. So let me go ahead and throw most of this code away, up until this point when I'm inside of my
inner loop.

- [6:15:48](https://www.youtube.com/watch?v=undefined&t=58548s) And let me go ahead, and I don't


even want the global variable here. All I want to do is focus on the current title. How could I detect if
someone likes The Office? Well, I could say something like-- how about this? So counter equals 0. We'll
just focus on The Office. If title equals, equals The Office, I could then go ahead and say, counter plus
equals 1.
- [6:16:13](https://www.youtube.com/watch?v=undefined&t=58573s) I don't need a key. There's no
dictionary involved now. It's just a simple integer variable. And then, down here I'll say something like,
number of people who like The Office is, whatever this value is. And I'll put in counter in curly braces,
and then I'll turn this whole thing into an F string.

- [6:16:31](https://www.youtube.com/watch?v=undefined&t=58591s) All right, let me go ahead and run


this. Python of favorites.py, Enter. Number of people who like The Office is 15. All right, so that's great.
But let's go ahead now and deliberately muddy the data a bit. All of you were very nice in that you typed
in The Office. But you can imagine someone just typing Office, for instance, maybe there, maybe there.

- [6:16:51](https://www.youtube.com/watch?v=undefined&t=58611s) And many people might just write


Office, you could imagine. Didn't happen here, but suppose it did, and probably would have if we had
even more and more submissions over time. Now let's go ahead and rerun this program, no changes to
the code. Now only 13 people like The Office. So let's fix this.

- [6:17:05](https://www.youtube.com/watch?v=undefined&t=58625s) The data is now as I mutated it to


have a couple Offices, and many The Offices. How could I change my Python code to now count both of
those situations? What could I change up here in order to improve this situation? Any thoughts? Yeah?
AUDIENCE: You write the title [INAUDIBLE].. DAVID J. MALAN: Yeah, so I could just ask two questions like
that.

- [6:17:30](https://www.youtube.com/watch?v=undefined&t=58650s) If title equals The Office, or title


equals, equals just Office, for instance. And I'm still don't have to worry about capitalization. I don't have
to worry about spaces because I at least threw that all away. Now I can go ahead and rerun this code. Let
me go run it a third time. OK, so we're back up to 15.

- [6:17:48](https://www.youtube.com/watch?v=undefined&t=58668s) So I like that. You could imagine


this not scaling very well. Avatar had three different permutations, and there were some others if we dug
deeper that there might have been more variants. Could we do something a little more general purpose?
Well, we could do something like this. If Office in the title-- this is kind of a cool thing you can do with
Python.

- [6:18:09](https://www.youtube.com/watch?v=undefined&t=58689s) It's very English-like, just ask the


question, albeit tersely. This, interesting, just got me into trouble. Now, all of a sudden, we're up to 16.
Does anyone know what the other one is? AUDIENCE: Someone put V Office. DAVID J. MALAN: What
Office? AUDIENCE: Someone entered a V Office, [INAUDIBLE].. DAVID J. MALAN: Oh, interesting.

- [6:18:31](https://www.youtube.com/watch?v=undefined&t=58711s) Yes, so they hit The. OK.


[APPLAUSE] DAVID J. MALAN: OK. Someone did that, sure. So The V Office. OK, this one's actually going
to be hard to correct for. I can't really think of a general-- well, this is actually a good example of data
gets messy fast. And you could imagine doing something where, OK, we could have like 26 conditions if
someone said The A Office, or The B Office, right? You could imagine doing that.

- [6:19:00](https://www.youtube.com/watch?v=undefined&t=58740s) But then there's surely going to


be other typos that are possible. So that's actually a hard one to fix. But it turns out we got lucky and
now this is actually the accurate count. But the data is itself messy. Let me show another way that just
adds another tool to our toolkit. It turns out that there's this feature in many programming languages,
Python among them, called regular expressions.
- [6:19:22](https://www.youtube.com/watch?v=undefined&t=58762s) And this is actually a really
powerful technique that we'll just scratch the surface of here. But it's going to be really useful, actually,
maybe toward final projects, in web programming, any time you want to clean up data or validate data.
And actually, just to make this clear, give me a moment before I switch screens here.

- [6:19:40](https://www.youtube.com/watch?v=undefined&t=58780s) And let me open up a Google


Form from scratch. Give me just a moment to create something real quick. If you've never noticed this
before when creating a Google Form, you can do a question. And if you want the user to type in
something very specific as a short text answer like this, you might know that there's toggles like this in
Google's world, like you can require it.

- [6:20:02](https://www.youtube.com/watch?v=undefined&t=58802s) Or you can do response


validation. You could say, what's your email? And then you could say something like, text is an email. So
here's an example in Google Forms how you can validate users' input. But a feature most of you have
probably never noticed, or cared about, or used, is this thing called a regular expression, where you can
actually define a pattern.

- [6:20:27](https://www.youtube.com/watch?v=undefined&t=58827s) And I could actually reimplement


that same idea by doing something like this. I can say, let the user type in anything represented by .star,
then an at sign, then something else, then a literal period, then, for instance, something else. So it's very
cryptic, admittedly, at first glance. But this means any character 0 more times.

- [6:20:49](https://www.youtube.com/watch?v=undefined&t=58849s) This means any character 0 more


times. This means a literal period, because apparently dot means any character in the context of these
patterns. Then this thing means any character 0 more times. So I should actually be a little more nitpicky.
You don't want 0 or more times, you want 1 or more times. So this with the plus means any character 1
or more time.

- [6:21:11](https://www.youtube.com/watch?v=undefined&t=58871s) So there has to be something


there. And I think I want the same thing here 1 or more times, 1 or more times. Or heck, if I want to
restrict this form in some sense to edu addresses, I could change that last thing to literally .edu. And so
long story short, even though this looks, I'm sure, pretty cryptic, there's this mini language built into
Python, and JavaScript, and Java, and other languages that allows you to express patterns in a
standardized way.

- [6:21:38](https://www.youtube.com/watch?v=undefined&t=58898s) And this pattern is actually


something we can implement in code, too. And let me switch back to Python for a second just to do the
same kind of idea. Let me toggle back to my code here. Let me put up, for instance, a summary of what it
is you can do. And here's just a quick summary of some of the available symbols.

- [6:21:58](https://www.youtube.com/watch?v=undefined&t=58918s) A period may represent any


character. .star or .asterisks means 0 or more characters. So the dot means anything, so it can be A or
nothing. It can be B or nothing. It can be A, B, A, B, C. It can be any combination of 0 or more characters.
Change that to a plus and you now express one or more characters. Question mark means something is
optional.

- [6:22:21](https://www.youtube.com/watch?v=undefined&t=58941s) Caret symbol means start


matching at the beginning of the user's input. Dollar sign means stop matching at the end of the user's
input. So we won't play with all of these just now. But let me go over here and actually tackle this Office
problem. Let me go ahead and import a new library called the regular expression library, import re.

- [6:22:43](https://www.youtube.com/watch?v=undefined&t=58963s) And then, down here, let me say


this. If re.search, this pattern. Let's just search for Office, quote, unquote, in the current title. Then we're
going to go ahead and increase the counter. So it turns out that the regular expression library has a
function called search that takes as its first argument a pattern, and then, as its second argument the
string you want to analyze for that pattern.

- [6:23:10](https://www.youtube.com/watch?v=undefined&t=58990s) So it's sort of looking for a needle


in this haystack, from left to right. Let me go ahead now and run this version of the program, Enter. And
now I screwed up because I forgot my colon, but that's old stuff. Enter. Huh. Number of people who like
The Office is now 0. So this seems like a big-- thank you-- big step backwards.

- [6:23:31](https://www.youtube.com/watch?v=undefined&t=59011s) What did I do wrong? Yeah?


AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yeah. I forced all my input to uppercase, so I probably need to
do this. So we'll come back to other approaches there. Let me rerun it now. OK, now we're back up to
16. But I could even, let's say-- I could tolerate just The Office. How about this, or how about something
like, or The Office? Let me do this instead.

- [6:23:58](https://www.youtube.com/watch?v=undefined&t=59038s) And let me use these other


special characters. This caret sign means the beginning of the string. This dollar sign weirdly represents
the end of the string. I'm adding in some parentheses just like in math, just to add another symbol here,
the or symbol here. And this is saying start matching at the beginning of the user string.

- [6:24:16](https://www.youtube.com/watch?v=undefined&t=59056s) Check if the beginning of the


string is Office, or the beginning of the string is The Office. And then, you better be at the end of the
string. So they can't keep typing words before or after that input. Let me go ahead and rerun the
program. And now we're down to 15, which used to be our correct answer, but then we noticed The V
Office.

- [6:24:36](https://www.youtube.com/watch?v=undefined&t=59076s) How can we deal with that? It's


going to be messier to deal with that. How about if I tolerate any character represented by dot in
between The and Office? Now if I rerun it, now I really have this expressive capability. So this is only to
say, there are so many ways in languages, in general, to solve problems.

- [6:24:59](https://www.youtube.com/watch?v=undefined&t=59099s) And some of these tools are more


sophisticated than others. This is one that you've actually probably glanced at but never used in the
context of Google Forms for years if you're in the habit of creating these for student groups or other
activities. But it's now something you can start to leverage.

- [6:25:11](https://www.youtube.com/watch?v=undefined&t=59111s) And we're just scratching the


surface of what's actually possible with this. But let's now do one final example just using some Python
code here. And let's actually write a program that's a little more general purpose that allows me to
search for any given title and figure out its popularity.

- [6:25:27](https://www.youtube.com/watch?v=undefined&t=59127s) So let me go ahead and simplify


this. Let's get rid of our regular expressions. Let's go ahead and continue capitalizing the title. And let's
go ahead to-- at the beginning of this program, and first ask the user for the title they want to search for.
So title equals, let's ask the user for input, which is essentially the same thing as our CS50 get_string
function.

- [6:25:49](https://www.youtube.com/watch?v=undefined&t=59149s) Ask them for the title. And then


whatever they type in, let's go ahead and strip whitespace and uppercase the thing again. And now,
inside of my loop, I could say something like this. If the current row's title after stripping whitespace and
forcing it to uppercase, too, equals the user's title, then, go ahead and maybe increment a counter.

- [6:26:15](https://www.youtube.com/watch?v=undefined&t=59175s) So I still need that counter back.


So let me go ahead and define this maybe in here, counter equals 0. And then, at the very end of this
program, let me go ahead and print out just the popularity of whatever the human typed in. So again,
the only difference is I'm asking the human for some input this time.

- [6:26:32](https://www.youtube.com/watch?v=undefined&t=59192s) I'm initializing my counter to 0,


then I'm searching for their title in the CSV file by doing the same massaging of the data by forcing it to
uppercase and getting rid of the whitespace. So now, when I run Python of favorites.py, Enter, I could
type in the office all lowercase even, and now we're down to 13.

- [6:26:55](https://www.youtube.com/watch?v=undefined&t=59215s) 13, why? Oh, that's correct.


Because I'm the one that went in and removed those The keywords a bit ago. If we fixed those, we
would be back up to 15. If we added support for The V Office, we would be up to 16 as well. All right, any
questions then on these various manipulations? And if you're feeling like, oh, my god, this is so much
Python code just to do simple things, that's the point.

- [6:27:20](https://www.youtube.com/watch?v=undefined&t=59240s) And indeed, even though it's a


powerful language and can solve these kinds of problems, we had to write almost 20 lines of code just to
ask a single question like this. But any questions on how we did this, or on any of these building blocks
along the way? Anything here? No? All right. That was a lot. Let's take a five-minute break here.

- [6:27:40](https://www.youtube.com/watch?v=undefined&t=59260s) When we come back, we'll do it


better. So we are back. And the rest of today is ultimately about, how can we store, and manipulate, and
change, and retrieve data more efficiently than we might by just writing raw code? This isn't to say that
you shouldn't use Python to do the kinds of things that we just did.

- [6:27:58](https://www.youtube.com/watch?v=undefined&t=59278s) And in fact, it might be super


common if you're getting a lot of messy input from users that you might want to clean it up. And maybe
the best way to do that is to write a program so that step-by-step you can make all of the requisite
changes and fixes like we did with The Office, for instance, again and again, and reuse that code,
especially if more and more submissions are coming through.

- [6:28:16](https://www.youtube.com/watch?v=undefined&t=59296s) But another theme of today,


ultimately, is that sometimes there are different, if not better tools for the same job. And in fact, now at
this point in the term, as we begin to introduce not just Python, but in a moment a language called SQL,
and next week, a language called JavaScript, and the week after that, synthesizing a whole lot of these
languages together is to just kind of paint a picture of how you might decide what the trade-offs are
between using this tool, or this tool, or this other tool.
- [6:28:42](https://www.youtube.com/watch?v=undefined&t=59322s) Because undoubtedly you can
solve problems moving forward in many different ways with many different tools. So let's give you
another tool, one with which you can implement a proper relational database. What we just saw in the
form of CSV files are what we might call flat-file databases. Again, just a very simple file, flat in that
there's no hierarchy to it.

- [6:29:03](https://www.youtube.com/watch?v=undefined&t=59343s) It's just rows and columns. And


that is all ultimately storing ASCII or Unicode text. A relational database, though, is something that's
actually closer to a proper spreadsheet program. A CSV is an individual sheet, if you will, from a
spreadsheet when you export it. If you had multiple sheets in a spreadsheet, you would have to export
multiple CSVs.

- [6:29:25](https://www.youtube.com/watch?v=undefined&t=59365s) And that gets annoying quickly in


code if you have to open up this CSV, this CSV, all of which represent different sheets or tabs in a proper
spreadsheet. A relational database is more like a spreadsheet program that you, a programmer, now can
interact with. You can write data to it. You can

- [6:29:45](https://www.youtube.com/watch?v=undefined&t=59385s) read data from it, and you can


have multiple sheets, a.k.a., tables storing all of your data. So whereas Excel and numbers in Google
spreadsheet are meant to be reused really by humans with their mouse and their keyboard, clicking, and
pointing, and manipulating things graphically, a relational database using a language called SQL is one in
which the programmer has similar capabilities, but doing so in code.

- [6:30:04](https://www.youtube.com/watch?v=undefined&t=59404s) Specifically, using a language


called SQL, and at a scale that's much grander than spreadsheets alone. In fact, if you try on your Mac or
PC to open a spreadsheet that's got tens of thousands of rows, it'll probably work fine, hundreds of
thousands of rows, millions of rows, no way. At some point your Mac or PC is going to struggle to open
particularly large data sets.

- [6:30:24](https://www.youtube.com/watch?v=undefined&t=59424s) And that, too, is where proper


databases come into play and proper languages for databases come into play, when it's all about scale.
And indeed, most any mobile app or web app today that you or someone else might write should
probably plan on lots of data if it's successful. So we need the right tools for that problem.

- [6:30:41](https://www.youtube.com/watch?v=undefined&t=59441s) So fortunately, even though we're


about to learn yet another language, it only does four things fundamentally, known by this silly acronym,
CRUD. SQL, this language for databases, supports the ability to create data, read data, update data, and
delete data. That's it. There's a few more keywords that exist in this language called SQL that we'll soon
see.

- [6:31:03](https://www.youtube.com/watch?v=undefined&t=59463s) But at the end of the day, even if


you're starting to feel like this is a lot very quickly, it all boils down to these four basic operations. And
the four commands in SQL, if you will, functions in a sense that implement those four ideas happen to be
these. They're almost the same but with some slight variance.

- [6:31:20](https://www.youtube.com/watch?v=undefined&t=59480s) The ability to create or insert data


is the C. The ability to select data is the R, or read. Update is the same. Delete is the same, but drop is
also a keyword as well. So we'll see these and a few other keywords in SQL that, at the end of the day,
just allow you to create, read, and update data using verbs, if you will, like these.

- [6:31:40](https://www.youtube.com/watch?v=undefined&t=59500s) So to do that, what's the syntax


going to be? Well, we won't get into the weeds too quickly on this. But here's a representative syntax of
how you can create using this language called SQL, in your very own database, a brand new table. This is
so easy in Excel, and Google Spreadsheets, and Apple Numbers.

- [6:31:56](https://www.youtube.com/watch?v=undefined&t=59516s) You want a new sheet, you click


the plus button. You get a new tab. You give it a name, and boom, you're done. In the world of
programming, though, if you want to create the analogue of that spreadsheet in the computer's
memory, you create something called a table, like a sheet, that has a name, and then in parentheses has
one or more columns.

- [6:32:14](https://www.youtube.com/watch?v=undefined&t=59534s) But unlike Google Spreadsheets,


and Apple Numbers, and Excel, you have to decide as the programmer what types of data you're going to
be storing in each of these columns. Now even though Excel, and Google Spreadsheets, and Numbers
does allow you to format or present data in different ways, it's not strongly typed data like it is, for
instance, when we were using C.

- [6:32:33](https://www.youtube.com/watch?v=undefined&t=59553s) And heck, even in Python there's


underlying data types. Even if you don't have to type them explicitly, databases are going to want to
know, are you storing integers? Are you storing real numbers or floats? Are you storing text? Why?
Because especially as your data scales, the more hints you give the database about your data, the more
performance it can be, the faster it can help you get at and store that data.

- [6:32:53](https://www.youtube.com/watch?v=undefined&t=59573s) So types are about to be


important again, but there's not going to be that many of them, fortunately. Now how can I go about
converting, for instance, some real data, like that from you, my favorites.csv file, into a proper relational
database? Well, it turns out that using SQL I can do this in VS Code on my own Mac, or PC, or in the
cloud here by just importing the CSV into a database.

- [6:33:14](https://www.youtube.com/watch?v=undefined&t=59594s) We'll see eventually how to do


this manually. For now, I'm going to use more of an automated process. So let me go over to VS Code
here. Let me type ls to see where we left off before. I had two files favorites.csv, which I downloaded
from Google Spreadsheets. Recall that I made a couple of changes.

- [6:33:28](https://www.youtube.com/watch?v=undefined&t=59608s) We deleted a couple of Thes from


the file for The Office. But this is the same file as before, and then we have favorites.py, which we'll set
aside for now. I'm going to go ahead now and run a command SQLite3. So in the world of relational
databases, there's many different products out there, many different software that implements the SQL
language.

- [6:33:51](https://www.youtube.com/watch?v=undefined&t=59631s) Microsoft has their own. There's


something called MySQL that's been very popular for years. Facebook, for instance, used it early on.
PostgreSQL, Microsoft Access Server, Oracle, and maybe a whole bunch of other product names you
might have encountered over time, which is to say there's many different types of tools, and servers, and
software in which you can use SQL.
- [6:34:10](https://www.youtube.com/watch?v=undefined&t=59650s) We're going to use a very
lightweight version of the SQL language today called SQLite. This is the version of SQL that's generally
used on iPhones and Android devices these days. If you download an app that stores data like your own
contacts, typically is stored using SQLite. Because it's fairly lightweight, but you can still store hundreds,
thousands, even tens of thousands of pieces of data even using this lightweight version thereof.

- [6:34:33](https://www.youtube.com/watch?v=undefined&t=59673s) SQLite3 is like version 3 of this


tool. We're going to go ahead and run SQLite3 with a file called favorites.db. It's conventional in the
world of SQLite to name your file something.db. I'm going to create a database called favorites.db. Once
I'm inside of the program, now I'm going to go ahead and enter CSV Mode.

- [6:34:52](https://www.youtube.com/watch?v=undefined&t=59692s) Again, not something you have to


memorize, just something you can look up as needed. And then, I'm going to import favorites.csv into a
table, that is, a sheet, if you will, called favorites as well. Now I'm going to hit Enter and I'm going to go
ahead and exit the program altogether and type ls.

- [6:35:11](https://www.youtube.com/watch?v=undefined&t=59711s) Now I have three files in my


current directory-- the CSV file, the Python file from before, and now favorites.db. But if I did this right,
all of the data you all typed into the CSV file has now been loaded into a proper database where I can
now use this SQL language to access it instead. So let's go ahead again and run SQLite3 of favorites.db,
which now exists.

- [6:35:33](https://www.youtube.com/watch?v=undefined&t=59733s) And now, at the SQLite prompt I


can start to play around and see what this data is. For instance, I can look, by typing .schema, at what the
schema is of my data, what's the design. Now no thought was put into the design of this data at the
moment because I automated the whole process. Once we start creating our own databases we'll give
more thought to the data types and the columns that we have.

- [6:35:55](https://www.youtube.com/watch?v=undefined&t=59755s) But we can see what SQLite


presumed I wanted just by importing the data by default. What the import command did for me a
moment ago is essentially the syntax. It automated the process of creating a table, if it doesn't exist,
called favorites. And then notice, in parentheses it gave me three columns-- timestamp, title, and genres,
which were inferred, obviously, from the CSV.

- [6:36:19](https://www.youtube.com/watch?v=undefined&t=59779s) All three of which have been


decreed to be text. Again, once we're more comfortable we'll create our own tables, choose our own
types and column names. But for now, I just automated the whole process just to get us started by using
this built-in import command as well. All right. So what now can I begin to do? Well, if I wanted to, for
instance, start playing around with data therein, I might execute a couple of different commands.

- [6:36:45](https://www.youtube.com/watch?v=undefined&t=59805s) Let me find the right one here--


one of which would be select. Select being one of our most versatile tools to select data from this
database. So if I have these three columns here-- timestamp, title, and genres, suppose I want to select
all of the titles. Doing that earlier in Python required importing the CSV library, opening the file, creating
a reader or a DictReader, iterating over every row, adding every title to a dictionary or just printing it out,
and dot, dot, dot.
- [6:37:18](https://www.youtube.com/watch?v=undefined&t=59838s) There was a dozen or so lines of
code when we first began. Now, how about this? Select title from favorites, semicolon, done. So now,
with this particular language, the output is very textual and it's simulating what it looks like if it were
more graphical by creating this table, so to speak. Select title from favorites is a distillation in a different
language called SQL of all the lines of code I wrote early on when we first started playing with
favorites.py.

- [6:37:46](https://www.youtube.com/watch?v=undefined&t=59866s) SQL is therefore optimized for


reading, and creating, and updating, and ultimately, deleting data. So here's perhaps a better tool for the
job once you have the data. Tossing it into a more powerful, versatile format might allow you now to get
more work done more quickly without having to reinvent the wheel.

- [6:38:04](https://www.youtube.com/watch?v=undefined&t=59884s) Someone else has figured out


how to select data like this. What more can I do here? Well, let me go ahead and pull up, in a moment,
just a little bit of a cheat sheet here. Give me one second to find this. So suppose I want to now select
data a little more powerfully. So here's what I just did in a canonical way.

- [6:38:26](https://www.youtube.com/watch?v=undefined&t=59906s) So select typically works like this.


You select columns from a specific table, semicolon. Unfortunately, stupid semicolons are back. Select
columns from table then, is the generic form of what I just did. More specifically, I selected one column
called title from favorites. Favorites is the name of the table.

- [6:38:44](https://www.youtube.com/watch?v=undefined&t=59924s) Semicolon ends my thought.


Suppose I wanted to get two things, like the genres that each of you inputted. I could instead do select
title, comma, genres from favorites, and then, a semicolon, and Enter. It's going to look a little ugly on my
screen because some of these titles and-- OK, one of you really went all out with Community.

- [6:39:03](https://www.youtube.com/watch?v=undefined&t=59943s) You can see that it's just wrapping


in an ugly way, but it's just now showing me two columns. If we scroll up to the very top again, the left
most of one, Black Mirror went all out, too. Thank you. And now, OK, we're going to have to clean some
of these up. Game of Thrones, good comedy, yes. Keep going, keep going, keep going.

- [6:39:25](https://www.youtube.com/watch?v=undefined&t=59965s) So now we've selected two of the


columns that we care about. There it is. OK, so it's crazy wide because of all of those genres. But it allows
me to select exactly the data I want. Let's go back to the titles, though, and perhaps start playing around
with some modifiers here. For instance, it turns out, using SQL there's a lot of functionality built into the
language.

- [6:39:45](https://www.youtube.com/watch?v=undefined&t=59985s) You've got a lot of functions,


similar to Excel or Google Spreadsheets where you can have formulas. SQL provides you with some of
the same heuristics that allow you to apply operations like these on entire columns. For instance, you
can take averages, count the total, get the distinct values, force things to lowercase, uppercase, min, and
max, and so forth.

- [6:40:03](https://www.youtube.com/watch?v=undefined&t=60003s) So let's try distinct, for instance.


Let me go back to my Terminal, and let's say, select, how about the distinct titles from the favorites
table? Enter. I didn't bother selecting the genres because I want it to be a little prettier. And you can see
here that we have just the distinct titles, except for issues of formatting.
- [6:40:26](https://www.youtube.com/watch?v=undefined&t=60026s) So whitespace is going to be an
issue again. Capitalization is going to be a thing again. So there's a trade-off. One of the things I was
doing in Python was forcing everything to uppercase and then getting rid of whitespace. But we could
combine some of these. I could do something like force every title to uppercase, then get the distinct
value.

- [6:40:42](https://www.youtube.com/watch?v=undefined&t=60042s) And that's actually going to get rid


of some of those values as well. And again, I did it all in one simple line that was fast. So let me pull up at
the bottom of the screen again. I selected distinct upper titles from favorites, and that did everything for
me at once in just one breath. Suppose I want to get the total number of counts of titles.

- [6:40:59](https://www.youtube.com/watch?v=undefined&t=60059s) How about select count of all of


those titles from favorites? Semicolon, Enter, and now you get back a mini table that contains just your
answer, 158 in this case. So that's the total number of, not distinct, but total titles that we had in the file.
And we could continue to manipulate the data further using, again, functions like these here.

- [6:41:24](https://www.youtube.com/watch?v=undefined&t=60084s) But there's also additional


filtration we can do. We can also qualify our selections by saying where some condition is true. So just as
in Scratch, and C, and Python, you have Boolean expressions, you can have the same in SQL as well,
where I can filter my data where something is true or false. Like allows me to do approximations.

- [6:41:47](https://www.youtube.com/watch?v=undefined&t=60107s) If I want to get something that's


like The Office but not necessarily T-H-E, space, Office, I could do pattern matching using like here. Order
by, limit, and grouped by are other commands I can execute, too. So let me go back and do a couple of
these here. How about, let me just get, oh, I don't know, all of the titles from favorites but limit it to 10
results.

- [6:42:10](https://www.youtube.com/watch?v=undefined&t=60130s) That might be one thing that's


helpful to see if you just care about some of the data at the top there instead. How about, select all of
the titles from favorites, where the title itself is like, quote, unquote, "Office?" And this will give me only
two answers. Those are the two rows, recall, that I mutated by getting rid of the word The.

- [6:42:33](https://www.youtube.com/watch?v=undefined&t=60153s) Notice that like allows me too


tolerate uppercase and lowercase. Because if I instead just use the equal sign, and in SQL a single equal
sign does, in fact, mean equality. For comparison's sake, it's not doing assignment. This is not how you
assign data in SQL. I got back no answers there. So indeed, the equal sign is giving me literal answers that
searches just for what I typed in.

- [6:42:59](https://www.youtube.com/watch?v=undefined&t=60179s) How could I get all of these? Well,


similar in spirit to regular expressions but not quite as powerful in SQL, I could do something like this. I
can select the title from favorites where the title is like, quote, unquote, "Office." But I can add, a bit
weirdly, percent signs to the left and the right.

- [6:43:18](https://www.youtube.com/watch?v=undefined&t=60198s) So the language SQL supports the


same notion of pattern matching but much more limited out of the box. If we want more powerful
regular expressions we probably do want to use Python instead. But the percent sign here means 0 or
more characters on the left, 0 or more characters on the right. So this will just grab any title that contains
O-F-F-I-C-E in it in that order.
- [6:43:40](https://www.youtube.com/watch?v=undefined&t=60220s) And now I get all 16, it would
seem, of those results, again. How do I know it's 16? Well, I can just get the count of those titles and get
back that answer instead as well. So again, it takes some getting used to, the vocabulary and the syntax
that you can use. There's these building blocks and others.

- [6:43:58](https://www.youtube.com/watch?v=undefined&t=60238s) But SQL is really designed, again,


for creating, reading, updating, and deleting data. For instance, I've never really been a fan of Friends, for
instance. So right now if I do select, how about title from favorites where title like, quote, unquote,
Friends with the percent signs? We can see that there's a whole bunch of them.

- [6:44:21](https://www.youtube.com/watch?v=undefined&t=60261s) That's how many exactly. Let's


just do a quick count. So that's nine of them. Well, delete from favorites. OK, you and me, delete from
favorites, where title like Friends, Enter. Nothing seems to happen, but bye-bye Friends. [APPLAUSE]
DAVID J. MALAN: Thank you. So now we've actually changed the data.

- [6:44:47](https://www.youtube.com/watch?v=undefined&t=60287s) And this is what's compelling


about a proper database. Yes, you could technically write Python code that not only reads the CSV file,
but also writes it. You can change using quote, unquote, "A" for append, or quote, unquote, "W" for
write, instead of quote, unquote, "R" for read alone.

- [6:45:03](https://www.youtube.com/watch?v=undefined&t=60303s) But it's definitely a little more


involved to do that in Python. But with SQL, you can update the data in real time. And if I were actually
running a web application here or a database for a mobile app, that change, theoretically, would be
reflected everywhere on your own devices if you're somehow talking to this application.

- [6:45:18](https://www.youtube.com/watch?v=undefined&t=60318s) So that's the direction we're


headed. This other thing has been bothering me. So select, how about title from favorites, where title
equals, what was it? The V Office, was it? Yeah, it was that one. How about we update favorites by
setting title equal to The Office, where title equals quote, unquote, "The V Office" semicolon? And now,
if I select the same thing again I can go up and down with my arrow keys quickly.

- [6:45:50](https://www.youtube.com/watch?v=undefined&t=60350s) Now there is no The V Office.


We've actually changed that value. How about genres? Select genres from favorites, where the title is
title equals Game of Thrones, semicolon. These were kind of long, and I don't really agree with all of
that. So how about we update favorites, set genres equal to, sure, action, adventure, sure, drama? OK,
so it's a decent list.

- [6:46:20](https://www.youtube.com/watch?v=undefined&t=60380s) Fantasy, sure, thriller, war. OK,


anything really but comedy, I would say. Let's go ahead and hit Enter now. And now, if I select genres
again, same query, now we've canonicalized that. We've thrown data away. So whether or not that is
right is probably a bit subjective and argumentative. But I have at least cleaned up my data, which is,
again, the U in CRUD.

- [6:46:42](https://www.youtube.com/watch?v=undefined&t=60402s) Create, read, update, delete, you


can do it that easily. Beware using delete. Beware worse using drop, whereby you can drop an entire
table. But via these kinds of commands, can we actually now manipulate our data much more rapidly
and with single thoughts. And in fact, if you're an aspiring statistician, or data scientist, or analyst in the
real world, SQL is such a commonly used language because it allows you to really dive into data quickly,
and ask questions of the data, and get back answers quite quickly.

- [6:47:11](https://www.youtube.com/watch?v=undefined&t=60431s) And this is a simple data set. You


can do this with much larger data sets as we soon will, too. Or any questions on what we've seen of SQL
thus far? Only scratched the surface, but again, it boils down to creating, reading, updating, and deleting
data. Questions here? All right. Well, let's consider the design of this data.

- [6:47:33](https://www.youtube.com/watch?v=undefined&t=60453s) Recall that if I do .schema, that


shows me the design of my table, the so-called schema of my data. This is OK. It gets the job done, and
frankly, everything the user typed in was arguably text, including the timestamp, which is the date and
time. But so the data set itself is somewhat simple. But if we look at the data set itself, especially genres,
let's do this.

- [6:47:55](https://www.youtube.com/watch?v=undefined&t=60475s) Select genres from favorites. And


let me point out one other thing stylistically, too. I am very deliberately capitalizing all of the special SQL
keywords, and I'm lowercasing all of the column names and the table names. This is a convention, and
honestly, it just helps you read, I think, the code when you're co-mingling your names for columns and
tables with proper SQL keywords.

- [6:48:18](https://www.youtube.com/watch?v=undefined&t=60498s) But I could just as easily do select


genres from favorites, but again, the SQL specific keywords don't quite jump out as much. So stylistically,
we would recommend this, selecting genres from favorites, semicolon. So here is where-- oh. OK, that
was not intended. I accidentally made every show, including The Office about action, adventure, drama,
fantasy, thriller, and war.

- [6:48:46](https://www.youtube.com/watch?v=undefined&t=60526s) How did I do that accidentally?


What did I do wrong? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yeah. So beware, this is funny. I think I
did say beware around this time. So the SQL database took me-- literally, I updated favorites, setting
genres equal to that, semicolon, end of thought. I really wanted to say where title equals, quote,
unquote, "Game of Thrones.

- [6:49:11](https://www.youtube.com/watch?v=undefined&t=60551s) " Unfortunately, there isn't an


undo command or time machine with a SQL database, so the best we can do here is, let's actually get rid
of favorites.db. Let's run SQLite of favorites.db again, which now will be recreated. Let me change myself
into CSV mode. Let me import, into my favorites table, the CSV file.

- [6:49:35](https://www.youtube.com/watch?v=undefined&t=60575s) And now, Friends is back, for


better or for worse, but so are all of our genres. If I now reload the file and do select, star, from-- sorry.
Select genres from favorites, that was the result I was getting. It's much messier, but that's because some
of these are quite long. But now we're back to the original data.

- [6:49:56](https://www.youtube.com/watch?v=undefined&t=60596s) Lesson here, be sure to back up


your work. All right. So what more can we now do with this data? Well, I don't love the design of the
genres table for a couple of reasons. One, we didn't have any sort of validation, but user input is going to
be messy. There's just a lot of redundancy in here. Let's go ahead and do this.
- [6:50:15](https://www.youtube.com/watch?v=undefined&t=60615s) Let me select all the comedies
you all typed in. So select title from favorites, where genres equals, quote, unquote, "comedy." OK, so
there's all of the shows that are explicitly comedies. But I think there might actually be others. Let me
scroll back up here. Comedy, drama. What was a comedy and a drama? How about let's search for the--
oops, let me copy paste comedy, comma, drama.

- [6:50:44](https://www.youtube.com/watch?v=undefined&t=60644s) OK, so The Office, in this case,


was considered comedy and drama, Billions, It's Always Sunny in Philadelphia, and Gilmore Girls as well.
But notice that I get many more when I just search for comedy. So the catch here is that, because I have
all of these genres implemented the way Google did, as a comma-separated list, it's actually really hard
and messy to get at any show, all of the shows that are somewhere described as comedy.

- [6:51:12](https://www.youtube.com/watch?v=undefined&t=60672s) Because if I search for quote,


unquote, "comedy," the only answers I'm going to get are this one, whatever that show is, this one,
whatever that show is, this one. But I'm not going to get this one. I'm not going to get this one. Why? If
I'm searching for, where genres equals, quote, unquote, "comedy," why am I missing those other shows?
Why am I missing? Yeah? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Exactly.

- [6:51:37](https://www.youtube.com/watch?v=undefined&t=60697s) It's not just a comedy, it's a


comedy and a drama, and a comedy or a news show, and so forth. So I have to search for these commas,
so this gets messy quickly, right? Let me copy this so I can do this. Let me search for where genres equals
comedy. How about, or genres equals comedy, drama, or genres equals this whole thing, comedy, news,
talk show? I'm going to get more and more results.

- [6:52:04](https://www.youtube.com/watch?v=undefined&t=60724s) But that's not going to scale well.


What could I do instead of enumerating with ors all of the different permutations of genres, do you
think? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yeah. So I could use the keyword is, similar in Python to
the word in. I could use the like keyword so that so long as the genres is like comedy somewhere in
there, that's going to give me all of them, so long as the word comedy is in there.

- [6:52:31](https://www.youtube.com/watch?v=undefined&t=60751s) But let me go ahead and just


open the form from earlier. Let me see if I can open this real quick before I toggle over. If we look back at
the form, recall that there were all of those radio buttons asking for the specific genres into which
something fell. And if I open this, let me full screen here and now open the original form.

- [6:52:55](https://www.youtube.com/watch?v=undefined&t=60775s) You'll see all of the genres here,


none of which are that worrisome except for a corner case is jumping out at me. Where might the like
keyword alone get me into trouble? It's not with comedy. I'm OK with comedy. AUDIENCE: Music and
musical? DAVID J. MALAN: Yeah, music and musical are deliberately on the list here.

- [6:53:17](https://www.youtube.com/watch?v=undefined&t=60797s) Because, one, they're separate


genres. But if I just search for something that's like music, I'm going to accidentally suck in all of the
musicals, which might not be what I intend. If music is a music video or whatever, and musical is actually
a different type of show, I don't want to just do that.

- [6:53:32](https://www.youtube.com/watch?v=undefined&t=60812s) So it seems just very messy. I


could probably hack something together with-- maybe add some commas in there, or something like
this. But this is just not a good design for the data. Google has done it this way because it's just simple to
actually keep the user's data all in a single column, and just as they did, separate it by commas.

- [6:53:50](https://www.youtube.com/watch?v=undefined&t=60830s) But this is a real messy way to use


CSV is by putting comma-separated values in your comma-separated values. Arguably, the folks at
Google probably just did this because it's just simpler. And they didn't want to give people multiple
sheets or complicate things using some other weirder character than commas alone.

- [6:54:08](https://www.youtube.com/watch?v=undefined&t=60848s) But I bet there's a better way for


us to do this. And let me go ahead and do this. Let me go back into my code here. And in just a moment,
I'm going to grab a program that I wrote in advance that's going to use Python to open up the CSV file,
iterate over all of the rows, and load the data into two tables this time, two tables, one called shows, and
one called genres, so as to actually separate these two things out.

- [6:54:30](https://www.youtube.com/watch?v=undefined&t=60870s) Give me just a moment to grab


the code. And when I run this, I'll only have to run it once. Let me go ahead and run Python in a moment,
and I'll reveal the results in a sec. This is going to be version 8 of the code online. When I do this, let me
go ahead and open up this file. Give me a second to move it into this directory.

- [6:54:52](https://www.youtube.com/watch?v=undefined&t=60892s) Version 8, OK. So here we have


version 8 of this that's available online that's going to do the following. And I'll gloss over some of the
details just so that we don't get stuck in the weeds of some of this code. I'm going to be using, at the top
of this program, as we'll soon see, a CS50 library, not for the sake of get_string, or get_int, or get_float,
but because there's some built-in SQL functionality that we didn't discuss a couple of weeks back with
the CS50 library itself.

- [6:55:18](https://www.youtube.com/watch?v=undefined&t=60918s) But inside of the CS50 library we'll


see there is a special function called SQL that gives you the ability using this weird URL-like looking thing,
technically called a URI, that allows me to open a file called favorites.db. And long story short, all of the
subsequent code is going to iterate over this favorites.csv file that we downloaded.

- [6:55:38](https://www.youtube.com/watch?v=undefined&t=60938s) And it's going to import it into the


SQLite database, but it's going to use two tables instead of just one. So give me just a moment to run
this, and then I'll reveal the actual results. This is going to be run on favorites.csv. And taking a look here,
give me just a moment. Oh, give me a sec. Come on.

- [6:56:11](https://www.youtube.com/watch?v=undefined&t=60971s) Come on. This program should


not be taking this long. Sorry. Let's open this real fast. Whoops, not that file. OK. Let me just skim this
code real quick to see where we've gone wrong. [INAUDIBLE] reader. Reader, title, show ID in certain two
shows. [INAUDIBLE] genres split, DB execute. All right. This is me debugging in real time.

- [6:56:43](https://www.youtube.com/watch?v=undefined&t=61003s) All those times we encourage you


to use print, this is me actually using print. We'll see how quickly I can recover from this. Python of
favorites version 8. OK, so here's me debugging in real time. It's printing it. Oh, maybe I just didn't wait
long enough. OK, so here we go. What I'm doing is printing out the dictionary that represents each row
that you all typed in.
- [6:57:07](https://www.youtube.com/watch?v=undefined&t=61027s) And we're actually making
progress. All right. I was too impatient and didn't wait long enough. So in a moment-- there we go. All
right, so all we have to do sometimes is wait. Let me go ahead now and open this file using SQLite3. So in
SQLite3 I now have a different version of favorites.db. I named it number 8 for consistency.

- [6:57:25](https://www.youtube.com/watch?v=undefined&t=61045s) Once I've run the program I can


do .schema to look inside of it. And here's what the two tables in this database are going to look like. I've
created a table called shows, this time to represent all of the TV shows that are favorites, that has two
columns. One is called ID, one is called Title.

- [6:57:42](https://www.youtube.com/watch?v=undefined&t=61062s) But now I'm going to start taking


out for a spin some of the other features of SQL. And besides there being text, it turns out there's a data
type called integer. Besides there being a data type called text, there's also a special key phrase that you
can specify that the title can never be null.

- [6:57:57](https://www.youtube.com/watch?v=undefined&t=61077s) Think back to our use of null in C.


Think back to the keyword none in Python. This is a database constraint that allows you to ensure that
none of you can't have of favorite TV show. If you submit the form, you have to have typed in a title for it
to end up in our database here. And you'll notice one other new feature.

- [6:58:16](https://www.youtube.com/watch?v=undefined&t=61096s) It turns out, on this table I'm


defining what's called a primary key, specifically to be the ID column. More on that in just a moment.
Meanwhile, the second table my code has created for me, as we'll soon see, gives me a column called
show ID, and then, a genre, the value of which is text that can also not be null.

- [6:58:36](https://www.youtube.com/watch?v=undefined&t=61116s) And then more on this in a


moment. This table has what we're going to call a foreign key, specifically the show ID column that
references shows ID. So before we get into the weeds of this, this is now a way of creating the relation in
relational database. If I have two tables now, not just one, they can somehow be linked together by a
common column.

- [6:59:00](https://www.youtube.com/watch?v=undefined&t=61140s) In other words, the shows


column-- shows table is going to give me a table with two columns-- an ID and a title. Every title you gave
me, I'm going to assign a unique value. The genre's table, meanwhile, is going to associate individual
genres singular with that same idea. And the result of this, to pop back to the Terminal here, is, let's do
this.

- [6:59:26](https://www.youtube.com/watch?v=undefined&t=61166s) Select star from shows of this new


database, and you'll see that I've given, indeed, all of the shows you all typed in unique identifiers. I
didn't filter out duplicates or do anything beyond just forcing everything to uppercase. So there's going
to be some duplicates here because I didn't want to get rid of anyone's data.

- [6:59:44](https://www.youtube.com/watch?v=undefined&t=61184s) But you'll see that, indeed, I've


given everyone a unique identifier, from the very first person who typed How I Met Your Mother, all the
way down to input number 158. Meanwhile, if I do select star from genres, which is now a table, not just
a column in the original data, now you'll see a much better design for this data.
- [7:00:09](https://www.youtube.com/watch?v=undefined&t=61209s) Notice what I've done here. Let
me go all the way to the top and you'll see two columns, one of which is called show ID, the other of
which is called genre. And again, I wrote some code to do this because I had to take Google's messy
output where everything was separated by commas. I had to tear away the commas and then put each
genre into this table by itself.

- [7:00:27](https://www.youtube.com/watch?v=undefined&t=61227s) Even though we haven't


introduced the syntax via which we can reconstitute the data and reassociate your genres with your
titles, why, at a glance, might this be a better design now? Even though I've doubled the number of
tables from one to two, why is this probably on the direction toward a better design? What might your
instincts be? Why is this cleaner? Again, first time with SQL, why is it better, perhaps, that we've done
this with our genre's table? Can I come to you? Why might this be better?

- [7:01:02](https://www.youtube.com/watch?v=undefined&t=61262s) Yeah. Oh, just because we had the


conversation before about the commas. AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Exactly. It's as simple
as that. We've cleaned up the data by giving every genre, every word in the genres column in the original
Google Spreadsheet its own cell in this table, if you will. And now notice show ID might appear multiple
times.

- [7:01:22](https://www.youtube.com/watch?v=undefined&t=61282s) Whoever typed in How I Met Your


Mother, they only associated one genre with it. And so we see that show ID 1 is a comedy. But whoever
typed in-- I forget the name of the second show offhand. But that person, whoever was assigned show ID
2 checked off a whole bunch of the genre's boxes. That happened again with show ID 3, 4.

- [7:01:43](https://www.youtube.com/watch?v=undefined&t=61303s) Persons 5, 6, 7 only checked one


box. And so you can see now that we've associated the data with what we might call a one-to-many
relationship. A one-to-many relationship, whereby for every one show in the show's table, it can now
have many genres associated with it, each of which is represented by a separate row here.

- [7:02:07](https://www.youtube.com/watch?v=undefined&t=61327s) So again, if I go ahead and select


star from shows-- let's limit it to the first 10 just to focus on a subset of the data. How I Met Your Mother,
The Sopranos was the second input there. It would seem that now that I've created the data in this way, I
could ideally somehow search the data, but a little more correctly.

- [7:02:25](https://www.youtube.com/watch?v=undefined&t=61345s) I don't have to worry about the


commas. I don't have to worry about the hackish approach of music being a substring of musical. But
how can I actually get back at this data? Well, let's go ahead and do this. Suppose I did want to get back
maybe all of the comedies. All of the comedies, no matter whether the person checked just the comedy
box or multiple boxes instead.

- [7:02:45](https://www.youtube.com/watch?v=undefined&t=61365s) How now, given that I have two


tables, could I go about selecting only the titles of comedies? I've actually made the problem a little
harder, but again, SQL is going to give me a solution for this. The problem is that if I want to search for
comedies, I have to check the genres table first. And then what's that going to give me? If I search the
genres table for comedies, what's that going to give me back potentially? Yeah? AUDIENCE: Show ID.

- [7:03:13](https://www.youtube.com/watch?v=undefined&t=61393s) DAVID J. MALAN: Maybe show ID.


So let me try that. Let me do select show ID from genres, where the genre in a given row equals quote,
unquote, "comedy." No commas, no like, no percent signs. Because literally, that column now is singular
words, like comedy, or drama, or the like. Let me go ahead and hit Enter here.

- [7:03:33](https://www.youtube.com/watch?v=undefined&t=61413s) OK, so I got back a whole bunch


of ID numbers. Now this could very quickly get annoying. It looks like show ID 1, 2, 4, 5, 6, 7, 9, and so
forth, are all comedies. So I could do something really crazy like, select title from shows, where ID equals
1, or ID equals 2. This is not going to scale very well, but this is why SQL is especially powerful.

- [7:04:00](https://www.youtube.com/watch?v=undefined&t=61440s) You can actually compose one


SQL question from multiple ones. So let's do this. Why don't I select the title where the ID of the show is
in the following list of IDs? Select show ID from genres, where the specific genre is, quote, unquote,
"comedy." So I've got two SQL queries. One is deliberately nested inside of parentheses.

- [7:04:27](https://www.youtube.com/watch?v=undefined&t=61467s) That's going to give me back that


whole list of show IDs. But that's exactly what I want to then look up the titles for by selecting title from
shows where the ID of the show is in that big, tall list. And so now if I hit Enter, I get back only those
shows that were somehow flagged as comedy, whether you in the audience checked one box for
comedy, two boxes, or all of the boxes.

- [7:04:52](https://www.youtube.com/watch?v=undefined&t=61492s) Somehow we teased out comedy,


again, just by using that Python script, which loaded this data not into one big table, but instead, two.
And if we want to clean this up, let's do a couple of things. Let's, outside of the parentheses, do order by
title. This is a way of sorting the data in SQL very easily.

- [7:05:09](https://www.youtube.com/watch?v=undefined&t=61509s) Now we have a whole list of the


same titles that are now sorted. And what was the keyword with which I could filter out duplicates?
Yeah, distinct. So let's try this. Same query, but let's select only the distinct titles from that whole query.
And notice, I've very deliberately done it this way.

- [7:05:27](https://www.youtube.com/watch?v=undefined&t=61527s) And to this day, any time I'm using


SQL, I don't just start at the beginning and type out my whole thought, and just get it right on the first
try. I very commonly start with the subquery, if you will, the thing in parentheses, just to get myself one
step toward what I care about. Then I add to it.

- [7:05:41](https://www.youtube.com/watch?v=undefined&t=61541s) Then I add to it. Then I add to it,


just like we've encouraged in Python and C, taking baby steps in order to get to the answer you actually
care about, like this one now. And other than this mistake, which we didn't fix because I re-imported the
data after accidentally changing everyone's genre, we now have an alphabetized list of all of the same
data.

- [7:06:01](https://www.youtube.com/watch?v=undefined&t=61561s) But now it's better designed,


because we have it split across these two tables. Oh, thank you. OK, just thanks. What questions do we
have, if any here? Questions on this approach? Yeah? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Oh, now
that we have a database, how do we transfer it to a CSV? There are ways to do that.

- [7:06:31](https://www.youtube.com/watch?v=undefined&t=61591s) And in fact, there's a command


within SQLite that allows you to export your data back to a CSV file. If you want to email it to someone
and you want them to be able to open it in Excel, or Google Spreadsheets, or Apple Numbers, or the like,
you can go in the other direction. Generally though, once you're in the world of SQL you're probably
storing your data there long term.

- [7:06:50](https://www.youtube.com/watch?v=undefined&t=61610s) And you're probably updating it,


maybe deleting it, adding to it, and so forth. For instance, the one command I did not show earlier is,
suppose someone forgot a show. Let's see, did I see this in the output? All right, so Curb Your
Enthusiasm. Saw that last night. It was just, yeah. Did anyone see it last night? No? All right, well, just the
one person that checked that box, so you and me.

- [7:07:10](https://www.youtube.com/watch?v=undefined&t=61630s) What's another show that didn't


make the list? How about Seinfeld? It's now on Netflix, apparently. So insert into shows. What do we
want to insert? Well, we want to insert maybe an ID and a title. But I don't actually care what the ID is,
so I'm just going to insert a title. And the value I'm going to give to that title is going to be, quote,
unquote, "Seinfeld.

- [7:07:34](https://www.youtube.com/watch?v=undefined&t=61654s) " And then, let me go ahead and


hit semicolon. Nothing seems to happen, but let me rerun the big query from before looking for
comedies. And unfortunately, Seinfeld has not yet been flagged as a comedy, so let's get this right, too.
What intuitively I'm going to have to do to associate, now, Seinfeld with my comedies? I just inserted into
the show's table.

- [7:07:55](https://www.youtube.com/watch?v=undefined&t=61675s) What more needs to happen


before we can flag Seinfeld as a comedy? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Say again?
AUDIENCE: [INAUDIBLE] DAVID J. MALAN: Yeah. So I need to insert into the genres table two things now,
a show ID, like this, and then, the name of the genre, which presumably is comedy. What values do I
want to insert? Well, the show ID, I better grab that.

- [7:08:18](https://www.youtube.com/watch?v=undefined&t=61698s) Oh, I don't even know what it is.


I'm going to have to figure out what that is. So I could do this in a couple of ways. Let me do a one-time
thing. Select star from shows, where title equals, quote, unquote, "Seinfeld" semicolon 159. So now I
could do, insert into genres a show ID and a genre name, the values 159, and, quote, unquote, "comedy"
semicolon, Enter.

- [7:08:47](https://www.youtube.com/watch?v=undefined&t=61727s) And now, if I scroll back in my


history and execute that really big query again, looking for all distinct comedies, now Seinfeld has made
the list. But I did this manually so I didn't actually capitalize it. Let's clean that up. Let's do update. Let's
do update my shows. Set title equals to Seinfeld semicolon.

- [7:09:09](https://www.youtube.com/watch?v=undefined&t=61749s) No? OK, thank you, where title


equals, quote, unquote, "Seinfeld." Let's not make that mistake again. Enter. And now, if I execute that
really big query, now Seinfeld is, indeed, considered a comedy. So where are we going with this? Well,
thus far we've been doing all of this pretty manually.

- [7:09:26](https://www.youtube.com/watch?v=undefined&t=61766s) And this is absolutely what an


analyst, a data scientist type person might do if just manipulating a pretty large data set just to get at
interesting answers that might be across one, two, or even many more tables. Eventually, in a few weeks,
we're going to start to automate all of this by writing code in Python that generates SQL to do this.
- [7:09:42](https://www.youtube.com/watch?v=undefined&t=61782s) If you go to most any website on
the internet today, and you, for instance, log in, odds are you're typing a username and password,
clicking Submit. What's then happening? Well, the website might not be implemented in Python but it's
probably implemented in some language, Python, JavaScript, Java, Ruby, something else.

- [7:09:58](https://www.youtube.com/watch?v=undefined&t=61798s) And that language is probably


using something like a relational database to use SQL to get your username, get your password, and
compare the two against what you've typed in. And actually, it's hopefully not getting your actual
password, but something called the hash thereof. But there's probably a database involved doing that.

- [7:10:16](https://www.youtube.com/watch?v=undefined&t=61816s) When you buy something on


Amazon.com and you click Check Out, odds are there's some code on Amazon's server that's looking at
what it is you added to your shopping cart, and then maybe using a for loop of some sort, in Python or
another language. It's doing a whole bunch of SQL inserts to store in their database what it is you
bought.

- [7:10:35](https://www.youtube.com/watch?v=undefined&t=61835s) There's other types of databases,


too, but SQL databases, or relational databases are quite popular. So let's go ahead and write one other
program here in Python that now merges these two languages together, whereby I'm going to use SQL
inside of a Python program so I can implement my logic of my program in Python, step-by-step, line-by-
line.

- [7:10:55](https://www.youtube.com/watch?v=undefined&t=61855s) But when I want to get at some


data I can actually talk to a SQL database. So let me go ahead and open favorites.py. And let me go ahead
and throw away some of what we did earlier and really just now add a SQL to the mix. From the CS50
library, let's import the SQL function. This will be useful to use because most third-party libraries that
deal with SQL and Python are more complicated than they need to be.

- [7:11:23](https://www.youtube.com/watch?v=undefined&t=61883s) So I think you'll find this library


easier to use. Let's then do the following. Create a variable called db for database. But I could call it
anything I want. Let's use that you URI, which is a fancy way of saying something that looks like a URL,
but that actually opens up a database on disk, that is, in the current folder.

- [7:11:44](https://www.youtube.com/watch?v=undefined&t=61904s) Let's now ask the user for a title


by prompting them for a, quote, unquote, "title" like this. And let's strip off any whitespace just so that
the data is not messy. And then, let's go ahead and do this. And this is the new logic. I'm going to go
ahead now and write a line of code that uses Python to talk to the original favorites.db.

- [7:12:05](https://www.youtube.com/watch?v=undefined&t=61925s) So again, I'm not using the two-


table database, which is in favorites8.db. I'm using the original that we imported from your own data,
and I'm going to do the following. I'm going to use db.execute to execute a SQL command inside of
Python. I'm going to select the count of shows from the favorites table, where the title the user typed in
is like this question mark.

- [7:12:35](https://www.youtube.com/watch?v=undefined&t=61955s) And why I'm doing that is as


follows. Just like in C, when we had percent S, in SQL for now, the analogue is going to be a question
mark. So same idea, different syntax. Instead of percent S, it's just a question mark. And using a comma
outside of this first string, using CS50's execute function I can pass in a SQL string, a command, then any
arguments I want to plug into the question marks therein.

- [7:12:59](https://www.youtube.com/watch?v=undefined&t=61979s) So the goal at hand is to actually


write a program that's going to search favorites.csv, a.k.a., favorites.db for the total number of people
that liked a particular show. So this is going to select the count of people from the favorites table where
the title they typed in is like whatever the user has just now typed in.

- [7:13:19](https://www.youtube.com/watch?v=undefined&t=61999s) This db execute function returns a


list. It returns a list of rows. And you would only know that by my telling you or reading the
documentation. And therefore, if I want to get back to the total count, I'm going to go ahead and grab
the first row from those rows. Because it's only going to give me back the count.

- [7:13:37](https://www.youtube.com/watch?v=undefined&t=62017s) And then I'm going to go ahead


and print out that row's first value. But it's going to be a little weird. Technically the column is going to be
called "count" star, quote, unquote, which is a little weird. Let me add one more feature to the mix. You
can actually give nicknames to columns that are coming back, especially if they are the result of
functions like this.

- [7:13:55](https://www.youtube.com/watch?v=undefined&t=62035s) I can just call that column counter,


in all lowercase. That means I can now say get back the counter key inside of this dictionary. So just to
recap, what have we done? We've imported the CS50 library SQL function. We've, with this line of code,
opened the favorites.db file that you and I created earlier by importing your CSV into SQLite.

- [7:14:20](https://www.youtube.com/watch?v=undefined&t=62060s) I'm now just asking the user for a


title they want to search for. I'm now executing this SQL query on that database, plugging in whatever
the human typed in as their title in order to get back a total count. And I'm giving the count a nickname,
an alias of counter, just so it's more self-explanatory.

- [7:14:39](https://www.youtube.com/watch?v=undefined&t=62079s) This function, db execute, no


matter what, always returns a list of rows, even if there's only one row inside of it. So this line of code
just gives me the first and only row. And then, this goes inside of that row, which it turns out is a
dictionary, and gives me the key counter and the value it corresponds to.

- [7:14:59](https://www.youtube.com/watch?v=undefined&t=62099s) So what, to be clear, is this doing?


Let's go ahead and run this manually in my Terminal window first. Let me run SQLite3 on favorites-- Well,
let's do this. On favorites.db, let me import the data again. So mode csv.import in from favorites.csv into
a favorites table. So I've just recreated the same data set that you all gave me earlier in favorites.db.

- [7:15:25](https://www.youtube.com/watch?v=undefined&t=62125s) If I were to do this manually, let's


search for The Office again. Select, count star from favorites, where title like, and let's just manually type
it in for now, The Office. We'll search for the one with the word The, semicolon. I get back 12. But
technically, notice what I get back. I technically get back a miniature table containing one column and
one row.

- [7:15:50](https://www.youtube.com/watch?v=undefined&t=62150s) What if I want to rename that


column? That's where the as keyword comes in. So select count star as counter. Notice what happens,
Enter. I just get back-- same simple table, but I've renamed the column to be counter just because it's a
little more self-explanatory as to what it is. So what am I doing with this line of code? This line of code is
returning to me that miniature temporary table in the form of a list of dictionaries.

- [7:16:16](https://www.youtube.com/watch?v=undefined&t=62176s) The list contains one row, as we'll


see, and it contains one column, as we'll see, the key for which is counter. So let's now run the code
itself. I'm going to get out of SQLite3 and I'm going to run Python of favorites.py. Enter. I'm being
prompted for a title. I'm going to type in The Office and cross my fingers, and there's that 12.

- [7:16:39](https://www.youtube.com/watch?v=undefined&t=62199s) Why is it 12? Well, there's a typo


again because I re-imported the CSV. I had deleted two of the Thes, so we're back at the original data
set. So there's 12 total that have, quote, unquote, "The Office" in the title like that. So what have we
done? We've combined some Python with some SQL, but we've relegated all of the complexity of
searching for something, the selecting of something, gotten rid of all of the with keyword, the open
keyword, the for loop, the reader the DictReader, and all of that.

- [7:17:07](https://www.youtube.com/watch?v=undefined&t=62227s) And it's just one line of SQL now,


using the best of both worlds. All right, any questions on what we've just done here or how any of this
works? Any questions here? Yeah? AUDIENCE: [INAUDIBLE] DAVID J. MALAN: When does this function
return more than one row? Was that the question? AUDIENCE: Yeah. DAVID J. MALAN: Yeah.

- [7:17:31](https://www.youtube.com/watch?v=undefined&t=62251s) So let's do that by changing the


problem at hand. This program was designed just to select the total count. Let's go ahead and select, for
instance, all of the ways you all typed in The Office by selecting the title this time. If I do this in SQLite3,
let me go ahead and do this again after increasing my Terminal window.

- [7:17:55](https://www.youtube.com/watch?v=undefined&t=62275s) Let's do it manually. Select title


from favorites, where the title is like, quote, unquote, "The Office," semicolon. I get back all of these
different rows, and we didn't even notice this one. There's actually another little typo in there with some
capitalization of the E, and the C, and the E.

- [7:18:13](https://www.youtube.com/watch?v=undefined&t=62293s) That would be an example of a


query that gives me back therefore for multiple rows. So let's now change my Python program. If I now,
in my Python program, do this, I get back a whole bunch of rows containing all of those titles. I can now
do, for row in rows, I can print out the current row's title, and now manipulate all of those things
together.

- [7:18:35](https://www.youtube.com/watch?v=undefined&t=62315s) Let me keep both on the screen.


Let me run Python of favorites.py. And that for loop now should iterate, what, 10 or more times, once for
each of those titles. And indeed, if I type in The Office again, Enter. Whoops. Row title. What did I do
wrong? Oh, I should not be renaming title to counter this time. So that's just a dumb mistake on my part.

- [7:18:57](https://www.youtube.com/watch?v=undefined&t=62337s) Let me rerun it again. And now I


should see after typing in The Office, Enter, a whole bunch of The Offices. And because I'm using like,
even the missed capitalizations are coming through, because like is case insensitive. It doesn't matter if
it's uppercase or lowercase. Whereas had I used the equal sign I would get back only the same ones
capitalized correctly.
- [7:19:17](https://www.youtube.com/watch?v=undefined&t=62357s) All right, any questions on this
next? All right, so let's transition to a larger, juicier data set, and consider some of the issues that arise
when actually now using SQL and skating toward a world in which we're using SQL for mobile apps, web
apps, and generally speaking, very large data sets. So let's start with a larger data set just like that.

- [7:19:39](https://www.youtube.com/watch?v=undefined&t=62379s) Give me just a moment to switch


screens over to what we have for you today, which is an actual relational database that we've created
out of a real-world data set from IMDb. So InternetMovieDatabase.com is a website where you can
search for TV shows, and movies, and actors, and so forth, all using their database behind the scenes.

- [7:20:00](https://www.youtube.com/watch?v=undefined&t=62400s) IMDb wonderfully makes their


data set available as not CSV files, but TSV files, tab-separated values. And so what we did is, before class
we downloaded those TSV files. We wrote a Python program similar to my favorites8.py file earlier that
read in all of those TSV files, created some SQL tables in an IMDb database for you in SQLite that has
multiple tables and multiple columns.

- [7:20:29](https://www.youtube.com/watch?v=undefined&t=62429s) So let's go and wrap our minds


around what's actually in this data set. Let me go back to VS Code here, and in just a moment, I'm going
to go ahead and copy the file, which we've named shows.db. And I'm going to go ahead and increase my
Terminal and do SQLite3 of shows.db. Whenever playing around with a SQLite database for the first time,
typing .

- [7:20:52](https://www.youtube.com/watch?v=undefined&t=62452s) schema is perhaps a good place


to start to give you a sense of what's in there. And things just escalated quickly. There's a lot in this data
set, because, indeed, there's going to be tens of hundreds of thousands of rows in this data set, and also
problem set 7, where we'll look at the movie side of things and not just the TV shows. So what is the
schema that we have created for you from IMDb's actual real-world data? One, there's a table called
shows.

- [7:21:14](https://www.youtube.com/watch?v=undefined&t=62474s) And notice we've just added


whitespace by hitting Enter a bunch of times to make it a little more stylistically readable. The shows
table has an ID column, a title column, a year, and the total number of episodes for a given show. And
the types of those columns are integer, text, numeric, and integer.

- [7:21:31](https://www.youtube.com/watch?v=undefined&t=62491s) So it turns out there's actually a


few different data types that are worth being aware of when it comes to creating tables themselves. In
fact, in SQLite there's five data types, and only five, fortunately, one of which is, indeed, integer, negative
or positive, numeric, which is kind of a catchall for dates and times, things that are numeric but are not
just integers, and not just real numbers, for instance.

- [7:21:55](https://www.youtube.com/watch?v=undefined&t=62515s) Real number is what we've


generally thought of as float up until now. Text, of course, is just text, but notice that you don't have to
worry about how big it is. Like in Python, it will size to fit. And then there's BLOB, which is binary large
object, which is for just raw 0s and 1s, like for files or things like that.

- [7:22:11](https://www.youtube.com/watch?v=undefined&t=62531s) But we'll generally use the other


four of these. And so, indeed, when we imported this data for you we decided that every show would be
given an ID, which is just an integer. Every show has, of course, a title, which should not be null.
Otherwise, why is it in the database? Every show has a year, which is numeric according to that
definition a moment ago.

- [7:22:32](https://www.youtube.com/watch?v=undefined&t=62552s) And the total number of episodes


for a show is going to be an integer. What now is with these primary keys that we mentioned earlier,
too? A primary key is the column that uniquely identifies all of the data. In our case, with the favorites, I
automatically gave each of your submissions a unique ID so that even if two or more of you typed in The
Office, your submission still had a unique identifier, a number that allowed me to then correlate it with
your genres, just as we saw a moment ago.

- [7:23:02](https://www.youtube.com/watch?v=undefined&t=62582s) In this version of IMDb, there's


also genres. But they don't come from us, they come from IMDb.com. And so a genre has a show ID, and
a genre just like our database. But these are real-world genres with a bit more filtration. Notice, though,
just like my version, there's a foreign key. A foreign key is the appearance of another table's primary key
in its own table.

- [7:23:27](https://www.youtube.com/watch?v=undefined&t=62607s) So when you have a table like


genres, which is somehow cross referencing the original shows table, if shows have a primary key called
ID, and those same numbers appear in the genres table under the column called show ID, by definition,
show ID is a foreign key. It's the same numbers but it's foreign in the sense that the number is being used
in this table, even though it's officially defined primarily in this other table.

- [7:23:54](https://www.youtube.com/watch?v=undefined&t=62634s) This is what we mean by


relational databases. You have multiple tables with some column in common, numbers typically. And
those numbers allow you to line the two tables up in such a way that you can reconnect the shows with
their genres, just like we did with our smaller data set a moment ago. This logic is extended further.

- [7:24:14](https://www.youtube.com/watch?v=undefined&t=62654s) Notice that the IMDb database


we've created for you has a stars table, like TV show stars, the actors therein. And that table,
interestingly, has no mention of people and no mention of shows, per se. It only has a column called
show ID, which is an integer, and a person ID, which is an integer. Meanwhile, if we scrolled down to the
bottom, you will see a table called people.

- [7:24:43](https://www.youtube.com/watch?v=undefined&t=62683s) And we have decided in IMDb's


world that every person in the TV show world will have a unique identifier that's a number, a name that's
text, a birth date, which is numeric, and then, again, specifying that ID is going to be their primary key. So
what's going on here? Well, it turns out that TV stars and writers are both types of people.

- [7:25:08](https://www.youtube.com/watch?v=undefined&t=62708s) So using this relational database,


notice the road we're going down. We're factoring out commonalities. And if a person can be different
things in life, well, we're defining them first and foremost as people. And then, notice these two tables
are almost the same. The stars table has a show ID, which is a number, and a person ID, which is a
number, which allows us via this middleman table, if you will, to link people with TV shows.

- [7:25:36](https://www.youtube.com/watch?v=undefined&t=62736s) Similarly, the writers table allows


us to connect shows with people, too, by just recording those numbers. So if we go into this data set,
let's do the following. Let's do select star from people semicolon. So a huge amount of data is coming
back. This is hundreds of thousands of rows now based on the ID numbers alone.
- [7:25:57](https://www.youtube.com/watch?v=undefined&t=62757s) So this is real-world data now
flying across the screen. There's a lot of people in the TV show business, not just actors and writers, but
others as well. It's still going. There's a lot of data there. So my god, if you had to do anything manual in
this data set it's probably not going to work out very well.

- [7:26:12](https://www.youtube.com/watch?v=undefined&t=62772s) And actually, we're up to, what, a


million people in this data set, plus, which would mean this probably isn't even going to open very well
in Excel, or Google Spreadsheets, or Apple Numbers. SQL probably is the better approach here. Let's
search for someone specific, like select star from people, where name equals Steve Carell, for instance,
sticking with comedies.

- [7:26:32](https://www.youtube.com/watch?v=undefined&t=62792s) All right, so there's Steve Carell.


He is person number 136,797, born in 1962. And that's as much data as we have on Steve Carell here.
How do we figure out what shows, for instance, he's in? Well, let's see, select star from shows,
semicolon. There's a crazy number of shows out there in the IMDb database.

- [7:26:53](https://www.youtube.com/watch?v=undefined&t=62813s) And you can see it here again


flying across the screen. Feels like we're going to have to employ some techniques in order to get at all of
Steve Carell's shows. So how are we going to do that? Well, god, this is a lot of data here. And in fact,
yeah, we have, what, 15 million shows plus in this data set, too.

- [7:27:13](https://www.youtube.com/watch?v=undefined&t=62833s) So doing things efficiently is now


going to start to matter. So let's actually do this. Let me select a specific show. Select star from shows
where title equals, quote, unquote, "The Office." And there presumably shouldn't be typos in this data
because it comes from the real website IMDb.com.

- [7:27:29](https://www.youtube.com/watch?v=undefined&t=62849s) Let's get back to show. Turns out


there's been a lot of The Offices out in the world. The one that started in 2005 is the one that we want,
presumably the most popular with 188 episodes. How can we get just that? Maybe we could do and year
equals, how about 2005? All right, so now we've got back just the ID of The Office that we care about.

- [7:27:52](https://www.youtube.com/watch?v=undefined&t=62872s) And let's do this, too. Let me turn


on a timer within SQLite just to get a sense of running time now. Let me do that again. Select star from
shows, where title equals The Office, and year equals 2005. And let's keep it simple. Let's just do titles
for now. Enter. All right, so not terribly long. It found it pretty fast, but it looks like it took how much real
time? 0.02 seconds, not bad for just a title.

- [7:28:15](https://www.youtube.com/watch?v=undefined&t=62895s) But just to plant a seed, it turns


out that we can probably speed even this up. Let me do this. Let me create something called an index,
which is another use of the C in CRUD for creating something. And I'm going to call this title index. And
I'm going to create it on the shows table, specifically on the title column.

- [7:28:35](https://www.youtube.com/watch?v=undefined&t=62915s) And we'll see in a moment what


this is going to do for me. Enter. Took a moment, like 0.349 seconds, to create something called an index.
But now watch, if I select star from shows searching for The Office again, previously it took me 0.021
seconds. Not bad, but now, wow. Literally no time at all, or so low that it wasn't really measurable.
- [7:28:57](https://www.youtube.com/watch?v=undefined&t=62937s) And I'll do it again just to get a
sense of things. Still quite low. Now even though 0.021 seconds, not crazy long, imagine now having a lot
of data, a lot of users running a real website or real mobile app. Every millisecond we can start to shave
off is going to be compelling. So what is it we just did? Well, we actually just created something called an
index.

- [7:29:17](https://www.youtube.com/watch?v=undefined&t=62957s) And this is a nice way to tie in,


now, some of our week 5 discussion of data structures, and our week 3 discussion of running times. An
index in a database is some kind of fancy data structure that allows the database to do better than linear
search. Literally, as you just saw, these tables are crazy long or tall right now, very linear, that is.

- [7:29:37](https://www.youtube.com/watch?v=undefined&t=62977s) And so when I first searched for


The Office, it was literally doing linear search, top to bottom, looking at as many as, what, a million plus
rows. That's relatively slow. It's not that slow, 0.021 seconds. But that's relatively slow just theoretically,
algorithmically, doing anything linearly.

- [7:29:54](https://www.youtube.com/watch?v=undefined&t=62994s) But if you instead create an index


using syntax like this, which I just did, creating an index on the title column of the show's table, that's like
giving the database a clue in advance saying, hey, I know I'm going to search on this column in this table
a lot. Do something with data structures to speed things up.

- [7:30:13](https://www.youtube.com/watch?v=undefined&t=63013s) And so if you think back to our


discussion of data structures, maybe it's using a tree. Maybe it's using a trie or a hash table, some fancier
two-dimensional data structure is generally going to lift the data up creating right maybe a tree
structure. So it's just much faster to find data, especially if it's sorting it now based on title, and not just
storing it in one long list.

- [7:30:33](https://www.youtube.com/watch?v=undefined&t=63033s) And in fact, in the world of


relational databases, the type of structure that's often used in a database is something called a B-tree.
It's not a binary tree. Different use of the letter B, but it looks a little something like the trees we've seen.
It's not binary because some of the nodes might have more than two children or fewer, but it's a very
wide but relatively shallow tree.

- [7:30:54](https://www.youtube.com/watch?v=undefined&t=63054s) It's not very tall. And the upside


of that is that if your data is stored in this tree, the database can find it more quickly. And the reason it
took half a second, a third of a second to build the index is because SQLite needed to take some non-
zero amount of time to just build up this tree in memory.

- [7:31:13](https://www.youtube.com/watch?v=undefined&t=63073s) And it has algorithms for doing so


based on alphabetization or other techniques. But you spend a bit of time up front, a third of a second.
And then thereafter, wow. Every subsequent query, if I keep doing it again and again, is going to be crazy
low, 0.000, maybe 0.001. But an order of magnitude, a factor of 10 or 100 faster than it previously was
earlier.

- [7:31:36](https://www.youtube.com/watch?v=undefined&t=63096s) So we have these indexes which


allow us to get at data faster. But what if we want to actually get data that's now across these multiple
tables? How can we do that? And how might these indices or indexes help further? Well, it turns out
there is a way that we've seen already indirectly to join two tables together.
- [7:31:55](https://www.youtube.com/watch?v=undefined&t=63115s) Previously, when I selected the ID
of The Office, and then I searched for it in the other table using select in a nested query, I was joining
two tables together. And it turns out there's a couple of ways to do this. Let's go ahead now and, for
instance, find all of Steve Carell's TV shows. Not just The Office but all of them, too.

- [7:32:14](https://www.youtube.com/watch?v=undefined&t=63134s) Unfortunately, if we look at our


schema, shows up here have no mention of TV-- oh, shows over here has no mention of the TV stars in
them. And people have no mention of shows. We somehow need to use this table here to connect the
two. And this is called a join table, in the sense that using two integer columns-- it joins the two tables
together logically.

- [7:32:43](https://www.youtube.com/watch?v=undefined&t=63163s) And so if you're savvy enough


with SQL, you can do what I did with my hands earlier and like recombine tables by using these common
IDs, these integers together. So let me do this. Let me go ahead and figure out, step-by-step, Steve
Carell's shows. So how am I going to do this? Well, if I select star from people, where name equals Steve
Carell, fortunately, there's only one of them.

- [7:33:06](https://www.youtube.com/watch?v=undefined&t=63186s) So this gives me back his name,


his ID, and his birth year. But it's really only his ID that I care about. Why? Because in order to get back
his shows, I need to link person ID with show ID. So I need to know his ID number. So what could I do
with this? Well, remember the schema and the stars table. I've just gotten, from the people table, Steve
Carell's ID.

- [7:33:33](https://www.youtube.com/watch?v=undefined&t=63213s) I bet by transitivity I could now


use his person ID, his ID, to get back all of his show IDs. And then once I've got all of his show IDs, I can
take it one step further and get back all of his shows' titles. So the answer is actually English words and
not just random, seemingly, integers. So let me go ahead and do this.

- [7:33:53](https://www.youtube.com/watch?v=undefined&t=63233s) Let me, again, get Steve Carell's ID


number, but not star. Star represents everything. It's a wildcard character in SQL. Let me just select the
ID of Steve Carell. And that gives me back 136,797. And it's only giving me back one value. The thing
called ID is just the column heading up above. Now, suppose I want to select all of the show IDs that
Steve Carell is affiliated with.

- [7:34:19](https://www.youtube.com/watch?v=undefined&t=63259s) Let me select Show ID from stars,


where the person ID in stars happens to equal Steve Carell's ID. So again, I'm building up my answer in
reverse and taking these baby steps. On the right, in parentheses, I'm getting Steve Carell's ID. On the
left, I am now selecting all of the show IDs that have some connection with that person ID in the stars
table.

- [7:34:45](https://www.youtube.com/watch?v=undefined&t=63285s) This answer, too, is not going to


be that illuminating. It's just a whole bunch of integers that have no meaning to me as a human. But let's
take this one step further. And even though my code is getting long, I could hit Enter and format it nicely,
especially if I were doing this in a code file. But I'm just doing it interactively for now.

- [7:35:01](https://www.youtube.com/watch?v=undefined&t=63301s) Let's now select all of the titles


from the shows table, where the ID of the show is in this following previous query. So again, the query is
getting long. But notice, it's the third and last step. Select title from the shows table, where the ID of the
show is in the list of all of the show IDs that came back from the stars table searching for Steve Carell's
person ID.

- [7:35:27](https://www.youtube.com/watch?v=undefined&t=63327s) How did we get that person ID?


Let me scroll to the end. Well, I selected, in my innermost parentheses, Steve Carell's own ID. So now,
when I hit Enter, voila. I get all of Steve Carell's TV shows up until now. And if I want to tidy this up
further, I can use the same tricks as before. Order by title, semicolon.

- [7:35:48](https://www.youtube.com/watch?v=undefined&t=63348s) Now I've got it all alphabetized as


before. So again, with SQL comes the ability to search-- I mean, look how quickly we do this, 0.094
seconds to search across three different tables to get back this answer. But my data is now all neatly
designed in individual tables, which is going to be important now that the data set is so large.

- [7:36:07](https://www.youtube.com/watch?v=undefined&t=63367s) But let me take this one step


further. Let me go ahead and do this. Let me go ahead and point out that with this query, notice that I'm
searching on-- let's say I'm searching on a person ID here. And at the end here, I'm searching on a name
column here. So let me actually go ahead and do this.

- [7:36:31](https://www.youtube.com/watch?v=undefined&t=63391s) Let me go ahead and see if we


can't speed this up. This query at the moment takes 0.092 seconds. Let's see if we can't speed this up
further by just quickly creating a few more of those B-trees in the databases memory. Create an index
called person index, and I'm going to do this on the stars table on the person ID column.

- [7:36:51](https://www.youtube.com/watch?v=undefined&t=63411s) Enter. It's taking a moment, taking


a moment. That's almost a full second because that's a big table. Let's create another index called show
index on the stars table. Why? Because I want to search by the show ID also. That was part of my big
query. Takes a moment. OK, just more than about 2/3 of a second.

- [7:37:09](https://www.youtube.com/watch?v=undefined&t=63429s) Now let's create one last one,


another index called name index, but I could call these things anything I want, on the people table. Why?
Because I'm also searching on the name column. So in short, I'm creating indexes on each of the columns
that are somehow involved in my search query, going from one table to the other.

- [7:37:25](https://www.youtube.com/watch?v=undefined&t=63445s) Now let's go back to the previous


query, which, recall, took-- I think I erased it, 0.091. All right. Well, it was roughly this order of
magnitude. We're not seeing the data now. But let me go ahead and run my original big query once. And
boom, we're down to almost nothing. So again, creating these indexes in memory has the effect of
rapidly speeding up our computation time.

- [7:37:53](https://www.youtube.com/watch?v=undefined&t=63473s) Now if you've ever used, for


instance, the my.harvard course shopping tool here on campus, or Yale's analogue, you might wonder,
why is the thing so slow? This could be one of the reasons why large data sets with thousands of rows,
thousands of courses tend to be slow, if, and I'm only conjecturing, if the database isn't properly indexed.

- [7:38:10](https://www.youtube.com/watch?v=undefined&t=63490s) If you're building your own web


application and you're finding that users are waiting and waiting, and things are spinning and spinning,
what might be among the problems? Well, it could absolutely just be bad algorithms and bad code that
you wrote. Or it might be that you haven't thought about, well, what column should be optimized for
searches and filtration like I've done here in order to speed up subsequent queries? Again, from the
outside in, we can only conjecture.

- [7:38:33](https://www.youtube.com/watch?v=undefined&t=63513s) But ultimately, this is just one of


the things that explains performance problems as well. All right, let's point out just a couple of final
syntactic things, and then we'll consider, bigger picture, some problems that might arise in this world. If
these nested, nested queries start to get a little much, there are other ways, just so you've seen it, that
you can execute similar logic in SQL.

- [7:38:57](https://www.youtube.com/watch?v=undefined&t=63537s) For instance, if I know in advance


that I want to connect Steve Carell to his show IDs and to their titles, we can do something more like
this. Select title from the people table, joined with the stars table on people ID equals stars.personID. So
what am I doing? New syntax. And again, this is not something you'll have to memorize or ingrain right
away.

- [7:39:25](https://www.youtube.com/watch?v=undefined&t=63565s) But just so you've seen other


approaches, select title from people join stars. This is an explicit way to say, take the people table in one
hand, the stars table in the other hand, and somehow join them as I keep doing with my fingertips here.
How to join them? Join them so that the people, the ID column in the people table lines up with the
person ID in the stars table.

- [7:39:49](https://www.youtube.com/watch?v=undefined&t=63589s) But that's not quite everything. I


could also say, join further on the shows table, where the stars show ID equals the shows ID column. So
what am I doing here? That's saying, go further and join the stars table with the show's table, joining the
show ID column with the ID column. Again, this starts to get a little messy to think about.

- [7:40:17](https://www.youtube.com/watch?v=undefined&t=63617s) But now I can just say, where


name equals, quote, unquote, "Steve Carell." I can do in one query what previously took me three
nested queries and get back the same answers. And I can still add in my order by title to get back the
result. And if I do this a little more neatly, let me type this out a little differently.

- [7:40:36](https://www.youtube.com/watch?v=undefined&t=63636s) Let me type this out by adding a


new line-- ah, I can't do that here. I'm going to leave it alone for now. We can type it on multiple lines in
other contexts. And let me do one last thing. Do I want to show that? I'm going to show it, but this is not
something you should ingrain just yet either.

- [7:40:54](https://www.youtube.com/watch?v=undefined&t=63654s) Select title from people, stars, and


shows. If you know in advance that you want to do something with all three tables, you can just
enumerate them, one table name after the other. And then you can say where people.ID equals
stars.personID. And now I'm hitting Enter so that it formats a little more readably on my screen.

- [7:41:12](https://www.youtube.com/watch?v=undefined&t=63672s) And stars.showID equals


shows.ID, and lastly, name equals Steve Carell. In short, you specify that you want to select data from all
three of these tables. And then you tell the database how to combine foreign keys with primary keys,
that is, the columns that have those integers in common. If I hit Enter now, I get the same exact results,
ever more so if I also add in an order by title.
- [7:41:41](https://www.youtube.com/watch?v=undefined&t=63701s) Oops. All right. That's why I didn't
want to do this earlier. I have to go back through my history multiple times to actually get back the multi-
line query this time. All right. That was a lot all at once. But this is only to say that, even as we make the
design of the data more sophisticated, and we put some of it over here, some of it over here, some of it
over here so as to avoid duplication of data, weird hacks like putting commas in the data, we can still get
back all of the answers that we might want across these several tables.

- [7:42:10](https://www.youtube.com/watch?v=undefined&t=63730s) And using indexes, we can


significantly speed up these processes so as to handle 10 times as many, a 100 times as many users on
the same actual database. There is going to be a downside. And thinking back to our discussion of
algorithms and data structures in past weeks, what might be a downside of creating these indexes?
Because as of now, I created four separate indexes on the name column, the title column, and some
other columns, too.

- [7:42:35](https://www.youtube.com/watch?v=undefined&t=63755s) Why wouldn't I just go ahead and


index everything if it's clearly speeding things up? Memory, so space. Any time you're starting to benefit
time wise in computer science, odds are you're sacrificing space, or vice versa. And probably indexing
absolutely everything is a little dumb because you're going to waste way more space than you might
actually need.

- [7:42:55](https://www.youtube.com/watch?v=undefined&t=63775s) So figuring out where the right


inflection point is is part of the process of designing and just getting better at these things. Now
unfortunately, a whole lot of things can go wrong in this world, and they continue to in the real world
with people using SQL databases. And in fact, here on out, if you're reading something technical about
SQL databases, and websites being hacked in some form, and passwords leaking out, unfortunately, all
too often it is because of what are called SQL injection attacks.

- [7:43:23](https://www.youtube.com/watch?v=undefined&t=63803s) And just to give you a sense now


to counterbalance, maybe [INAUDIBLE] enthusiasm for like, oh, that was neat how we can do things so
quickly. With great power comes responsibility in this world, too. And so many people introduce bugs
into their code by not quite appreciating how it is the data is getting into your application.

- [7:43:43](https://www.youtube.com/watch?v=undefined&t=63823s) So what do I mean by that? Here,


for instance, is a typical login screen for Yale. And here's the analogue for Harvard where you're
prompted, every day probably, for your username and your password, your email address and your
password here. Suppose, though, that behind this login page, whether Harvard's or Yale's, there's some
website.

- [7:44:00](https://www.youtube.com/watch?v=undefined&t=63840s) And that website is using SQL


underneath the hood to store all of the Harvard or Yale people's usernames, passwords, ID numbers,
courses, transcripts, all of that stuff. So there's a SQL database underneath the website. Well, what might
go wrong with this process? Unfortunately, there's some special syntax in SQL just like there is in C and
Python.

- [7:44:20](https://www.youtube.com/watch?v=undefined&t=63860s) For instance, there are comments


in SQL, too. If you do two hyphens, dash, dash, that's a comment in SQL. And if you, the programmer,
aren't sufficiently distrustful of your users, such that you defend against potentially adversarial attacks,
you might do something like this. Suppose that I somewhat maliciously or curiously log in by typing my
username, Malan@harvard.

- [7:44:44](https://www.youtube.com/watch?v=undefined&t=63884s) edu, and then maybe a single


quote and a dash, dash. Why? Because I'm trying to suss out if there is a vulnerability here to a SQL
injection attack. Do not do this in general. But if I were the owner of the website trying to see if I've
made any mistake, I might try using potentially dangerous characters in my input. Dangerous how?
Because single quote is used for quoting things in SQL, as we've seen-- single quotes or double quotes.

- [7:45:07](https://www.youtube.com/watch?v=undefined&t=63907s) Dash, dash, I claim now, is used


for commenting. But let's now imagine what the code underneath the hood might be for something like
Yale's login or Harvard's login. What if it's code that looks like this? So let me read it from left to right.
Suppose that they are using something like CS50's own execute function, and they've got some SQL
typed into the website that says select star from users, where username equals this, and password
equals that.

- [7:45:34](https://www.youtube.com/watch?v=undefined&t=63934s) And they're plugging in username


and password. So what am I doing here? Well, when the user types their username password, hits Enter,
I probably want to select that user from my database to see if the username and passwords match. So
the underlying SQL might be, select star from users, where username equals question mark, and
password equals question mark.

- [7:45:53](https://www.youtube.com/watch?v=undefined&t=63953s) Users is the table. One column is


username. One column is password. All right. And if we get back one row, presumably
[email protected] exists with that password. We should let him proceed from there on out. So that's
some pseudo code, if you will, for this scenario. What if, though, this code is not as well written as it
currently is, and isn't using question marks? So the question mark syntax is a fairly common SQL thing,
where the question marks are used as placeholders, just like in printf, percent S was.

- [7:46:25](https://www.youtube.com/watch?v=undefined&t=63985s) But this function, db.execute from


CS50's library and third-party libraries as well, is also doing some good stuff with these question marks,
and defending against the following attack. Suppose that you were not using a third-party library like
ours and you were just manually constructing your SQL queries like this.

- [7:46:42](https://www.youtube.com/watch?v=undefined&t=64002s) You were to do something like


this instead using an f-string in Python. You're comfortable with format strings now. You've gotten into
the habit of using curly braces and plugging in values. Suppose that you, the aspiring programmer, is just
using techniques that you've been taught. So you have an f-string with select star from users, where
username equals, quote, unquote, "username" in curly braces.

- [7:47:02](https://www.youtube.com/watch?v=undefined&t=64022s) And password equals, quote,


unquote, "password" in curly braces. As of what, two weeks ago, this was perfectly legitimate technique
in Python to plug in values into a string. But notice if you are using single quotes yourself and the user
has typed in single quotes to their input, what could go wrong here? Where are we going with this if
you're just blindly plugging user input into your own prepared string of text? Yeah? AUDIENCE:
[INAUDIBLE] DAVID J. MALAN: Yeah.
- [7:47:41](https://www.youtube.com/watch?v=undefined&t=64061s) Worst case, they could insert
what is actually SQL code into your database as follows. Generally speaking, if you're using special syntax
like single quotes to surround the user's input, you'd better hope that they don't have an apostrophe in
their name. Or you better hope that they don't type a single quote as well.

- [7:47:57](https://www.youtube.com/watch?v=undefined&t=64077s) Because what if their single quote


finishes your single quote instead, and then the rest of this is somehow ignored? Well, let's consider how
this might happen. Let me go ahead in here. This got a little blurry here, but let me plug in here-- wow,
that looks awful. Let me fix the red. Just change this to white so it's more readable.

- [7:48:16](https://www.youtube.com/watch?v=undefined&t=64096s) What happens if the user does


this instead? They type in, like I did into the screenshot, '[email protected],' single quote, dash, dash.
What has just happened logically, even though we've only just begun with SQL today? Well, select star
from users, where username equals Malan@harvard.

- [7:48:37](https://www.youtube.com/watch?v=undefined&t=64117s) edu, end quote. What's bad about


the rest of this? Dash, dash, I claim, means a comment, which means my color coding is going to be a
little blurry again. But everything after the dash, dash is just ignored. The logic, then, of the SQL query,
then, is to just say, select [email protected] from the database, not even checking the password
anymore.

- [7:48:59](https://www.youtube.com/watch?v=undefined&t=64139s) Therefore, you will get back at


least one row. So length of rows will equal 1, and so presumably the rest of the pseudo code logs the
user in, gives them access to my my.harvard account, or whatever it is. And they've pretended to be me
simply by using a single quote and a dash, dash in the username field.

- [7:49:18](https://www.youtube.com/watch?v=undefined&t=64158s) Again, please don't go start doing


this later today on Harvard, Yale, or other websites. But it could be as simple as that. Why? Because the
programmer practiced what they were taught, which was just to use curly braces to plug in, in f-strings,
values. But if you don't understand how the user's input is going to be used, and if you don't distrust
your users fundamentally, for every good person out there there's going to be, unfortunately, some
adversary who just wants to try to find fault in your code or hack

- [7:49:45](https://www.youtube.com/watch?v=undefined&t=64185s) into your data set. This is what's


known as a SQL injection attack, because the user can type something that happens to be or look like
SQL, and trick your database into doing something it didn't intend to, like, for instance, logging the user
in. Worst case, they could even do something else. Maybe the user types a semicolon, then the word
drop, or the word update.

- [7:50:07](https://www.youtube.com/watch?v=undefined&t=64207s) You could imagine doing


semicolon update table grades, where name equals Malan, and set the grade equal to A instead of B, or
something like that. The ability to inject SQL into the database means you can do anything you want with
the data set, either constructively, or worse, destructively. And now, just a quick, little cartoon that
should now make sense.

- [7:50:28](https://www.youtube.com/watch?v=undefined&t=64228s) OK, to, like, one of us, two of us.


Awkwardly somewhat funny. All right, so let's move on to one last condition. There's one other problem
that can go awry here. Oh, and I should explain this. So this is an allusion to the son, Robert, having
typed in semicolon. The word drop, table, students, and doing some of the same technique.

- [7:50:54](https://www.youtube.com/watch?v=undefined&t=64254s) This is humor that only CS people


would understand because it's the mom realizing, oh, her son's doing a SQL injection attack onto the
database. Less funny when you explain it, but once you notice the syntax, that's all this is an allusion to.
All right. So one final threat, now that you are graduating to the world of proper databases and away
from CSV files alone.

- [7:51:15](https://www.youtube.com/watch?v=undefined&t=64275s) Things can go wrong when using


databases, and honestly, even using CSV files if you have multiple users. And thus far, you and I have had
the luxury in almost every program we've written that it's just me using my code. It's just you using your
code. And even if your teaching fellow or TA is using it, probably not at the same time.

- [7:51:31](https://www.youtube.com/watch?v=undefined&t=64291s) But the world gets interesting if


you start putting your code on phones, on websites, such that now you might have two users literally
trying to log in at the same time, literally clicking a button at the same, or nearly the same time. What
happens, then, if a computer is trying to handle requests from two different people at once, as might
happen all the time on a website? You might get what are called race conditions.

- [7:51:55](https://www.youtube.com/watch?v=undefined&t=64315s) And this is a problem in


computing in general, not just with SQL, not just with Python, really just any time you have shared data,
like a database, as follows. This apparently is one of the most liked Instagram posts ever. It is literally just
a picture of an egg. Has anyone clicked on this egg? Like, a couple? Oh, OK.

- [7:52:14](https://www.youtube.com/watch?v=undefined&t=64334s) Wow. All right, so yes. So go


search for this photo if you'd like to add to the likes on Instagram. The account is world_record_egg. This
is just a screenshot of Instagram of that picture of an egg. If you're in the habit of using Instagram, or like
any social media site, there's some equivalent of a like button or a heart button these days.

- [7:52:30](https://www.youtube.com/watch?v=undefined&t=64350s) And that's actually a really hard


problem. Such a simple idea to count the number of likes something has, but that means someone has
to click on it. Your code has to detect the click. Your code has to update the database, and then do it
again and again, even if multiple people are perhaps right now clicking on that same egg.

- [7:52:48](https://www.youtube.com/watch?v=undefined&t=64368s) And unfortunately, bad things can


happen if two people try to do something at the same time on a computer. How might this happen? So
here's some more code, half pseudocode, half Python code here, as follows. Suppose that what happens
when you, literally, right now, maybe click on the like button on the Instagram post.

- [7:53:09](https://www.youtube.com/watch?v=undefined&t=64389s) Suppose that code, like the


following, is executed on Facebook servers. db.execute of select likes from posts where ID equals
question mark. All right. So what am I assuming here? I'm assuming that that photograph has a unique
ID. It's some big integer, whatever it was, randomly assigned. I'm assuming that when you click on the
heart the unique ID is somehow sent to Instagram servers so that their code can call it ID.

- [7:53:36](https://www.youtube.com/watch?v=undefined&t=64416s) And I'm assuming that Instagram


is using its SQL database and selecting, from a posts table, the current number of likes of that egg for
that given ID number. Why? Because I need to know how many likes it already has if I want to add one to
it and then update the database. I need to select the data, then I need to update the data here.

- [7:53:55](https://www.youtube.com/watch?v=undefined&t=64435s) All right. So in some Python code


here, let's store, in a variable called likes, whatever comes back in the first row from the likes column.
Again, this is new syntax specific to our library, but a common way of getting back first row and the
column called likes therein. So at this point in the story, likes is storing the total number of likes, in the
millions or whatever it is, of that particular egg.

- [7:54:16](https://www.youtube.com/watch?v=undefined&t=64456s) Then I do this. Execute update


posts, set the number of likes equal to this value, where the ID of the post equals this value. What do I
want to update the likes to? Whatever likes currently is plus 1, and then plugging in the ID. So a simple
idea, right? I'm checking the value of the likes, and maybe it's 10.

- [7:54:37](https://www.youtube.com/watch?v=undefined&t=64477s) I'm changing 10 to 11 and then


updating the table. But a problem can arise if two people have clicked on that egg at roughly the same
time, or literally, the same time. Why is that? Well, in the world of databases and servers, and the
Instagrams of the world have thousands of physical servers nowadays.

- [7:54:56](https://www.youtube.com/watch?v=undefined&t=64496s) So they can support millions,


billions even, of users nowadays. What can go wrong? Well, typically code like this is not what we'll call
atomic. To be atomic means that it all executes together or not at all. Rather, code typically is executed,
as you might imagine, line by line. And if your code is running on a server that multiple people have
access to, which is absolutely the case for an app like Instagram, if you and I click on the heart at roughly
the same time, for efficiency, the computer, the server, owned by Instagram,

- [7:55:28](https://www.youtube.com/watch?v=undefined&t=64528s) might execute this line of code for


me. Then it might execute this line of code for you. Then this line of code for me, then this line of code
for you, then this line of code for me, then this line of code for you. That is to say, our queries might get
intermingled chronologically. Because it'd be a little obnoxious if, when you're using Instagram, I'm
blocked out while you're interacting with the site.

- [7:55:47](https://www.youtube.com/watch?v=undefined&t=64547s) It'd be a lot nicer for efficiency


and fairness if somehow they do a little bit of work for me, a little bit of work for you, and back and
forth, and back and forth, equitably on the server. So that's what typically happens by default. These
lines of code get executed independently. And they can happen in alternating order with other users.

- [7:56:05](https://www.youtube.com/watch?v=undefined&t=64565s) You can get them combined like


this. Same order top to bottom, but other things might happen in between. So suppose that the number
of likes at the very beginning was 10. And suppose that Carter and I both click on that egg at roughly the
same time. And suppose this line of code gets executed for me, and that gives me a value in likes,
ultimately, of 10.

- [7:56:25](https://www.youtube.com/watch?v=undefined&t=64585s) Suppose, then, that the computer


takes a break from dealing with my request, does the same code for Carter, and gets back what value for
the current number of likes? Also 10 for Carter. Because mine has not been recorded yet. At this point in
the story, somewhere in the computer's memory there's a likes variable for me, storing 10.
- [7:56:42](https://www.youtube.com/watch?v=undefined&t=64602s) There's a likes variable storing 10
for Carter. Then this line of code executes for me. It updates the database to be likes plus 1, which stores
11 in the database. Then Carter's code is executed, updating the same row in the database to 11,
unfortunately. Because his value of likes happened to be the same value of mine.

- [7:57:04](https://www.youtube.com/watch?v=undefined&t=64624s) And so the metaphor here, that if


we had a refrigerator on stage we would actually act out, is something that was taught to me years ago
in an operating systems class, whereby the most similar analogue in the real world would be if you've got
a mini fridge in your dorm room. And one of you and your roommates comes home, opens the fridge,
and realizes, oh, we're out of milk, was how the story went in my day.

- [7:57:27](https://www.youtube.com/watch?v=undefined&t=64647s) So you close the refrigerator, and


you walk across the street, go to CVS, and get in line to buy some milk. Meanwhile, your roommate
comes home. They, too, inspect the state of your refrigerator, a.k.a., a variable, open the door, and
realizes, oh, we're out of milk. I'll go get more milk. Close the fridge, go across the street, and head to
maybe a different store, or the line is long enough that you don't see each other at the store.

- [7:57:47](https://www.youtube.com/watch?v=undefined&t=64667s) So long story short, you both


eventually get home, open the door, and damn it, now there's milk from your other roommate there
because you both made a decision on this based on the state of a variable that you independently
examined. And you didn't somehow communicate. Now in the real world, this is absolutely solvable.

- [7:58:06](https://www.youtube.com/watch?v=undefined&t=64686s) How would you fix this or avoid


this problem in the real world? Literally, own roommate, own fridge. AUDIENCE: Text your roommate
[INAUDIBLE].. DAVID J. MALAN: Perfect. Let them know, so somehow communicate. And in fact, the
terminology here would be multiple threads can somehow intercommunicate by having shared state,
like the iMessage thread on your phone.

- [7:58:23](https://www.youtube.com/watch?v=undefined&t=64703s) You could leave a note. You could,


more dramatically, lock the refrigerator somehow, thereby making the milk purchasing process atomic.
The fundamental problem is that for efficiency, again, computers tend to intermingle logic that needs to
happen when it's happening across multiple users just for fairness' sake, for scheduling sake.

- [7:58:43](https://www.youtube.com/watch?v=undefined&t=64723s) You need to make sure that all


three of these lines of code execute for me, and then for Carter, and then for you if you want to ensure
that this count is correct. And for years, when social media was first getting off the ground, this was a
super hard problem. Twitter used to go down all of the time, and tweets, and retweets were a thing that
were similarly happening with a very high frequency.

- [7:59:03](https://www.youtube.com/watch?v=undefined&t=64743s) These are hard problems to solve.


And thankfully, there are solutions. And we won't get into the weeds of how you might use these things,
but know that there are solutions in the form of things called locks, which I use that word deliberately
with the fridge. Software locks can allow you to protect a variable so no one else can look at it until
you're done with it.

- [7:59:21](https://www.youtube.com/watch?v=undefined&t=64761s) There are things called


transactions, which allow you to do the equivalent of sending a message to, or really locking out your
roommate from accessing that same variable, too, but for slightly less amount of time. There are
solutions to these problems. So for instance, in Python, the same code now in green might look a little
something like this.

- [7:59:40](https://www.youtube.com/watch?v=undefined&t=64780s) When you know that something


has to happen all at once, altogether, you first begin a transaction, and you do your thing, and then you
commit the transaction at the very end. Here, too, though, there's going to be a downside. Typically, the
more you use transactions in this way, potentially the higher the probability is that you're going to box
someone out or make Carter's request a little slower.

- [8:00:01](https://www.youtube.com/watch?v=undefined&t=64801s) Why? Because we can't interact


at the same time. Or you might make his request fail if he tries to update something that's already been
updated. So you generally want to have as few lines of code together in between these transactions so
that you get in and you get out. And you go to CVS and you get back really fast so as to not cause these
kind of performance things.

- [8:00:18](https://www.youtube.com/watch?v=undefined&t=64818s) So things indeed escalated quickly


today. The original goal was just to solve problems using a different language more effectively than
Python. But as soon as you have these more powerful techniques, a whole new set of problems arises.
Takes practice to get comfortable with. But ultimately, this is all leading us toward the introduction next
week of web programming with HTML, CSS, and some JavaScript.

- [8:00:38](https://www.youtube.com/watch?v=undefined&t=64838s) The week after, bringing Python


and SQL back into the mix. So that by term's end, we've really now used all of these different languages
for what they're best at. And over the next few weeks, the goal is to make sure you're understanding and
comfortable with what each of these things is good and bad for.

- [8:00:51](https://www.youtube.com/watch?v=undefined&t=64851s) Let's go ahead and wrap here. I'll


stick around for questions. We'll see you next time. [MUSIC PLAYING] SPEAKER 1: All right.

- [8:02:15](https://www.youtube.com/watch?v=undefined&t=64935s) This is CS50, and this is already


week 8. And if we think back to the past several weeks now, recall that things started pretty interestingly,
pretty interactively, in like week 0, when we were using Scratch, because with Scratch we had a GUI, a
graphical user interface. So even as we explored variables and loops and conditionals and all of that, you
had kind of a fun environment in which to express those ideas.

- [8:02:36](https://www.youtube.com/watch?v=undefined&t=64956s) And then in week 1, we sort of


took a lot of that away, when we introduced C, and a terminal window, and a command line, because
now, all of your programs became very textual, very keyboard-based, and gone was the mouse, the
animations, the menus, and so forth. And so now, fast forward to week 8, we're going to bring those
kinds of user interface, UI, elements back, in the form of web programming.

- [8:02:57](https://www.youtube.com/watch?v=undefined&t=64977s) And this goes beyond just laying


out websites. This will, to this week and next week, combine elements of the back-end server stuff that
we've been doing for the past several weeks, using Python, using SQL, and now introducing a couple of
other languages, on the so-called client side, on your own Mac, your own PC, your own phone, that's
going to talk to those back-end services.
- [8:03:15](https://www.youtube.com/watch?v=undefined&t=64995s) So indeed, at this end of CS50,
does everything rather come together into a user interface that's just super familiar. All of us are on our
phones, desktops, laptops, every day. And increasingly, even the mobile apps that you all are using are
implemented, not necessarily in languages like Swift or Java, if you're familiar with those, but with
languages called HTML, CSS, and JavaScript, which we'll focus on here today.

- [8:03:39](https://www.youtube.com/watch?v=undefined&t=65019s) But before we do that, let's


provide a foundation on which these apps can run, because indeed, we'll start to look underneath the
hood of how the internet itself works, albeit quickly, so that we have kind of a mental model for where
all of this code is running, how you can troubleshoot issues, and how, really, ultimately, after CS50, you
can learn, by just poking around other actual websites.

- [8:03:59](https://www.youtube.com/watch?v=undefined&t=65039s) So the internet, we're all on it.


Literally, right now, what is it, in your own words? What is the internet? It's this utility nowadays, that we
all rather take for granted. How would you describe it? AUDIENCE: Big storage. SPEAKER 1: OK, big
storage, and indeed, that's how the cloud is described, which is kind of an abstraction if you will, for a
whole lot of wires and cables and hardware.

- [8:04:22](https://www.youtube.com/watch?v=undefined&t=65062s) And the internet, other


formulations of the term, how else? AUDIENCE: Bunch of data that we can reach. SPEAKER 1: OK, a
bunch of data that we can all reach, by way of being interconnected somehow with wires or wirelessly.
And so really, the internet, too, is a hardware thing. There's a whole lot of servers out there, that are
somehow interconnected, via physical cables, via internet service providers, via wireless connectivity,
and the like.

- [8:04:43](https://www.youtube.com/watch?v=undefined&t=65083s) And once you start to have


networks of networks of networks, do you get the internet. Indeed, Harvard has its own network and
Yale has its own network, and your own home probably has its own network. But once you start
connecting those networks, do you get the interconnected network that is the internet as we now know
it? So there's this whole alphabet soup that goes with the internet, some of whose acronyms and terms
you've probably seen before.

- [8:05:05](https://www.youtube.com/watch?v=undefined&t=65105s) But let's at least peel back some


of those layers and consider what some of the building blocks are. So here's a picture of the internet
before it was known as the internet, back in 1969, when it was something called ARPANET, from the
Advanced Research Projects Agency. And the intent, originally, was just to Interconnect a few universities
here in Utah and California, literally servers, or computers, in each of those areas, somehow
interconnected with wires, so that people could start to share data.

- [8:05:30](https://www.youtube.com/watch?v=undefined&t=65130s) A year later, it expanded to


include MIT and Harvard and others. And now fast forward to today, you have a huge number of systems
around the world that are on this same network. And, in fact, if I just pull up a web page here, that's sort
of constantly changing, a visualization of the internet as it might now be today, this here, in the abstract,
all of these lines and interconnections represent just how interconnected the world is today.

- [8:05:56](https://www.youtube.com/watch?v=undefined&t=65156s) And it just means that there's all


the more servers, all the more cabling, all of the more hardware giving us this underlying infrastructure.
But if we focus, really, on just these nodes, these individual dots, whether back in 1970, or now in 2021,
each of these dots you can think of as, yes, a server, but a certain type of server, namely known as a
router.

- [8:06:15](https://www.youtube.com/watch?v=undefined&t=65175s) And a router, as the name implies,


just routes data left to right, top to bottom, from one point to another. And so there's all these servers
here on campus at Harvard, on Yale's campus, in Comcast's network, Verizon's network, your own home
network, you have your own routers out there, whose purpose in life is to take in data and then decide,
should I send it this way, or this way, or this way, so to speak, assuming there are multiple options with
multiple cables.

- [8:06:39](https://www.youtube.com/watch?v=undefined&t=65199s) You, in your home, probably have


just one cable coming in or going out. But certainly, if you're a place like Harvard or Yale or Comcast or
the like, there's probably a whole bunch of interconnections that the data can then travel across
ultimately. So how do we get data among these routers? For instance, if you want to send an email to
someone at Stanford, in California, from here, on the East Coast, or if you want to visit www.stanford.

- [8:07:05](https://www.youtube.com/watch?v=undefined&t=65225s) edu, how does your laptop, your


phone, your desktop, actually get data from point A to point B? Well, essentially, your laptop or phone
knows when it boots up at the beginning of the day, what the local router is, what the address of that
local router is. So if you want to send an email from my laptop over here, my laptop is essentially going
to hand it to the nearest Harvard router.

- [8:07:24](https://www.youtube.com/watch?v=undefined&t=65244s) And then, from there, I don't


know, I don't care how it gets the rest of the distance. But hopefully, within some small number of steps
later, Harvard's router is going to send it to maybe Boston's router is going to send it to California's
router is going to send it to Stanford's router, until finally it reaches Stanford's email server.

- [8:07:38](https://www.youtube.com/watch?v=undefined&t=65258s) And we can depict this, actually,


how about a bit playfully. Thankfully, the course's staff kindly volunteered to create a visualization for
this, using a familiar technology. So here we have some of our TFs and TAs and CAs present and past. Let
me go ahead and full screen this window here. Give me just a moment to pull it up on my screen here.

- [8:07:58](https://www.youtube.com/watch?v=undefined&t=65278s) And we'll consider what happens


if we want to send a packet of information from one person or router, namely Phyllis in this case, in the
bottom right hand corner, up to Brian, in this case, in the top left hand corner. So each of the staff
members here represents exactly one of these routers on the internet.

- [8:08:17](https://www.youtube.com/watch?v=undefined&t=65297s) [MUSIC PLAYING] [APPLAUSE] The


applause is appreciated. It actually took us a significant number of attempts to get that ultimately right.
So when, what was it the staff were all passing here? Here we have just, physically, what it was the staff
were passing around. So Phyllis started with an envelope, inside of which was that email, presumably, on
the East Coast, and she wanted to send it to Brian on the West Coast, top left hand corner.

- [8:09:05](https://www.youtube.com/watch?v=undefined&t=65345s) And so she had all of these


different options, different connections, between her and point B, namely Brian. She could go up, down,
in her case, and then each of those subsequent routers could go up, down, left, or right, until it finally
reaches Brian. And long story short, there's algorithms that figure out how you decide to send a packet
up, down, left, or right, so to speak.

- [8:09:25](https://www.youtube.com/watch?v=undefined&t=65365s) But they do so by taking an input,


and in the form of input is this envelope. And there's at least a couple of things on the outside of this,
because all of these routers and, in turn, all of our Macs and PCs and phones these days, speak
something called TCP/IP, a set of acronyms you've probably seen somewhere on your phone, your Mac
or PC, in print somewhere, which refers to two protocols, two conventions, that computers use to inter-
communicate these days.

- [8:09:51](https://www.youtube.com/watch?v=undefined&t=65391s) Now what's a protocol? A


protocol is like a set of rules, that you behave. In healthier times, I might extend my hand and someone
like Carter might extend his hand, thereby interacting with me, based on a human protocol of like
literally physically shaking hands. Nowadays, we have mask protocols, whereby what you need to do is
wear a mask indoors.

- [8:10:09](https://www.youtube.com/watch?v=undefined&t=65409s) But that, too, is just a set of rules


that we all follow and adhere to, that's somewhere standardized and documented. So computers use
protocols all the time to govern how they are sending information and receiving information. And TCP
and IP are two such protocols that standardize this as follows. What TCP/IP tells someone like Phyllis to
do, if she wants to send an email to Brian, is put the email in a virtual envelope, so to speak.

- [8:10:32](https://www.youtube.com/watch?v=undefined&t=65432s) But on the outside of that virtual


envelope, put Brian's unique address. And I'll describe this as destination on the middle of the envelope,
just like in our human world, you would write the destination address on the envelope. And then she's
going to put her own source address in the top left hand corner, just like you, the sender, would put your
own source address in the human world.

- [8:10:53](https://www.youtube.com/watch?v=undefined&t=65453s) But, instead of these addresses


being like something Kirkland Street, Cambridge, Massachusetts 02138, USA, you probably know that
computers on the internet have unique addresses of their own, known as IP addresses. And an IP
address is just a numeric identifier on the internet, that allows computers, like Phyllis and Brian, to
address these envelopes to and from each other.

- [8:11:14](https://www.youtube.com/watch?v=undefined&t=65474s) And you've probably seen the


format at some point. Typically, the format of IP addresses is something dot something dot something
dot something. Each of those somethings, represented here with a hash symbol, is a number from 0
through 255. And, based on that little hint, if each of these hashes represents a number from 0 to 255,
each of those hashes is represented with how many bytes or bits? Eight bits or one byte, which is to say,
we can extrapolate from there, an IP address must use 32 bits or 4 bytes,

- [8:11:47](https://www.youtube.com/watch?v=undefined&t=65507s) if we rewind now to some of the


primitives we looked at in week 0. And what that means is, at least at a glance, it looks like we have 4
billion some odd IP addresses available to us. Now, unfortunately, there's a huge number of humans in
the world these days, all of whom have, many of whom have multiple devices, certainly in places like
this, where you have a laptop, and a phone, and you have other internet of things-type devices, all of
which need to be addressed.
- [8:12:10](https://www.youtube.com/watch?v=undefined&t=65530s) So there's another type of IP
address that's starting to be used more commonly. This is version 4 of IP. There's also version 6 which,
instead of 32 bits, uses 128 bits, which gives us a crazy number of possible addresses for computers, so
we can at least handle all of the additional devices we now have today.

- [8:12:29](https://www.youtube.com/watch?v=undefined&t=65549s) So this is to say, what ultimately is


going on this envelope is the destination address, that is Brian's IP address, and the source address, that
is Phyllis's IP address, so that this packet can go from point A to point B, and if need be, back, by just
flipping the source and the destination. But on the internet, you presumably know that there's not just
email servers.

- [8:12:50](https://www.youtube.com/watch?v=undefined&t=65570s) There's web servers, there's chat


servers, video servers, game servers. Like there's all of these different functions on the internet
nowadays. And so, when Brian receives that envelope, how does he know it's an email, versus a web
page, versus a Skype call, versus something else altogether.

- [8:13:08](https://www.youtube.com/watch?v=undefined&t=65588s) Well, it turns out that we can look


at the other part of this acronym, the TCP in TCP/IP. And what TCP allows us to do, for instance, is specify
a couple of things. One, the type of service whose data is in this envelope, that is, it does this with a
numeric identifier. And I'm going to go ahead and write down a colon, and the word port, P-O-R-T.

- [8:13:32](https://www.youtube.com/watch?v=undefined&t=65612s) And I'm going to write that in the


source address, too, colon and port. So technically, now, what's on this envelope is not just the
addresses, but also a unique number that represents what kind of service is being sent from point A to
point B, whether it's email, or web traffic, or Skype, or something else.

- [8:13:49](https://www.youtube.com/watch?v=undefined&t=65629s) These numbers are standardized,


and here are just two of the most common ones, not even in the context of email, but in the context of
the web. Port 80 is typically used whenever an envelope contains a web page, or a request therefor, or
the number 443, when that request is actually encrypted, using that thing you probably know, in URLs,
known as HTTPS, where the S literally means secure.

- [8:14:11](https://www.youtube.com/watch?v=undefined&t=65651s) More on what the HTTP means


later. If it's email, the number might be 25 or 465, or 587. These are the kinds of things you Google if you
ultimately care about. But if you've ever had to configure, like, Outlook or even Gmail to talk to another
account, you might very well have seen these numbers, by typing in something like SMTP.Gmail.

- [8:14:31](https://www.youtube.com/watch?v=undefined&t=65671s) com and then a number, which is


only to say these numbers are omnipresent. But they're typically not things you and I have to care about,
because servers and computers nowadays automate much of this process. But that's all it takes,
ultimately, for Phyllis to get this message to Brian. But what if it's a really big message? If it's a short
email, It might fit perfectly in one single packet, so to speak.

- [8:14:51](https://www.youtube.com/watch?v=undefined&t=65691s) But suppose that Phyllis wants to


send Brian a picture of a cat, like this, or worse, a video of a cat. It would be kind of inequitable if no one
else could do anything on the internet, just because Phyllis wants to send Brian a really big picture, a
really big video of a cat. It would be nice if we could kind of time-share the interconnections, across
these routers, so that we can give a little bit of time to Phyllis, a little bit of time to someone else, a little
bit of time to someone else, so that eventually, Phyllis' entire cat

- [8:15:19](https://www.youtube.com/watch?v=undefined&t=65719s) gets through the internet. But in


terms of fairness, she doesn't monopolize the bandwidth of the network in question. And this, then,
allows us to do one other feature of TCP/IP, which is fragmentation, where we can temporarily, and
Phyllis's computer would do this automatically, fragment the big packet in question, or the big file in
question, and then use, not just a single envelope, but maybe a second, a third, and a fourth, or more.

- [8:15:48](https://www.youtube.com/watch?v=undefined&t=65748s) If we do that, though, we're


probably going to need one other piece of information, just logically, on these envelopes. Like, if you
were implementing this, chopping up this picture of a cat into four parts, like, intuitively, what might you
want to put virtually on the outside of this envelope now? Yeah.

- [8:16:04](https://www.youtube.com/watch?v=undefined&t=65764s) AUDIENCE: The order. SPEAKER 1:


The order of them, somehow. So probably something like part one of four, part two of four, part three of
four, and so forth. So I'm going to write one more thing in like the memo line of the envelope here. I put
some kind of sequence number, that's just a little bit of a clue to Brian, to know in what order to
reassemble these things.

- [8:16:22](https://www.youtube.com/watch?v=undefined&t=65782s) And even more powerfully than


that, this actually gives us this simple primitive of just using INTs on these envelopes, in these packets. If
Brian receives envelopes like these, with numbers like these in the memo field, what other feature does
TCP apparently enable Brian and Phyllis to implement? This is a bit subtle.

- [8:16:44](https://www.youtube.com/watch?v=undefined&t=65804s) But it's not just the ordering of


the packets. What else might be useful about putting numbers on these things, might you think? What
might be useful here? Yeah, in back. AUDIENCE: How about if you like missed. SPEAKER 1: If you missed
something that was intended to be sent, if I heard that correct. So short answer, exactly, yes, TCP,
because of this simple little integer that we're including, can quote unquote "guarantee" delivery.

- [8:17:07](https://www.youtube.com/watch?v=undefined&t=65827s) Why? Because if Brian receives


one out of four, two out of four, four out of four, but not three out of four, he now knows, predictably,
that he needs to ask Phyllis, somehow, to resend that packet. And so this is why pretty much always, if
you receive an email, you either receive the whole thing, or nothing at all.

- [8:17:24](https://www.youtube.com/watch?v=undefined&t=65844s) Like sentences and words and


paragraphs should never really be missing from an email. Or if you download a photograph on the web,
it shouldn't just have a blank hole in the middle, just because that packet of information happened to be
lost. TCP, if it is the protocol being used to transmit data from point A to point B, ensures that it either all
gets there, or ultimately, none of it at all.

- [8:17:45](https://www.youtube.com/watch?v=undefined&t=65865s) So this is an important property,


but, just as a teaser there's other protocols out there. There's something called UDP, which is an
alternative to TCP, that doesn't guarantee delivery. And just as a taste of why you might ever not want to
guarantee delivery, maybe you're watching like a streaming video, like a sports event online.
- [8:18:03](https://www.youtube.com/watch?v=undefined&t=65883s) You probably don't necessarily
want the thing to buffer and buffer and buffer, just because you have a slow connection, because you're
going to start to miss things. And then you're going to be the only one in the world watching the game
that ended 20 minutes ago, when everyone else is sort of up to speed.

- [8:18:16](https://www.youtube.com/watch?v=undefined&t=65896s) Similarly for a voice call, it would


be really annoying if our voice is constantly buffered. So UDP might be a good protocol for making sure
that, even if the person on the other end sounds a little crappy, at least you can hear them. It's not
pausing and resending and resending, because that would really slow down that sort of human
interaction.

- [8:18:34](https://www.youtube.com/watch?v=undefined&t=65914s) So, in short, IP handles the


addressing of these packets, and standardizes numbers that every computer, your own included, gets,
and TCP handles the standardization of like what services can be used, between points A and point B. All
right, this is great, but presumably, when Phyllis sends a message to Brian, she doesn't really know and
probably shouldn't care what his IP address is, right? These days it's, like, I don't know most of the phone
numbers that my friends have.

- [8:19:03](https://www.youtube.com/watch?v=undefined&t=65943s) I instead look them up in some


way. And, indeed, when you visit a website, what do you type in? It's typically not something dot
something dot something dot something, where each of those somethings is a number. What do you
typically type in to a browser? So a domain name, right? Something like Stanford.edu, Harvard.edu,
Yale.edu, gmail.

- [8:19:20](https://www.youtube.com/watch?v=undefined&t=65960s) com, or any other such domain


name. And so, thankfully, there's another system on the internet, one more acronym for today, called
DNS, domain name system. And pretty much every network on the internet, Harvard's, Yale's, Comcast's,
your own home network, somewhere, somehow has a DNS server. You probably didn't have to configure
it yourself.

- [8:19:40](https://www.youtube.com/watch?v=undefined&t=65980s) Someone else did, your campus,


your job, your internet service provider. But there is some server connected somehow to the network
you're on, via wires or wirelessly, that just has a really big table in its memory, a big spreadsheet, if you
will, or, if you prefer, a hash table, that has at least two columns of keys and values respectively.

- [8:20:01](https://www.youtube.com/watch?v=undefined&t=66001s) Where on the left hand side is


what we'll call domain name, something like Harvard.edu, Yale.edu, an IP address on the right hand side,
that is to say, a DNS server's purpose in life is just to translate domain names to IP addresses. And vice
versa, if you want to go in the other direction, and technically, just to be precise, it translates fully
qualified domain names to IP addresses.

- [8:20:23](https://www.youtube.com/watch?v=undefined&t=66023s) And we'll see what those are in


just a moment. But again, all of this just kind of happens magically when you turn on your phone or your
laptop today, because all of these things are pre-configured for us nowadays. So how can we actually
start to see some of these things in action? Well, let's go ahead and poke around, for instance, at a
couple of URLs here.
- [8:20:45](https://www.youtube.com/watch?v=undefined&t=66045s) Let's see what we can actually do
now with these basic primitives. If we now have the ability to move data from point A to point B, and
what can be in that envelope could be, yes, an email, but today, onward, it's really going to be web
content. There's going to be content that you're requesting, like give me today's home page.

- [8:21:02](https://www.youtube.com/watch?v=undefined&t=66062s) And there's content you're


sending, which would be the contents of that actual home page. And so, just to go one level deeper, now
that we have these packets that are getting from point A to point B using TCP/IP, let's put something
specific inside of them, not just an email and a bunch of text, but something called HTTP, which stands
for hypertext transfer protocol.

- [8:21:24](https://www.youtube.com/watch?v=undefined&t=66084s) You've seen this for decades now,


probably, in the form of URLs, so much so that you probably don't even type it nowadays. Your browser
just adds it for you automatically, and you just type in Harvard.edu, or Yale.edu, or the like. But HTTP is
just a final protocol that we'll talk about here, that just standardizes how web browsers and web servers
inter-communicate.

- [8:21:45](https://www.youtube.com/watch?v=undefined&t=66105s) So this is a distinction now


between the internet and the web. The internet is really like the low-level plumbing, all of the cables, all
of a technology that just moves packets from left to right, right to left, top to bottom, that gets data from
point A to point B.

- [8:22:02](https://www.youtube.com/watch?v=undefined&t=66122s) You can do anything you want on


top of that internet nowadays, email and web and video and chat and gaming, and all of that. So HTTP,
or the web, is just one application that is conceptually on top of, built on top of the internet. Once you
take for granted that there is an internet, you can do really interesting things with it, just like in our
physical world, once you have electricity, you can just assume you can do really interesting things with
that, too, without even knowing or caring how it works.

- [8:22:25](https://www.youtube.com/watch?v=undefined&t=66145s) But now that you'll be


programming for the web, it's useful to understand how some of these things indeed work. So let's take
a peek at the format of the things that go inside of these messages. These days, it's usually actually
HTTPS that's in play, where, again, the S just means secure.

- [8:22:42](https://www.youtube.com/watch?v=undefined&t=66162s) More on that later, but the HTTP


is what standardizes what kinds of messages go inside of these envelopes. And wonderfully, it's just
textual information, typically. There is a simple text format that humans decided on years ago, that goes
inside of these envelopes, that tells a browser how to request information from a server, and how to
respond from the server to that client with information.

- [8:23:06](https://www.youtube.com/watch?v=undefined&t=66186s) So here's, for instance, a


canonical URL, https://www.example.com. What might you see at the end of this? You might sometimes
see a slash. Browsers nowadays kind of simplify things and don't show it to you. But slash, as we'll see,
just represents like the default folder, the root of the web server's hard drive, like whatever the base is of
it.

- [8:23:26](https://www.youtube.com/watch?v=undefined&t=66206s) It's like C colon backslash on


Windows, or it's my computer on Mac OS. But a URL can have more than that. It can have slash path,
where path is just a word, or multiple words, that sort of describe a longer part of the URL. That path
could actually be a specific file, we'll see, like something called file.html.

- [8:23:45](https://www.youtube.com/watch?v=undefined&t=66225s) More on HTML in just a bit, or it


can even be slash folder, maybe with another slash, or maybe it can be /folder/file.html. Now these days
Safari, and even Chrome to some extent, and other browsers, are in the habit of trying to hide more and
more of these details from you and me. Ultimately, though, it'll be useful to understand what URLs
you're at, because it maps directly to the code, that we're ultimately going to write.

- [8:24:11](https://www.youtube.com/watch?v=undefined&t=66251s) But this is only to say that all this


stuff in yellow refers to, presumably, a specific file and/or folder on the web server, on which you're
programming. All right, what's this? Example.com, this is the domain name, as we described it earlier.
Example.com is the so-called domain name. This whole thing, www.example.com, is the fully qualified
domain name.

- [8:24:34](https://www.youtube.com/watch?v=undefined&t=66274s) And what the WW is referring to


is specifically the name of a specific server in that domain. So back in the day, there was a
www.example.com web server. There might have been a mail.example.com mail server. There might
have been a chat.example.com chat server. Nowadays, this hostname, or subdomain, depending on the
context, can actually refer to a whole bunch of servers, right? When you go to www.facebook.

- [8:25:02](https://www.youtube.com/watch?v=undefined&t=66302s) com, that's not one server, that's


thousands of servers nowadays. So long story short, there's technology that somehow get your data to
one of those servers, but this whole thing is what we meant by fully qualified domain name. This thing
here, hostname, in the context of an email address it might alternatively be called a subdomain.

- [8:25:17](https://www.youtube.com/watch?v=undefined&t=66317s) This thing here, top level domain,


you probably know that dot com means commercial, although anyone can buy it these days. Dot org is
similar, dot net. Some of them are a bit restricted, dot mil is just for the US military, dot edu is just for
accredited educational institutions. But there are hundreds, if not more, top level domains nowadays,
some more popular than others.

- [8:25:38](https://www.youtube.com/watch?v=undefined&t=66338s) CS50's tools, for instance, use


CS50.io. IO sort of connotes input-output. It actually belongs, though, to a small island nation, a country,
whose country code is .io, and you see other two letter top level domains that are country specific.
Indeed, it's something.uk, something.

- [8:26:00](https://www.youtube.com/watch?v=undefined&t=66360s) jp, and the like typically refer to


countries. But some of them have been rather co-opted, .tv as well, because they have these meanings
in English as well. Lastly, this is what we'll call the protocol. That specifies how the server uses this URL to
get data from point A to point B. So what is inside of this envelope? Let's now start poking around a little
bit more.

- [8:26:19](https://www.youtube.com/watch?v=undefined&t=66379s) What is inside of this envelope?


It's essentially, for our purposes today, one of two verbs, either GET or POST. And if any of you have
dabbled with HTML or made your own website, you might have seen some of these terms before. But
these two verbs describe just how to send information from you to the server.
- [8:26:37](https://www.youtube.com/watch?v=undefined&t=66397s) Long story short, more on this
next week, GET means put any user input in the URL, POST means hide it, so that things you're searching
for, credit card numbers you're typing in, usernames and passwords you're inputting, don't show up in
the URL, and are therefore visible to anyone with access to your computer and your search history, but
rather they're somehow provided elsewhere, deeper into that envelope.

- [8:26:58](https://www.youtube.com/watch?v=undefined&t=66418s) But for now, we'll focus almost


entirely on GET, which is perhaps the most common one that we're always going to use. And what we're
going to do is this. Let me switch over just to a blank screen here. And if we assume that little old me is
this laptop here, and I'm connected to the cloud, and in that cloud is some server that I want to request
the web page of, Harvard.edu or Yale.

- [8:27:20](https://www.youtube.com/watch?v=undefined&t=66440s) edu, it's really going to be a two-


step process. There's going to be a request, that goes from point A to point B, and then, hopefully, the
server that hears that request is going to reply with what we'll typically call a response. And other terms
that are relevant here, is my laptop is the so-called client, Harvard.edu, Yale.

- [8:27:41](https://www.youtube.com/watch?v=undefined&t=66461s) edu, whatever it is, is the so-


called server. And just like in a restaurant, where you might request something to eat, the server might
bring it to you. It's, again, that kind of bidirectional relationship. One request, one response, for each
such web page we request. All right, so what's inside these envelopes, and what do we actually see?
Well, this arrow, this line I just drew from left to right, representing the request, technically looks a little
more like this.

- [8:28:05](https://www.youtube.com/watch?v=undefined&t=66485s) When you visit a web page, using


your browser, on your phone, laptop, or desktop, what's going inside that envelope, and the textual
message your Mac or PC or phone is automatically generating, looks a little something like this. The verb
GET, the URL, or rather the path that you want to get, slash represents the default page on the website.

- [8:28:23](https://www.youtube.com/watch?v=undefined&t=66503s) HTTP/1.1 is just some mention of


what version of HTTP you're speaking. Now we're up to version 2, and version 3, but 1.1 is quite
common. And the envelope contains some mention of the host that was typed in, the fully qualified
domain name. This is because single servers can actually host many different websites.

- [8:28:42](https://www.youtube.com/watch?v=undefined&t=66522s) If you're using Squarespace or


Wix or one of these popular hosting websites nowadays, you don't get your own personal server, most
likely. You're on the same server as dozens, hundreds of other customers. But when your customers,
your users' browsers, include a little mention of your specific, fully qualified domain name in the
envelope, Squarespace and Wix just know to send it to your web page or my web page or some other
customer altogether.

- [8:29:07](https://www.youtube.com/watch?v=undefined&t=66547s) Dot dot dot, there's some other


stuff there. But that's really the essence of what's in these requests. Hopefully, then, when your browser
requests this web page from the server, what comes back? Well, hopefully, a response that looks like
this, HTTP/1.1, so the same version, some status code, like a number 200, and then literally a short
phrase like OK, which means exactly that, like, OK, this request was satisfied.
- [8:29:32](https://www.youtube.com/watch?v=undefined&t=66572s) Then it contains some other
information, like the type of content that's coming back. And we'll see that this, too, is standardized.
Text/HTML means here comes some HTML, which is just a text language. It could instead be image/jpeg
or Image/png, or video/mp4, there are these different content types, otherwise known as MIME types,
that uniquely identify types of files, that come back, similar in spirit to file extensions, but a little more
standardized this way.

- [8:29:59](https://www.youtube.com/watch?v=undefined&t=66599s) Then there's some more stuff, dot


dot dot. But in general, what you see here, are a familiar pattern, keys and values. These keys and values
are otherwise known as HTTP headers. And your browser has been sending these every time you visit a
website. And, indeed, we can see this right now ourselves. Let me go over, in just a second, to Chrome
on my computer, though you can do this kind of thing with most any browser today.

- [8:30:24](https://www.youtube.com/watch?v=undefined&t=66624s) I'll go ahead and visit


HTTP://Harvard.edu, Enter. And, voila, I'm at Harvard's home page for today. The content often changes.
But this is what it looks like right now. Well, I typed in the URL, but notice it changed a little bit. It
actually sent me to HTTPS and added the www, even though I didn't type that.

- [8:30:43](https://www.youtube.com/watch?v=undefined&t=66643s) But it turns out we can poke


around at what my browser is actually doing. Let me open another page. I'm going to start to use
incognito mode this time, not because I care that people know I'm visiting Harvard.edu, but because it
throws away any history that I just did. So that every request is going to look like a brand new one, and
that's just useful diagnostically, because we're always going to see fresh information.

- [8:31:03](https://www.youtube.com/watch?v=undefined&t=66663s) My browser is not going to


remember what I previously already requested. But I'm going to go up to View, developer, developer
tools, which is something that all of you have, if you use Chrome. And there's something analogous for
Firefox and Edge and Safari and other browsers. Developer tools is going to open up these tabs down
here.

- [8:31:20](https://www.youtube.com/watch?v=undefined&t=66680s) I don't really care what's new, so


I'm going to close the bottom thing there. And I'm going to hover over the Network tab for just a
moment. And now I'm going to go and say HTTP://Harvard.edu, so the shorter version. I'm going to hit
Enter, and a whole bunch of stuff just flew across the screen.

- [8:31:37](https://www.youtube.com/watch?v=undefined&t=66697s) And it's still coming in. And if I


zoom in down here, my God, visiting Harvard.edu, still going, is downloading, what 17, 18, 19
megabytes, 20 megabytes, millions of bytes of information, over 111 HTTP requests. In other words, a bit
of a simplification, but my browser, unbeknownst to me, sent one envelope initially with the request.

- [8:32:00](https://www.youtube.com/watch?v=undefined&t=66720s) Then the server said, OK, by the


way, there's 110 other things you need, 112 other things you need to get. So my computer went back
and forth, requesting even more content for me. Why? Well, inside of Harvard's web page is a whole
bunch of images, and maybe sound files and videos and other stuff, that all need to be downloaded and
to compose what is ultimately the web page.

- [8:32:20](https://www.youtube.com/watch?v=undefined&t=66740s) But I don't care about like 100


plus of these things. Let's focus on the very first one first. The very first request I sent was up here. And
I'm going to click on this row, under the Network tab. And then I'm going to see a bit of diagnostic
information. To an average person using the web, they needn't care about this, just as you probably
didn't care about it until right now.

- [8:32:39](https://www.youtube.com/watch?v=undefined&t=66759s) And even then, perhaps not. But


if I scroll down to request headers, you will see, if I click View source, literally everything that was in the
request my Mac just sent to Harvard.edu. Two of the lines are familiar, get/http1.1, host:harvard.edu,
and then other stuff that, for now, it's not that interesting for us.

- [8:33:00](https://www.youtube.com/watch?v=undefined&t=66780s) But let's look at the response that


came back from the server. I'm going to scroll up now and see response headers, view source. And this is
interesting. It is not OK. There's no 200, there's no word OK. Curiously, harvard.edu has moved
permanently. What does that mean? Well, there's a whole bunch of stuff here that's not that interesting
for us.

- [8:33:22](https://www.youtube.com/watch?v=undefined&t=66802s) But this line, location, is


interesting. This is an HTTP header, a standardized key value pair, that's part of the HTTP protocol, that is,
conventions. And if I highlight just this one, it's telling me, mm-mmm, Harvard is not at
HTTP://Harvard.edu, Harvard's website is now, and perhaps forever, at HTTPS://www.harvard.edu.

- [8:33:44](https://www.youtube.com/watch?v=undefined&t=66824s) So what's the value here?


Probably someone at Harvard wants you to use a secure connection. So they redirected you from HTTP
to HTTPS. Maybe the marketing people want you to be at www instead of just Harvard.edu. Why? Just to
standardize things, but there are technical reasons to use a hostname, and not just the raw domain
name.

- [8:34:03](https://www.youtube.com/watch?v=undefined&t=66843s) And all this other stuff is sort of


uninteresting for our purposes, now, because a browser that receives a 301 response knows, by design,
by the definition of HTTP, to automatically redirect the user. And that's why, in my browser, all of this
happened in like a split second, because I didn't really know or care about all of those headers.

- [8:34:23](https://www.youtube.com/watch?v=undefined&t=66863s) But that's why and how I ended


up at this URL here. My browser was told to go elsewhere via that new location. And the browser just
followed those breadcrumbs, if you will, at which point it downloaded all of the other images and files,
and so forth, that compose this particular page. Well, let me zoom out.

- [8:34:41](https://www.youtube.com/watch?v=undefined&t=66881s) And let me actually go into VS


Code, if only because it's a little more pleasant to do things in just a terminal window, without actually
using a full-fledged browser. So now let's just use an equivalent program. It's called Curl, for connecting
to a URL, that's going to allow me to play with websites and just see those headers, without bothering to
download all the images and text and so forth from the website.

- [8:35:01](https://www.youtube.com/watch?v=undefined&t=66901s) It's going to allow me to do


something like this. Let me go ahead and run, for instance, Curl-I-xget, which is just the command line
arguments that says simulate a GET request textually, as though you're a browser. And let's go to
HTTP://Harvard.edu Enter. Now, by way of how Curl, works, I'm just seeing the headers.
- [8:35:24](https://www.youtube.com/watch?v=undefined&t=66924s) It didn't bother downloading the
whole website. And you see exactly the same thing, 301 moved permanently. Location is, indeed, this
one here. So that's kind of interesting. But let's follow it manually now. Let's now do what it's telling me
to do. Let's go to the location, with HTTPS and the www and hit Enter.

- [8:35:42](https://www.youtube.com/watch?v=undefined&t=66942s) And now, what's a good sign with


this output? Most of it's irrelevant. AUDIENCE: Migrate? SPEAKER 1: 200 OK, that means I'm seeing,
presumably, if I were using a real browser, the actual content of the web page. Looks like Harvard's
version of HTTP is even newer than the one I'm using.

- [8:35:59](https://www.youtube.com/watch?v=undefined&t=66959s) It's using HTTP version 2, which is


fine. But 200 is indeed indicative of things being OK. Well, what if I try visiting some bogus URL, like
Harvard.edu, when this file does not exist, something completely random, probably doesn't exist, and hit
Enter. What do you see now, that's perhaps familiar, in the real world? Yeah.

- [8:36:22](https://www.youtube.com/watch?v=undefined&t=66982s) AUDIENCE: Error 404. SPEAKER 1:


Yeah, error 404. All of us have seen this probably endlessly, from time to time, when you screw up by
mis-typing a URL, or someone deletes the web page in question. But all that is is a status code that a
browser is being sent from the server, that's a little clue as to what the actual problem is, underneath the
hood.

- [8:36:40](https://www.youtube.com/watch?v=undefined&t=67000s) So instead of getting back, for


instance, something like OK, or moved permanently, what I've just gotten back, quite simply, is 404 not
found. Well, it turns out there's other types of status codes that you'll start to see over time, as you start
to program for the web. 200 is OK. 301 is moved permanently.

- [8:36:58](https://www.youtube.com/watch?v=undefined&t=67018s) 302, 304, 307 are all similar in


spirit. They're related to redirecting the user from one place to another. 401, 403, unauthorized or
forbidden. If you ever mess up your password, or you try visiting a URL you're not supposed to look at,
you might see one of these codes, indicating that you just don't have authorization for those.

- [8:37:17](https://www.youtube.com/watch?v=undefined&t=67037s) 404 not found, 418, I'm a teapot,


was an April Fool's joke by the tech community years ago. 500 is bad. And, unfortunately, all of you are
probably on a path now to creating HTTP 500 errors, once, next week, we start writing code, because all
of us are going to screw up. We're going to have typos, logical errors, and this is on the horizon, just like
segfaults were in the world of C, but solvable with the right skills.

- [8:37:42](https://www.youtube.com/watch?v=undefined&t=67062s) 503 service unavailable, means


maybe the server is overloaded, or something like that. And there's other codes there. But those are
perhaps some of the most common ones. Has anyone, we can get away with this here, less so in New
Haven, has anyone ever visited SafetySchool.org? HTTP://SafetySchool.org, dare we do this, Enter.

- [8:38:08](https://www.youtube.com/watch?v=undefined&t=67088s) Oh, look at that. Where did we


end up? [LAUGHTER] OK, so-- [APPLAUSE] --so this has been like a joke for like 10 or 20 years. Someone
out there has been paying for the domain name, safetyschool.org, just for this two second
demonstration. But we can now infer, how did this work? The person who bought that domain name and
somehow configured DNS to point to their web server, the IP address of their web server, what is their
web server presumably spitting out, whenever a browser requests the page? What status code,
perhaps?

- [8:38:43](https://www.youtube.com/watch?v=undefined&t=67123s) Well, we can simulate this. Let me


go over to VS Code. Let me go back over here. Let me increase my terminal window. Let me do Curl-I-
xget HTTP://safetyschool.org Enter, and that's all this website does. There's not even an actual website
there. No HTML, no CSS languages we're about to see. It literally just exists on the internet to do that
redirect there.

- [8:39:10](https://www.youtube.com/watch?v=undefined&t=67150s) In fairness, there are others. Let


me actually do another one here. Instead of safetyschool.org, turns out someone, some years ago,
bought HarvardSucks.org Enter. And when we do this, you'll see that, oh, they don't need us to be
secure, but I do need the www. Let's do this one, Enter. That one is not found.

- [8:39:37](https://www.youtube.com/watch?v=undefined&t=67177s) This demo actually worked for so


many years. But someone has stopped paying for the Squarespace account recently, apparently. So--
[APPLAUSE] OK, so, fortunately, we did save the YouTube video to which this thing refers. And so, just to
put this into context, since it's been quite a few years, Harvard and Yale, of course, have this long-
standing rivalry.

- [8:40:00](https://www.youtube.com/watch?v=undefined&t=67200s) There is this tradition of pranking


each other. And, honestly, hands down, one of the best pranks ever done in this rivalry was by Yale to
Harvard. It's about a three-minute retrospective. It's one of the earliest videos, I dare say, on YouTube, so
the quality is representative of that. But let me go ahead and full screen my page here.

- [8:40:18](https://www.youtube.com/watch?v=undefined&t=67218s) And what used to live at


HarvardSucks.org is this video here. If we could dim the lights for about three minutes. [VIDEO
PLAYBACK] [MUSIC PLAYING] [CHEERING] - Actually we're going all the way to the top. And you pass it
down. - This is for you, Yale. - We love you, Yale. - We're here to trip up Harvard.

- [8:41:04](https://www.youtube.com/watch?v=undefined&t=67264s) - Go Harvard! - Go Harvard! - Pass


from the top one, pass it down. - Pass them down. - It's nice to say the ERA sucks. - Let's go Harvard. -
Where does? - You see that [BEEP]? Where they're passing. It's going to have to happen. - It's actually
going to happen. I can't [BEEP] believe this.

- [8:41:26](https://www.youtube.com/watch?v=undefined&t=67286s) - What do you think of Yale? -


They don't think good. - Hah-hah. - Because they don't have it. Doesn't run out of stuff. - OK. - Is there
another stuff? - Probably that's going to be legible, very small. - Garbage. - I know, but-- - Well, what
houses? - Says, are we in boats now? - How many extras? - How many extra are there? - Sometimes.

- [8:41:50](https://www.youtube.com/watch?v=undefined&t=67310s) - Yeah. [CHEERING] - OK. - You


guys are from Harvard, right? - No, vote for. Full timer. - Yeah, these are '05s. - Just make sure everyone
has one. - They're probably crummy. - We're still passing. - All the cards needed. - Oh, no. My bad. - Yeah.
All right, cue it up. [CHEERING] Go, Harvard. [APPLAUSE] - Hold up your signs.

- [8:42:21](https://www.youtube.com/watch?v=undefined&t=67341s) [BEEP] - You suck. You suck. You


suck. You suck. You suck. You suck. You suck. You suck. You suck. You suck. You suck. You suck. You suck.
You suck. You suck. [SCREAMING] - What do you think of Yale, sir? - Going to be, do one more time. - One
more time. One more time. [SCREAMING] - Oh, there it goes again. - Oh. - Harvard sucks.

- [8:42:58](https://www.youtube.com/watch?v=undefined&t=67378s) Harvard sucks. Harvard sucks.


Harvard sucks. Harvard sucks. Harvard sucks. Harvard sucks. [END PLAYBACK] SPEAKER 1: All right, so
thanks to our friends at Yale for that one. Let's go ahead here and consider, in just a moment, what
further is deeper down inside of the envelope, because we now have the ability to get data from, oh, OK,
YouTube autoplay again.

- [8:43:24](https://www.youtube.com/watch?v=undefined&t=67404s) Got to stop doing that. Let's


consider for just a moment that, let's consider for just a moment that we now have this ability to get
data from point A to point B. And we have the ability to specify in those envelopes what it is we want
from the website. We want to get the home page. We want to get back the HTML.

- [8:43:42](https://www.youtube.com/watch?v=undefined&t=67422s) But what is that HTML? In fact,


we don't yet have the language with which the web pages themselves are written, namely HTML and
CSS. But let's go ahead and take a five minute break here. And when we come back, we'll learn those
two languages. All right, we are back. So we've got three languages to look at today.

- [8:43:58](https://www.youtube.com/watch?v=undefined&t=67438s) But two of them are not actually


programming languages. What makes something a programming language, like C or Python and SQL, is
that there are these constructs via which you can express conditionals. You might have variables, you
might have looping constructs. You have the ability, ultimately, to express logic.

- [8:44:13](https://www.youtube.com/watch?v=undefined&t=67453s) HTML and CSS aren't so much


about logic as they are about structure, and the aesthetics of a page. And so we're going to create the
skeleton of a web page using this pair of languages, HTML and CSS. And then toward the end of the
today, we'll introduce an actual programming language, that actually is pretty similar in spirit and
syntactically to both C and Python, but that's going to allow us to make these web pages not just static,
things that you look at, but interactive applications as well.

- [8:44:38](https://www.youtube.com/watch?v=undefined&t=67478s) And then next week again, in


week 9, will we reintroduce Python and SQL, tie all of this together, so that you can actually have a
browser or a phone talking to a back-end server, and creating the experience that you and I now take for
granted for most any app or website today. Well, let's go ahead and do this.

- [8:44:55](https://www.youtube.com/watch?v=undefined&t=67495s) Let's quickly whip up something


in this language called HTML. I'm in VS Code here. I'm going to go ahead and create a file quite simply
called, Hello.html. The convention is typically to end your file names in dot html. And I'm going to go
ahead and bang this out real quick. But then we'll more slowly step through what the constructs are
herein.

- [8:45:13](https://www.youtube.com/watch?v=undefined&t=67513s) So I'm going to say doctype html


open bracket html, and then notice I'm going to do open bracket slash html close bracket. And I'm
leveraging a feature of VS Code and programming environments more generally, to do a bit of
autocomplete. So you'll see that there's this symmetry to much of what I'm going to type, but I'm not
typing all of these things.
- [8:45:33](https://www.youtube.com/watch?v=undefined&t=67533s) VS Code is automatically
generating the end of my thought for me, if you will. Let me go ahead and say, Open the head tag. Open
the title tag. I'll say something cute like, Hello, title. And then down here, I'm going to create the body of
this web page and say something like Hello, body. And let me specify at the very top, that all of this is
really in English, Lang equals en.

- [8:45:55](https://www.youtube.com/watch?v=undefined&t=67555s) So at this moment, I have a file in


my VS Code environment called Hello.html. VS Code as we're using it, of course, is cloud-based. We're
using it in a browser, even though you can also download it and run it on a Mac and PC. So we are in this
weird situation where I'm using the cloud to create a web page, and I want that web page to also live in
the cloud, that is, on the internet.

- [8:46:18](https://www.youtube.com/watch?v=undefined&t=67578s) But the thing about VS Code, or


really any website that you might use in a browser, by default that website is using probably TCP port
number 80 or TCP port number 443, which is HTTP and HTTPS respectively. But here I am, sort of a
programmer myself, trying to create my own website on an existing website.

- [8:46:39](https://www.youtube.com/watch?v=undefined&t=67599s) So it's a bit of a weird situation.


But that's OK, because what's nice about TCP is that you and I can just pick port numbers to use and run
our own web server, on a web server. That is, we can control the environment entirely, by just running
our own web server via this command, HTTP-server, in my terminal window.

- [8:46:59](https://www.youtube.com/watch?v=undefined&t=67619s) This is a command that we


preinstalled in VS Code here. And you'll notice a pop-up just came up. Your application running on port
8080 is available. That's a commonly used TCP port number, when 80 is already used, and 443 is already
used, you can run your own server on your own port, 8080 in this case.

- [8:47:15](https://www.youtube.com/watch?v=undefined&t=67635s) I've opened that tab in advance,


and if I go into another browser tab here, here I see a so-called directory listing of the web server I'm
running. So I don't see any of my other files. I don't see anything belonging to VS Code itself. I only see
the file that I've created in my current directory, called Hello.html.

- [8:47:32](https://www.youtube.com/watch?v=undefined&t=67652s) And so if I click on this file now, I


should see Hello, body. I don't see the title. But that's because the title of a web page nowadays is
typically embedded in the tab. And if I'm full screen in my browser, there are no tabs. So let me minimize
the window a bit. And now you can see just in this single browser window, in my own URL here, that
Hello, body, is in the top left hand corner.

- [8:47:52](https://www.youtube.com/watch?v=undefined&t=67672s) And if I zoom in, there's Hello,


title. So what have I done here? I have gone ahead and created my own web page in HTML, in a file
called Hello.html. And then I have opened up a web server of my own, configured it to listen on TCP port
8080, which just says to the internet, hey, listen for requests from web browsers, not on the standard
port number, 80 or 443, listen on 8080.

- [8:48:19](https://www.youtube.com/watch?v=undefined&t=67699s) And this means I can develop a


website using a web-based tool, like this one here, which is increasingly common today. All right, so now
let's consider what it is I actually just typed out. HTML is characterized really by just two features, two
vocab words, tags and attributes. Most of what I just typed were tags, but there was at least one
attribute already.

- [8:48:38](https://www.youtube.com/watch?v=undefined&t=67718s) Here's the same source code that


I typed out in HTML, from top to bottom. Let's consider what this is. The very first line of code here,
doctype html, is the only anomalous one. It's the only one that starts with an open bracket, a less than
sign, and an exclamation point. There's no more exclamation points thereafter, for now.

- [8:48:55](https://www.youtube.com/watch?v=undefined&t=67735s) This is the document type


declaration, which is a fancy way of saying, it's just got to be there nowadays. It's like a little breadcrumb
at the beginning of a file that says to the browser, you are about to see a file written in HTML version 5.
That line of code has changed over time, over the years.

- [8:49:11](https://www.youtube.com/watch?v=undefined&t=67751s) The most recent version of it is


nice and succinct like this, and it's just a clue to the browser as to what version of HTML is being used by
you, the programmer. All right, what comes after that? Well, after that, and I've highlighted two things in
yellow, this is what we're going to start calling an open tag, or a start tag, open bracket HTML then
something, close bracket, is the so-called start or open tag.

- [8:49:33](https://www.youtube.com/watch?v=undefined&t=67773s) Then the corresponding close or


end tag is down here. And it's almost the same. You use the same tag number, you use the same angled
brackets. But you do add a slash, and you don't repeat yourself with any of the things called attributes,
because, what is this thing here? Lang equals quote unquote "en," means the language of my page is
written in the English language.

- [8:49:54](https://www.youtube.com/watch?v=undefined&t=67794s) The humans have standardized


two and three letter codes for every human language, right now. And so this is just a clue to the browser
for like automatic translation and accessibility purposes, what language the web page itself is written in.
Not the tags, but the words, like Hello, title and Hello, body, which while minimalist, are indeed in
English.

- [8:50:13](https://www.youtube.com/watch?v=undefined&t=67813s) So when you close a tag, you


close the name of it with the slash and the angle brackets. You don't repeat the attribute. That would
just be annoying to have to type everything again. But notice the pattern here. It's new syntax. But this is
another example of key value pairs in computing. The key is Lang, the value is E-N for English.

- [8:50:32](https://www.youtube.com/watch?v=undefined&t=67832s) The attribute is called Lang, the


value is called, it is E-N. So again, it's just key value pairs, in just yet another context. Probably the
browser's using a hash table underneath the hood, to keep track of this stuff, like a two column table,
with keys and values. Again, humans keep using the same paradigm in different languages.

- [8:50:48](https://www.youtube.com/watch?v=undefined&t=67848s) What's inside of that? The nesting


is important visually, not to the computer, but to us, the humans, because it implies that there's some
hierarchy here. And, indeed, what is inside of the HTML tag here? Well, we have what we'll call the head
tag. The head tag says, hey, browser, here comes the head of the page.

- [8:51:06](https://www.youtube.com/watch?v=undefined&t=67866s) And then the body tag says, hey,


browser, here comes the body of the page. The body is like 99% of the user's experience, the big
rectangular window. The head is really just the address bar and other such stuff at top, like the title that
we saw a moment ago. Just to introduce the vernacular, then, the HTML tag, otherwise known as an
element, has two children, the head child and the body child, which is to say that head and body are
now siblings.

- [8:51:32](https://www.youtube.com/watch?v=undefined&t=67892s) So you can use the same kind of


family tree terminology that we used, when talking about trees, weeks ago. If we look at the head tag,
how many children does it seem to have? I'm seeing one, and, indeed, at least if we ignore all the white
space, the spaces or tabs or new line characters, there's just one child, a title element.

- [8:51:51](https://www.youtube.com/watch?v=undefined&t=67911s) And an element is the


terminology that includes the start tag and the end tag, and everything in between. So this is the title
element. And the title element has one child, which is just pure text, otherwise known as a text node.
Recall, node, from our discussions of data structures weeks ago. If we jump then to the body, which is
the other child of the HTML tag, it too has one child, which is just another chunk of text, a text node,
that says, quote unquote "Hello, body.

- [8:52:17](https://www.youtube.com/watch?v=undefined&t=67937s) " What's nice about this


indentation, even though the browser technically is not going to care, is that it implies this kind of
structure. And this is where we connect, like weeks 5 and now weeks 8, here is the tree structure we
began to talk about, even in our world of C. It's not a binary tree, even though this one happens to have
no more than two children.

- [8:52:37](https://www.youtube.com/watch?v=undefined&t=67957s) It's an arbitrary tree that can have


0 or any number of children. But if we have a special node here that refers to the document, the root
node, so to speak, is HTML, drawn with a rectangle here, just for discussion's sake. It has two children,
head and body, also rectangles. Head has a title child, and then it and body have text nodes, which I've
drawn with ovals instead.

- [8:52:58](https://www.youtube.com/watch?v=undefined&t=67978s) Which is only to say that when


your browser, Chrome, Safari, whatever, downloads a web page, opens up that envelope, and sees the
contents that have come back from the server, it essentially reads the code that someone wrote, the
HTML code, top to bottom, left to right, and creates in the browser's memory, in your Mac or your PC or
your phone's memory or RAM, this kind of data structure.

- [8:53:20](https://www.youtube.com/watch?v=undefined&t=68000s) That's what's going on


underneath the hood. And that's why aesthetically, it's just nice, as a human, to indent things stylistically,
because it's very clear then to you, and to other programmers, what the structure actually is. So that's it
for like the fundamentals of HTML. We'll see a bunch of tags and a bunch of examples now.

- [8:53:38](https://www.youtube.com/watch?v=undefined&t=68018s) But HTML is just tags and


attributes. And it's the kind of thing that you look them up when you need to. Eventually, many of them
get ingrained. I constantly check the reference guides or stack overflow if I'm trying to figure out, how do
I lay something out. It's really just these building blocks that allow you to assemble the structure of a
web page.

- [8:53:55](https://www.youtube.com/watch?v=undefined&t=68035s) This one is being super simple,


but it's just tags and attributes. Any questions on this framework, before we start to add more tags, more
vocabulary, if you will? In the middle, yeah. AUDIENCE: What would happen if we put the title tag?
SPEAKER 1: If we put the hello tag around body, that's a good question.

- [8:54:13](https://www.youtube.com/watch?v=undefined&t=68053s) Let's try it. So let me actually go


to this, and say open bracket title, whoops, sometimes you don't want it to finish your thought for you.
But it did that time. I've gone ahead and changed the file. Let me go and open up, give me a second to
open my terminal window, and go back to the URL that has my page.

- [8:54:34](https://www.youtube.com/watch?v=undefined&t=68074s) Give me a second. There's my


Hello.html. Let me zoom in on this. Let me zoom in on this. And let me go ahead now and click on
Hello.html. And in this case, it looks like we don't actually see anything. So the browser is hiding it.
Technically speaking, browsers tend to be pretty generous. And half the time, when you make mistakes
in HTML, it will display, it might display-- not display as you intend it.

- [8:54:59](https://www.youtube.com/watch?v=undefined&t=68099s) It might not display the same on


Macs or PCs or Chrome or on Firefox. There is a tool, though, that we'll see, that can help answer this
question for you. For instance, if I go to Validator.w3.org, W3 is the World Wide Web Consortium, a
group of people that standardize this kind of stuff, I can click on Validate by direct input, and just copy
paste my sample HTML into this box, and click Check.

- [8:55:21](https://www.youtube.com/watch?v=undefined&t=68121s) And I should see, hopefully, that


indeed, it's an error, what you proposed that I do. The browser just did its best to do something, which
was to show me nothing at least, rather than the incorrect information. But if I revert that change, and
let me undo what we just did, let me copy my original code back into this text box, and click Check, now
you can see, conversely, my code is now correct.

- [8:55:42](https://www.youtube.com/watch?v=undefined&t=68142s) And there's automated tools to


check that. But we'll encourage you, for problem sets and projects, to use that particular manual tool. All
right, so let's go ahead and enhance this a little bit by introducing a whole bunch of tags, just to give you
a sense of some of the building blocks here. So I'm going to go ahead and create a new file called
Paragraphs.html.

- [8:56:01](https://www.youtube.com/watch?v=undefined&t=68161s) And I'm just going to do a bunch


of copy/paste just to start things off, so I'm not constantly typing all this darn stuff again and again,
because I want everything to be the same here, except I'm going to change my title to be Paragraphs for
this demo. And inside of the body, I need a whole bunch of paragraphs of text.

- [8:56:16](https://www.youtube.com/watch?v=undefined&t=68176s) And I don't really want to come


up with some text. So let me go to some random website here and grab lorem ipsum text, which if you're
involved in like student newspaper or just design, this is placeholder text, kind of looks like Latin, but
technically isn't. Here, though, I have a handy way of just getting three long paragraphs in something
that looks like Latin.

- [8:56:34](https://www.youtube.com/watch?v=undefined&t=68194s) And I've put those, notice, inside


of the body. And they're indeed long. Look how long the made-up words here are. So let me go now into
my browser tab here. Let me reload this page, and you'll see two files have now appeared,
Paragraphs.html, which is my new one, and Hello.html. Let me click on Paragraphs.
- [8:56:58](https://www.youtube.com/watch?v=undefined&t=68218s) html, and what clearly seems to
be wrong? Yeah. AUDIENCE: One paragraph. SPEAKER 1: Yeah, it's obviously one massive paragraph,
instead of three. So that's interesting, but it's just a little hint as to how pedantic HTML is. It will only do
what you say. And each of these tags tells the browser to start doing something, and then maybe stop
doing something, like, hey, browser, here comes my HTML.

- [8:57:16](https://www.youtube.com/watch?v=undefined&t=68236s) Hey, browser, here comes the


head of my page. Hey, browser, here comes the title of my page, Hello, title. Hey, browser, that's it for
the title. That's it for the head, here comes the body tag. So it's kind of having this conversation between
the browser, between the HTML and the browser, doing literally what it says.

- [8:57:30](https://www.youtube.com/watch?v=undefined&t=68250s) So if you want a paragraph, you're


probably going to want to use the P tag for paragraph. And I'm going to go ahead and add this to my
code. I'm going to keep things neat, even though the browser won't care, by indenting things here. Let
me create another paragraph tag here, and close it right after that one, indenting again, and I'm keeping
everything nice and orderly.

- [8:57:51](https://www.youtube.com/watch?v=undefined&t=68271s) Let me do one more here. Let me


indent that, and then let me add it to the end of my page here. So again, a little tedious, but now I have
three paragraphs of text that say, hey, browser, start a paragraph. Hey, browser, stop that paragraph.
Start, stop, and so forth. Let me go back to the browser window here.

- [8:58:10](https://www.youtube.com/watch?v=undefined&t=68290s) Let me hit Command R or Control


R to reload the page. And voila, now I have three cleaner paragraphs, all right? So there's a P tag for
paragraphs. So now we have that particular building block. What if I want to add, for instance, some
headings to this page? Well, that's something that's possible, too.

- [8:58:26](https://www.youtube.com/watch?v=undefined&t=68306s) Let me go ahead and create a


new file called Headings.html. Let me copy and paste that same code as before. But now, let's preface
each paragraph with maybe H1. And I'm going to just write the word one. And here I'm going to say H2,
two. And down here I might say H3, three. So this is another tag, another three tags, H1, H2, H3.

- [8:58:49](https://www.youtube.com/watch?v=undefined&t=68329s) As you might have inferred by the


file name I chose, this just gives you headings, like in a book, different chapters or sections or
subsections, or in an academic paper, you have different hierarchies to the text that you're writing. So
now that I've added an H1 tag, and the word one, H2 tag, the word two, H3 tag and the word three, let's
go back to the browser, reload the page again, and voila, once the page reloads, I'll do it with the manual
button, reload the page.

- [8:59:17](https://www.youtube.com/watch?v=undefined&t=68357s) Oh, what am I doing wrong? Yeah.


AUDIENCE: Not in headings file. SPEAKER 1: Right, I'm not in the headings file. So let me go back a page.
Now there's Headings.html. Let me click on that. OK, now we see some evidence of this. Again, it's
nonsensical content. But you can kind of see that H1 is apparently big and bold, H2 is slightly less big, but
still bold.

- [8:59:37](https://www.youtube.com/watch?v=undefined&t=68377s) H3 is the same but a little smaller.


And it goes all the way down to H6. After that, you should probably reorganize your thoughts. But there
are six different hierarchies here, as you might use for chapters, sections, subsections, and so forth, all
right? So those are headings, as an HTML tag, in our vocabulary.

- [8:59:53](https://www.youtube.com/watch?v=undefined&t=68393s) What's a common thing, too,


well, let me go to VS Code again, let me go ahead and get some boilerplate here, create a file called
List.html. Let's create a simple list inside of my body, and I'll give this a title of List. And let me fix the title
of this one to be Headings, as well. So in List.

- [9:00:19](https://www.youtube.com/watch?v=undefined&t=68419s) html, suppose I want to have a list


of things, foo, bar, and baths, they're like a computer scientist's go-to words, just like a mathematician
might say xyz. Foo, bar, baths is in List.html. Let me go back to my browser, hit the Back button. There's
List.html, and, hopefully, I'll see foo, bar, and baths, one on each line like a nice little list, but, of course, I
do not.

- [9:00:38](https://www.youtube.com/watch?v=undefined&t=68438s) And this is not English. Chrome


thinks it might be Arabic. But that's curious, too, because the Lang attribute should be overriding that. So
Google is trying to override it. All right, what's the obvious explanation why we're seeing foo, bar, and
baths on the same line, and not three separate ones? AUDIENCE: We didn't tell it.

- [9:00:55](https://www.youtube.com/watch?v=undefined&t=68455s) SPEAKER 1: We didn't tell it to do


that. So we need paragraph tags, or maybe something else. Turns out there is something else. There is a
UL tag, for an unordered list in HTML, inside of which you can have LI tags, for list item, inside of which
you can put your words. So there's my foo, there's my bar, there's my baths.

- [9:01:14](https://www.youtube.com/watch?v=undefined&t=68474s) And, again, notice that VS Code is


finishing my thought for me. But notice the hierarchy, open UL, open LI, close LI, open LI, close LI, open
LI, close LI, close UL. So it's sort of done in reverse order here. Let me go back to my browser, reload the
same page, List.html, and voila, a default bulleted list, that still seems to be in Arabic.

- [9:01:36](https://www.youtube.com/watch?v=undefined&t=68496s) What if I want this list to be


numbered? Well, you can probably guess. If you don't want an unordered list, but an ordered list, what
tag should I use? AUDIENCE: OL. SPEAKER 1: OL, sure, so let's try that. Not always that easy as just
guessing, but in this case, OL is going to do the trick. Let me go back to my other browser.

- [9:01:53](https://www.youtube.com/watch?v=undefined&t=68513s) Let me reload the page, and now


it's going to automatically number for me. It's a tiny thing, but this is actually useful if you have a very
long list of data, and maybe you might add some things in the middle, the beginning, or the end. It
would just be annoying to have to go and renumber it. The computer is doing it for us by, instead, just
numbering from top to bottom here.

- [9:02:10](https://www.youtube.com/watch?v=undefined&t=68530s) All right, what about another type


of layout, not just paragraphs, not just lists, but what about tabular data? You've got some research data
you want to present, some financial data you want to present, a phone book that you want to present.
How might we go about laying out data, a la a table? Well, let me create a file called Table.

- [9:02:26](https://www.youtube.com/watch?v=undefined&t=68546s) html, and I'll just copy paste


where we started earlier. Let me start to close some of these other files. And in Table.html, this is going
to be a bit more HTML, but I'm going to go ahead and do this. Table and close table, tables can have
table headings. So T head is the name of that tag, and tables can have T bodies, table bodies.

- [9:02:46](https://www.youtube.com/watch?v=undefined&t=68566s) So I'm going to add that tag. And


this is a common technique, sort of start your thought, finish your thought, and then go back and fill in
what's in between. What do I want to put in this table? How about a bunch of names and numbers. So,
for instance, like left column name, right column number. So let's create a table row, with what's called
the TR tag.

- [9:03:06](https://www.youtube.com/watch?v=undefined&t=68586s) Let's create a table heading with


the TH tag, and let's say name here. Let's create another table heading called number here. And all of
that, to be clear, is in one table row. Meanwhile, in the table body, let me create another table row, but
this time, it's not a heading. Now I'm in the guts of my table.

- [9:03:25](https://www.youtube.com/watch?v=undefined&t=68605s) Let's do table data, which is


synonymous with like the cell of the table, in like an Excel spreadsheet or Google spreadsheet. In this TD,
I'm going to say like Carter's name, and then lets grab Carter's number from our past demo, 617-495-
1000. Then let's put me into the mix, and I'll go ahead and copy paste here, which often is not good.

- [9:03:44](https://www.youtube.com/watch?v=undefined&t=68624s) But we'll see that there's a lot of


shared structure with HTML. Let me go ahead and do mine, 949-468-2750, and now save this page. So
we're getting to be a lot of indentation. I'm using four spaces by default. Some people use two spaces by
default. So long as you're consistent, that's considered good style.

- [9:04:02](https://www.youtube.com/watch?v=undefined&t=68642s) But let me go back to my browser


here, and hit back. That then brings me to my directory listing again. Here's Table.html, and this is not
that interesting yet. But you can see that there's two columns, name and number. Because it's a table
heading, TH, the browser made it boldfaced for me. In there, in the table, are two rows below that,
Carter and David.

- [9:04:22](https://www.youtube.com/watch?v=undefined&t=68662s) It's a little, oh, I forgot my number


one, sorry about that. One and one, it's not the prettiest table, right? I feel like I kind of want to separate
things a little more, maybe put some borders or the like. But with HTML alone, I'm really focusing on the
structure alone. So we'll make this prettier soon.

- [9:04:38](https://www.youtube.com/watch?v=undefined&t=68678s) But for now, this is how you might


lay out tabular data. All right, let me pause here just to see if there's any questions. But, again, the goal
right now is just to kind of throw at you some basic building blocks, that, again, can be easily looked up
in a reference. But we're going to start stylizing these things soon, too.

- [9:04:54](https://www.youtube.com/watch?v=undefined&t=68694s) Yeah, in the middle. AUDIENCE:


How to indent? SPEAKER 1: How do you indent paragraphs? Really good question. For that, we'll
probably going to want something called CSS, Cascading Style Sheets. So let me come back to that, in
just a little bit. For the stylization of these things, beyond the basics, like big and bold, we're going to
need a different language altogether.

- [9:05:10](https://www.youtube.com/watch?v=undefined&t=68710s) All right, well, let's now create


what the web is full of, which is like photographs and images and the like. Let me go ahead and create a
new file called Image.html, and let me go ahead and change the title here to be, say, Image. And then, in
the body of this page, let's go ahead and put an image.

- [9:05:29](https://www.youtube.com/watch?v=undefined&t=68729s) The interesting thing about an


image is that it's actually not going to have a start tag and an end tag, because that's kind of illogical.
Like, how can you start an image and then eventually finish it? It's either there or it isn't. So some tags
do not have end tags. So let me do image, IMG, source equals Harvard.jpeg.

- [9:05:49](https://www.youtube.com/watch?v=undefined&t=68749s) And let me go ahead, and, in my


terminal window, I actually came with a photo of Harvard. Let me grab this for just a second. Let me grab
Harvard.jpeg and put it into my directory, pretend that I downloaded that in advance. And so I'm
referring to now a file called Harvard.jpeg, that apparently is in the same folder as my Image.html file.

- [9:06:12](https://www.youtube.com/watch?v=undefined&t=68772s) If this image were on the internet,


like Harvard server, I could also say like HTTPS://www.Harvard.edu/FolderName, whatever it is,
/Harvard.jpeg, but if you've in advance uploaded a file to your own, the Scode environment, like I did
before class, by dragging and dropping this whole file, this photo of Harvard, you can just refer to it
relatively, so to speak.

- [9:06:35](https://www.youtube.com/watch?v=undefined&t=68795s) This would be the same thing as


saying ./Harvard.jpeg, go to the current directory and get the file called Harvard.jpeg. But that's
unnecessary to type. For accessibility purposes, though, for someone who's vision-impaired, it's ideal if
we also give this an alternative text, something like Harvard University, in the so-called Alt tag, and this is
so that screen readers will recite what it is the photo is, for folks who can't see it.

- [9:07:00](https://www.youtube.com/watch?v=undefined&t=68820s) And if you're just on a slow


connection, sometimes you'll see the text of what you're about to see, before the image itself
downloads, especially on a mobile device. So let's now go back to my open browser tab, and let's look in
the directory. I now have Harvard.jpeg, which I downloaded in advance, and Image.html.

- [9:07:17](https://www.youtube.com/watch?v=undefined&t=68837s) Let me click on Image.html, and


here we have a really big picture of Memorial Hall, the building we're currently in. Suffice it to say I
should probably fix this and maybe make it only so wide. But to do that, we're going to probably want to
use this other language, CSS. There are some historical attributes that you can still use to control width
and height, and so forth.

- [9:07:39](https://www.youtube.com/watch?v=undefined&t=68859s) But we're going to do it the better


way, so to speak, with a language designed for just that. How about a video, though. I also came
prepared with, let me grab another file here, let me grab a file called Halloween.mp4, which is an MPEG
file. And let me go ahead and change this now to be a file called Video.html.

- [9:08:03](https://www.youtube.com/watch?v=undefined&t=68883s) I'll change my title to be Video.


And let's go ahead and now introduce another tag, a video tag, open bracket video, and then let me go
ahead and close that tag proactively. And then inside of the video tag, you can say the source of the
video is going to be specifically Halloween.

- [9:08:23](https://www.youtube.com/watch?v=undefined&t=68903s) mp4, the type of this file, I know,


is Video/mp4, because I looked up its content type or MIME type. And the video tag actually has a few
attributes. I can have this thing autoplay. I can have it loop forever. I can mute it, so that there's no
sound, which is necessary nowadays. Most browsers, to prevent ads, don't autoplay videos, if they have
sound.

- [9:08:41](https://www.youtube.com/watch?v=undefined&t=68921s) So if you mute your video, it will


autoplay, but presumably not annoy users. And let me set the width of this thing to be like, oh, 1280
pixels wide. But I can make it any size I want. So I know this just from having looked up the syntax for this
tag. But notice one curiosity. Sometimes attributes don't have values.

- [9:09:00](https://www.youtube.com/watch?v=undefined&t=68940s) They're empty attributes. They're


just single words, autoplay, loop, muted, and that kind of makes sense for any attribute that really does
what it says. Like, it doesn't make sense to say muted equals something. Like it's either muted or not.
The attribute is there or not. Similarly, for these others, as well.

- [9:09:16](https://www.youtube.com/watch?v=undefined&t=68956s) So let me go back to my other


browser tab, reload the directory listing. There is both my mp4 and also Video.html, which is the web
page that embeds it. And this is actually a video that was just on Harvard's website yesterday, and it was
amazing. So we included it in this demo here. This is the video that was on Harvard.edu last night, same
photo.

- [9:09:42](https://www.youtube.com/watch?v=undefined&t=68982s) But you can see here that an


image alone probably would not have the same effect. This is actually a movie, a small video file that's
now looping. Now there's some artifacts here, like there's a white border around the top. I feel like it'd
be nice to fill the screen. But again, we'll come back to a language that can allow us to do exactly that.

- [9:09:58](https://www.youtube.com/watch?v=undefined&t=68998s) Well, it's not just videos like this,


that you might want to put into a web page. Let me create another file called iFrame.html. If you've ever
poked around with, if you have your own YouTube account, or if you had your own blog or WordPress
site, or Wix or Squarespace, you might have been in the habit of embedding videos in websites, using
like embedded YouTube players.

- [9:10:18](https://www.youtube.com/watch?v=undefined&t=69018s) Well, this is possible, too, using


what's called an inline frame, an iFrame. And an iFrame is just a tag that is literally iFrame. It has source
equals, and then a URL, and if it happens to be a YouTube video, there's a certain URL format you need
to follow, per YouTube's documentation. So you might do www.youtube.com, embed, and then here's an
ID of a video.

- [9:10:42](https://www.youtube.com/watch?v=undefined&t=69042s) So this is essentially what we do,


if we want to embed CS50's own lecture videos, in the course's website, or the video player does literally
this. If I want to allow full screen, I can add this attribute, too, that I know exists, by just having checked
the documentation. And if I now go back to my browser here, reload my directory listing, there's
iFrame.html.

- [9:11:02](https://www.youtube.com/watch?v=undefined&t=69062s) It's not going to fill the screen,


because I haven't customized the aesthetics yet. But it does seem to embed a tiny little video there for
you to play with later, if you'd like. So we could change the width, change the height, get rid of that
margin, and so forth. But an iFrame is a way of embedding someone else's web page in your web page, if
they allow it, so as to create all the more of an interactive experience for them on, say, your site.
- [9:11:26](https://www.youtube.com/watch?v=undefined&t=69086s) All right, well, the web is, of
course, known for things like links. Let's go ahead and create a file called Link.html. And if we want to
create a web page that actually links from itself somewhere else, let's go ahead and do this, something
very simple like visit Harvard.edu period. Now, in like Facebook, Instagram, a lot of websites nowadays, if
you just type in a domain name, or a fully qualified domain name, it automatically becomes a link.

- [9:11:53](https://www.youtube.com/watch?v=undefined&t=69113s) That's because those websites


have code in them that automatically detects something that looks like a URL, and turns it into a proper
link. HTML itself does not do that for you. And so if I go back to my web page here, click on Link.html, if
you type visit Harvard.edu period, that's all you're literally going to see.

- [9:12:12](https://www.youtube.com/watch?v=undefined&t=69132s) But instinctively, even if you've


never written HTML before, what should we probably do here to solve this problem? What could we do
to solve this problem. What do I probably want to add. Yeah. AUDIENCE: Surround your-- SPEAKER 1:
Yeah, so I want to surround the URL with some kind of link text. And you wouldn't necessarily know this
until someone told you, or you looked it up, but the tag for creating a link is somewhat weirdly called the
A tag for anchor.

- [9:12:36](https://www.youtube.com/watch?v=undefined&t=69156s) It has an attribute called HREF for


hyper-reference, which is like a link in the virtual world to a URL. So let me type in Harvard's full and
proper URL here. Then I'm going to close the tag. And then I can still say Harvard.edu, and make that
what the human sees. But the place they're going to go should be a full URL protocol and all, HTTP or
HTTPS, and all.

- [9:13:01](https://www.youtube.com/watch?v=undefined&t=69181s) Now if I go back here and reload


the page, now it automatically gets underlined. It happens to be purple by default. Why? Because we
visited Harvard.edu a few minutes ago. So my browser, by default, is indicating in purple that I've been
there before. But now I have a link that I can click on, and if I hover over it but don't click, you'll see that,
in most browsers, there's a little clue as to where you will go if you click subsequently on this link.

- [9:13:24](https://www.youtube.com/watch?v=undefined&t=69204s) And without going too far down a


rabbit hole, but to tie together our discussion of cybersecurity recently, what if I were to do something
like this. Right now you have the beginnings of a phishing attack of sorts, P-H-I-S-H-I-N-G, whereby you
can create clearly a web page, or, heck, even an email using HTML, that tells the user they're going to go
one place, but they're really going to go someplace else altogether.

- [9:13:50](https://www.youtube.com/watch?v=undefined&t=69230s) And that is the essence of


phishing attacks these days. If you've ever gotten a bogus email pretending to be from PayPal or your
bank or some other website, odds are they've just written HTML that says whatever they want, but the
underlying tags might do something very different. And so having the instinct to look in the bottom left
hand corner, or be a little suspicious when you're just told blindly to click on a link, it's this easy to
socially engineer people, that is, deceive them, by just saying one thing and linking to another.

- [9:14:18](https://www.youtube.com/watch?v=undefined&t=69258s) Well, what if I want to link my


page to another page I already created? Well, if I want to link to that photo of Harvard, I can just do HREF
= equals quote unquote and the name of a file, in my same account, that is itself a web page. So this is
how you can create relative links, multi-page web pages, multi-page websites, yourself.
- [9:14:38](https://www.youtube.com/watch?v=undefined&t=69278s) So if I now reload this page, hover
over Harvard.edu, you'll see in the bottom left hand corner a very long URL. But that's because I'm in
code spaces right now, VS Code, and it's appending automatically to the end of my current URL the file
name, Image.html. But this should work. When I click on this, I go immediately to that file we created
earlier, with a crazy, big version of the image.

- [9:15:01](https://www.youtube.com/watch?v=undefined&t=69301s) But that's just a way that one


page on a website can link to another page on a website. Let's do one other thing here, making things
more responsive, because, in fact, that wasn't a particularly responsive website. Responsive means
responding to the size of the user's device, which is so important when someone might be on a screen
like this, or on a screen like this these days.

- [9:15:22](https://www.youtube.com/watch?v=undefined&t=69322s) There are special tags we can use


to tell the browser to modify its display, based on the hardware. So let me create a file called
Responsive.html. I'm going to copy/paste some starting point here, call this title Responsive. And let me
go ahead and just grab, let me grab some of that lorem ipsum text from before, just so that we have a
sizable paragraph to play with here.

- [9:15:46](https://www.youtube.com/watch?v=undefined&t=69346s) And let me go ahead and grab this


text here. And I'm just going to paste this into the body of this page. And that's it. So I just have a big
paragraph, at the moment, inside of my body. Let me go back to my browser. Let me open up this file,
called Responsive.html, to make the point that it is not yet responsive.

- [9:16:05](https://www.youtube.com/watch?v=undefined&t=69365s) Let me go ahead and click on


Responsive.html. That looks fine. But here's another trick you can do, using Chrome or Edge or other
browsers these days. You can pretend to be another device. Let me go to View, developer, developer
tools again. Last time we used this to use the Network tab, which was kind of interesting, because we
could see what the underlying network traffic is.

- [9:16:26](https://www.youtube.com/watch?v=undefined&t=69386s) But notice, we can also click on


this icon, in Chrome, at least, that looks like a mobile phone. I can turn my laptop into what looks like a
mobile device by clicking this. I'm going to click the dot dot dot menu over here, and just move the dock.
Instead of on the bottom, where it might be by default, I'm going to move it to the right hand side.

- [9:16:43](https://www.youtube.com/watch?v=undefined&t=69403s) So that now on the left, you see


what looks more like the shape of a vertical phone. And, in fact, if I go to my dimensions here, I'll choose
something like iPhone X, so a few years back. Here's what that same website might look like on an
iPhone X. You know, that looks pretty damn small, to be able to read it.

- [9:17:01](https://www.youtube.com/watch?v=undefined&t=69421s) And that's because the website


has not automatically responded to the fairly narrow dimensions of the iPhone in question, or Android
device, or whatnot. So let me go ahead and do this. Let me go back into my code. And let me go into the
head of the page, and for the first time, add another tag up here.

- [9:17:18](https://www.youtube.com/watch?v=undefined&t=69438s) This word is now all over the


internet, but there is a metatag that is called, that allows you to specify the name of some kind of
configuration detail here, or property, if you will. Viewport is the technical term for the rectangular
region that the human sees in a browser. It's essentially the body of the page.
- [9:17:35](https://www.youtube.com/watch?v=undefined&t=69455s) but only the part the human is
currently seeing. And you can specify the content of the viewport should have an initial scale of 1. So it
shouldn't be zoomed in or out. And the width that the browser should assume should be equal to the
device's width. These are sort of magical statements that you just have to know or copy/paste or
transcribe, that just express, to the browser, assume that the width of the page is the same thing as the
width of the device.

- [9:18:00](https://www.youtube.com/watch?v=undefined&t=69480s) Don't assume the luxury of a big


laptop or desktop computer. Now, making only that change, let me go back to my pretend iPhone here,
using Chrome's developer tools. Let me reload the page. And now, it's not very effective on this screen, if
I were showing you this on, is there-- well, there we go.

- [9:18:22](https://www.youtube.com/watch?v=undefined&t=69502s) Let's do this. There we go. So if I


zoom in to 100%, this would be on an actual physical device, much more readable than it would have
been a moment ago, even though I realized that demo was not necessarily persuasive. But it's as simple
as telling the browser to resize the thing to the width of the page.

- [9:18:39](https://www.youtube.com/watch?v=undefined&t=69519s) All right, let me pause here to see


if there's any questions, because that feels like enough HTML tags. We'll add just a couple of more in. But
for the most part, like HTML tags are things you Google and figure out over time, just to build up your
vocabulary. The basic building blocks are tags, attributes.

- [9:18:54](https://www.youtube.com/watch?v=undefined&t=69534s) Some attributes have values.


Some do not. And that's sort of the structure of HTML in essence. Questions on any of these, though.
Yeah. AUDIENCE: Do attributes have an order? SPEAKER 1: Do attributes have an order? No, attributes
can be in any order, from left to right. I tend to be a little nit-picky, and so I alphabetize them, if only
because then I can easily spot if something's missing, if it's not there alphabetically.

- [9:19:16](https://www.youtube.com/watch?v=undefined&t=69556s) Most people on the internet don't


seem to do that. Yeah, in the middle. Version. Yeah, good question. I mentioned that HTML is starting to
replace other languages for user interfaces. And it's not just HTML alone. It's HTML with CSS, with
JavaScript, both of which we'll get a taste of here today.

- [9:19:35](https://www.youtube.com/watch?v=undefined&t=69575s) That rather has been the trend for


portability, and the ability for companies, for individual programmers, to write one version of an app and
have it work on Android devices and iPhones and Macs and PCs, and the like. It is very expensive. It is
very time-consuming to learn a language like Java and write an Android app, learn another language
called Swift and make an iOS app, not to mention make them look and behave the same, not to mention
fix a bug in one and then remember to fix it in the other.

- [9:20:01](https://www.youtube.com/watch?v=undefined&t=69601s) I mean, this is just very painful


and time-consuming and costly. So this standardization on HTML, CSS, and JavaScript, even for mobile
apps and web apps, has been increasingly compelling, because it solves problems like that. All right, so
let's go ahead and now do something that's finally interactive.

- [9:20:20](https://www.youtube.com/watch?v=undefined&t=69620s) All of these pages thus far are


really just tastes of static content, content that does not change. Well, let's go ahead and do this. Let me
introduce one other format of URLs, which looks a little something like it did before. So slash path, but it
could actually be something like this, slash path question mark, key equals value.

- [9:20:40](https://www.youtube.com/watch?v=undefined&t=69640s) You might not have noticed, or


cared to notice, the URLs in your URL bar every day. But these things are everywhere. Often when you
type into a search engine like Google a search query, whatever you just typed ends up in the URL. When
you click on a link that contains some information, there might be a question mark, and then some keys
and values.

- [9:20:58](https://www.youtube.com/watch?v=undefined&t=69658s) There might be an ampersand


and more keys and values. Here, again, is that very common programming paradigm of just associating
keys with values. We can see this as follows. Let me actually go to google.com, in a browser here, and let
me search for something the internet is filled with, cats. Enter, notice now that my URL changed from
google.

- [9:21:20](https://www.youtube.com/watch?v=undefined&t=69680s) com to google.com slash search


question mark, Q equals cats, ampersand and then a bunch of stuff that I don't understand or know. So
let's just delete it for now, and leave it with the essence of that URL. And that still works. If I zoom out
here, years ago you would get pictures of cats. Now you get videos of the movie.

- [9:21:41](https://www.youtube.com/watch?v=undefined&t=69701s) And then that top query there, is


Cats a bad movie. But we can also, of course, click on Images. And there are the adorable cat, creepy
cats. All right, this didn't used to happen when we searched for cats. But anyhow, the point is that the
URL changed to include the user's input. And this is such a simple, but such a powerful thing.

- [9:22:01](https://www.youtube.com/watch?v=undefined&t=69721s) This is how humans provide input


to servers. They don't manually create the URLs, like I sort of just did. But when you fill out a form on the
web and you hit Enter, typically the URL suddenly changes to include whatever you typed in, in the URL,
assuming the form is using the verb GET. That's not ideal.

- [9:22:20](https://www.youtube.com/watch?v=undefined&t=69740s) If you're typing in a username, a


password, a credit card information, because you don't want the next person to sit down at your laptop
to see literally everything you typed in, saved in your history. So there's another verb, POST, that can
hide all of that. And it's just sent a little differently.

- [9:22:34](https://www.youtube.com/watch?v=undefined&t=69754s) But things like this are typically


sent via GET, and what that means underneath the hood is that your browser is just making a request
like this, Get/search? Q equals, whatever you typed in, the host that you visited, and so forth. And
hopefully what comes back is a page full of search results, including cats.

- [9:22:53](https://www.youtube.com/watch?v=undefined&t=69773s) And what's interesting here now


is, if I go back to VS Code on my own computer, and let me go ahead and create a file called, how about
Search.html. In Search.html, I'm going to start with some copy/paste from before, change my title to
search. And in the body of this page, I'm going to introduce a form tag.

- [9:23:15](https://www.youtube.com/watch?v=undefined&t=69795s) And in this form tag, I'm going to


have a couple of inputs. And the types of inputs are going to be text, and the type of the input is going to
be submit. And this isn't that interesting yet, but let's see what is happening in the page itself. Let me go
back to my directory listing. Let me click on Search.html.

- [9:23:37](https://www.youtube.com/watch?v=undefined&t=69817s) I seem to have the beginning of


my own search engine. It's not very interesting. It's just a text box and a submit button. But let's finish
my thoughts here. So let's specifically give this text box a name of Q, which, if you roll back to the late
'90s when Larry and Sergey of Google fame created Google.com, Q represented query, the query that
the human's typing in.

- [9:23:58](https://www.youtube.com/watch?v=undefined&t=69838s) So the name of this text box shall


be text, shall be Q. The form is going to use what method? Technically it uses GET by default, but I'll be
explicit and say method equals quote unquote "get." Stupidly, it's lowercase in HTML, even though
what's in the envelope is indeed uppercase, by convention.

- [9:24:17](https://www.youtube.com/watch?v=undefined&t=69857s) The action of this form,


specifically, would ideally go to my own server. But we don't really have time today to implement Google
itself. So we're just going to send the user's request to google.com/search. So I'm creating a form, the
action of which is to send the data to Google's slash search path, using the GET method.

- [9:24:36](https://www.youtube.com/watch?v=undefined&t=69876s) It's going to send an input called


Q, whenever I click this Submit button. Let me go back to the browser, reload the page. Nothing seems
to have changed yet, but, if I search for, let me zoom out, so we can see the URL bar. Right now I'm in
Search.html. If I zoom out and search for cats now and click Submit, I'm whisked away to google.com.

- [9:24:59](https://www.youtube.com/watch?v=undefined&t=69899s) But notice that the URL is


parameterized, with those key value pairs, that key value pair. And I get back a whole bunch of cat
results. And I can very easily now make this a little prettier. Right now, it's not ideal that like the human
has to move their cursor and click in the box. And it's a little obnoxious that autocomplete is enabled.

- [9:25:15](https://www.youtube.com/watch?v=undefined&t=69915s) If I don't want to search for cats


anymore, well, according to HTML's documentation, I can say something like this. Autocomplete equals
off, to turn off autocomplete, auto focus to automatically put the cursor inside of that text box. If I want
some explanatory text, I can put placeholder text like quote unquote "query.

- [9:25:34](https://www.youtube.com/watch?v=undefined&t=69934s) " And now if I go back to this page


and reload, now it's a little more user-friendly. You see query in kind of gray text. The cursor is already
there and blinking. I don't have to even move my cursor. I can search for dogs now, and you didn't see
any autocomplete at all. Hit enter to submit, and now I'm searching for, there we go, adorable dogs,
instead.

- [9:25:52](https://www.youtube.com/watch?v=undefined&t=69952s) So what have I done? I've


implemented the front end of Google.com, just not the back end. To implement the back end, we're
obviously going to need like a really big database, maybe something like SQL. We're going to need some
code that like searches the database for dogs or cats, or anything else.

- [9:26:06](https://www.youtube.com/watch?v=undefined&t=69966s) We're going to need Python for


something like that. And in fact, that's the direction we're steering next week, when we implement that
back end. But today it's all about this front end. Or any question, then, about forms, these URL
parameters, before we now transition to making things look a little prettier, with CSS? And then we'll end
by making things a little more functional, with JavaScript.

- [9:26:29](https://www.youtube.com/watch?v=undefined&t=69989s) Anything at all? No? All right, so


let's start to answer a couple of the questions that came up, by making these pages a little more
aesthetically interesting. Let's go ahead now and introduce to the mix one other language, as follows. Let
me go ahead and create a file called Home.html, as though I'm making a home page for the very first
time.

- [9:26:51](https://www.youtube.com/watch?v=undefined&t=70011s) And in this page, I'm going to give


a title of Home. And I'm just going to have like three things. First I'm going to have maybe a paragraph of
text up here at the top, that says something welcoming for my home page, like my name, John Harvard,
for instance, or John Harvard's home page. Then in the middle of the page, I'm going to have some text
like, welcome to my home page exclamation point! And at the bottom of the page, I'm going to have a
final paragraph that says something like copyright, the copyright symbol, John

- [9:27:20](https://www.youtube.com/watch?v=undefined&t=70040s) Harvard, or something like that.


All right, so it's like a web page with three different structural areas, made with text. This isn't that
interesting. If I open this page called Home.html, let me go ahead and create three quick paragraphs, a
first paragraph for John Harvard. Inside the middle, I'm going to say something like welcome to my home
page exclamation point! And at the bottom, whoops, at the bottom, a little footer that says something
like copyright, a little simple copyright symbol, and John Harvard's name.

- [9:27:50](https://www.youtube.com/watch?v=undefined&t=70070s) All right, now let me reload the


page. And there we go. It's a very simple, very underwhelming web page that has three main sections.
Let's start to now stylize this in an interesting way, so that it's a little more aesthetically pleasing. First,
these aren't really paragraphs. They're sort of like areas of the page, divisions, like the header is up here.

- [9:28:08](https://www.youtube.com/watch?v=undefined&t=70088s) There's like the main part of my


screen. And then there's the footer of my screen. So paragraphs isn't quite right, if these aren't really
paragraphs of texts. I might more properly call them divs or divisions of the page, which is a very
commonly used tag in HTML, which just has this generic rectangular region to it.

- [9:28:24](https://www.youtube.com/watch?v=undefined&t=70104s) It does not do anything


aesthetically, no bold facing, no size changes. It just creates an invisible rectangular region, inside of
which you can start to style the text. Or I can take this one step further. There's some other tags in HTML,
known as semantic tags, that literally have names that describe the types of your page, which is all the
more compelling these days for accessibility, too, for screen readers, for search engines, because now, a
screen reader, a search engine can realize that footer is probably a little fluffy.

- [9:28:53](https://www.youtube.com/watch?v=undefined&t=70133s) The header might be a little


interesting. The main part of the page is probably the juicy part, that I want users to be able to search for
or read aloud, substantively. So let's start to stylize this page somehow. Let's introduce a style attribute in
HTML, inside of which is going to be text like this, font size colon large, text align colon center.

- [9:29:17](https://www.youtube.com/watch?v=undefined&t=70157s) On Main, I'm going to add a style


attribute and say font size medium, text align center. And then on the footer, I'm going to say style equals
font size small, text align center. What's going on here? Well, in blue is the language we promised, called
CSS, for Cascading Style Sheets. We're not really seeing the Cascading Style Sheet of it yet.

- [9:29:42](https://www.youtube.com/watch?v=undefined&t=70182s) But in blue here, notice is another


very common paradigm. It's different syntax now, but how would you describe what you're looking at
here in blue? This is another example of what kind of programming convention? AUDIENCE: Key value.
SPEAKER 1: Yeah, it's just more key value pairs, right? It'd be nice if the world standardized how you
write key value pairs, because we've now seen equal signs and arrows and colons and semicolons, and
all this.

- [9:30:08](https://www.youtube.com/watch?v=undefined&t=70208s) But it's just different languages,


different choices. The key here is font-size, the value is large. The other key is text-align, the colon, the
value is center. The semicolon just separates one key value pair from another. Just like in the URL, the
ampersand did, in the context of HTTP. The designers of CSS used semicolons instead.

- [9:30:27](https://www.youtube.com/watch?v=undefined&t=70227s) Strictly speaking, this semicolon


isn't necessary. I tend to include it just for symmetry, but it doesn't matter, because there's nothing after
that. This is a bit of a weird example. This is the co-mingling of CSS inside of JavaScript. So as of now, you
can use the CSS language inside of the quote marks in the value of a style attribute.

- [9:30:49](https://www.youtube.com/watch?v=undefined&t=70249s) We did something a little similarly


last two weeks, a week plus ago, when we included some SQL inside of Python. So again, languages can
kind of cross barriers together. But we're going to clean this up, because this is going to get messy
quickly, certainly for large web pages, the size of Harvard's or Yale's, or the like.

- [9:31:07](https://www.youtube.com/watch?v=undefined&t=70267s) So let's see what this looks like.


Let me go back to my browser window here, reload the page. And it's not that different. But it's indeed
centered, and it's indeed large, medium, and small text. And let me make one refinement. The copyright
symbol actually can be expressed, but there's no key on my US keyboard here.

- [9:31:25](https://www.youtube.com/watch?v=undefined&t=70285s) I can actually magically say


ampersand hash 169 semicolon, using what's called an HTML entity. It turns out there are numeric
codes, with this weird syntax, that allow you to specify symbols that exist in Macs and PCs and phones,
but that don't exist on most keyboards. If I reload the page now, now it's a proper copyright symbol.

- [9:31:46](https://www.youtube.com/watch?v=undefined&t=70306s) So minor aesthetic, but it


introduces us to these HTML entities. So even if you've never seen CSS before, you can probably find
something kind of dumb about what I did here, like poor design. It is correct, if my goal was small,
medium, and large, bottom up, what looks like a bad design, perhaps, even if you've never seen this
language before.

- [9:32:09](https://www.youtube.com/watch?v=undefined&t=70329s) Yeah. AUDIENCE: Same SPEAKER


1: Yeah, I've used the same style three times, like copy/paste, or typing the exact same thing again and
again. It has rarely been a good thing. Well, here's where we can take advantage of the design of CSS,
because it supports what we might call inheritance, whereby children inherit the properties, the key
value pairs of their parents or ancestors.
- [9:32:31](https://www.youtube.com/watch?v=undefined&t=70351s) And what that means is, I can do
this. Let me get rid of this text align. Let me get rid of this text align. Let me get rid of this one. I could get
rid of the semicolon, too, but I'll leave it for now. And let me add all of that style to the parent element,
the body, so that it sort of cascades down to the header, the main, and the footer tags as well.

- [9:32:52](https://www.youtube.com/watch?v=undefined&t=70372s) And let me close my quotes there,


too. Now, if I go back to my browser and hit reload, nothing changes. But it's a little better designed,
right? Because if I want to change the text alignment to maybe be right aligned, I can now reload the
page, and voila, now it's over there. I change it in one place, not in three different places.

- [9:33:09](https://www.youtube.com/watch?v=undefined&t=70389s) So that would seem to be


marginally better design. And could we do this any more differently? Well, it's not that elegant that it's
all just in line with my HTML. This generally tends to be bad practice, where you co-mingle your HTML
and your CSS, especially since some of you might be really good at laying out the structure of web pages
and the content and the data, and you might have a horrible sense of design or just not care about the
aesthetics.

- [9:33:33](https://www.youtube.com/watch?v=undefined&t=70413s) You might work with a designer,


an artist, who's much better at all of these fine tunings aesthetically. Wouldn't it be nice if you could
work on the HTML, they could work on the CSS. And you don't have to somehow like literally edit the
same lines of code as each other. Well, just like we can move stuff into header files in C, or packages in
Python, we can do the same in CSS.

- [9:33:54](https://www.youtube.com/watch?v=undefined&t=70434s) So I'm actually going to go ahead


and do this. Let me get rid of all of these style attributes, and let me now start to practice a convention
of not co-mingling CSS with my HTML. Let me instead move it into the head of the page, in a style tag,
instead of an attribute. This is one of the rare examples where there are attributes that have the same
names of tags as vice versa.

- [9:34:17](https://www.youtube.com/watch?v=undefined&t=70457s) It's not very common, but this


one does exist. Here's a slightly different syntax for expressing the same key value pairs. If I want to apply
CSS properties, that is, key value pairs, to the header of the page, I say header, and then I use curly
braces, and inside of those I say font-size large, text-align center.

- [9:34:38](https://www.youtube.com/watch?v=undefined&t=70478s) Then, if I want to apply some


properties to the main section of the page, I again do font-size, say, medium, and then I can do text-align
center. Then, lastly, on the footer of the page, I can assign some properties like font-size small, and then
text-align center semicolon. And I don't have to do anything more in my HTML.

- [9:35:00](https://www.youtube.com/watch?v=undefined&t=70500s) It all just represents the structure


of my page. But, because of this style tag in the head of the page, the browser knows in advance that the
moment it encounters a header tag, a main tag, or a footer tag, it should apply those properties, those
styles. If I reload the page, other than it being recentered now, there's no other changes.

- [9:35:18](https://www.youtube.com/watch?v=undefined&t=70518s) All we're doing is sort of


iteratively improving the design here. But now everything's in the top of the file. But there's still a bad
design here. What could I now do that would be smarter? Similar problem to before. Yeah. AUDIENCE:
Create it. SPEAKER 1: OK, create a new file with just the CSS.
- [9:35:36](https://www.youtube.com/watch?v=undefined&t=70536s) I like that. Let's go there in just
one second. But even as we're here, there's still a redundancy we can probably chip away at. Yeah, get
rid of the text-align center in three different places, which doesn't seem necessary, and perhaps
someone else, if I get rid of text-align center, what should I add to my style tag in order to bring it back,
but apply it to everything in the page? And the page, if I scroll down, looks like this, in HTML.

- [9:36:00](https://www.youtube.com/watch?v=undefined&t=70560s) Yeah. AUDIENCE: The body.


SPEAKER 1: Yeah, so the body tag. So let me go ahead and say body. And then in here, put text-align
center. And that, now, if I reload the page, has no visual effect, but it's just better design, because now I
factored out that kind of commonality. And so, just to make clear what we've been doing here, these are
all, again, CSS properties, these key value pairs.

- [9:36:19](https://www.youtube.com/watch?v=undefined&t=70579s) And there's different types of


ways of using them. And there's this whole taxonomy. What we've been doing thus far are what we're
going to call type selectors, where the type is the name of a tag. And so it turns out there's other ways,
though, to do this. And let's head in this direction.

- [9:36:35](https://www.youtube.com/watch?v=undefined&t=70595s) Let's go ahead and maybe write


our CSS slightly differently, because you know what would be nice. I bet, after today, once I start creating
other files for my home page, or John Harvard's home page, I might want to have centered text on other
pages. And I might want to have large text or medium text or small text.

- [9:36:52](https://www.youtube.com/watch?v=undefined&t=70612s) It'd be nice if I could reuse these


properties again and again, and kind of create my own library, maybe even ultimately putting it in a
separate file. So let me do this. Instead of explicitly applying text-align center to the body, let me create a
new noun, or an adjective, rather, for myself, called centered.

- [9:37:09](https://www.youtube.com/watch?v=undefined&t=70629s) It has to start with a dot, because


what I'm doing is inventing my own class, so to speak. This has nothing to do with classes in Java or
Python. Class here is this aesthetic feature. And, actually, let me rename these, to be dot large, dot
medium, and dot small. What this is doing for me is it's inventing new words, well-named words, that I
can now use in this file, or potentially in other web pages I make, as follows.

- [9:37:36](https://www.youtube.com/watch?v=undefined&t=70656s) I can now say, if I want to center


the whole body, I can say class equals centered. On the header tag, I can say class equals large. On the
main tag I can say class equals medium. On the footer tag, I can say class equals small. But let me take
this one step further. As you suggested, why don't I go ahead now and let me actually get rid of-- let me
grab all of the CSS, copy it to my clipboard.

- [9:38:01](https://www.youtube.com/watch?v=undefined&t=70681s) Let me get rid of the style tag


here, and create a new file called Home.css, and let me just save all of that same text in a separate file
ending in .css, nothing else, no HTML whatsoever. But let me go back to my Home.html page, and this is
one of the most annoyingly named tags, because it doesn't really mean what it does, Link HREF
Home.css rel equals stylesheet.

- [9:38:30](https://www.youtube.com/watch?v=undefined&t=70710s) So ideally we would have used


the link tag for links in web pages, but this is link in the sort of conceptual sense. We're linking this file to
this other one, so that they work together, using this hyper-reference, Home.css, the relationship of that
file to this one is that of stylesheet. A stylesheet is a file containing a whole bunch of stylizations, a whole
bunch of properties, as we just did.

- [9:38:53](https://www.youtube.com/watch?v=undefined&t=70733s) So here, too, it's underwhelming


the effect. If I reload the page, nothing changed. But now, I not only have a better design here, because I
can now use those same classes in my second page that I might make, my third page, my fourth page, my
bio, my resume page, whatever it is I'm making on my website here, I can reuse those styles by just
including one line of code, instead of copying and pasting all of that style stuff into file after file after file.

- [9:39:22](https://www.youtube.com/watch?v=undefined&t=70762s) And heck, if the rest of the world


is really impressed by my centered class, and my large and medium and small classes, I could bundle this
up, let other people on the internet download it, and I have my own library, my own CSS library, that
other people can use. Why should you ever invent a centered class again, if I already did it for you, stupid
and small as this one is.

- [9:39:41](https://www.youtube.com/watch?v=undefined&t=70781s) But it would be nice now to


package this up in a way that's usable by other people as well. So this is perhaps the best design, when it
comes to CSS. Use classes where you can, use external stylesheets where you can, but don't use the style
attribute where we began, which while explicit, starts to get messy quickly, especially for large files.

- [9:40:07](https://www.youtube.com/watch?v=undefined&t=70807s) All right, any questions, then, on


this. No, all right, so that's class selectors. When you specify dot something, that means you're selecting
all of the tags in the page, that have that particular class, and applying those properties. So there's a
couple of others here, just to give you a taste now of what's possible.

- [9:40:25](https://www.youtube.com/watch?v=undefined&t=70825s) There's so much more that you


can actually do with HTML and CSS together. Let me go ahead and open up a few examples that I did
here in advance. Let me go ahead and open up VS Code. And let me go ahead and copy my source eight
directory. Give me one second to grab the source eight directory for today's lectures, so that I can now
go into my browser, go into some of the pre-made examples in source eight, and let me open up
paragraphs one here.

- [9:40:57](https://www.youtube.com/watch?v=undefined&t=70857s) So here's something, it's a little


subtle. But does anyone notice how this is stylized? This is just some generic lorem ipsum text again. But
what's noteworthy stylistically, a book might do this. Yeah? AUDIENCE: They're bigger. SPEAKER 1: Yeah,
the first paragraph's a little bigger. Why? Who knows, it's just a stylistic thing at the beginning of the
chapter.

- [9:41:19](https://www.youtube.com/watch?v=undefined&t=70879s) The first paragraph is bigger. How


did we do that? Well, we can actually explore this in a couple of ways. One, I can obviously go into VS
Code and show you the code. But, now, that we're using Chrome and we're using these developer tools,
let's again go into them. View developer, developer tools, and now notice, let me turn off the mobile
feature, and let me move the dock back to the bottom, just so that it's fully wide.

- [9:41:40](https://www.youtube.com/watch?v=undefined&t=70900s) We looked at the Network tab


before. We looked at the mobile button before. Now let me click on Elements. What's nice about the
Elements tab is you can see a pretty printed version of the web page's HTML, nicely color-coded, syntax
highlighted for you, so that you can now henceforth learn from, look at, the source code, the HTML
source code, of any web page on the internet.

- [9:42:02](https://www.youtube.com/watch?v=undefined&t=70922s) Notice that my own web page


here, it's not that interesting. There's a bunch of paragraph tags of lorem ipsum text. But notice what I
did. The very first one, I gave an ID to. This is something that you, as a web designer, can do. You can give
an ID attribute to any tag in a page, to give it a unique identifier.

- [9:42:20](https://www.youtube.com/watch?v=undefined&t=70940s) The onus is on you, not to reuse


the word, anywhere else. If you reuse it, you've screwed up. It's incorrect behavior. But I chose an ID of
first, just so that I have some way of referring to the very first paragraph in this file. If I look in the head
of the page, and the style tag here, notice that I have hash first.

- [9:42:40](https://www.youtube.com/watch?v=undefined&t=70960s) So just as I use dot for classes, the


world of CSS uses a hash symbol to represent IDs, unique IDs. And what this is telling the browser,
whatever element has the first ID, F-I-R-S-T, without the hash, apply font-size larger to it. And that's why
the first paragraph, and only the first paragraph, is actually stylized.

- [9:43:02](https://www.youtube.com/watch?v=undefined&t=70982s) If I actually go into VS Code now,


and let me go into my source eight directory. Let me open up Paragraphs1.html. Here is the actual file. If
I want to change the color of that first paragraph to green, for instance, I can do color colon: green. Let
me close the developer tools, reload the page. And now that page is green as well.

- [9:43:23](https://www.youtube.com/watch?v=undefined&t=71003s) You don't have to just use words.


You can use hexadecimal. What was the hex code for green in RGB? Like no red, lots of green, no blue. So
you could do 00 FF 00, using a hash, which, coincidentally, is the same symbol, but it has nothing to do
with IDs. This is just how Photoshop and web pages represent colors.

- [9:43:45](https://www.youtube.com/watch?v=undefined&t=71025s) Let's go back here and reload. It's


the same, although it's a slightly different version of green. This is pure green here. If I want to change it
to red, that would be, let's see, RGB FF 00 00, and here I can go and reload. Now it's first paragraph red.
This actually gets pretty tedious quickly.

- [9:44:02](https://www.youtube.com/watch?v=undefined&t=71042s) Like, if you're a web designer


trying to make a website for the first time, it actually might be fun to tinker with the website, before you
open up your editor and you start making changes and save and reload. That's just more steps. So notice
what you can do with developer tools, too, in Chrome and other browsers.

- [9:44:17](https://www.youtube.com/watch?v=undefined&t=71057s) When I highlight over this


paragraph, under the Elements tab, notice that, one, it gets highlighted in blue. If I move my cursor, it
doesn't get highlighted. If I move it, it gets highlighted. So it's showing me what that tag represents. But
notice over here on the right, you can also see all of the stylizations of that particular element.

- [9:44:36](https://www.youtube.com/watch?v=undefined&t=71076s) Some of them are built-in. The


italicized ones here at the bottom means user agent stylesheet. That means this is what Google makes all
paragraphs look like by default. But in non-italicized here, you see hash first, which is my code, that I just
changed. And if I want to start tinkering with colors, I can do like 00 00 FF Enter.
- [9:44:56](https://www.youtube.com/watch?v=undefined&t=71096s) I changed it to blue. But notice, if
I go back to VS Code, I didn't change my original VS Code code. This is now purely client side. And this is
a key detail. When I drew that picture earlier of the browser going, making a request to the cloud, the
server in the cloud and the response coming back, the browser, your Mac, your PC, your phone, has a
copy of all the HTML and CSS, so you can change it here, however you actually want.

- [9:45:21](https://www.youtube.com/watch?v=undefined&t=71121s) And, for instance, you can do this


with any website. Let's go, say, on a field trip here, to how about Stanford.edu. So here's Stanford's
website as of today. Let's go ahead here and let's see, there's their admissions page, campus life, and so
forth. Let me go ahead and view developer tools on Stanford's page, developer tools, elements, you can
see all of their HTML.

- [9:45:45](https://www.youtube.com/watch?v=undefined&t=71145s) And notice it's collapsed, so here


is their header. Here's their main part, and I'm using my keyboard shortcuts to just open and close the
tags, to dive in deeper and deeper. Suppose you want to kind of mess with Stanford, you can actually like
right click on any element of a page, or control click, Inspect, and that's going to jump you automatically
to the tag in the Elements tab that shows you that link.

- [9:46:07](https://www.youtube.com/watch?v=undefined&t=71167s) And notice, if I hover over this LI,


notice Stanford's using a list, as an unordered list from left to right. But it doesn't have to be a bulleted
list top to bottom. They've used CSS to change it to be a list, from news, events, academics, research,
health care, campus admission, about. Well, so much for admission, that's gone.

- [9:46:26](https://www.youtube.com/watch?v=undefined&t=71186s) So now, if I close developer tools,


now it's gone from Stanford's website. But, of course, what have I really done. I've just like mutated my
own local copy. So this is not hacking, even though this might be how they do it in TV and the movies. It's
still there if I reload the page. But it's a wonderfully powerful way to, one, just iterate quickly, and try
different things stylistically, figure out how you want to design something, and two, just learn how
Stanford did something.

- [9:46:50](https://www.youtube.com/watch?v=undefined&t=71210s) So, for instance, if I right click or


control click on admission again, go to inspect, and let me go to the LI tag. Let me keep going up, up, up,
up, up to the UL tag. There's going to be a lot going on here. But notice, they have applied all of these
CSS properties to that particular UL tag. But notice, here, this is how, it's something like this.

- [9:47:12](https://www.youtube.com/watch?v=undefined&t=71232s) And we'd have to read more to


learn how this works, list style type none, this is how they probably got rid of the bullets. And what you
can do is just tinker. Like, all right, well, what does this do? Well, let me uncheck it. All right, didn't really
change anything, font weights, uncheck this, there we go.

- [9:47:27](https://www.youtube.com/watch?v=undefined&t=71247s) So now the margin is changed,


the padding around it has changed. Let's get rid of this. We can just start turning things on and off, just to
get a sense of how the web page works. I'm not really learning anything here so far. Let me go to the LI
here for, let's go to the admissions one here. Margin, there we go, OK.

- [9:47:47](https://www.youtube.com/watch?v=undefined&t=71267s) So when there's a display


property in CSS, that's apparently effectively changing things from vertical to horizontal, if I turn that off,
now Stanford's links all look like this. And there are those bullets. So again, just default styles, that
they've somehow overridden, and a good web designer just knows ultimately how to do these kinds of
things.

- [9:48:07](https://www.youtube.com/watch?v=undefined&t=71287s) All right, how about a couple of


final building blocks, before we'll take one more break. And then we'll dive in with JavaScript to
manipulate this stuff programmatically. Let me go ahead and open up, how about Paragraphs2 here. Let
me close this tab, let me go into Paragraphs2, which is premade. And this one looks the same, except,
when I go ahead and inspect this first paragraph, notice that I was able to get rid of the ID somehow,
which is just to say, there's many, many ways to solve problems in HTML and CSS,

- [9:48:35](https://www.youtube.com/watch?v=undefined&t=71315s) just like there is in C and Python.


Let me look in the head and the style of the page now. This is what we might call another type of
selector, that allows us to specify the paragraph tag, that itself happens to be the first child only. So you
can apply CSS to a very specific child, namely first child. There's also syntax for last child, if just the first
one is supposed to look a little different.

- [9:49:00](https://www.youtube.com/watch?v=undefined&t=71340s) So, here, I've just gotten out of


the business of creating my own unique identifier and, instead, I'm using this type of selector as well.
Well, what more can we do? Let me go into another example here, called Link1.html, and here we have a
very simple page that just says visit Harvard. But notice it's purple by default, because we've been to
Harvard.edu before.

- [9:49:21](https://www.youtube.com/watch?v=undefined&t=71361s) Let's see if we can't maybe stylize


Harvard's links to be a little different. Let me go into Link version 2, now, which looks like this. And now
Harvard is very red. How did I do that? Well, let me right click on it, click Inspect, and I can start to poke
around. It looks like my HTML is not at all noteworthy.

- [9:49:40](https://www.youtube.com/watch?v=undefined&t=71380s) It's just very simple HTML, anchor


tag with an HREF. So let's look at the style. Let me zoom out. And we can look at it in two different ways.
We can literally look at the style, contents here, or we can look at Chrome's pretty version of it, over
here. It looks like my style sheet, in the style tag, has changed the color to be red, and the text
decoration, which is a new thing, but it's another CSS property, to none.

- [9:50:06](https://www.youtube.com/watch?v=undefined&t=71406s) Notice, if I turn that off, links on


the internet are underlined by default, which tends to be good for familiarity, for visibility, for
accessibility. But, if it's very obvious what is text and what is a link, maybe you change text decoration to
none. But maybe, watch this, maybe the link comes, the line comes back when you hover over it.

- [9:50:27](https://www.youtube.com/watch?v=undefined&t=71427s) Well, let's look at how I did this in


style. Notice that I have stylization, and I put my curly braces on the same line here, as tends to be
convention in CSS. Color is red, text decoration is none. But, whenever an anchor tag is hovered over,
you can change the text decoration to be back to the default, underline.

- [9:50:48](https://www.youtube.com/watch?v=undefined&t=71448s) So, again, just little ways of


playing around with the aesthetics of the page, once you understand that, really, there's just different
types of selectors. And you might have to remind yourself, look them up occasionally, as to what the
syntax is. But it's just another way of scoping your properties to specific tags.
- [9:51:03](https://www.youtube.com/watch?v=undefined&t=71463s) Let's look at version 3 of this here,
which adds Yale to the mix. If I go to Link3.html, maybe I want to have Harvard links red, Yale links blue.
How might I have done this? Well, let's right click, and click Inspect. And here we might have two links,
with a couple of techniques, just to, again, emphasize, you can do this so many different ways.

- [9:51:25](https://www.youtube.com/watch?v=undefined&t=71485s) I gave my Harvard link an ID of


Harvard, my Yale link an ID of Yale. In my CSS, if we go to the head of the page, I then did this. The tag
with the Harvard ID, a.k.a. #Harvard, should be red, #Yale should be blue, and then any anchor tag
should have no text decoration, unless you hover over it, at which point it should be underlined.

- [9:51:49](https://www.youtube.com/watch?v=undefined&t=71509s) And so, if I hover over Harvard,


it's red underlined, Yale, it's blue underlined. If I want to get rid of the IDs, I can do this a slightly different
way. Let me go into Link4. Same effect, but notice, I got rid of the IDs now. How else can I express
myself? Well, let's look at the CSS here. The anchor tag has no text decoration by default, unless you're
hovering over it.

- [9:52:08](https://www.youtube.com/watch?v=undefined&t=71528s) And this is kind of cool. This is


what we would call, on our list here, an attribute selector, where you select tags using CSS notation,
based on an attribute. So this is saying, go ahead and find any anchor tag who's HREF value happens to
equal this URL, and make it red. Do the same for Yale, and make it blue.

- [9:52:28](https://www.youtube.com/watch?v=undefined&t=71548s) Now, this might not be ideal,


because if there's something after the slash, these equal signs don't work, because if it's a different
Harvard or different Yale link, this is a little too precise. So let me look at version 5 here, of Link.html.
Look at this style, and I did this a little smarter.

- [9:52:44](https://www.youtube.com/watch?v=undefined&t=71564s) This is new syntax. And, again,


just the kind of thing you look up. Star equals means, change any anchor tag who's HREF contains
anywhere in it Harvard.edu to red, and do the same thing for Yale, based on star equals. So star here
connotes wildcard. So search for Harvard.edu or Yale.

- [9:53:05](https://www.youtube.com/watch?v=undefined&t=71585s) edu anywhere in the HREF, and if


it's there, colorize the link. And, again, we could do this all day long, with diminishing returns, to actually
achieve the same kind of stylizations in different ways. And as projects just get larger and larger, you just
have more and more decisions to make. And so you have certain conventions you start to adopt.

- [9:53:22](https://www.youtube.com/watch?v=undefined&t=71602s) And, indeed, if I may, you have


the introduction of what are called frameworks, ultimately. If you're a full-time web developer, or you're
working for a company doing the same, you might have internal conventions that you adhere to. For
instance, the company might say, always use classes, don't use IDs.

- [9:53:38](https://www.youtube.com/watch?v=undefined&t=71618s) Or always use attribute selectors,


or don't use this. And it wouldn't be necessarily as draconian as that. But they might have a style guide
of sorts. But, what many people, and many companies, do nowadays, is they do not come up with all of
their own CSS properties. They start with something off the shelf, a framework, typically a free and open
source framework, that just gives them a lot of pretty stylizations for free, just by using a third party
library.
- [9:54:04](https://www.youtube.com/watch?v=undefined&t=71644s) And one of the most popular ones
nowadays is something called Bootstrap, that CS50 uses on all of its websites, super-popular in industry
as well. It's at getbootstrap.com, and this is just to give you a taste of it, a website that documents the
library that they offer. And there's so much documentation here, but let me just go to things like, how
about components.

- [9:54:27](https://www.youtube.com/watch?v=undefined&t=71667s) It just gives you, out of the box,


the CSS with which you can create little alerts. If you've ever noticed on CS50's website, little colorful
warnings at the top of the page, or call outs, to draw your attention to things. How did we do that? It's
probably a paragraph tag or a div tag, and maybe we changed the font color.

- [9:54:43](https://www.youtube.com/watch?v=undefined&t=71683s) We changed the background


color. Or it's a lot of stuff we could absolutely do from scratch, but, you know what, why would we
reinvent the wheel if we can just use Bootstrap. So, for instance, let me just scroll down. If you've ever
seen on CS50's website a yellow warning alert like this, let me just zoom in on this.

- [9:55:01](https://www.youtube.com/watch?v=undefined&t=71701s) We are just using HTML like this.


We're using a div tag, which, again, is an invisible division, a rectangular region of the page. But we're
using classes called alert and another class called alert warning. Those are classes that the folks at
Bootstrap invented. They associated certain text colors and background colors and padding and margin
and like other aesthetics with, so all we have to do is use those classes.

- [9:55:25](https://www.youtube.com/watch?v=undefined&t=71725s) Role equals alert, just makes clear


to like a screen reader that this is an alert, that should probably be recited, and whatever's in between
the open tag and close tag, is what the human would see. How do you use something like Bootstrap?
Well, you just read the documentation. Under Getting Started, there is a link tag you copy/paste into
your own.

- [9:55:44](https://www.youtube.com/watch?v=undefined&t=71744s) So let me do this. So in


Table.html, we had code like this. Let me actually read Bootstrap's documentation really fast. And they
tell me... copy/paste this code. I'm going to put this into the head of my page. And it's quite long, but
notice, it's a link tag, which I used earlier for my own CSS file, the HREF of which is this CDN link, content
delivery network, that's referring to a specific version of Bootstrap that's available on this day.

- [9:56:10](https://www.youtube.com/watch?v=undefined&t=71770s) And the file that I'm including is


called Bootstrap.min.css. This is an actual file I can visit with my browser. If I open this in a separate tab,
this is the CSS that Bootstrap has made freely available to us. Crazy long, no white space. That's because
it's been minimized, just to not waste space by adding lots of white space and comments.

- [9:56:30](https://www.youtube.com/watch?v=undefined&t=71790s) But this contains a whole lot,


hundreds, of CSS properties that we can reuse, thanks to classes that they invented. If I want to use
some JavaScript code, I can also copy this script tag. But we'll come back to that before long. Let me now
just make a couple of tweaks to this table. If I go into my browser from before, this is what it looked like
previously, where name and number were bold, but centered, and then Carter and David were on the
left, and the numbers were to the right.

- [9:56:56](https://www.youtube.com/watch?v=undefined&t=71816s) It's fine. It's not that pretty, but


it'd be nice if it were a little prettier than that. So if we add Bootstrap into it, notice one thing happens
first, when I reload the page. No longer are Chrome's default styles used. Now Bootstrap's default styles
are used, which is a way of enforcing similarity across Chrome, Edge, Firefox, Safari, and others.

- [9:57:15](https://www.youtube.com/watch?v=undefined&t=71835s) Notice it went from a serif font to


a sans serif font, and something cleaner like this. It still looks pretty ugly, but let me go into Bootstrap's
documentation. Let me go under their content tab, for tables. And if I just kind of start skimming this,
these are some good looking tables, right? Like, there's some underlining here, some bolder font.

- [9:57:38](https://www.youtube.com/watch?v=undefined&t=71858s) There's a dark line. If I keep going,


ooh, that's getting pretty, too, if I want to have a colorful table, like I could figure all of this stuff out
myself if I want some dark mode here, if I want to have alternating highlights, and so forth. There's so
many different stylizations of tables that I could do myself.

- [9:57:54](https://www.youtube.com/watch?v=undefined&t=71874s) But I care about making a phone


book, not about reinventing these wheels. So if I read the documentation closely, it turns out that all I
need to do is add Bootstrap's table class to my table tag, and watch with a simple reload, what my now
Table.html file looks like. Much nicer, right? Might not be what you want, but, my God, with like two lines
of code, I just really prettied things up.

- [9:58:18](https://www.youtube.com/watch?v=undefined&t=71898s) And so here, then, is the value of


using something like a framework. It allows you to actually create much prettier, much more user-
friendly websites than you might otherwise be able to make on your own, certainly quickly. In fact, let's
iterate one more time on one other example, before we introduce a bit of that code.

- [9:58:40](https://www.youtube.com/watch?v=undefined&t=71920s) Let me go ahead and open up


Search.html from before, which, recall, looks like this, and Search.html on my browser was this very
simple Google search. And suppose I want to reinvent Google.com's UI a bit more. Here's a screenshot of
Google.com on a typical day. It's got an about link, a store link, Gmail images, these weird dots, sign in,
their logo.

- [9:59:05](https://www.youtube.com/watch?v=undefined&t=71945s) It's not appearing well on the


screen here, but there's a big text box in the middle, and then two buttons, Google search, and I'm
feeling lucky. Well, could I maybe go about implementing this UI myself, using some HTML, some CSS,
and maybe Bootstrap's help, just so I don't have to figure out all of these various stylizations? Well,
here's my starting point.

- [9:59:24](https://www.youtube.com/watch?v=undefined&t=71964s) In Search.html, let's go and add in


Bootstrap, first and foremost, so that we have access to all of their classes that are reusable now. And let
me go ahead and figure out how to do this. Well, just like Stanford's site had like its NAV navigation bar,
using a UL, but they changed it from being a bulleted list to being left to right, I bet I can do something
like this myself.

- [9:59:48](https://www.youtube.com/watch?v=undefined&t=71988s) So let me go into the body of my


page and, first, based on Bootstrap's documentation, let me add a div called a div with a class of
container fluid. Container fluid is just a class that comes with Bootstrap that says, make your web page
fluid, that is, grow to fill the window. So that way it's going to resize nicely.
- [0:00:07](https://www.youtube.com/watch?v=undefined&t=72007s) I'm going to go ahead and fix my
indentation here. If you haven't discovered this yet, if you highlight multiple lines in VS Code, you can hit
Tab and indent them all at once. So now, I have all of that inside of this div. Now, just like in Stanford's
site, let's create an unordered list that has maybe an LI, called with a class of NAV item, and then in here,
whoops, in here, let me go ahead and say, A HREF=https://about.

- [0:00:43](https://www.youtube.com/watch?v=undefined&t=72043s) google, which is the real URL of


Google's about page. And I'll put the about text in there. Then I'm going to close my LI tag here, and I
want to do one other thing, because I'm using Bootstrap. Bootstrap's documentation, if I read it closely,
says to add a class to your links, called like NAV link, and text dark, to make it dark, like black or dark gray,
instead of the default blue.

- [0:01:05](https://www.youtube.com/watch?v=undefined&t=72065s) All right, so I think I have now an


about link in a navigation part of my screen. Let me go ahead and save this and reload. All right, so not
exactly what I wanted. It's a bulleted list, still, so I need to override this somehow. Let me read
Bootstrap's documentation a little more clearly. And let me pretend to do that, for time's sake.

- [0:01:25](https://www.youtube.com/watch?v=undefined&t=72085s) If I go under content, oops, if I go


under components, and I go to Navs and Tabs, long story short, if you want to create a pretty menu like
this, where your links are from the left to the right, just like Stanford, I essentially need HTML like this.
And this is subtle, but I left off this class. I should have added a class called NAV on my UL.

- [0:01:46](https://www.youtube.com/watch?v=undefined&t=72106s) So that was my bad. Let me go in


here and say add class equals NAV, and then again, this class NAV item, Bootstrap told me to, NAV link
text dark, Bootstrap told me to. Let me go back to my page here, reload, and OK, still kind of ugly. But at
least the About link is in the top left hand corner, just like it should be in the real google.com.

- [0:02:08](https://www.youtube.com/watch?v=undefined&t=72128s) Now let me whip up a couple of


more links real fast. Let me go and do a little copy/paste, though I bet next week we can avoid this kind
of copy/paste. Let me change this link to be Store.google.com. The text will be store. Let me go ahead
and create another one here for Gmail. So this one's going to go to, officially, how about, technically it's
www.google.com/gmail.

- [0:02:36](https://www.youtube.com/watch?v=undefined&t=72156s) Normally it just redirects. And let


me grab one more of these. And for Google Images, and I'm going to paste this, whoops, I'm going to,
come on. I'm going to put this here, too. This is going to be images, and that URL is IMG.hp, is the URL.
All right, let me go ahead and reload the browser page.

- [0:02:56](https://www.youtube.com/watch?v=undefined&t=72176s) Now it's coming along, right?


About, store, Gmail, images. It's not quite what I want. So I'd have to read the documentation to figure
out how to maybe nudge one of these over, to start right aligning it. And there's a couple of ways to do
this. But one way is if I want Gmail to move all the way over and push everything else, I can say that add
some margin to the Gmail list item, margin start auto.

- [0:03:23](https://www.youtube.com/watch?v=undefined&t=72203s) This is in Bootstrap's


documentation, a way of saying whatever space you have, just automatically shove everything apart.
And now, if I reload the page again, now, voila, Gmail and images is over to the right. All right, so now
we're kind of moving along. Let me go ahead and add the big blue button to sign in.
- [0:03:40](https://www.youtube.com/watch?v=undefined&t=72220s) So here with sign in, let me go
ahead and, over in my same NAV, yeah, so let's go ahead and do one more LI, class equals NAV item. And
then, inside of this LI tag, what am I going to do? Turns out there is a class that can turn a link into a
button, if you say BTN, for button, and then button primary, makes it blue, the HREF for this one is going
to be https://accounts.

- [0:04:05](https://www.youtube.com/watch?v=undefined&t=72245s) goo gle.com/service/login, which


is literally where you go if you click on that big blue button. The role of this link is that of button. And
then sign in, is going to be the text on it. If I now reload the page, now we're getting even closer,
although it looks a little stupid. Notice that sign in is way in the top right hand corner, whereas the real
google.

- [0:04:26](https://www.youtube.com/watch?v=undefined&t=72266s) com has a little bit of margin


around it? OK, that's an easy fix, too. Let me go back into my HTML here. Let me add margin-3. This, too,
is a Bootstrap thing. They have a class called m-something. The something is a number from like 1 to 5, I
believe, that adds just some amount of white space. So if I reload now, OK, it's just a little prettier.

- [0:04:46](https://www.youtube.com/watch?v=undefined&t=72286s) And now let me accelerate. Just


to demonstrate how I can take this home, let me go ahead and open up my premade version of this,
whereby I added to this some final flourishes. If I go to Search2.html, I decided to replace their logo with
just this out of a cat, and notice that I re-implemented essentially google.com.

- [0:05:08](https://www.youtube.com/watch?v=undefined&t=72308s) Here's a text box, here's two


buttons, even though they're a little washed out on the screen. I even figured out how to get dots that
look pretty similar to Google's. And if we view source, you can see how I kind of finished this code. If I go
to view developer tools, and I go to elements, and I go into this div, and I go into this div, you'll see that
here's an image tag for happy cat.

- [0:05:31](https://www.youtube.com/watch?v=undefined&t=72331s) And I added some classes there to


make it fluid, and width 25% of the screen. If I go into the form tag, this is the same form tag as before.
But, notice, I used button tags this time, with button and button light classes. And then I stylized them in
a certain way. And so in the end result, if I want to go ahead and search now for birds, and click Google
search, voila, I've implemented something that's pretty darn close to Google.

- [0:05:55](https://www.youtube.com/watch?v=undefined&t=72355s) com, without even touching raw


CSS myself. And now here's the value, then, of a framework. You can just start to use off the shelf
functionality that someone else created for you. But if you want to make refinements, you don't really
like the shade of blue that Bootstrap chose, or the gray button, or you want to curve things a bit more,
that's where you can create your own CSS file, and do the last mile, sort of fine tuning things.

- [0:06:16](https://www.youtube.com/watch?v=undefined&t=72376s) And that tends to be best


practice. Stand on the shoulders of others as much as you can, using libraries. And then if you really
don't like what the library is doing, then use your own skills and understanding of HTML and CSS to
refine things a bit further. But still, after all of that, all of these examples we've done thus far are still
static, other than the Google one, which searches on the real Google.com.

- [0:06:38](https://www.youtube.com/watch?v=undefined&t=72398s) Let's take a final 5 minute break


and we'll give you a sense of what we can next do, next week onward, with JavaScript. See you in five. All
right, so I think it's fair to say, we're about to see our very last language. Next week and final projects are
ultimately going to be about synthesizing so many of these.

- [0:06:55](https://www.youtube.com/watch?v=undefined&t=72415s) Thankfully, this language called


JavaScript is quite similar syntactically to both C and Python. And, indeed, if you can imagine doing
something in either of those, you can probably do it in some form in JavaScript. The most fundamental
difference today, though, is that when you have written C code and Python code thus far, you've done it
on the server.

- [0:07:12](https://www.youtube.com/watch?v=undefined&t=72432s) You've done it in the terminal


window environment. And when you run the code, it's running in the cloud on the server. The difference
now today with JavaScript is, even though you're going to write it in the cloud using VS Code, recall that,
when a browser gets the page containing this code, it's going to get a copy of the HTML, the CSS, and the
JavaScript code.

- [0:07:32](https://www.youtube.com/watch?v=undefined&t=72452s) So JavaScript, that we see today,


is all going to be executed in the browser, on users' own Macs, PCs, and phones, not in the server.
JavaScript can be used on the server, using an environment called Node.js. It's an alternative to Python
or Ruby or Java or other languages. We are using it today client side, which is a key difference.

- [0:07:51](https://www.youtube.com/watch?v=undefined&t=72471s) So in Scratch, let's do this one last


time. If you wanted to create a variable in Scratch, set encounter equal to 0. In JavaScript, it's going to
look like this. You don't specify the type, but you do use the keyword let, and there's a few others as
well, that say let counter equal 0 semicolon.

- [0:08:08](https://www.youtube.com/watch?v=undefined&t=72488s) If you want to increment that


variable by one, you in JavaScript could say something like, counter equals counter plus 1, or you can do
it more succinctly, with plus equals, or the plus plus is back in JavaScript. You can now say counter plus
plus semicolon again. In Scratch, if you wanted to do a conditional like this, asking if x less than y, it looks
pretty much like C.

- [0:08:31](https://www.youtube.com/watch?v=undefined&t=72511s) The parentheses are,


unfortunately, back. The curly braces here are back, if you have multiple statements in particular. But,
syntactically, it's pretty much the same as it was for if, for if else, and even for it's else if else. Unlike
Python, it's two words again, else if. So quite, quite like C, nothing new beyond that.

- [0:08:49](https://www.youtube.com/watch?v=undefined&t=72529s) If you want to do something


forever in Scratch, you'd use this block. In JavaScript, you can do it a few ways, similar to Python, similar
to C, you just say while true. In JavaScript, Booleans are lowercase again, just like in C. So it's lowercase
true. If you want to do something a finite number of times, like repeat three times, looks almost like C as
well.

- [0:09:08](https://www.youtube.com/watch?v=undefined&t=72548s) The only difference, really, is using


the word let here, instead of INT. And, again, you'll use let to create a string, or an INT, or any other type
of variable in JavaScript. The browser will figure out what type you mean from context. In C we would
have said INT instead. Ultimately, this language, and that's it for our tour of JavaScript syntax.
- [0:09:28](https://www.youtube.com/watch?v=undefined&t=72568s) There's bunches of other
features, but syntactically it's going to be that accessible, relatively speaking. The power of JavaScript
running in the user's browser is going to be that you can change this thing in memory. Think about most
any website, that's at all interesting today, that you use.

- [0:09:43](https://www.youtube.com/watch?v=undefined&t=72583s) It's typically very interactive and


dynamic. If you're sitting in front of Gmail on a laptop or desktop with the browser tab open, and
someone sends you an email, all of a sudden, another row appears in your inbox, another row, another
row. How is that implemented? Honestly, it could be an HTML table.

- [0:09:59](https://www.youtube.com/watch?v=undefined&t=72599s) Maybe it's a bunch of divs top to


bottom. The point, though, is, you don't have to hit Command R or Control R to reload the page to see
more email. It automatically appears every few seconds or minutes. How is that working? When you visit
Gmail.com, you are downloading not just HTML and CSS with your initial inbox, presumably.

- [0:10:18](https://www.youtube.com/watch?v=undefined&t=72618s) You're downloading some


JavaScript code, that is designed to keep talking every second, every 10 seconds or something, to Gmail
servers, and they, then, are using their code to add another element, another element, another element,
to the existing DOM, document object model, which is the fancy term for tree in memory that
represents HTML, so that the web page can continue to update in real time.

- [0:10:42](https://www.youtube.com/watch?v=undefined&t=72642s) Google Maps, same thing. If you


click and drag and drag and drag, your browser did not download the entire world to your Mac or PC by
default. It only downloaded what's in your viewport, the rectangular region. But when you click and
drag, it's going to get some more tiles up there, some more images, some more images, as you keep
dragging, using JavaScript, again, behind the scenes.

- [0:11:01](https://www.youtube.com/watch?v=undefined&t=72661s) So let's actually use JavaScript to


start interacting with pages. How can we do this? We can put the JavaScript code in the head of the
page, in the body of the page, or even factor it out to a separate file. So let's take a look. Here is a new
version of Hello.html, that, during the break, I just added a form to, because it'd be nice if this page
didn't just say Hello, title, Hello, body, it said, Hello, David, Hello, Carter, Hello, whoever uses it.

- [0:11:26](https://www.youtube.com/watch?v=undefined&t=72686s) I've got a form that I borrowed


from some of our earlier code, and that form has an input whose ID is name, that also has a submit
button. But there's no code in this yet. So let's add a little bit of JavaScript code as follows. Suppose that,
when this form is submitted, I want to greet the user.

- [0:11:43](https://www.youtube.com/watch?v=undefined&t=72703s) How can I do that? Well, let's do it


the somewhat messy way first. I can add an attribute called on submit to the form element, and I can say
on submit, call the function called greet, close quotes. Unfortunately, this function doesn't yet exist. But I
can make it exist. But there's another detail here.

- [0:12:01](https://www.youtube.com/watch?v=undefined&t=72721s) When the user clicks submit,


normally forms get submitted to the server. I don't want to do that today. I want to just submit the form
to the browsers, keep on the same page, and just print to the screen, Hello, David, or so forth. So I'm
also going to go ahead and say, return false. And this is a JavaScript way of telling the browser, even
when the user tries to submit the form, return false.
- [0:12:22](https://www.youtube.com/watch?v=undefined&t=72742s) Like, no, don't let them actually
submit the form. But do call this function called greet. In the head of my page, I'm going to add a script
tag, wherein the language is implicitly JavaScript, and has no relationship, for those of you who took
APCS with Java, just a similarly named language, but no relation, I'm going to name a function called
Greet.

- [0:12:42](https://www.youtube.com/watch?v=undefined&t=72762s) Apparently in JavaScript, the way


you create a function is you literally say the word function instead of Def. You don't specify a return type.
And in this function, I could do something like this, alert quote unquote, how about, Hello, there. Initially
I'm going to keep it simple, using a built-in function called alert, which is not a good user interface.

- [0:13:03](https://www.youtube.com/watch?v=undefined&t=72783s) There are better ways to do this.


But we're doing something simple first. Let me now go ahead and load this page again. It still looks as
simple as before, with just a simple text box. I'll zoom in to make it bigger. I'm going to type my name,
but I think it's going to be ignored when I click Submit.

- [0:13:17](https://www.youtube.com/watch?v=undefined&t=72797s) It just says, Hello, there. And this


is, again, this is an ugly user interface. It literally says the whole code space URL of the web page is saying
this to you. It's really just meant for simple interactions like this, for now. All right, let's have it say Hello,
David, somehow. Well, how can I do this? Well, if this element on the page was given by me a unique ID,
it'd be nice if, just like in CSS, I can go grab the value of that text box, using code.

- [0:13:42](https://www.youtube.com/watch?v=undefined&t=72822s) And I actually can. Let me go


ahead and do this. Let me store, in a variable called name, the result of calling a special function called
document.queryselector. This query selector function is JavaScript's version of what we were doing in
CSS, to select nodes, using hashes or dots or other syntax. It's the same syntax.

- [0:14:04](https://www.youtube.com/watch?v=undefined&t=72844s) So if I want to select the element


whose unique ID is name, I can literally just pass, in single or double quotes, hash name, just like in CSS.
That gives me the actual node from the tree. It gives me one of these rectangles from the DOM, the
document object model. If I actually want to get at the specific value therein, I need to go one step
further and say .value.

- [0:14:27](https://www.youtube.com/watch?v=undefined&t=72867s) So, similar in spirit to Python,


where we saw a lot of dot notation, where you can go inside an object, inside of an object, that's what's
going on. Long story short, in JavaScript, there is a special global variable called document, that lets you
just do stuff with the document, the web page itself.

- [0:14:42](https://www.youtube.com/watch?v=undefined&t=72882s) One of those functions is called


query selector. That function returns to you whatever it is you're selecting. And dot value means go
inside of that rectangle, and grab the actual text that the human typed in. So if I want to now say, Hello,
to that person, the syntax is a little different from C and Python.

- [0:15:01](https://www.youtube.com/watch?v=undefined&t=72901s) I can use concatenation, which


actually does exist in Python, but we didn't use it much. I can go ahead and say hello, quote unquote
"Hello," plus name. All right, now, if I go back to the browser window, reload the page, to get the latest
version of the code, type in David, and click Submit, now I see, Hello, David.
- [0:15:20](https://www.youtube.com/watch?v=undefined&t=72920s) Not the best website, but it does
demonstrate how I can start to interact with the page. But let me stipulate that this co-mingling of
languages is never a good thing. It's fine to use classes, but using style equals quote unquote and a
whole bunch of CSS, that was not going to scale well, once you have lots and lots of properties.

- [0:15:39](https://www.youtube.com/watch?v=undefined&t=72939s) Same here, once you have more


and more code, you don't want to just put your code inside of this on submit handler. So there's a better
way. Let's get rid of that on summit attribute, and literally never use it again. That was for
demonstration's sake only. And let's do this. Let me move the script tag, actually, just below the form,
but still inside the body, so that the script tag exists only after the form tag exists, logically.

- [0:16:04](https://www.youtube.com/watch?v=undefined&t=72964s) Just like in Python, your code is


read top to bottom, left to right. And let me now do this. Let me define this function called Greet, and
then let me do this, document.queryselector, let me select the form on the page. It doesn't have a
unique ID. It doesn't need to. I can just reference it by name, form, because there's only one of them.

- [0:16:24](https://www.youtube.com/watch?v=undefined&t=72984s) And let me call this special


function, add event listener. This is a function that listens for events. Now this is actually a term of art
within programming. Many different languages are governed by events. And pretty much any user
interface is governed by events, especially phones. On phones, you have touches, and you have drags,
and you have long press, and you have pinch, and all of these other gestures.

- [0:16:47](https://www.youtube.com/watch?v=undefined&t=73007s) On your Mac or PC you have click,


you have drag, you have key down, key up, as you're moving your hands up and down on the keyboard.
This is a non-exhaustive list of all of the events that you can listen for in the context of web
programming. And this might be a throwback to Scratch, where, recall, Scratch let you broadcast events.

- [0:17:04](https://www.youtube.com/watch?v=undefined&t=73024s) And we had the two puppets sort


of talking to one another via Events. In the world of web programming, game programming, any human
physical device these days, they're just governed by events. And you write code that listens for these
events happening. So what do I want to listen for? Well, I want to add an event listener for the Submit
event.

- [0:17:21](https://www.youtube.com/watch?v=undefined&t=73041s) And when that happens, I want to


call the Greet function, like this. So this is kind of interesting. Thank you, I have my Greet function as
before, no changes. But I'm adding one line of code down here. I'm telling the browser to use
document.queryselector to select the form. Then I'm adding an event listener, specifically for the Submit
event.

- [0:17:43](https://www.youtube.com/watch?v=undefined&t=73063s) And when that happens, I call


Greet. Notice I am not using parentheses after Greet. I don't want to call Greet right away. I want to tell
the browser to call Greet, when it hears this Submit event. Now let me go ahead and deliberately, I think,
trip over something here, let me type in my name, David, submit, and there we go.

- [0:18:08](https://www.youtube.com/watch?v=undefined&t=73088s) All right, Hello, David. All right,


but let's now make this slightly better designed. Right now, I'm defining a function Greet, which is fine.
But I'm only using it in one place. And you might recall, we stumbled on this in Python, where I was like,
why are we creating a special function called get value when we're only using it like one line later? And
we introduced what type of function in Python the other day? AUDIENCE: Lambda.

- [0:18:31](https://www.youtube.com/watch?v=undefined&t=73111s) SPEAKER 1: Yeah, so lambda


functions, anonymous functions. You can actually do this in JavaScript as well. If I want to define a
function all at once, I can actually do this. Let me cut this onto my clipboard, paste it over here. Let me
fix all of the alignment. Let me get rid of the name. And I can actually, now, do this.

- [0:18:51](https://www.youtube.com/watch?v=undefined&t=73131s) The syntax is a little weird. But


using now just these four lines of code, I can do this. I can tell the browser to add an event listener for
the Submit event. And then when it hears that, call this function that has no name. And unlike Python,
this function can have multiple lines, which is actually a nice thing.

- [0:19:08](https://www.youtube.com/watch?v=undefined&t=73148s) It looks a little weird. There's a lot


of indentation in curly braces going on now. But you can think of this as just being, run these two lines of
code, when the form is submitted. But if I want to block the form from actually being submitted, I've got
to do one other thing. And you would only know this from being told it or reading the documentation.

- [0:19:25](https://www.youtube.com/watch?v=undefined&t=73165s) I need to do this function, prevent


default, passing in this E argument, which is a variable that represents the event, more on that another
time, that just allows us to prevent whatever the default handling of that particular event is. So long
story short, this is representative of the type of code you might write in JavaScript, whereby you can
actually interact with your code, the user's actual form.

- [0:19:48](https://www.youtube.com/watch?v=undefined&t=73188s) And we can do interesting things,


too. Built into browsers nowadays is functionality like this. So here's a very simple example, that has just
three buttons in it, one red, one green, one blue. Well, it turns out using JavaScript, you can control the
CSS of a page programmatically. I can change the background of the body of the page to red, to green, to
blue, just by listening for clicks on these buttons, and then changing CSS properties.

- [0:20:12](https://www.youtube.com/watch?v=undefined&t=73212s) Just to give you a taste of this, if I


view the page's source, similar code here, I can select the red button by an ID that I apparently defined
on it, right up here. I can add an event listener, this time not for submit, but for click. And when it's
clicked, I execute this one line of code. And this one line of code we haven't seen before, but you can go
into the body of the page, its style property, and you can change its background color to red.

- [0:20:36](https://www.youtube.com/watch?v=undefined&t=73236s) This is one example of two


different groups not talking to one another in advance. In CSS, properties that have two words are
usually hyphenated, like background-color. Unfortunately, in JavaScript, if you do something dash
something, that's subtraction, which is logically nonsensical here. So in CSS, you can convert background-
color to, in JavaScript, background Color, where you capitalize the C, and you get rid of the minus sign.

- [0:21:02](https://www.youtube.com/watch?v=undefined&t=73262s) What else can we do here? Well,


back in the day, there used to be a blink tag. And it's one of the few historical examples of a tag that was
removed from HTML, because in the late '90s, early 2000s, this is what the web looked like. There was a
lot of this kind of stuff. There was even a marquee that would move text from left to right over the
screen.
- [0:21:20](https://www.youtube.com/watch?v=undefined&t=73280s) And the web was a very ugly
place. I will admit, my very first web page probably used both of these tags. But how can we bring it
back? Well, this is a version of the blink tag implemented in JavaScript. How? I wrote some code in this
example, that waits every 500 milliseconds to change the CSS of the page to be visible, invisible, visible,
invisible, because built into JavaScript is support for a clock.

- [0:21:43](https://www.youtube.com/watch?v=undefined&t=73303s) So you can just do something on


some sort of schedule. Let me go ahead and open up this example, autocomplete. So let me zoom back
out. In Autocomplete.html, I whipped up as an example, that has just a text box, but I also grabbed the
dictionary from problem set 5 speller, so that if I want to search for something like Apple, this searches
that 140,000 words, using JavaScript, to create what we know in the world of the web as autocomplete.

- [0:22:07](https://www.youtube.com/watch?v=undefined&t=73327s) When you start searching for


something, you should start to see words that start with that phrase. And sure enough, if I search for
something like banana, here's the three variants of bananas that appear in that file, and so forth. How is
that working? Just JavaScript, when it finds matching words, it's just updating the DOM, the tree in the
computer's memory, to show more and more text, or less.

- [0:22:28](https://www.youtube.com/watch?v=undefined&t=73348s) And for one final example, this is


how programs like DoorDash and Google Maps and Uber Eats and so work. You have built into browsers
today some fancy APIs, application programming interfaces, whereby you can ask for information about
the user's device. For instance, here, I wrote a program, in Geolocation.

- [0:22:48](https://www.youtube.com/watch?v=undefined&t=73368s) html, that's apparently asking to


know my location. All right, let me go ahead and allow it this time, if that's something you're comfortable
with on your own device. It's taking a moment, because sometimes these things take a little while to
analyze. But, hopefully, in just a moment, there are apparently my GPS coordinates, and as a final
flourish today, for what you can do with a little bit of HTML for your structure, CSS for your style, and
now JavaScript for your logic, which we'll tie in again next week, let me go ahead

- [0:23:13](https://www.youtube.com/watch?v=undefined&t=73393s) and search Google for those GPS


coordinates. Zoom in here on Google Maps, and if we zoom in, in, in, OK, we're pretty close. We're not
on that street, but there, oh, there it is, actually. There is the marker it had put for us. We're indeed here
in Memorial Hall. So all that with JavaScript, but the basic understanding of the DOM and the document
object model, we'll pick up where we left off next week.

- [0:23:36](https://www.youtube.com/watch?v=undefined&t=73416s) And now add a back-end. See you


next time. [MUSIC PLAYING] DAVID: All right.

- [0:24:57](https://www.youtube.com/watch?v=undefined&t=73497s) So this is CS50 and this is week


nine, and this is it in terms of programming fundamentals. Today, we come rather full circle with so many
of the languages that we've been looking at over the past several weeks. And with HTML and CSS and
JavaScript last week, we're going to add back into the mix, Python and SQL.

- [0:25:15](https://www.youtube.com/watch?v=undefined&t=73515s) And with that, do we have the


ability to program for the web. And even though this isn't the only user interface out there, increasingly--
or people certainly using laptops and desktops and a browser to access applications that people have
written, but it's also, increasingly, the way that mobile apps are written as well.
- [0:25:31](https://www.youtube.com/watch?v=undefined&t=73531s) There are languages called Swift
for iOS, there are languages called Java for Android, but coding applications in both of those language
means knowing twice as many language, building twice as many applications, potentially. So we're
increasingly seeing, for better or for worse, that the world is starting to really standardize, at least for the
next some number of years, on HTML, CSS, and JavaScript coupled with other languages like Python and
SQL on the so-called backend.

- [0:25:55](https://www.youtube.com/watch?v=undefined&t=73555s) And so today, we'll tie all of those


together and give you the last of the tools in your toolkit with which to tackle final projects to go off into
the real world, ultimately, and somehow solve problems with programming. But we need an additional
tool today, and we've sort of outgrown HTTP server.

- [0:26:11](https://www.youtube.com/watch?v=undefined&t=73571s) This is just a program that comes


on certain computers that you can install for free, happens to be written in a language called JavaScript,
but it's a program that we've been using to run a web server in VSCO. But you can run it on your own
Mac or PC or anywhere else. But all this particular HTTP server does is serve up static content like HTML
files, CSS files, JavaScript files, maybe images, maybe video files, but just static content.

- [0:26:37](https://www.youtube.com/watch?v=undefined&t=73597s) It has no ability to really interact


with the user beyond simple clicks. You can create a web form and serve it visually using HTTP server, but
if the human types in input into a form and click Submit, unless you submit it elsewhere to something
like google.com like we did last time, it's not actually going to go anywhere because this server can't
actually process the requests that are coming in.

- [0:26:59](https://www.youtube.com/watch?v=undefined&t=73619s) So today, we're going to introduce


another type of server that comes with Python that allows us to not only serve web pages but also
process user input. And recall that all that input is going to come ultimately from the URL, or more
deeply inside of those virtual envelopes. So here's the canonical URL we talked about last week for
random website like www.example.com.

- [0:27:20](https://www.youtube.com/watch?v=undefined&t=73640s) And I've highlighted the slash to


connote the root of the web server, like the default folder where, presumably, there's a file called
index.html or something else in there. Otherwise, you might have a more explicit mention of the actual
file named file.html. You can have folders, as you probably gleaned from the most recent problem set.

- [0:27:38](https://www.youtube.com/watch?v=undefined&t=73658s) You can have files in folders like


this, and these are all examples of what a programmer would typically call a path. So it might not just be
a single word, it might have multiple slashes and multiple folders and some folders and files. But this is
just more generally known as a path. But there's another term of our, that's essentially equivalent, that
we'll introduce today.

- [0:27:56](https://www.youtube.com/watch?v=undefined&t=73676s) This is also synonymously called a


route, which is maybe a better generic description of what these things are because it turns out they
don't have to map to, that is, refer to a specific folder or a specific file, you can come up with your own
routes in a website. And just make sure that when the user visits that, you give them a certain website
page.
- [0:28:15](https://www.youtube.com/watch?v=undefined&t=73695s) If they visit something else, you
give them a different web page. It doesn't have to map to a very specific file, as we'll soon see. And if you
want to get input from the user, just like Google does, like q=cats, you can add a question mark at the
end of this route. The key, or the HTTP parameter name that you want to define for yourself, and then
equal sum value that, presumably, the human typed in.

- [0:28:35](https://www.youtube.com/watch?v=undefined&t=73715s) If you have more of these, you


can put an ampersand, and then more key equals value pairs ampersand, repeat, repeat, repeat. The
catch, though, is that using the tools that we had last week alone, we don't really have the ability to
parse, that is, to analyze and extract things like q equals cats. You could have appended question mark q
equals cats or anything else to any of URLs in your home page for problem set eight, but it doesn't
actually do anything useful, necessarily, unless you use some fancy JavaScript.

- [0:29:06](https://www.youtube.com/watch?v=undefined&t=73746s) The server is not going to bother


even looking in that for you. But today, we're going to introduce using a bit of Python. And in fact, we're
going to use a web server implemented in Python, instead of using HTTP server alone, to automatically,
for you, look for any key value pairs after the question mark and then hand them to you in the form of a
Python dictionary.

- [0:29:26](https://www.youtube.com/watch?v=undefined&t=73766s) Recall that a dictionary in Python,


a dict object, is just key value pairs. That seems like a perfect fit for these kinds of parameters. And
you're not going to have to write that code yourself. It's going to be handed to you by way of what's
called a framework. So this will be the second of two frameworks, really, that we look at in the class.

- [0:29:43](https://www.youtube.com/watch?v=undefined&t=73783s) And a framework is essentially a


bunch of libraries that someone else wrote and a set of conventions, therefore, for doing things. So
those of you who really started dabbling with Bootstrap this past week to make your home pages
prettier and nicely laid out, you are using a framework. Why? Well, you're using libraries, code that
someone else wrote, like all the CSS, maybe some of the JavaScript that the Bootstrap people wrote for
you.

- [0:30:05](https://www.youtube.com/watch?v=undefined&t=73805s) But it's also a framework in the


sense that you have to go all in. You have to use Bootstraps classes, and you have to lay out your divs or
your spans or your table tags in a sort of Bootstrap-friendly way. And it's not too onerous, but you're
following conventions that a bunch of humans standardized on.

- [0:30:22](https://www.youtube.com/watch?v=undefined&t=73822s) So similarly, in the world of


Python, is there another framework we're going to start using today. And whereas Bootstrap is used for
CSS and JavaScript, Flask is going to be used for Python. And it just solves a lot of common problems for
us. It's going to make it easier for us to analyze the URLs and get key value pairs, it's going to make it
easier for us to find files or images that the human wants to see when visiting our website.

- [0:30:45](https://www.youtube.com/watch?v=undefined&t=73845s) It's even going to make it easier to


send emails automatically, like when someone fills out a form. You can dynamically, using code, send
them an email as well. So Flask, and with it some related libraries, it's just going to make stuff like that
easier for us. And to do this, all we have to do is adhere to some pretty minimalist requirements of this
framework.
- [0:31:05](https://www.youtube.com/watch?v=undefined&t=73865s) We're going to have to create a
file for ourselves called app.py, this is where our web app or application is going to live. If we have any
libraries that we want to use, the convention in the Python world is to have a very simple text file called
requirements.txt where you list the names of those libraries, top to bottom, in that text file, similar in
spirit to the include or the import statements that we saw in C and Python, respectively.

- [0:31:30](https://www.youtube.com/watch?v=undefined&t=73890s) We're going to have a static folder


or static directory, which means any files you create that are not ever going to change, like images, CSS
files, JavaScript files, they're going to go in this folder. And then lastly, any HTML that you write, web
pages you want the human to see, are going to go in a folder called templates.

- [0:31:48](https://www.youtube.com/watch?v=undefined&t=73908s) So this is, again, evidence of what


we mean by a framework. Do you have to make a web app like this? No, but if you're using this particular
framework, this is what people decided would be the human conventions. If you've heard of other
frameworks like Django or asp.net or bunches of others, there are just different conventions out there
for creating applications.

- [0:32:06](https://www.youtube.com/watch?v=undefined&t=73926s) Flask is a very nice


microframework in that that's it. All you have to do is adhere to these pretty minimalist requirements to
get some code up and running. All right, so let's go ahead and make a web app. Let me go ahead and
switch over to VS Code here, and let me practice what I'm preaching here by first creating app.py.

- [0:32:25](https://www.youtube.com/watch?v=undefined&t=73945s) And let's go ahead and create an


application that very simply, maybe, says hello to the user. So something that, initially, is not all that
dynamic, pretty static, in fact. But we'll build on that as we've always done. So in app.py, what I'm going
to do first is exactly the line of code I had on the screen earlier.

- [0:32:43](https://www.youtube.com/watch?v=undefined&t=73963s) From Flask, import Flask, with a


capital F second and a lowercase f first. And I'm also going to preemptively import a couple of functions,
render template, and request. More on those in just a bit. And then below that, I'm going to say, go
ahead and do this. Give me a web-- a variable called app that's going to be the result of calling the Flask
function and passing in it this weird incantation here, name.

- [0:33:10](https://www.youtube.com/watch?v=undefined&t=73990s) So we've seen this a few weeks


back when we played around with Python and we had that if main thing at the bottom of the screen. For
now, just know that __name__ refers to the name of the current file. And so this line here, simple as it is,
tells Python, hey, Python, turn this file into a Flask application.

- [0:33:30](https://www.youtube.com/watch?v=undefined&t=74010s) Flask is a function that just figures


out, then, how to do the rest. The last thing I'm going to do for this very simple web application is this.
I'm going to say that I'm going to have a function called index that takes no arguments. And whenever
this function is called, I want to return the results of rendering a template called index.html.

- [0:33:50](https://www.youtube.com/watch?v=undefined&t=74030s) And that's it. So let's assume


there's a file somewhere, haven't created it yet, called index.html. But render template means render
this file that is printed to the user's screen, so to speak. The last thing I'm going to do is I have to tell
Flask when to call this index function. And so I'm going to tell it to define a route for, quote unquote,
"slash.
- [0:34:12](https://www.youtube.com/watch?v=undefined&t=74052s) " And that's it. So let's take a look
at what I just created here. This is slightly new syntax, and it's really the only weirdness that we'll have
today in Python. This is what's known in Python is what's called a decorator. A decorator is a special type
of function that modifies, essentially, another function.

- [0:34:27](https://www.youtube.com/watch?v=undefined&t=74067s) For our purposes, just know that


on line six this says, hey Python, define a route for slash, the default page on my website application. The
next two lines, seven and eight, say, hey Python, define a function called index, takes no arguments. And
the only thing you should ever do is return render template of quote unquote "index.html.

- [0:34:47](https://www.youtube.com/watch?v=undefined&t=74087s) " All right, so that's it. So really,


the next question, naturally, should be all right, well, what is in index.html? Well, let me go ahead and do
that next. Let me create a directory called templates, practicing, again, what I preached earlier. So I'm
going to create a new empty directory called templates, I'm going to go and CD into that directory and
then do code of index.html.

- [0:35:12](https://www.youtube.com/watch?v=undefined&t=74112s) So here is going to be my index


page. And I'm going to do a very simple web page, doc type HTML. I'm just going to borrow some stuff
from last week. HTML language equals English. I'll close that tag. I'll then do a head tag, I'll do a meta tag,
the name of which is viewport. This makes my site recall responsive.

- [0:35:28](https://www.youtube.com/watch?v=undefined&t=74128s) That is, it just grows and shrink to


fit the size of the device. The initial scale for which is going to be one, and the width of which is going to
be device width. So I'm typing this out, I have it printed here. This is stuff I typically copy paste. But then
lastly, I'm going to add in my title, which will just be hello for the name of this app.

- [0:35:45](https://www.youtube.com/watch?v=undefined&t=74145s) And then the body-- whoops,


Bobby. The body of this tag will be-- there we go. The body of this page, rather, will just be hello comma
world. So very uninteresting and really a regression to where we began last week. But let's go now and
experiment with these two files. I'm not going to bother with a static folder right now, because I don't
have any other files that I want to serve up.

- [0:36:07](https://www.youtube.com/watch?v=undefined&t=74167s) No images, no CSS, nothing like


that. And honestly, requirements.txt is going to be pretty simple. I'm going to go requirements.txt and
just say make sure the system has access to the Flask library itself. All right, but that's the only thing we
can add in there for now. All right, so now I have two files, app.py, and I have index.html.

- [0:36:27](https://www.youtube.com/watch?v=undefined&t=74187s) But index.html thank you is inside


of my templates directory so how do I actually start a web server last week, I would have said HTTP
server. But HTTP server is not a Python thing. It has no idea about Flask or Python or anything I just
wrote. HTTP server will just spit out static files. So if I ran HTTP server, and then I clicked on app.

- [0:36:47](https://www.youtube.com/watch?v=undefined&t=74207s) py, I would literally see my Python


code. It would not get executed because HTTP server is just for static content. But today, I'm going to run
a different command called Flask run. So this framework Flask that I actually preinstalled in advance, so it
wasn't strictly necessary that I create that requirements.
- [0:37:05](https://www.youtube.com/watch?v=undefined&t=74225s) txt file just yet, comes with a
program called Flask, takes command line arguments like the word run, and when I do that, you'll see
somewhat similar output to last week whereby you'll see the name-- your URL for your unique preview
of that. You might see a pop up saying that your application is running on TCP port, something or other.

- [0:37:22](https://www.youtube.com/watch?v=undefined&t=74242s) By default, last week, we used


port 8080. Flask, just because, prefers port 5,000. So that's fine too. I'm going to go ahead and open up
this URL now. And once it authenticates and redirects me, just to make sure I'm allowed to access that
particular port, let me zoom in. Voila, there's the extent of this application.

- [0:37:40](https://www.youtube.com/watch?v=undefined&t=74260s) If I view source by right-clicking or


control clicking, there's my HTML that's been spit out. So really, I've just reinvented the wheel from last
week because there's no dynamism now, nothing at all. But what if I do this? Let me close the source
and let me zoom out. So you can see my URL bar.

- [0:37:56](https://www.youtube.com/watch?v=undefined&t=74276s) Let me zoom in now, and I have a


very unique cryptic URL. But the point is that it ends with nothing. Or implicitly, it ends with slash. This is
just Chrome being a little helpful. It doesn't bother showing you a slash, even though it's implicitly there.
But let me do something explicit like my name equals, quote unquote, "David.

- [0:38:15](https://www.youtube.com/watch?v=undefined&t=74295s) " So there's a key value pair that


I've manually typed into my URL bar and hit Enter. Nothing happens, nothing changes. It still says hello,
world. But the opportunity today is to now, dynamically, get at the input from that URL and start
displaying it to the user. So let me go back over here to my terminal window and code.

- [0:38:35](https://www.youtube.com/watch?v=undefined&t=74315s) Let me move that down to the


bottom there. And what if I want to say, huh, hello, name. I ideally want to say something like-- I don't
want to hard code David because then it's never going to say hello to anyone else. I want to put like a
variable name here, like name should go here. But it's not an HTML tag, so I need some kind of
placeholder.

- [0:38:55](https://www.youtube.com/watch?v=undefined&t=74335s) Well, here's what I can do. If I go


back to my Python code, I can now define a variable called name. And I can ask Flask to go into the
current request, into its arguments, that is in the URL, as they're called, and get whatever the value of
the parameter called name is. That puts that into a variable for me.

- [0:39:17](https://www.youtube.com/watch?v=undefined&t=74357s) And then, in render template--


this is one of those functions that can take more than one argument. If it takes another argument, you
can pass in the name of any variable you want. So if I want to pass in my name, I can literally say name
equals name. So this is the name of a variable I want to give to the template.

- [0:39:35](https://www.youtube.com/watch?v=undefined&t=74375s) This is the actual variable that I


want to get the value from. And now lastly, in my index.html, the syntax as of today in Flask, is to do two
curly braces and then put the name of the variable that you want to plug in. So here's what we mean by
a template. A template is like a blueprint in the real world, where it's plans to make something.

- [0:39:59](https://www.youtube.com/watch?v=undefined&t=74399s) This is the plan to make a web


page that has all of this code literally, but there's this placeholder with two curly braces here and here
that says go ahead and plug in the value of the name variable right there. So in this sense, it's similar in
spirit to our f strings or format strings in Python.

- [0:40:15](https://www.youtube.com/watch?v=undefined&t=74415s) The syntax is a little different just


because reasonable people disagree, different people, different frameworks come up with different
conventions. The convention in Flask, in their templates, is to use two curly braces here. The hope is that
you, the programmer, will never want to display two curly braces in your actual web page.

- [0:40:32](https://www.youtube.com/watch?v=undefined&t=74432s) But even if you do, there's a


workaround. We can escape that. So now let me go ahead and go back to my browser tab here.
Previously, even though I added name equals David to the end of the URL with a question mark, it still
said hello, world. But now, hopefully, if I made these changes, let me go ahead and open up my terminal
window.

- [0:40:51](https://www.youtube.com/watch?v=undefined&t=74451s) Let me restart Flask so it loads my


changes by default. Let me go back to my hello tab and click reload so it grabs the page anew from the
server. And there we go, hello, David. I can play around now and I can change the URL appear to, for
instance, Carter. Zoom out, hit Enter. And now we have something more dynamic.

- [0:41:11](https://www.youtube.com/watch?v=undefined&t=74471s) So the new pieces here are, in


Python, we have some code here that allows us to access, programmatically, everything that's after the
question mark in the URL. And the only thing we have to do that is call this function request.args.get. You
and I don't have to bother figuring out where is the question mark, where is the equal sign, where are
the ampersands, potentially.

- [0:41:32](https://www.youtube.com/watch?v=undefined&t=74492s) The framework, Flask, does all of


that for us. OK, any questions then on these principles thus far? Yeah, in back. AUDIENCE: Why do you
say the question mark in the URL? DAVID: Why do you need a question mark in the URL? The short
answer is just because that is where key value pairs must go. If you're making a GET request from a
browser to a server, the convention, standardized by the HTTP protocol, is to put them in the URL after
the so-called route or path, then a question mark.

- [0:42:11](https://www.youtube.com/watch?v=undefined&t=74531s) And it delineates what's part of


the root or the path, and what's part of the human input to the right. Other questions? Yeah. AUDIENCE:
Can you go over again why the left and right in the [INAUDIBLE]?? DAVID: Sure. This is this annoying
thing about Python. When you pass in parameters, two functions that have names, you typically say
something equals something else.

- [0:42:32](https://www.youtube.com/watch?v=undefined&t=74552s) So let me make a slight tweak


here. How about I say name of person here. This allows me to invent my own variable for my template
and assign it the value of name. I now, though, have to go into my index file and say name of person-- did
I get that right? Name of person, yeah. So these two have to match. And so this is just stupid because it's
unnecessarily verbose.

- [0:43:00](https://www.youtube.com/watch?v=undefined&t=74580s) So what typically people do is


they just use the same name as the variable itself, even though it looks admittedly stupid, but it has two
different roles. The thing to the left of the equal sign is the name of the variable you plan to use in the
template, the thing on the right is the actual value you're assigning it.
- [0:43:16](https://www.youtube.com/watch?v=undefined&t=74596s) And this is because its general
purpose. I could override this and I could say something like name always equals Emma, no matter what
that variable is. And now if I go back to my browser and reload, no matter what's in the URL, David or
Carter, It's always-- OK, Emma broke the server. What did I do? Oh, I didn't change my template back.

- [0:43:37](https://www.youtube.com/watch?v=undefined&t=74617s) There we go. Let me change that


back to be name, so that it's name there and it's name here. But I've hardcoded Emma's name, so now
we're only ever going to see Emma no matter whose name is in the URL. That's all. All right, so this is bad
user interface. If, in order to get a greeting for the day, you, the user, have to manually change the URL,
which none of us ever do.

- [0:43:57](https://www.youtube.com/watch?v=undefined&t=74637s) This is not how web pages work.


What is the more normal mechanism for getting input from the user and putting it in that URL
automatically? How did we do that last week? With Google, if you recall. AUDIENCE: We have the search
bar and we [INAUDIBLE] you have to make something in there [INAUDIBLE]. DAVID: OK, so we did make
something in order to get the input from the user.

- [0:44:26](https://www.youtube.com/watch?v=undefined&t=74666s) And specifically, what was the tag


or the terminology we used last week? AUDIENCE: [INAUDIBLE]. DAVID: Sorry, a little louder? Oh, no. But
yeah. AUDIENCE: Is it input? DAVID: So the input tag, inside of the form tag. So in short, forms, or of
course, how the web works and how we typically get input from the user, whether it's a button or a text
box or a dropdown menu or something else.

- [0:44:47](https://www.youtube.com/watch?v=undefined&t=74687s) So let's go ahead and add that


into the mix here. So let's enhance this hello app to do a little something more by, this time, just doing
this. Let me get rid of this name stuff and let me just have a very simple index.html file that, by default, is
going to simply ask the user for some input as follows.

- [0:45:06](https://www.youtube.com/watch?v=undefined&t=74706s) I'm going to go back into my


index.html, and instead of printing out the user's name, this is the page I'm going to use to actually get
input from the user. So I'm going to create a form tag. The method I'm going to use for now is going to
be, quote unquote, "get." Then, inside of that form, I'm going to have an input tag.

- [0:45:23](https://www.youtube.com/watch?v=undefined&t=74723s) And I'm going to turn off


autocomplete like we did last week. I'm going to turn on auto focus, so it puts the cursor in the text box
for me. I'm going to give the name of this input the name, name. Not to be too confusing, but I'm asking
the human for their name. So it makes sense that the name of the input should be, quote unquote,
"name.

- [0:45:40](https://www.youtube.com/watch?v=undefined&t=74740s) " The placeholder I want the


human to see in light gray text will be Name with a capital N, just so it's a little grammatical. And then
type of this text fiel-- type of this input is going to be text. Then I'm just going to give myself, like last
week, a submit button. And I don't care what it says, it's just going to say the default submit terminology.

- [0:45:57](https://www.youtube.com/watch?v=undefined&t=74757s) Let me go ahead, now, and open


up my terminal window again. Let me go to that same URL so that I can see-- whoops. There we go. So
that was just cached from earlier. Let me go back to that same URL, my GitHub preview.dev URL, and
here I have the form. And now, I can type in anything I want. The catch, though, is when I click Submit,
where is it going to go? Well, let's be explicit.

- [0:46:25](https://www.youtube.com/watch?v=undefined&t=74785s) It does have a default value, but


let me go into my index.html and let me add, just like we did last week for it, Google. Whereas
previously, I said something like www.google.com/search, but today, we're not going to rely on some
third party. I'm going to implement the so-called backend, and I'm going to have the user submit this
form to a second route, not just slash, how about /greet.

- [0:46:47](https://www.youtube.com/watch?v=undefined&t=74807s) I can make it up, whatever I want.


Greet feels like a nice operative word, so /greet is where the user will be sent when they click Submit on
this form. All right, so let's go ahead now and go back to my browser tab. Let me go ahead, actually, and
let me reload Flask here so that it reloads all of my changes.

- [0:47:05](https://www.youtube.com/watch?v=undefined&t=74825s) Let me reload this tab so that I get


the very latest HTML and, indeed, quick safety check. If I view page source, we indeed see that my
browser has downloaded the latest HTML. So it definitely has changed. Let's go ahead and type in David.
And when I click Submit here, what's going to happen? Hypotheses.

- [0:47:23](https://www.youtube.com/watch?v=undefined&t=74843s) What's going to happen visually,


functionally, however you want to interpret when I click Submit. Yeah? AUDIENCE: [INAUDIBLE] an
empty page. DAVID: OK, the user's going to go to an empty page. Pretty good instinct, because-- no
where else, if I mentioned /greet, it doesn't seem to exist. How's the URL going to change, just to be
clear? What's going to appear, suddenly, in the URL? Yeah? AUDIENCE: 404? DAVID: 404? No, not in the
URL.

- [0:47:54](https://www.youtube.com/watch?v=undefined&t=74874s) Specifically in the URL,


something's going to get added automatically when I click. AUDIENCE: The key value pair? DAVID: The
key value pair, right. That's how forms work. That's why our Google trick last week worked. I sort of
recreated a form on my own website. And even though I didn't get around to implementing google.

- [0:48:09](https://www.youtube.com/watch?v=undefined&t=74889s) com itself, I can still send the


information to Google just relying on browsers, standardizing-- to your question earlier, that whenever
you submit a form, it automatically ends up after a question mark in the URL if you're using GET. So this
both of you are right, this is going to break. And all three of you are right, in effect, 404 not found.

- [0:48:27](https://www.youtube.com/watch?v=undefined&t=74907s) You can see it in the tab here.


That's the error that has come back. But what's interesting, and most important, the URL did change.
And it went to /greet?name=david. So I just, now, need to add some logic that actually looks for that so-
called route. So let me go back to my app.py. Let me define another route for, quote unquote, "slash
greet.

- [0:48:49](https://www.youtube.com/watch?v=undefined&t=74929s) " And then, inside of-- under this,


let me define another function. I'll call it greet, but I could call it anything I want. No arguments, for now,
for this, and then let me go ahead and do this in my app.py. This time around, I do want to get the
human's name. So let me say requeste.
- [0:49:09](https://www.youtube.com/watch?v=undefined&t=74949s) args get quote unquote "name",
and let me store that in a variable called name. Then let me return a template, and you know what, I'm
going to give myself a new template, greet.html. Because this has a different purpose, it's not a form. I
want to say hello to the user in this HTML file, and I want to pass, into it, the name that the human just
typed in.

- [0:49:28](https://www.youtube.com/watch?v=undefined&t=74968s) All right, so now if I go up and


reload the page, what might happen now? Other logical check here. If I go ahead and hit reload or
resubmit the form, what might happen now? Any instincts? Let me try, so let's try this. Let's go ahead
and reload the page. Previously, it was not found. Now it's worse, and this is the 500 error, internal
server error that I promised next week we will all encounter accidentally, ultimately.

- [0:50:00](https://www.youtube.com/watch?v=undefined&t=75000s) But here we have an internal


server error. Because it's an internal error, this means something's wrong with your code. So the route
was actually found because it's not a 404 this time. But if we go into VS Code here and we look at the
console, the terminal window, you'll see that-- this is actually a bit misleading.

- [0:50:21](https://www.youtube.com/watch?v=undefined&t=75021s) Do I want to do this? Let me


reload this. Let me reload here. Oh, standby. Come on. There we go. Come on. OK, here we have this
error here, and this is where your terminal window is going to be helpful. In your terminal window, by
default, is typically going to go helpful stuff like a log, L-O-G, of what it is the server is seeing from the
browser.

- [0:50:47](https://www.youtube.com/watch?v=undefined&t=75047s) For instance, here's what the


server just saw in purple. Get /greet?name=david using HTTP version 1.0. Here, though, is the status
code that the server returned, 500. Why, what's the error? Well, here's where we get these annoying
pretty cryptic Python messages that help50 might ultimately help you with, or here, we might just have a
clue at the bottom.

- [0:51:07](https://www.youtube.com/watch?v=undefined&t=75067s) And this is actually pretty clear,


even though we've never seen this error before. What did I screw up here? I just didn't create
greet.html, right? Template not found. All right, so that must be the last piece of the puzzle. And again,
representative of how you might diagnose problems like these, let me go into my terminal window.

- [0:51:24](https://www.youtube.com/watch?v=undefined&t=75084s) After hitting Control C, which


cancels or interrupts a process, let me go into my templates directory. If I type ls, I only have index.html.
So let's code up greet.html. And in this file let's quickly do doc type. Doc type HTML, open bracket HTML,
language equals English. Inside of this, I'll have the head tag, inside of here, I'll have the meta.

- [0:51:49](https://www.youtube.com/watch?v=undefined&t=75109s) The name is viewport, the


content of which is-- I always forget this to. The content of which is initial scale equals one, width equals
device width. Quote unquote, title is still going to be, I'll call this greet because this is my template. And
then here, in the body, I'm going to have hello comma name.

- [0:52:14](https://www.youtube.com/watch?v=undefined&t=75134s) So I could have kept around the


old version of this, but I just recreated, essentially, my second template. So index.html now is almost the
same, but the title is different and it has a form. greet.html is almost the same, but it does not have a
form. It just has the hello comma name. So let me now go ahead and rerun in the correct directory.
- [0:52:34](https://www.youtube.com/watch?v=undefined&t=75154s) You have to run Flask wherever
app.py is, not in your templates directory. So let me do Flask run to get back to where I was. Let me go
into my other tab. Cross my fingers this time that, when I go back to slash and I get index.html's form,
now I type in David and click Submit, now we get hello, David.

- [0:52:54](https://www.youtube.com/watch?v=undefined&t=75174s) And now we have a full-fledged


web app that has two different routes, slash and /greet, the latter of which takes input like this and then,
using a template, spits it out. But something could go wrong, and let's see what happens here. Suppose I
don't type anything in. Let me go here and just click Submit.

- [0:53:14](https://www.youtube.com/watch?v=undefined&t=75194s) Now, I mean, it looks stupid. So


there's bunches of ways we could solve this. I could require that the user have input on the previous
page, I could have some kind of error check for this. But there's another mechanism I can use that I'll just
show you. It turns out this GET function, in the context of HTTP and also in general with Python
dictionaries, you can actually supply a default value.

- [0:53:36](https://www.youtube.com/watch?v=undefined&t=75216s) So if there is no name parameter


or no value for a name parameter, you can actually give it a default value like this. So I'll say world, for
instance. Now, let me go back here. Let me type in nothing again and click Submit. And hopefully this
time, I'll do-- oops, sorry. Let me restart Flask to reload the template.

- [0:53:55](https://www.youtube.com/watch?v=undefined&t=75235s) Let me go ahead and type nothing


this time, clicking Submit. And hopefully, we now-- Oh, interesting. I should have faked this. Suppose that
the reason this-- Oh. Suppose I just get rid of name altogether like this and hit Enter. Now I see hello,
world, and this is a subtlety that I didn't intend to get into here.

- [0:54:20](https://www.youtube.com/watch?v=undefined&t=75260s) When you have question mark


name equals nothing, you're passing in what's called-- whoops. When you have greet question mark
name equals something, you actually are giving a value to name. It is quote unquote with nothing in
between. That is different from having no value at all. So allow me to just propose that the error here,
we would want to require this in a different way.

- [0:54:43](https://www.youtube.com/watch?v=undefined&t=75283s) And probably the most robust


way to do this would be to go in here, in my HTML, and say that the name field is required. Now, if I go
back to my form after restarting Flask here, and I go ahead and click reload on my form and type in
nothing and click Submit, now the browser is going to yell at me. But just as a teaser for something we'll
be doing in the next problem set in terms of error checking, you should never, ever, ever rely on client
side safety checks like this.

- [0:55:15](https://www.youtube.com/watch?v=undefined&t=75315s) Because we know, from last week,


that a curious programmer can go to inspect, and let me poke around the HTML here. Let me go into the
body, the form. OK, you say required, I say not required. You can just delete what's in the dom, in the
browser, and now I can go ahead and submit this form. And it appears to be broken.

- [0:55:34](https://www.youtube.com/watch?v=undefined&t=75334s) Not a big deal with a silly little


greeting application like this. But if you're trying to require that humans actually provide input that is
necessary for the correct operation of the site, you don't want to trust that the HTML is not altered by
some adversary. All right, any questions, then, on this particular app before we add another feature
here? Any questions here? Yeah.

- [0:56:00](https://www.youtube.com/watch?v=undefined&t=75360s) AUDIENCE: Do you guys


[INAUDIBLE]. DAVID: Sorry, little louder. In the index function-- AUDIENCE: Oh, sorry. [INAUDIBLE] DAVID:
Sorry? AUDIENCE: [INAUDIBLE] DAVID: Would it be a problem if what? AUDIENCE: You have to
[INAUDIBLE]. DAVID: No. I mean no, this is OK. What you should really do is something we're going to do
with another example where I'm going to start error checking things.

- [0:56:26](https://www.youtube.com/watch?v=undefined&t=75386s) So let me wave my hands at that


and propose that we'll solve this better in just a bit. But it's not bad to do what I just did here, it's only
going to handle one of the scenarios that I was worried about. Not all of them. All right, so even though
this is new to most of us here, consider index.html, my first template, and consider greet.

- [0:56:43](https://www.youtube.com/watch?v=undefined&t=75403s) html, my second template. What


might be arguably badly designed? Even though this might be the first time you've ever touched web
programming like this. What's bad or dumb about this design of these two templates alone? And there's
a reason, too, that I bored us by typing it out that second time. Yeah? AUDIENCE: [INAUDIBLE] you said,
stuff like Notepad and [INAUDIBLE]..

- [0:57:11](https://www.youtube.com/watch?v=undefined&t=75431s) DAVID: Yeah, there's so much


repetition. I mean, it was deliberately tedious that I was retyping everything. The doc type, the HTML
tag, the head tag, the title tag. And little things did change along the way, like the title and certainly, the
content of the body. But so much of this, I mean, almost all of the page is a copy of itself in multiple files.

- [0:57:31](https://www.youtube.com/watch?v=undefined&t=75451s) And God forbid we have a third


template, a fourth template, a hundredth template for a really big website. This is going to get very
tedious very quickly. And suppose you want to change something in one place, you're going to have to
change it now in two, three, a hundred different places instead. So just like in programming more
generally, we have this ability to factor out commonalities.

- [0:57:49](https://www.youtube.com/watch?v=undefined&t=75469s) So do you in the context of web


programming, and specifically templating, have the ability to factor out all of those commonalities. The
syntax is going to be a little curious, but it functionally is pretty straightforward. Let me go ahead and do
this. Let me go ahead and copy the contents of index.html. Let me go into my templates directory and
code a file that, by default, is called layout.html.

- [0:58:12](https://www.youtube.com/watch?v=undefined&t=75492s) And let me go ahead, and per


your answer, copy all of those commonalities into this file now instead. So here I have a file called
layout.html. I don't want to give every page the same title, maybe, but for now that's OK. I'm going to
call everything hello. But in the body of the page, what I'm going to do here is just have a placeholder for
actual contents that do change.

- [0:58:35](https://www.youtube.com/watch?v=undefined&t=75515s) So in this layout, I'm going to go


ahead in here and just put in the body of my page, how about this syntax? And this is admittedly new.
Block body, and then percent sign close curly brace. And then I'm going to do end block. So a curious
syntax here, but this is more template syntax. The other template syntax we saw before was the two
curly braces.
- [0:58:59](https://www.youtube.com/watch?v=undefined&t=75539s) That's for just plugging in values.
There's this other syntax with Flask that allows you to, say, a single curly brace, a percent sign, and then
some functionality like this defining a block. And this one's a little weird because there's literally nothing
between the close curly and the open curly brace here.

- [0:59:17](https://www.youtube.com/watch?v=undefined&t=75557s) But let's see what this can do for


us. Let me now go into my index.html, which is where I borrowed most of that code from, and let me
focus on what is minimally different. The only thing that's really different in this page, title aside, is the
form. So let me go ahead and just cut that form out to my clipboard.

- [0:59:38](https://www.youtube.com/watch?v=undefined&t=75578s) Let me change the first line of


index.html to say this file is going to extend layout.html, and notice I'm using the curly braces again. And
this file is going to have its own body block inside of which is just the HTML that I actually want to make
specific to this page. And I'll keep my indentation nice and neat here.

- [1:00:02](https://www.youtube.com/watch?v=undefined&t=75602s) And let's consider what I've done.


This is starting to look weird fast, and this is now a mix of HTML with templating code. Index.html, first
line now says, hey, Flask, this file extends layout.html, whatever that is. This next line, three through 10,
says, hey, Flask, here is what I consider my body block to be.

- [1:00:24](https://www.youtube.com/watch?v=undefined&t=75624s) Plug this into the layout


placeholder. Therefore, so if I now go back to layout.html, and layout.html, it's almost all HTML by
contrast. But there is this placeholder, and if I want to put a default value, I could say-- whoops. If I want
to put a default value, I could put a default value there just in case some page does not have a body
block.

- [1:00:45](https://www.youtube.com/watch?v=undefined&t=75645s) But in general, that's not going to


be relevant. So this is just a placeholder, albeit a little verbose, that says plug in the page-specific content
right here. So if I go now into greet.html, this one's even easier. I'm going to cut this content and get rid
of everything else. Greet.

- [1:01:07](https://www.youtube.com/watch?v=undefined&t=75667s) html 2 is going to extend layouts,


dot HTML extends plural, and then I'm going to have my body block here simply be this one line of code.
And then I'm going to go ahead and end that block here. These are not HTML tags, this is not HTML
syntax. Technically, the syntax we keep seeing with the curly braces, and these now curly braces with
percent signs, is an example of Jinja syntax, J-I-N-J-A, which is a language, that some humans invented,
for this purpose of templating.

- [1:01:36](https://www.youtube.com/watch?v=undefined&t=75696s) And the people who invented


Flask decided, we're not going to come up with our own syntax, we're going to use these other people's
syntax called Jinja syntax. So again, there starts to be at this point in the course, and really in computing,
a lot of sharing, now, of ideas and sharing of code.

- [1:01:51](https://www.youtube.com/watch?v=undefined&t=75711s) So Flask is using this syntax, but


other libraries and other languages might also too. All right, so now index.html is half HTML, half
templating code, Jinja syntax. Greet.html is almost all Jinja syntax, no tags even, but because they both
extend layout.html, now I think I've improved the design of this thing.
- [1:02:15](https://www.youtube.com/watch?v=undefined&t=75735s) If I go back to app.py, none of this
really needs to change. I don't change my templates to mention layout.html, that's already implicit in the
fact that we have the extends keyword. So now if I go ahead and open my terminal window, go back to
the same folder as app.py and do Flask run, all right, my application is running on port 5000.

- [1:02:36](https://www.youtube.com/watch?v=undefined&t=75756s) Let me now go back to the /route


in my browser and hit Enter, I have this form again. And just as a little check, let me view the source of
the page that my browser is seeing. And there's all of the code. No mention of Jinja, no curly braces, no
percent signs. It's just HTML. It's not quite pretty printed in the same way, but that's fine.

- [1:02:55](https://www.youtube.com/watch?v=undefined&t=75775s) Because now, we're starting to


dynamically generate websites. And by that, I mean this isn't quite indented nicely or perfectly. That's
fine. If it's indented in the source code version, doesn't matter what the browser really sees. Let me now
go ahead and type in my name, click Submit.

- [1:03:08](https://www.youtube.com/watch?v=undefined&t=75788s) I should see, yep, hello, David. Let


me go ahead and view the source of this page. And we'll see almost the same thing with what's plugged
in there. So this is, now, web programming in the literal sense. I did not hard code a page that says hello
comma David, hello comma Carter, hello comma Emma. I hardcoded a page that has a template with a
placeholder, and now I'm using actual logic, some code in app.

- [1:03:31](https://www.youtube.com/watch?v=undefined&t=75811s) py, to actually tell the server what


to send to the browser. All right, any questions, then, on where we're at here? This is now a web
application. Simple though it is, it's no longer just a web site. Yeah? AUDIENCE: Is what we did just better
for design or for memory [INAUDIBLE]?? DAVID: It better for design or for memory? Both.

- [1:03:58](https://www.youtube.com/watch?v=undefined&t=75838s) It's definitely better for design


because, truly, if we had a third page, fourth page, I would really start just resorting to copy paste. And
as you saw with home page, often, in the head of your page, you might want to include some CSS files
like Bootstrap or something else. You might want to have other information up there.

- [1:04:13](https://www.youtube.com/watch?v=undefined&t=75853s) If you had to upgrade the version


of Bootstrap or you change libraries, so you want to change one of those lines, you would literally have
to go into three, four, a hundred different files to make one simple change. So that's bad design. And in
terms of memory, yes. Theoretically, the server, because it knows there's this common layout, it can
theoretically do some optimizations underneath the hood.

- [1:04:33](https://www.youtube.com/watch?v=undefined&t=75873s) Flask is probably doing that, but


not in the mode we're using it. We're using it in development mode, which means it's typically reloading
things each time. Other questions on this application? Anything at all? All right, so let me ask a question,
not just in terms of the code design. What about the implications for privacy? Why is this maybe not the
best design for users, how I've implemented this? I've used a web form, but-- Yeah? AUDIENCE: For some
reason, you wanted your name.

- [1:05:06](https://www.youtube.com/watch?v=undefined&t=75906s) So these private people could just


look at the URL. DAVID: Yeah. I mean, if you have a nosy sibling or roommate and they have access to
your laptop and they just go trolling through your autocomplete or your history, like, literally what you
typed into a website is going to be visible. Not a big deal if it's your name, but if it's your password, your
credit card or anything else that's mildly sensitive, you probably don't want it ending up in the URL at all
even if you're in incognito mode or whatnot.

- [1:05:29](https://www.youtube.com/watch?v=undefined&t=75929s) You just don't want to expose


yourself or your users to that kind of risk. So perhaps, we can do better than that. And fortunately, this
one is actually an easy change. Let me go into my index.html where my form is. And in my form, I can
just change the method from GET to POST. It's still going to send key value pairs to the server, but it's not
going to put them in the URL.

- [1:05:53](https://www.youtube.com/watch?v=undefined&t=75953s) The upside of which is that we


can assuage this privacy concern, but I'm going to have to make one other change too. Because now, if I
go ahead and run Flask again after making that change, and I now reload the form to make sure I have
the latest version. You should be in the habit of going to View, Developer, View Source, or Developer
Tools just to make sure that what you're seeing in your browser is what you intend.

- [1:06:15](https://www.youtube.com/watch?v=undefined&t=75975s) And yes, I do see what I wanted.


Method equals POST now. Let me go ahead and type in David and click Submit. Now I get a different
error. This one is HTTP 405, method not allowed. Why is that? Well, in my Flask application, I've only
defined a couple of routes so far. One of which is for slash, then that worked fine.

- [1:06:38](https://www.youtube.com/watch?v=undefined&t=75998s) One of which is for /greet, and


that used to work fine. But apparently, what Flask is doing is it only supports GET by default. So if I want
to change this route to support different methods, I can say, quote unquote "POST" inside of this
parameter here. So that now, I can actually support POST, not just GET.

- [1:07:02](https://www.youtube.com/watch?v=undefined&t=76022s) And if I now restart Flask, so Flask


run, Enter, and I go back to this URL. Let me go back one screen to the form, reload the page just to make
sure I have the latest even though nothing there has changed. Type David and click Submit now, now I
should see hello, world. Notice that I'm at the greet route, but there's no mention of name equals
anything in the URL.

- [1:07:28](https://www.youtube.com/watch?v=undefined&t=76048s) All right, so that's an interesting


takeaway. It's a simple change, but whereas GET puts things in the URL, POST does not. But it still works
so long as you tweak the backend to look as a POST request, which means look deeper in the envelope.
It's not going to be as simple as looking at the URL itself.

- [1:07:45](https://www.youtube.com/watch?v=undefined&t=76065s) Why shouldn't we just always use


POST? Why not use POST everywhere? Any thoughts? Right, because it's obnoxious to be putting any
information in URLs if you're leaving these little breadcrumbs in your history and people can poke
around and see what you've been doing. Yeah, what do you think? AUDIENCE: You're supposed to
duplicate [INAUDIBLE]..

- [1:08:12](https://www.youtube.com/watch?v=undefined&t=76092s) DAVID: Yeah. I mean, if you get rid


of GET requests and put nothing in the URL, your history, your autocomplete, gets pretty less useful.
Because none of the information is there for storage, so you can't just go through the menu and hit
Enter. You'd have to re-fill out the form. And there's this other symptom that you can see here.
- [1:08:30](https://www.youtube.com/watch?v=undefined&t=76110s) Let me zoom out and let me just
reload this page. Notice that you'll get this warning, and it'll look different in Safari and Firefox and Edge
and Chrome here, confirm form. args So your browser might remember what your inputs were and
that's great, but just while you're on the page. And this is in contrast to GET, where the state is
information.

- [1:08:51](https://www.youtube.com/watch?v=undefined&t=76131s) Like, key value pairs is embedded


in the URL itself. And if you looked at an email I sent earlier today, I deliberately linked to
https://www.google.c om/search?q=what+time+is+it. This is, by definition, a GET request when you click
on it. Because it's going to grab the information, the key value pair, from the URL, send it to Google
server, and it's just going to work.

- [1:09:18](https://www.youtube.com/watch?v=undefined&t=76158s) And the reason I sent this via


email earlier was I wanted people to very quickly be able to check what is the current time. And so I can
sort automate the process of creating a Google search for you, but that you induce when you click that
link. If Google did not support GET, they only supported this, the best I could do is send you all to this
URL which, unfortunately, has no useful information.

- [1:09:40](https://www.youtube.com/watch?v=undefined&t=76180s) I would have had to add to my


email, by the way, type in the words what time is it. So it's just bad for usability. So there, too, we might
have design when it comes to the low level code, but also the design when it comes to the user
experience, or UX, as a computer scientist would call it. Just in terms of what you want to optimize for,
ultimately.

- [1:09:58](https://www.youtube.com/watch?v=undefined&t=76198s) So GET and POST both have their


roles. It depends on what kind of functionality you want to provide and what kind of sensitivity there
might be around it. All right, any questions, then, on this, our first web application? Super simple, just
gets someone's name and prints it back out. But we now have all the plumbing with which to create
really most anything we want.

- [1:10:19](https://www.youtube.com/watch?v=undefined&t=76219s) All right, let's go ahead and take a


five minute break. And when we come back, we'll add to this some first year intramural sports. All right,
so we are back. And recall that the last thing we just changed was the route to use POST instead of GET.
So gone is my name and any value in the URL. But there was a subtle bug or change here that we didn't
call out earlier.

- [1:10:44](https://www.youtube.com/watch?v=undefined&t=76244s) I did type David into the form and


I did click Submit, and yet here it is saying hello comma world. So that seems to be broken all of a
sudden, even though we added support for POST. But something must be wrong. Logically, it must be the
case here. Intuitively, that if I'm seeing hello, world, that's the default value I gave the name variable.

- [1:11:07](https://www.youtube.com/watch?v=undefined&t=76267s) It must be that it's not seeing a


key called name in request.args, which is this. Gives you access to everything after the URL. That's
because there's this other thing we should know about, which is not just request.args but request.form.
These are horribly named, but request.

- [1:11:26](https://www.youtube.com/watch?v=undefined&t=76286s) args is for GET requests,


request.form is for POST requests. Otherwise, they're pretty much functionally the same. But the onus is
on you, the user or the programmer, to make sure you're using the right one. So I think if we want to get
rid of the world and actually see what I, the human, typed in, I think I can just change request.args to
request.form.

- [1:11:45](https://www.youtube.com/watch?v=undefined&t=76305s) Still dot get, still quote unquote


"name," and now, if I go ahead and rerun Flask in my terminal window, go back to my browser, go back
to-- and actually, I won't even go back to the form. I will literally just reload, Command R or Control R,
and what this warning is saying is it's going to submit the same information to the website.

- [1:12:05](https://www.youtube.com/watch?v=undefined&t=76325s) When I click Continue, now I


should see hello comma David. So again, you, too, are going to encounter, probably, all these little
subtleties. But if you focus on, really, the first principles of last week, like what it HTTP, how does it get
request work, how does a POST request work now, you should have a lot of the mental building blocks
with which to solve problems like these.

- [1:12:25](https://www.youtube.com/watch?v=undefined&t=76345s) And let me give you one other


mental model, now, for what it is we're doing. This framework called Flask is just an example of many
different frameworks that all implement the same paradigm, the same way of thinking and the same way
of programming applications. And that's known as MVC, model view controller.

- [1:12:42](https://www.youtube.com/watch?v=undefined&t=76362s) And here's a very simple diagram


that represents the process that you and I have been implementing thus far. And actually, this is more
than we've been implementing thus far. In app.py is what a programmer would typically call the
controller. That's the code you're writing, this are called business logic that makes all of the decisions,
decides what to render, what values to show, and so forth.

- [1:13:03](https://www.youtube.com/watch?v=undefined&t=76383s) In layout.html, index.html,


greet.html is the so-called view templates that is the visualizations that the human actually sees, the
user interface. Those things are dumb, they pretty much just say plop some values here. All of the hard
work is done in app.py. So controller, AKA app.py, is where your Python code generally is.

- [1:13:26](https://www.youtube.com/watch?v=undefined&t=76406s) And in your view is where your


HTML and your Jinja code, your Jinja templating, the curly braces, the curly braces with percent signs,
usually is. We haven't added an M to MVC yet model, that's going to refer to things like CSV files or
databases. The model, where do you keep actual data, typically long term.

- [1:13:46](https://www.youtube.com/watch?v=undefined&t=76426s) So we'll come back to that, but


this picture, where you have one of these-- each of these components communicating with one another
is representative of how a lot of frameworks work. What we're teaching today, this week, is not really
specific to Python. It's not really specific to Flask, even though we're using Flask.

- [1:14:02](https://www.youtube.com/watch?v=undefined&t=76442s) It really is a very common


paradigm that you could implement in Java, C sharp, or bunches of other languages as well. All right, so
let's now pivot back to VS Code here. Let me stop running Flask, and let me go ahead and create a new
folder altogether after closing these files here. And let me go ahead and create a folder called FroshIMS,
representing freshman intramural sports or first year intramural sports that I can now CD into.
- [1:14:29](https://www.youtube.com/watch?v=undefined&t=76469s) And now I'm going to code an
app.py. And in anticipation, I'm going to create another templates directory. This one in the FroshIMS
folder. And then in my templates directory, I'm going to create a layout.html. and I'm just going to get
myself started here. FroshIMS will go here. I'm just copying my layout from earlier because most of my
interesting work, this time, is now going to be, initially, in app.py.

- [1:14:53](https://www.youtube.com/watch?v=undefined&t=76493s) So what is it we're creating? So


literally, the very first thing I wrote as a web application 20 years ago, was a site that literally looked like
this. So I was like a sophomore or junior at the time. I'd taken CS50 and a follow-on class only. I had no
idea how to do web programming. Neither of those two courses taught web programming back in the
day.

- [1:15:12](https://www.youtube.com/watch?v=undefined&t=76512s) So I taught myself, at the time, a


language called Perl. And I learned a little something about CSV files, and I sort of read enough-- can't
even say googled enough, because Google didn't come out for a couple of years later. Read enough
online to figure out how to make a web application so that students on campus, first years, could
actually register via a website for intramural sports.

- [1:15:32](https://www.youtube.com/watch?v=undefined&t=76532s) Back in my day, you would literally


fill out a piece of paper and then walk it across the yard to Wigglesworth Hall, one of the dorms, slide it
under the dorm of the Proctor or RA, and thus you were registered for sports so. 1996, 1997. We could
do better by then. There was an internet, just wasn't really being used much on campus or more
generally.

- [1:15:51](https://www.youtube.com/watch?v=undefined&t=76551s) So background images that repeat


infinitely was in vogue, apparently, at the time. All of this was like images that I had to hand make
because we did not have the features that JavaScript and CSS nowadays have. So it was really just HTML,
and it was really just controller code written, not in Python, but in Perl.

- [1:16:10](https://www.youtube.com/watch?v=undefined&t=76570s) And it was really just the same


building blocks that we hear already today now have. So we'll get rid of all of the imagery and focus
more on the functionality and the aesthetics, but let's see if we can whip up a web application via which
someone could register for one such intramural sport. So in app.py, me go ahead and import some
familiar things now.

- [1:16:30](https://www.youtube.com/watch?v=undefined&t=76590s) From Flask, let's import capital


Flask, which is that function we need to kick everything kick start everything. Render templates, so we
have the ability to render, that is print out, those templates, and request so that we have the ability to
get at input from the human. Let me go ahead and create the application itself using this magical
incantation here.

- [1:16:49](https://www.youtube.com/watch?v=undefined&t=76609s) And then let's go ahead and


define a route for slash for instance first. I'm going to define a function called index. But just to be clear,
this function could be anything. Foo, bar, baz, anything else. But I tend to name them in a manner that's
consistent with what the route is called. But you could call it anything you want, it's just the function that
will get called for this particular route.
- [1:17:13](https://www.youtube.com/watch?v=undefined&t=76633s) Now, let me go ahead here and
just get things started. Return, render template of index.html. Just keep it simple, nothing more. So
there's nothing really FroshIM specific about this here, I just want to make sure I'm doing everything
correctly. Meanwhile, I've got my layout. OK, let me go ahead, and in my templates directory, code a file
called index.html.

- [1:17:34](https://www.youtube.com/watch?v=undefined&t=76654s) And let's just do extends


layout.html at the top just so that we get benefit from that template. And down here, I'm just going to
say to do. Just so that I have something going on visually to make sure I've not screwed up yet. In my
FroshIMS directory, let me do Flask run. Let me now go back to my previous URL, which used to be my
hello example.

- [1:17:55](https://www.youtube.com/watch?v=undefined&t=76675s) But now, I'm serving up the


FroshIM site. Oh, and I'm seeing nothing. That's because I screwed up accidentally. What did I do wrong
in index.html? What am I doing wrong? This file extends layout.html, but-- AUDIENCE: You left out the
block tag? DAVID: Yeah. I forgot to tell Flask what to plug into that layout.

- [1:18:23](https://www.youtube.com/watch?v=undefined&t=76703s) So I just need to say block body,


and then in here, I can just say to do or whatever I want to eventually get around to. Then end the block.
Let me end this tag here. OK, so now it looks ugly, more cryptic. But this is, again, the essence of doing
templating. Let me now restart Flask up here, let me go back to the page.

- [1:18:41](https://www.youtube.com/watch?v=undefined&t=76721s) Let me reload. Crossing my fingers


this time, and there we go. To do. So it's not the application I want, but at least I know I have some of the
plumbing there by default. All right, so if I want the user to be able to register for one of these sports,
let's enhance, now, index.

- [1:18:56](https://www.youtube.com/watch?v=undefined&t=76736s) html to actually have a form that's


maybe got a dropdown menu for all of the sports for which you can register. So let me go into this
template here. And instead of to do, let's go ahead and give myself, how about an H1 tag that just says
register so the user knows what it is they're looking at. How about a form tag that's going to use POST,
just because it's not really necessary to put this kind of information in the URL.

- [1:19:17](https://www.youtube.com/watch?v=undefined&t=76757s) The action for that, how about we


plan to create a register route so that we're sending information from to a register route. So we'll have to
come back to that. In here, let me go ahead and create, how about an input with autocomplete equals
off, auto focus on. How about a name equals name, because I'm going to ask the student for their name
using placeholder text of quote unquote "name.

- [1:19:42](https://www.youtube.com/watch?v=undefined&t=76782s) " And the type of this box will be


text. So this is pretty much identical to before. But if you've not seen this yet, let's create a select menu,
a so-called dropdown menu in HTML. And maybe the first option I want to be in there is going to be, oh,
how about the current three sports for the fall, which are basketball, and another option is going to be
soccer, and a third option is going to be ultimate frisbee for first year intramurals right now.

- [1:20:13](https://www.youtube.com/watch?v=undefined&t=76813s) So I've got those three options.


I've got my form. I haven't implemented my route yet, but this feels like a good time to go back now and
check if my form has reloaded. So let me go ahead and stop and start Flask. You'll see there's ways to
automate the process of restarting the server that we'll do for you for problem set nine, so you don't
have to keep stopping Flask.

- [1:20:33](https://www.youtube.com/watch?v=undefined&t=76833s) Let me reload my index route and


OK, it's not that pretty. It's not though, maybe-- nor was this. But it now has at least some functionality
where I can type in my name and then type in the sport. Now, I might be biasing people toward
basketball. Like UX wise, user experience wise, it's obnoxious to precheck basketball but not the others.

- [1:20:54](https://www.youtube.com/watch?v=undefined&t=76854s) So there's some little tweaks we


can make there. Let me go back into index.html. Let me create an empty option up here that, technically,
this option is not going to have the name of any sports. But it's just going to have a word I want the
human to see, so I'm actually going to disable this option and make it selected by default.

- [1:21:13](https://www.youtube.com/watch?v=undefined&t=76873s) But I'm going to say sport up


here. And there's different ways to do this, this is just one way of creating, essentially, a-- whoops,
option. Yep, that looks right. Creating a placeholder sports so that the user sees something in the
dropdown. Let me go ahead and restart Flask, reload the page, and now it's just going to be marginally
better.

- [1:21:32](https://www.youtube.com/watch?v=undefined&t=76892s) Now you see sport that's checked


by default, but you have to check one of these other ones ultimately. All right, so that's pretty good. So
let me now type in David. I'll register for ultimate frisbee. OK, I definitely forgot something. Submit
button. So let's add that. All right, so input type equals submit.

- [1:21:53](https://www.youtube.com/watch?v=undefined&t=76913s) All right, let's put that in. Restart


Flask, reload. Getting better. Submit could be a little prettier. Recall that we can change some of these
HTTP-- these HTML attributes. The value of this button should be register, maybe, just to make things a
little prettier. Let me now reload the page and register.

- [1:22:10](https://www.youtube.com/watch?v=undefined&t=76930s) All right, so now we really have


the beginnings of the user interface that I created some years ago to let people actually register for the
sport. So let's go, now, and create maybe the other route that we might need. Let me go into app.py. And
in here, if we want to allow the user to register, let's do a little bit of error checking which I promised
we'd come back to.

- [1:22:29](https://www.youtube.com/watch?v=undefined&t=76949s) What could the user do wrong?


Because assume that they will. One, they might not type their name. Two, they might not choose a
sport. So they might just submit an empty form. So that's two things we could check for, just so that
we're not scoring bogus entries in our database, ultimately. So let's create another route called greet,
/greet.

- [1:22:47](https://www.youtube.com/watch?v=undefined&t=76967s) And then in this route, let's create


a function called greet but can be called anything we want. And then let's go ahead, and in the greet
function, let's go ahead and validate the submission. So a little comment to myself here. How about if
there is not a request.form GET name value, so that is if that function returns nothing, like quote
unquote, or the special word none in Python.
- [1:23:14](https://www.youtube.com/watch?v=undefined&t=76994s) Or request.form.get"sport" not in
quote unquote, what were they? Basketball, the other one was soccer, and the last was ultimate frisbee.
Getting a little long, but notice what I'm-- the question I'm asking. If the user did not give us a name, that
is, if this function returns the equivalent of false, which is, quote unquote, or literally none if there's no
such parameter.

- [1:23:46](https://www.youtube.com/watch?v=undefined&t=77026s) Or if the sport the user provided


is not some value in basketball, soccer, or ultimate frisbee, which I've defined as a Python list, then let's
go ahead and just yell at the user in some way. Let's return render template of failure.html. And that's
just going to be some error message inside of that file.

- [1:24:06](https://www.youtube.com/watch?v=undefined&t=77046s) Otherwise, if they get this far, let's


go ahead and confirm registration by just returning-- whoops, returning render template quote unquote
"success" dot HTML. All right, so a couple quick things to do. Let me first go in and in my templates
directory, let's create this failure.html file.

- [1:24:28](https://www.youtube.com/watch?v=undefined&t=77068s) And this is just meant to be a


message to the user that they fail to provide the information correctly. So let me go ahead and in
failure.html. not repeat my past mistake. So let me extend layout.html and in the block body, you are not
registered. I'll just yell at them like that so that they know something went wrong.

- [1:24:48](https://www.youtube.com/watch?v=undefined&t=77088s) And then let me create one other


file called success.html, that similarly is mostly just Jinja syntax. And I'm just going to say for now, even
though they're not technically registered in any database, you are registered. That's what we mean by
success. All right, so let me go ahead, and back in my FroshIMS, directory run Flask run.

- [1:25:07](https://www.youtube.com/watch?v=undefined&t=77107s) Let me go back to the form and


reload. Should look the same. All right, so now let me not cooperate and just immediately click Register
impatiently. OK, what did I do wrong. Register-- oh, I'm confusing our two examples. All right, I spotted
the error. What did I do wrong? Unintentional. There's where I am, what did I actually invent over here?
Where did I screw up? Anyone? AUDIENCE: Register, not greet.

- [1:25:46](https://www.youtube.com/watch?v=undefined&t=77146s) DAVID: Thank you. So register, not


greet. I had last example on my mind, so the route should be register. Ironically, the function could be
greet, because that actually doesn't matter. But to keep ourselves sane, let's use the one and the same
words there. Let me go ahead now and start Flask as intended.

- [1:26:00](https://www.youtube.com/watch?v=undefined&t=77160s) Let me reload the form just to


make sure all is working. Now, let me not cooperate and be a bad user, clicking register-- oh my God. OK,
other unintended mistake. But this one we've seen before. Notice that by default, route only support
GET. So if I want to specifically support POST, I have to pass in, by a methods parameter, a list of allowed
route methods that could be GET comma POST, but if I don't have no need for a GET in this context, I can
just do POST.

- [1:26:32](https://www.youtube.com/watch?v=undefined&t=77192s) All right, now let's do this one last


time. Reload the form to make sure everything's OK, click Register, and you are not registered. So it's
catching that. All right, let me go ahead and at least give them my name. Register. You are not registered.
Fine, I'm going to go ahead and be David with ultimate frisbee register.
- [1:26:50](https://www.youtube.com/watch?v=undefined&t=77210s) Huh. OK. What should I-- what did
I mean to do here? All right, so let's figure this out. How to debug something like this, which is my third
and final unintended, unforced error? How can we go about troubleshooting this? Turn this into the
teachable moment. All right, well first, some safety checks. What did I actually submit? Let me go ahead
and view page source, a good rule of thumb.

- [1:27:21](https://www.youtube.com/watch?v=undefined&t=77241s) Look at the HTML that you


actually sent to the user. So here, I have an input with a name name. So that's what I intended, that
looks OK. Ah, I see it already, even though you, if you've never used a select menu, you might not know
what, apparently, is missing from here that I did have for my text input.

- [1:27:42](https://www.youtube.com/watch?v=undefined&t=77262s) Just intuitively, logically. What's


going through my head, embarrassingly, is, all right, if my form thinks that it's missing a name or a sport,
how did I create a situation in which name is blank or sport is blank? Well, name, I don't think it's going
to be blank because I explicitly gave this text field a name name and that did work last time.

- [1:28:03](https://www.youtube.com/watch?v=undefined&t=77283s) I've now given a second input in


the form of the select menu. But what seems to be missing here that I'm assuming exists here? It's just a
dumb mistake I made. What might be missing here? If request.form gives you all of the inputs that the
user might have typed in, let me go into my actual code here in my form and name equals sport.

- [1:28:30](https://www.youtube.com/watch?v=undefined&t=77310s) I just didn't give a name to that


input. So it exists, and the browser doesn't care. It's still going to display the form to you, it just hasn't
given it a unique name to actually transmit to the server. So now, if I'm not going to put my foot in my
mouth, I think that's what I did wrong.

- [1:28:45](https://www.youtube.com/watch?v=undefined&t=77325s) And again, my process for figuring


that out was looking at my code, thinking through logically, is this right, is this right? No, I was missing
the name there. So let's run Flask, let's reload the form just to make sure it's all defaults again, type in
my name and type in ultimate frisbee, crossing my fingers extra hard this time.

- [1:29:04](https://www.youtube.com/watch?v=undefined&t=77344s) And there. You are registered. So I


can emphasize-- I did not intend to screw up in that way, but that's exactly the right kind of thought
process to diagnose issues like this. Go back to the basics, go back to what HTTP and what HTML forms
are all about, and just rule things in and out. There's only a finite number of ways I could have screwed
that up.

- [1:29:21](https://www.youtube.com/watch?v=undefined&t=77361s) Yeah? AUDIENCE: Are you


[INAUDIBLE]. DAVID: Excuse-- say a little louder? AUDIENCE: I don't understand why name equals sport
[INAUDIBLE].. DAVID: Why did name equal sport address the problem? Well, let's first go back to the
HTML. Previously, it was just the reality that I had this user input dropdown menu, but I never gave it a
name.

- [1:29:45](https://www.youtube.com/watch?v=undefined&t=77385s) But names, or more generally, key


value pairs, is how information is sent from a form to the server. So if there's no name, there's no key to
send, even if the human types a value. It would be like nothing equals ultimate frisbee, and that just
doesn't work. The browser is just not going to send it.
- [1:30:02](https://www.youtube.com/watch?v=undefined&t=77402s) However, in app.py, I was naively
assuming that in my requests form, there would be a name called quote unquote "sport." It could have
been anything, but I was assuming it was sport. But I never told the form that. And if I really wanted to
dig in, we could do a little something more. Let me go back to the way it was a moment ago.

- [1:30:22](https://www.youtube.com/watch?v=undefined&t=77422s) Let me get rid of the name of the


sport dropdown menu. Let me rerun Flask down here and reload the form itself after it finishes being
served. And now, let me do this. View Developer Tools, and then let me watch the Network tab, which
recall, we played around with a little bit last week. And we also played around with Curl, which let us see
the HTTP requests.

- [1:30:44](https://www.youtube.com/watch?v=undefined&t=77444s) Here's another-- here's what I


would have done if I still wasn't seeing the error and was really embarrassed on stage. I would have
typed in my name as before, I would have chosen ultimate frisbee. I would have clicked register. And
now, I would have looked at the HTTP request. And I would click on Register here.

- [1:31:01](https://www.youtube.com/watch?v=undefined&t=77461s) And just like we did last week, I


would go down to the request down here. And there's a whole lot of stuff that we can typically ignore.
But here, let me zoom in, way at the bottom, what Chrome's developer tools are doing for me, it's
showing me all of the form data that was submitted. So this really would have been my telltale clue.

- [1:31:19](https://www.youtube.com/watch?v=undefined&t=77479s) I'm just not sending the sport,


even if the human typed it in. And logically, because I've done this before, that must mean I didn't give
the thing a name. But another good tool. Like good programmers, web developers are using these kinds
of tools all the time. They're not writing bug-free code.

- [1:31:34](https://www.youtube.com/watch?v=undefined&t=77494s) That's not the point to get to. The


point to get to is being a good diagnostician, I would say, in these cases. OK, other questions on this?
Yeah. AUDIENCE: What if you want to edit one HTML in CSS, [INAUDIBLE].. DAVID: I'm sorry, a little bit
louder? AUDIENCE: If you want to edit in CSS or anything, in HTML, once you have to fix the template,
how do you that? DAVID: So how would you edit CSS if you have these templates? That process we'll
actually see before long.

- [1:32:08](https://www.youtube.com/watch?v=undefined&t=77528s) It's almost going to be the exact


same. Just to give you a teaser for this, and you'll do this in the problem set, but we'll give you some
distribution code to automate this process. You can absolutely still do something like this. Link href
equals quote unquote "styles" dot CSS rel equals style sheet, that's one of the techniques we showed
last week.

- [1:32:27](https://www.youtube.com/watch?v=undefined&t=77547s) The only difference today, using


Flask, is that all of your static files, by convention, should go in your static folder. So the change you
would make in your layout would be to say that styles dot CSS is in your static folder. And then, if I go
into my FroshIMS directory, I can create a static folder. I can CD into it, nothing's there by default.

- [1:32:49](https://www.youtube.com/watch?v=undefined&t=77569s) But if I now code a file called


styles.css, I could now do something like this body. And in here, I could say background color, say FF0000
to make it red. Let me go ahead now and restart Flask in the FroshIMS directory. Cross my fingers
because I'm doing this on the fly. Go back to my form and reload.
- [1:33:16](https://www.youtube.com/watch?v=undefined&t=77596s) Voila, now we've tied together
last week's stuff as well. If I answered the right question? AUDIENCE: [INAUDIBLE] change one page and
not the other. DAVID: If you want to change one page and not the other in terms of CSS? AUDIENCE: Yes.
DAVID: That depends. In that case, you might want to have different CSS files for each page if they're
different.

- [1:33:38](https://www.youtube.com/watch?v=undefined&t=77618s) You could use different classes in


one template than you did in the other. There's different ways to do that. You could even have a
placeholder in your layout that allows you to plug in the URL of a specific style sheet in your individual
files. But that starts to get more complicated quickly. So in short, you can absolutely do it.

- [1:33:58](https://www.youtube.com/watch?v=undefined&t=77638s) But typically, I would say most


websites try not to use different style Sheets per page. They reuse the styles as much as they can. All
right, let me go ahead and revert this real quick. And let's start to add a little bit more functionality here.
I'm going to go ahead and just remove the static folder just so as to not complicate things just yet.

- [1:34:16](https://www.youtube.com/watch?v=undefined&t=77656s) And let's go ahead and just play


around with a different user interface mechanism. In my form here, the dropdown menu is perfectly
fine. Nothing wrong with it. But suppose that I wanted to change it to checkboxes instead. Maybe I want
students to be able to register for multiple sports instead. Well, it might make sense to clean this up in a
couple of ways.

- [1:34:35](https://www.youtube.com/watch?v=undefined&t=77675s) And let's do this. Before we even


get into the checkboxes, there's one subtle bad design here. Notice that I've hardcoded basketball,
soccer, and ultimate frisbee here. And if you recall, in app.py, I also enumerated all three of those here.
And any time you see copy paste or the equivalent thereof, feels like we could do better.

- [1:34:55](https://www.youtube.com/watch?v=undefined&t=77695s) So what if I instead do this. What


if I instead give myself a global variable of Sports, I'll capitalize the word just to connote that it's meant to
be constant even though Python does not have constants, per se. The first sport will be basketball. The
second will be soccer. The third will be ultimate frisbee.

- [1:35:16](https://www.youtube.com/watch?v=undefined&t=77716s) Now I have one convenient place


to store all of my sports if it changes next semester or next year or whatnot. But notice what I could do
to. I could now do something like this. Let me pass into my index template a variable called sports that's
equal to that global variable sports. Let me go into my index now, and this is really, now, going to hint at
the power of templating and Jinja, in this case here.

- [1:35:42](https://www.youtube.com/watch?v=undefined&t=77742s) Let me go ahead and get rid of all


three of these hard coded options and let me show you some slightly different syntax for sport, in sports.
Then end for. We've not seen this end for syntax. There's like end block syntax, but it's as simple as that.
So you have a start and an end to your block without indentation mattering.

- [1:36:01](https://www.youtube.com/watch?v=undefined&t=77761s) Watch what I can do here. Option


curly brace sport close curly brace. Let me save that. Let me go back into my terminal window, do Flask
run. And if I didn't mess up here, let me go back to this. The red's going to go away because I deleted my
CSS. And now I still have a sport dropdown and all of those sports are still there.
- [1:36:23](https://www.youtube.com/watch?v=undefined&t=77783s) I can make one more
improvement now. I don't need to mention these same sports manually in app.py. I can now just say if
the user's inputed sport is not in my global variable, sports, and ask the same question. And this is really
handy because if there's another sport, for instance, that gets added, like say football, all I have to do is
change my global variable.

- [1:36:44](https://www.youtube.com/watch?v=undefined&t=77804s) And if I reload the form now and


look in the dropdown, boom, now I have support for a fourth sport. And I can keep adding and adding
there. So here's where templating starts to get really powerful in that now, in this template, I'm using
Jinja's for loop syntax, which is almost identical to Python here, except you need the curly brace and the
percent sign and you need the weird ending and for.

- [1:37:07](https://www.youtube.com/watch?v=undefined&t=77827s) But it's the same idea as in


Python. Iterating over something with a for loop lets you generate more and more HTML. And this is like
every website out there. For instance, Gmail. When you visit your inbox and you see all of this big table
of emails, Google has not hardcoded your emails manually. They have grabbed them from a database.

- [1:37:24](https://www.youtube.com/watch?v=undefined&t=77844s) They have some kind of for loop


like this, and are just outputting table row after table row or div after div dynamically. All right, so now,
let's go ahead and change this, maybe, to, oh, how about little checkboxes or radio buttons. So let me go
ahead and do this. Instead of a select menu, I'm going to go ahead and do something like this.

- [1:37:46](https://www.youtube.com/watch?v=undefined&t=77866s) For each of these sports let me go


ahead and output, not an option, but let me go ahead and output an input tag, the name for which is
quote unquote "sport," the type of which is checkbox, the value of which is going to be the current
"sport," quote unquote, and then afterward I need to redundantly, seemingly, output the sport.

- [1:38:10](https://www.youtube.com/watch?v=undefined&t=77890s) So you see a word next to the


checkbox. And we'll look at the result of this in just a moment. So it's actually a little simpler than a
select menu, a dropdown menu, because now watch what happens if I reload my form. Different user
interface, and it's not as pretty, but it's going to allow users to sign up for multiple sports at once now, it
would seem.

- [1:38:28](https://www.youtube.com/watch?v=undefined&t=77908s) Now I can click on basketball and


football and soccer or some other combination thereof. If I view the page's source, this is, again, the
power of templating. I didn't have to type out four inputs, I got them now automatically. And these
things all have the same name, but that's OK. It turns out with Flask, if it sees multiple values for the
same name, it's going to hand them back to you as a list if you use the right function.

- [1:38:53](https://www.youtube.com/watch?v=undefined&t=77933s) All right, but suppose we don't


want users registering for multiple sports. Maybe capacity is an issue. Let me go ahead and change this
checkbox to radio button, which a radio button is mutually exclusive. So you can only sign up for one. So
now, once I reload the page, there we go. It now looks like this.

- [1:39:12](https://www.youtube.com/watch?v=undefined&t=77952s) And because I've given each of


these inputs the same name, quote unquote, "sport," that's what makes them mutually exclusive. The
browser knows all four of these things are types of sports, therefore I'm only going to let you select one
of these things. And that's simply because they all have the same name.
- [1:39:29](https://www.youtube.com/watch?v=undefined&t=77969s) Again, if I view page source,
notice all of them, name equal sport, name equals sport, name equals sport, but what differs is the value
that each one is going to have. All right, any questions, then, on this approach? All right. Well, let me go
ahead and open a version of this that I made in advance that's going to now start saving the information.

- [1:39:51](https://www.youtube.com/watch?v=undefined&t=77991s) So thus far, we're not quite at the


point of where this website was, which actually allowed the proctors to see, like in a database, everyone
who had registered for sports. Now, we're literally telling students you are registered or you are not
registered, but we're literally doing nothing with this information.

- [1:40:06](https://www.youtube.com/watch?v=undefined&t=78006s) So how might we go about


implementing this? Well, let me go ahead and close these tabs, and let me go into what I call version
three of this in the code for today. And let me go into my source nine directory, FroshIMS3, and let me go
ahead and open up app.py. So this is a premade version. I've gotten rid of football, in this case.

- [1:40:26](https://www.youtube.com/watch?v=undefined&t=78026s) But I've added one thing at the


very top. What's, in English, does this represent on line seven? What would you describe what that thing
is? What are we looking at? What do you think? AUDIENCE: It's an empty dictionary. DAVID: Yeah, it's an
empty dictionary, right? Registrants is apparently a variable on the left.

- [1:40:46](https://www.youtube.com/watch?v=undefined&t=78046s) It's being assigned an empty


dictionary on the right. And a dictionary, again, is just key value pairs. Here, again, is where dictionaries
are just such a useful data structure. Why? Because this is going to allow me to remember that David
registered for ultimate frisbee, Carter registered for soccer, Emma registered for something else.

- [1:41:01](https://www.youtube.com/watch?v=undefined&t=78061s) You can associate keys with


values, names with sports, assuming a model where you can only register for one sport for now. And so
let's see what the logic is that handles this. Here in my register route in the code I've premade, notice
that I'm validating the user's name. Slightly differently from before but same idea.

- [1:41:20](https://www.youtube.com/watch?v=undefined&t=78080s) I'm using request.form.get to get


the human's name. If not name, so if the human did not type a name, I'm going to output error.html. But
notice I've started to make the user interface more expressive. I'm telling the user, apparently, with a
message what they did wrong. Well how? I'm apparently passing to my error template, instead of just
failure.html, a specific message.

- [1:41:45](https://www.youtube.com/watch?v=undefined&t=78105s) So let's go down this rabbit hole.


Let me actually go into templates/error.hml, and sure enough, here's a new file I created here, that
adorably is apparently going to have a grumpy cat as part of the error message, but notice what I've
done. In my block body I've got an H1 tag that just says error, big and bold.

- [1:42:05](https://www.youtube.com/watch?v=undefined&t=78125s) I then have a paragraph tag that


plugs in whatever the error message is that the controller, app.py, is passing in. And then just for fun, I
have a picture of a grumpy cat connoting that there was, in fact, an error. Let's keep looking. How do I
validate sport? I do similarly request.form.

- [1:42:23](https://www.youtube.com/watch?v=undefined&t=78143s) get of sport, and I store it in a


variable called sport. If there's no such sport, that is the human did not check any of the boxes, then I'm
going to render error.html two, but I'm going to give a different message, missing sport. Else, if the sport
they did type in is not in my sports global variable, I'm going to render error.

- [1:42:42](https://www.youtube.com/watch?v=undefined&t=78162s) html, but complain differently,


you gave me an invalid sport somehow. As if a hacker went into the HTML of the page, changed it to add
their own sport like volleyball. Even though it's not offered, they submitted volleyball. But that's OK, I'm
rejecting it, even though they might have maliciously tried to send it to me by changing the dom locally.

- [1:42:59](https://www.youtube.com/watch?v=undefined&t=78179s) And then really, the magic is just


this. I remember that this person has registered by indexing into the registrant dictionary using the name
the human typed in as the key and assigning it a value of sport. Why is this useful? Well, I added one
final route here. I have a /registrants route with a registrants function that renders a template called
registrants.html.

- [1:43:22](https://www.youtube.com/watch?v=undefined&t=78202s) But it takes as input that global


variable just like before. So let's go down this rabbit hole let me go into templates registrants dot HTML.
Here's this template. It looks a little crazy big, but it extends the layout. Here comes the body. I've got an
H1 tag that says registrants, big and bold.

- [1:43:42](https://www.youtube.com/watch?v=undefined&t=78222s) Then I've got a table that we saw


last week. This has a table head that just says name sport for two columns. Then it has a table body
where in, using this for loop in Jinja syntax, I'm saying, for each name in the registrants variable, output a
table row, start tag, and end tag, inside of which, two table datas, two cells, table data for name, table
data for registrants bracket name.

- [1:44:09](https://www.youtube.com/watch?v=undefined&t=78249s) So it's very similar to Python


syntax. It essentially is Python syntax, albeit with these curly braces and the percent sign. So the net
effect here is what? Let me open up my terminal window, run Flask run. Let me now go into the form
that I premade here. So gone is football. Let me go ahead and type in David.

- [1:44:28](https://www.youtube.com/watch?v=undefined&t=78268s) Let me choose, oh, no sport.


Register. Error, missing sport. And there is the grumpy cat. So missing sport, though, specifically was
outputed. All right, fine. Let me go ahead and say no name. But I'll choose basketball. Register. Missing
name. All right, and let me maliciously, now, do this. Now I'm hacking.

- [1:44:48](https://www.youtube.com/watch?v=undefined&t=78288s) Let me go into this. I'll type my


name, sure, but let me go into the body tag down here. Let me maliciously go down in ultimate frisbee,
heck with that, let's volleyball. Change that and change this to volleyball. Enter. So now, I can register for
any sport I want to create. Let me click register, but invalid sports.

- [1:45:12](https://www.youtube.com/watch?v=undefined&t=78312s) So again, that speaks to the


power and the need for checking things on backend and not trusting users. It is that easy to hack
websites otherwise if you're not validating data server side. All right, finally, let's just do this for real.
David is going to register for ultimate frisbee. Clicking register.

- [1:45:27](https://www.youtube.com/watch?v=undefined&t=78327s) And now, the output is not very


pretty, but notice I'm at the registrants route. And if I zoom out, I have an HTML table. Two columns,
name and sport, David and ultimate frisbee. Let me go back to the form, letting me pretend Carter
walked up to my laptop and registered for basketball. Register. Now we see two rows in this table, David,
ultimate frisbee, Carter, basketball.

- [1:45:50](https://www.youtube.com/watch?v=undefined&t=78350s) And if we do this one more time,


maybe Emma comes along and registers for soccer register. All of this information is being stored in this
dictionary, now. All right, so that's great. Now we have a database, albeit in the form of a Python
dictionary. But why is this, maybe, not the best implementation? Why is it not great? Yeah.

- [1:46:11](https://www.youtube.com/watch?v=undefined&t=78371s) AUDIENCE: You are storing


[INAUDIBLE]. DAVID: Yeah. So we're only storing this dictionary in the computer's memory, and that's
great until I hit Control C and kill Flask, stopping the web server. Or the server reboots, or maybe I close
my laptop or whatever. If the server stops running, memory is going to be lost.

- [1:46:33](https://www.youtube.com/watch?v=undefined&t=78393s) RAM is volatile. It's thrown away


when you lose power or stop the program. So maybe this isn't the best approach. Maybe it would be
better to use a CSV file. And in fact, some 20 years ago, that's literally what I did. I stored everything in a
CSV file. But let's skip that step, because we already saw last week, or a couple of weeks ago now, how
we can use SQLite.

- [1:46:52](https://www.youtube.com/watch?v=undefined&t=78412s) Let's see if we can't marry in


some SQL here to store an actual database for the program. Let me go back here and let me open up,
say, version four of this, which is almost the same but it adds a bit more functionality. Let me close these
tabs and let me open up app.py now in version four. So notice it's almost the same, but at the top, I'm
creating a database connection to a database called FroshIMS.db.

- [1:47:18](https://www.youtube.com/watch?v=undefined&t=78438s) So that's a database I created in


advance. So let's go down that rabbit hole. What does it look like? Let me make my terminal window
bigger. Let me run SQLite 3 of FroshIMS.db. OK, I'm in. Let's do .schema. and let's just infer what I
designed this to be. I have a table called registrants, which has one, two, three columns.

- [1:47:38](https://www.youtube.com/watch?v=undefined&t=78458s) An ID column that's an integer, a


name column that's text but cannot be null, and a sport column that's also text, cannot be null, and the
primary key is just ID. So that I have a unique ID for every registration. Let's see if there's anyone in there
yet. Select star from registrants. OK, there's no one in there.

- [1:47:57](https://www.youtube.com/watch?v=undefined&t=78477s) No one is yet registered for


sports. So let's go back to the code and continue on. In my code now, I've got the same global variable
for validation and generation of my HTML. Looks like my index route is the same. It's dynamically
generating the menu of sports. Interestingly, we'll come back to this.

- [1:48:15](https://www.youtube.com/watch?v=undefined&t=78495s) There's a deregister route that's


going to allow someone to deregister themselves if they want to exit the sport or undo their registration.
But this is the juicy part. Here's my new and improved register route. Still works on POST, so some mild
privacy there. I'm validating the submission as follows.

- [1:48:34](https://www.youtube.com/watch?v=undefined&t=78514s) I'm getting the user's inputted


name, the user's inputted sport, and if it is not a name or the sport is not in sports, I'm going to render
failure.html. So I kept it simple. There's no cat in this version. It just says failure. Otherwise, recall how
we co-mingled SQL and Python before.

- [1:48:50](https://www.youtube.com/watch?v=undefined&t=78530s) We're using CS50's SQL library,


but that just makes it a little easier to execute SQL queries and we're executing this. Insert into
registrants name comma sport. What two values, the name and the sport, that came from that HTML
form. And then lastly, and this is a new function that we're calling out explicitly now, Flask also gives you
access to a redirect function, which is how safetyschool.org, Harvardsucks.

- [1:49:17](https://www.youtube.com/watch?v=undefined&t=78557s) org, and all these other sites we


played around with last week we're all implemented redirecting the user from one place to another. This
Flask function redirect comes from my just having imported it at the very top of this file. It handles the
HTTP 301 or 302 or 307 code, whatever the appropriate one is.

- [1:49:35](https://www.youtube.com/watch?v=undefined&t=78575s) It does that for me. All right, so


that's it for registering via this route. Let's look at what the registrant's route is. Here, we have a new
route for /registrants. And instead of just iterating over a dictionary like before, we're getting back, let's
see, db.execute of select star from registrants.

- [1:49:57](https://www.youtube.com/watch?v=undefined&t=78597s) So that's literally the


programmatic version of what I just did manually. That gives me back a list of dictionaries, each of which
represents one row in the table. Then, I'm going to render register and start HTML, passing in literally
that list of dictionaries just like using CS50's library in the past.

- [1:50:16](https://www.youtube.com/watch?v=undefined&t=78616s) So let's go and look at these-- that


form. If I go into templates and open up registrants.html, oh, OK, it's just a table like before. And actually,
let me change this syntactically for consistency. We have a Jinja for loop that iterates over each registrant
and for each of them, outputs a table row.

- [1:50:40](https://www.youtube.com/watch?v=undefined&t=78640s) Oh, but this is interesting. Instead


of just having two columns with the person's name and sport, notice that I'm also outputting a full-
fledged form. All right, this is starting to get juicy. So let's actually go back to my terminal window, run
Flask, and actually see what this example looks like now.

- [1:50:57](https://www.youtube.com/watch?v=undefined&t=78657s) Let me reload the page. All right.


In the home page, it looks exactly the same. But let me now register for something. David for ultimate
frisbee, register. Oh, damn it. Let's try this again. David registering for ultimate frisbee, register. OK. So
good thing I have deregister. So this is what it should now look like.

- [1:51:17](https://www.youtube.com/watch?v=undefined&t=78677s) I have a page at the route


called /registrants that has a table with two columns, name and sport, David, ultimate frisbee. But oh,
wait, a third column. Why? Because if I view the page source, notice that it's not the prettiest UI. For
every row in this table, I'm also going to be outputting a form just to deregister that user.

- [1:51:37](https://www.youtube.com/watch?v=undefined&t=78697s) But before we see how that


works, let me go ahead and register Carter, for instance. So Carter will give you basketball. Again,
register. The table grows. Now, let me go back and let's register Emma for soccer. And the table should
grow. Before we look at that HTML, let's go back to my terminal window.
- [1:51:55](https://www.youtube.com/watch?v=undefined&t=78715s) Let's go into SQLite FroshIMS. Let
me go into FroshIMS, and let me open up with SQLite 3 FroshIMS.db. And now do select star from
registrants. And whereas, previously, when I executed this there were zero people, now there's indeed
three. So now we see exactly what's going on underneath the hood.

- [1:52:21](https://www.youtube.com/watch?v=undefined&t=78741s) So let's look at this form now--


this page now. If I want to unregister, deregister one of these people specifically, how do we do this?
Clicking one of those buttons will indeed delete the row from the database. But how do we go about
linking a web page with Python code with a database? This is the last piece of the puzzle.

- [1:52:43](https://www.youtube.com/watch?v=undefined&t=78763s) Up until now, everything's been


with forms and also with URLs. But what if the user is not typing anything in, they're just clicking a
button? Well, watch this. Let me go ahead and sniff the traffic, which you could be in the habit of doing
now. Any time you're curious how a website works, let me go to the Network tab.

- [1:53:01](https://www.youtube.com/watch?v=undefined&t=78781s) And Carter, shall we deregister


you from basketball? Let's deregister Carter and let's see what just happened. If I look at the deregister
request, notice that it's a POST. The status code that eventually came back as 302, but let's look at the
request itself. All the headers there we'll ignore.

- [1:53:21](https://www.youtube.com/watch?v=undefined&t=78801s) The only thing that button


submits, cleverly, is an ID parameter, a key equaling two. What does two presumably represent or map
to? Where did this two come from? It doesn't say Carter, it doesn't say basketball? What is it?
AUDIENCE: The second person who registered. DAVID: The second person that registered.

- [1:53:45](https://www.youtube.com/watch?v=undefined&t=78825s) So those primary keys that we


started talking about a couple of weeks ago, why it's useful to be able to uniquely identify a row in a
table, here is just one of the reasons why. If it suffices for me just to send the ID number of the person I
want to delete from the database, because I can then have code like this.

- [1:54:03](https://www.youtube.com/watch?v=undefined&t=78843s) If I go into app.py and I look at my


deregister route now, the last of them, notice that I got this. I first go into the form, and I get the ID that
was submitted, hopefully. If there was, in fact, an ID, and the form wasn't somehow empty, I execute this
line of code. Delete from registrants where ID equals question mark, and then I plug-in that number,
deleting Carter and only Carter.

- [1:54:29](https://www.youtube.com/watch?v=undefined&t=78869s) And I'm not using his name,


because what if we have two people named Carter, two people named Emma or David? You don't want
to delete both of them. That's why these unique IDs are so, so important. And here's another reason
why. You don't want to store some things in URLs. Suppose we went to this URL, deregister?ID=3.

- [1:54:50](https://www.youtube.com/watch?v=undefined&t=78890s) Suppose I, maliciously, emailed


this URL to Emma. It doesn't matter so much what the beginning is, but supposed I emailed her this URL,
/deregister?ID=3, and I said, hey, Emma, click this. And it uses GET instead of POST. What did I just trick
her into doing? What's going to happen if Emma clicks this? Yeah? AUDIENCE: Deregistering? DAVID: You
would trick her into deregistering herself.
- [1:55:22](https://www.youtube.com/watch?v=undefined&t=78922s) Why? Because if she's logged into
this FroshIMS website, and the URL contains her ID just because I'm being malicious, and she clicked on
it and the website is using GET, unfortunately, GET URLs are, again, stateful. They have state information
in the URLs. And in this case, it's enough to delete the user and boom, she would have accidentally
deregistered herself.

- [1:55:42](https://www.youtube.com/watch?v=undefined&t=78942s) And this is pretty innocuous.


Suppose that this was her bank account trying to make a withdrawal or a deposit. Suppose that this were
some other website, a Facebook URL, trying to trick her into posting something automatically. Here, too,
is another consideration when you should use POST versus GET, because GET requests can be plugged
into emails sent via Slack messages, text messages, or the like.

- [1:56:04](https://www.youtube.com/watch?v=undefined&t=78964s) And unless there's a prompt


saying, are you sure you want to deregister yourself, you might blindly trick the user into being
vulnerable to what's called a cross-site request forgery. A fancy way of saying you trick them into clicking
a link that they shouldn't have, because the website was using GET alone.

- [1:56:21](https://www.youtube.com/watch?v=undefined&t=78981s) All right, any question, then, on


these building blocks? Yeah. AUDIENCE: What do the first thing in the instance of the SQL [INAUDIBLE]
where they have three slashes? What does that mean? DAVID: When three columns, you mean?
AUDIENCE: No, three forward slashes. DAVID: The three forward slashes. I'm not sure I follow.

- [1:56:46](https://www.youtube.com/watch?v=undefined&t=79006s) AUDIENCE: Yeah, so I think it's in


[INAUDIBLE].. DAVID: Sorry, it's in where? Which file? AUDIENCE: It's in [INAUDIBLE] scroll up.
[INAUDIBLE] DAVID: Sorry, the other direction? AUDIENCE: Yeah. DAVID: OK. AUDIENCE: [INAUDIBLE]. So
please scroll a little bit more. DAVID: Keep scrolling more? Oh, this thing.

- [1:57:19](https://www.youtube.com/watch?v=undefined&t=79039s) OK, sorry. This is a URI, it's typical


syntax that's referring to the SQLite protocol, so to speak, which means use SQLite to talk to a file
locally. :// is just like you and I see in URLs. The third slash, essentially, means current folder. That's all. So
it's a weird curiosity, but it's typical whenever you're referring to a local file and not one that's elsewhere
on the internet.

- [1:57:45](https://www.youtube.com/watch?v=undefined&t=79065s) That's a bit of an


oversimplification, but that's indeed a convention. Sorry for not clicking earlier. All right, let's do one
other iteration of FroshIMS here just to show what I was actually doing too, back in the day, was not only
storing these things in CSV files, as I recall. I was also automatically generating an email to the proctor in
charge of the intramural sports program so that they would have sort of a running history of people
registering and they could easily reply to them as well.

- [1:58:10](https://www.youtube.com/watch?v=undefined&t=79090s) Let me go into FroshIMS version


five, which I precreated here, and let me go ahead and open up, say, app.py this time. And this is some
code that I wrote in advance. And it looks a little scary at first glance, but I've done the following. I have
now added the Flask mail library to the picture by adding Flask mail to requirements.

- [1:58:33](https://www.youtube.com/watch?v=undefined&t=79113s) txt and running a command to


automatically install email support for Flask as well. And this is a little bit cryptic, but it's honestly mostly
copy paste from the documentation. What I'm doing here is I'm configuring my Flask application with a
few configuration variables, if you will. This is the syntax for that. app.

- [1:58:51](https://www.youtube.com/watch?v=undefined&t=79131s) config is a special dictionary that


comes with Flask that is automatically created when you create the app appear on line nine, and I just
had to fill in a whole bunch of configuration values for the default sender address that I want to send
email as, the default password I want to use to send email, the port number, the TCP port, that we talked
about last week.

- [1:59:09](https://www.youtube.com/watch?v=undefined&t=79149s) The mail server, I'm going to use


Gmail's smtp.gmail.com server. Use TLS, this means use encryption. So I set that to true. Mail username,
this is going to grab it from my environment. So for security purposes, I didn't want to hard code my own
Gmail username and password into the code. So I'm actually storing those in what are called
environment variables.

- [1:59:27](https://www.youtube.com/watch?v=undefined&t=79167s) You'll see more of these in


problem set nine, and it's a very common convention on a server in the real world to store sensitive
information in the computer's memory so that it can be accessed when your website is running, but not
in your source code. It's way too easy if you put credentials, sensitive stuff in your source code, to post it
to GitHub or to screenshot it accidentally, or for information to leak out.

- [1:59:51](https://www.youtube.com/watch?v=undefined&t=79191s) So for today's purposes, know


that the OS.environ dictionary refers to what are called environment variables. And this is like an out-of-
band, a special way of defining key value pairs in the computer's memory by running a certain command
but that never show up in your actual code. Otherwise, there would be so many usernames and
passwords accidentally visible on the internet.

- [2:00:12](https://www.youtube.com/watch?v=undefined&t=79212s) So I've installed this in advance.


Let me see if I can do this correctly. Let me go over to another tab in just a moment. And here, I have on
my second screen here, John Harvards inbox. It's currently empty, and I'm going to go ahead and register
for some sport as John Harvard here, hopefully.

- [2:00:29](https://www.youtube.com/watch?v=undefined&t=79229s) So let me go ahead and run Flask


run on this version five. Let me go ahead and reload the main screen. Not that one. Let me reload the
main screen here. This time, clearly, I'm asking for name and email. So name will be John Harvard.
[email protected]. He'll register for, how about soccer. Register.

- [2:00:53](https://www.youtube.com/watch?v=undefined&t=79253s) And if I did this correctly, not only


is John Harvard, on his screen, seeing you are registered, but when he checks his email on this other
screen, crossing his fingers that this actually works as a demonstration, and I promise it did right before
class. Horrifying. I don't think there's a mistake this time.

- [2:01:25](https://www.youtube.com/watch?v=undefined&t=79285s) Let me try something over here


real quick, but I don't think this is broken. It wouldn't have said success if it were. I just tried submitting
again, so I just did another you are registered. Oh, I'm really sad right now. AUDIENCE: [INAUDIBLE]
DAVID: What's that? AUDIENCE: Check spam. DAVID: I could check spam, but then it's-- not sure we want
to show spam here on the internet that every one of us gets.
- [2:02:02](https://www.youtube.com/watch?v=undefined&t=79322s) Oh, maybe. Oh! [LAUGHTER AND
APPLAUDING] Thank you. OK. Wow, that was a risky click I worried. All right, so you are registered is the
email that I sent out, and it doesn't have any actual information in it. But back in the day it would have,
because I included the student's name and their dorm and all of the other fields of information that we
asked for.

- [2:02:28](https://www.youtube.com/watch?v=undefined&t=79348s) So let's just take a quick look at


how that code might work. I did have to configure Gmail in a certain way to allow, what they call, less
secure apps using SMTP, which is the protocol used for outbound email. But besides setting these things,
let's look at the register route down here. It's actually pretty straightforward.

- [2:02:45](https://www.youtube.com/watch?v=undefined&t=79365s) In my register route, I validated


the submission just like before. Nothing new there. I then confirmed the registration down here, nothing
new there. All I did was use two new lines of code. And it's this easy to automate the sending of emails. I
apparently have done it too many times, which is why it ended up in spam.

- [2:03:02](https://www.youtube.com/watch?v=undefined&t=79382s) I created a variable called


message. I used a message function that I must have imported higher up, so we'll go back to that. Here's,
apparently, the subject line as the first argument. And the second argument is the named parameter
recipients, which takes a list of emails that should get the confirmation email.

- [2:03:19](https://www.youtube.com/watch?v=undefined&t=79399s) So in brackets, I just put the one


user's email and then mail.send that message. So let's scroll back up to see what message and what mail
actually is. Mail, I think, we saw. Yep, mail is this, which I have as a variable because I followed the
documentation for this library. You simply configure your current app with Mail support, capital M here.

- [2:03:42](https://www.youtube.com/watch?v=undefined&t=79422s) And if you look up here now, on


line seven, here's the new library from Flask mail I imported. Capital Mail, capital Message, so that I had
the ability to create a message and send a mail. So such a simple thing whether you want to confirm
things for users, you want to do password resets. It can be this easy to actually generate emails provided
you have the requisite access and software installed.

- [2:04:04](https://www.youtube.com/watch?v=undefined&t=79444s) And just to make clear that I did


add something here, let me open up my requirements.txt file, and indeed, I have both Flask and Flask-
mail ready to go. But I ran the command in advance to actually do that. All right, any questions, then, on
these examples here? No? All right. So what other pieces might actually remain for us let me flip over
here.

- [2:04:30](https://www.youtube.com/watch?v=undefined&t=79470s) It turns out that a key component


of most any web application nowadays that we haven't touched on yet, but it'll be one of our final
flourishes today, is the notion of a session. And a session is actually a feature that derives from all of the
basics we talked about today and last week, and a session is the technical term for what you and I know
as a shopping cart.

- [2:04:47](https://www.youtube.com/watch?v=undefined&t=79487s) When you go to amazon.com and


you start adding things to your shopping cart, they follow you from page to page to page. Heck if you
close your browser, come back to the next day, they're typically still your shopping cart, which is great for
Amazon because they want your business. They don't want you to have to start from scratch the next
day.

- [2:05:04](https://www.youtube.com/watch?v=undefined&t=79504s) Similarly, when you log into any


website these days, even if it's not an e-commerce thing but it has usernames and passwords, you and I
are not in the habit of logging into every darn page we visit on a website. Typically, you log in once, and
then for the next hour, day, week, year, you stay logged into that website.

- [2:05:21](https://www.youtube.com/watch?v=undefined&t=79521s) So somehow, the website is


remembering that you have logged in. And that is being implemented by way of this thing called a
session, and perhaps a more familiar term that you might know as, and worry about, called cookies. Let's
go ahead and take one more five minute break here. And when we come back, we'll look at cookies,
sessions, and these final features.

- [2:05:40](https://www.youtube.com/watch?v=undefined&t=79540s) All right. So the promise now is


that we're going to implement this notion of a session, which is going to allow us to log users in and keep
them logged in and even implement things like a shopping cart. And the overarching goal here is to build
an application that is, quote unquote, "stateful.

- [2:05:56](https://www.youtube.com/watch?v=undefined&t=79556s) " Again, state refers to


information, and something that's stateful remembers information. And in this context, the curiosity is
that HTTP is technically a stateless protocol. Once you visit a URL, http://something, hit Enter, web page
is downloaded to your browser, like that's it. You can unplug from the internet, you can turn off your Wi-
Fi, but you still have the web page locally.

- [2:06:20](https://www.youtube.com/watch?v=undefined&t=79580s) And yet we somehow want to


make sure that the next time you click on a link on that website, it doesn't forget who you are. Or the
next thing you add to your shopping cart, it doesn't forget what was already there. So we somehow want
to make HTTP stateful, and we can actually do this using the building blocks we've seen thus far.

- [2:06:36](https://www.youtube.com/watch?v=undefined&t=79596s) So concretely, here's a form you


might see occasionally, but pretty rarely, when you log into Gmail. And I say rarely because most of you
don't log into Gmail frequently, you just stay logged in, pretty much endlessly, in your browser. And that's
because Google has made the conscious choice to give you a very long session time, maybe a day, a
week, a month, a year, because they don't really want to add friction to using their tool and making you
log in every darn day.

- [2:07:01](https://www.youtube.com/watch?v=undefined&t=79621s) By contrast, there's other


applications on campus, including some of the CS50 zone, that makes you log in every time. Because we
want to make sure that it's indeed you accessing the site, and not a roommate or friend or someone
maliciously. So once you do fill out this form, how does Google subsequently know that you are you, and
when you reload the page even or open a second tab for your same Gmail account, how do they know
that you're still David or Carter or Emma or someone else? Well, let's look underneath the hood of
what's going on.

- [2:07:29](https://www.youtube.com/watch?v=undefined&t=79649s) When you log into Gmail,


essentially, you initially see a form like this using a GET request. And the website responds like we saw
last week with some kind of HTTP response. Hopefully 200 OK with the form. Meanwhile, the website
might also respond with an HTTP header that, last week we didn't care about, this week, we now do.

- [2:07:49](https://www.youtube.com/watch?v=undefined&t=79669s) Whenever you visit a website, it is


very commonly the case that the website is putting a cookie on your computer. And you may generally
know that cookies can be bad and they track you in some way, and that's both a blessing and a curse.
Without cookies, you could not implement things like shopping carts and log-ins as we know them today.

- [2:08:10](https://www.youtube.com/watch?v=undefined&t=79690s) Unfortunately, they can also be


used for ill purposes like tracking you on every website and serving you ads more effectively and so forth.
So with good comes some bad. But the basic primitive for us, the computer scientist, boils down to just
HTTP headers. A cookie is typically a big number, a big, seemingly random value, that a server tells your
browser to store in memory, or even longer term, store on disk.

- [2:08:35](https://www.youtube.com/watch?v=undefined&t=79715s) So you can think of it like a file


that a server is planting on your computer. And the promise that HTTP makes is that if a server sets a
cookie on your computer, you will represent that same cookie or that same value on every subsequent
request. So when you visit the website like Gmail, they plop a cookie on your computer like this with
some session equals value, some long random value.

- [2:08:58](https://www.youtube.com/watch?v=undefined&t=79738s) One, two, three, A, B, C,


something like that. And when you then visit another page on gmail.com or any other website, you send
the opposite header, not set cookie, but just cookie colon, and you send the exact same value. It's similar
to going to a club or an amusement park where you pay once, you go through the gates once, you get
checked by security once, and then they very often take like a little stamp and say, OK, now you can
come and go.

- [2:09:23](https://www.youtube.com/watch?v=undefined&t=79763s) And then for you, efficiency-wise,


if you come back later in the day or later in the evening, you can just present your hand. You've been
stamped, presumably. They've already-- you've already paid, you've already been searched or whatnot.
And so it's this sort of fast track ticket back into the club, back into the park.

- [2:09:37](https://www.youtube.com/watch?v=undefined&t=79777s) That's essentially what a cookie is


doing for you, whereby it's a way of reminding the website we've already done this, you already asked
me for my username and password. This is my path to now come and go. Now, unlike this hand stamp,
which can be easily copied or transferred or duplicated or kept on over multiple days, these cookies are
really big, seemingly random values, letters and numbers.

- [2:10:00](https://www.youtube.com/watch?v=undefined&t=79800s) So statistically, there's no way


someone else is just going to guess your cookie value and pretend to be you It's just very low probability,
statistically. But this is all it boils down to is this agreement between browser and server to send these
values back and forth in this way. So when we actually translate this, now, to code, let's do something
like a simple login app.

- [2:10:21](https://www.youtube.com/watch?v=undefined&t=79821s) Let me go into a folder I made in


advance today called login. And let me code up app.py and let's take a look in here. So what's going on?
A couple of new things up top. If I want to have the ability to stamp my users hands, virtually, and
implement sessions, I'm going to have to import from Flask support for sessions.
- [2:10:42](https://www.youtube.com/watch?v=undefined&t=79842s) So this is another feature you get
for free by using a framework and not having to implement all this yourself. And from the Flask session
library, I'm going to import Session, capital S. Why? I'm going to configure the session as follows. Long
story short, there's different ways to implement sessions.

- [2:10:57](https://www.youtube.com/watch?v=undefined&t=79857s) The server can store these


cookies in a database, in a file, in memory, in RAM, in other places too. We are telling it to store these
cookies on the server's hard drive. So in fact, whenever you use sessions as you will for problem set nine,
you'll actually see a folder suddenly appear called Flask_session, inside of which are the cookies,
essentially, for any users or friends or yourself who've been visiting your particular application.

- [2:11:23](https://www.youtube.com/watch?v=undefined&t=79883s) So I'm setting it to use the file


system, and I don't want them to be permanent because I want, when you close your browser, the
session to go away. They could be made to be permanent and last much longer. Then I tell my app to
support sessions. And that's it for now. Let's see what this application actually does before we dissect
the code.

- [2:11:40](https://www.youtube.com/watch?v=undefined&t=79900s) Let me go over to my terminal


window, run Flask run, and then let me go ahead and reload my preview URL. Give it a second to kick
back in. Let me go ahead and open my URL. Come on. Oops, let me go ahead. Too long of a break. There
we go. So this website simply has a login form. There's no password, though I could certainly add that
and check for that too.

- [2:12:06](https://www.youtube.com/watch?v=undefined&t=79926s) It just asks for your name. So I'm


going to log in as myself, David, and click Login. And now notice I'm currently at the /login route. But
notice this. If I try to go to the default route, just, slash, which is where most websites live by default,
notice that I magically get redirected to log in. So somehow, my code knows, hey, if you're not logged in,
you're going to /login instead.

- [2:12:27](https://www.youtube.com/watch?v=undefined&t=79947s) Let me type in my name, David,


and click Login. And now notice I am back at slash. Chrome is sort of annoyingly hiding it, but this is the
same thing as just a single slash. And now notice it says you are logged in as David. Log out. What's cool
is notice if I reload the page, it still knows that. If I create a second tab and go to the same URL, it still
knows that.

- [2:12:47](https://www.youtube.com/watch?v=undefined&t=79967s) I could even-- I could keep doing


this in multiple tabs, it's still going to remember me on both of them as being logged in as David. So how
does that work? Especially when I click Log Out, then I get forgotten altogether. All right, so let's see how
this works. And it's some basic building blocks.

- [2:13:04](https://www.youtube.com/watch?v=undefined&t=79984s) Under my /route, notice I have


this. If there is no name in the session, redirect the user to /login. So these two lines together are what
implement that automatic redirection using HTTP 301 or 302 automatically. It's handled for me with
these two lines. Otherwise, show index.html. All right, let's go down that rabbit hole.

- [2:13:25](https://www.youtube.com/watch?v=undefined&t=80005s) What's in index.html? Well, if I


look in my-- let me look in my templates folder for my login demo and look at templates/index.html. All
right, so what's going on here? I extend layout.html, I have a block body, and then I've got some other
syntax. So we haven't seen this yet, but it's more Jinja stuff, which again, is almost identical to Python.

- [2:13:52](https://www.youtube.com/watch?v=undefined&t=80032s) If there's a name in the session


variable, then literally say you are logged in as curly braces session bracket name. And then notice this,
I've got a simple HTML link to log out via /logout. Else, if there is no name in the session, then it
apparently says you are not logged in and it leads me to an HTML link to /login and then end diff.

- [2:14:14](https://www.youtube.com/watch?v=undefined&t=80054s) So again, Jinja does not rely on


indentation. Recall the HTML and CSS don't really care about indentation, only the human does. But in
code with Jinja, you need these end tags, end block, end for, end if, to make super obvious that you're
done with that thought. So session is just this magic variable that we now have access to because we've
included these two lines of code and these that handle that whole process of stamping every user's hand
with a different, unique identifier.

- [2:14:44](https://www.youtube.com/watch?v=undefined&t=80084s) If I made my code space public


and I let all of you visit the exact same URL, all of you would be logged out by default. You could all type
your own names individually, all log in at the same URL using different sessions. And in fact, I would then
see, if I go into my terminal window here and my login directory, notice the Flask session directory I
mentioned.

- [2:15:04](https://www.youtube.com/watch?v=undefined&t=80104s) And if I CD into that and type ls,


notice that I had two tabs open, or actually, I think I started the server twice. I have two files in there. I
would ultimately have one file for every one of you. And that's what's beautiful about sessions is it
creates the illusion of per user storage. Inside of my session is my name, inside of your session, so to
speak, is your name.

- [2:15:26](https://www.youtube.com/watch?v=undefined&t=80126s) And the same is going to apply to


shopping carts, ultimately, as well. Let's see how login works here. My login route supports both GET and
POST, so I could play around if I want. And notice this, this login route is kind of interesting as follows. If
the user got to this route via POST, my inference is that they must have submitted a form.

- [2:15:47](https://www.youtube.com/watch?v=undefined&t=80147s) Why? Because that's how I'm


going to design the HTML form in a second. And if they did submit the form via POST, I'm going to store,
in the session, at the name key, whatever the human's name is. And then, I'm going to redirect them
back to slash. Otherwise, I'm going to show them the login form.

- [2:16:04](https://www.youtube.com/watch?v=undefined&t=80164s) So this is what's cool. If I go to this


login form, which lives at, literally, slash login, by default, when you visit a URL like that, you're visiting
via GET. And so that's why I see the form. However, notice this. The form, very cleverly, submits to itself,
like the one route/login submits to its same self, /login, but it uses POST when you submit the form.

- [2:16:30](https://www.youtube.com/watch?v=undefined&t=80190s) And this is a nice way of having


one route but for two different types of operations or views. When I'm just there visiting /login via URL,
it shows me the form. But if I submit the form, then this logic, these three lines, kick in, and this just
avoids my having to have both an index route and a greet route, for instance.
- [2:16:50](https://www.youtube.com/watch?v=undefined&t=80210s) I can just have one route that
handles both GET and POST. How about logout? What does this do? Well, it's as simple as this. Change
whatever name is in the session to be none, which is Python's version of null, essentially, and then
redirect the user back to slash. Because now, in index.html, I will not notice a name there anymore.

- [2:17:12](https://www.youtube.com/watch?v=undefined&t=80232s) This will be false. And so I'll tell


the user instead, you are not logged in. So like it's-- I want to say as simple as this is, though I realize this
is a bunch of steps involved. This is the essence of every website on the internet that has usernames and
passwords. And we skip the password name step for that, more on that in problem set nine, but this is
how every website out there remembers that you're logged in.

- [2:17:35](https://www.youtube.com/watch?v=undefined&t=80255s) And how this works, ultimately, is


that as soon as you use in Python lines like this and lines like this, Flask takes care of stamping the virtual
hand of all of your users and whenever Flask sees the same cookie coming back from a user, it grabs the
appropriate file from that folder, loads it into the session global variable so that your code is now unique
to that user and their name.

- [2:18:01](https://www.youtube.com/watch?v=undefined&t=80281s) Let's do one other example with


sessions here that'll show how we might use these, now, for shopping carts. Let me go into the store
example here. Let me go ahead and run this thing first. If I run store in my same tab and go back over
here, we'll see a very ugly e-commerce site that just sells seven different books here.

- [2:18:22](https://www.youtube.com/watch?v=undefined&t=80302s) But each of these books has a


button via which I can add it to my cart. All right, well where are these books coming from? Well, let's
poke around. Let me go into my terminal window again. Let me go into this example, which is called
store, and let me open up about index dot ht-- whoops. Let's open up index, how about, books.

- [2:18:48](https://www.youtube.com/watch?v=undefined&t=80328s) html is the default one, not index


this time. So if I look here, notice that that route that we just saw uses a for loop in Jinja to iterate over a
whole bunch of books, apparently, and it outputs, in an H2 tag, the title of the book, and then another
one of these forms. So that's interesting. Let's go back one step.

- [2:19:08](https://www.youtube.com/watch?v=undefined&t=80348s) Let's go ahead and open up


app.py, because that must be-- excuse me, what's ticking all of this off. Notice that this file is importing
session support. It's configuring sessions down here, but it's also connecting to a store.db file. So it's
adding some SQLite. And notice this, in my /route, I'm selecting star from books, which is going to give
me a list of dictionaries, each of which represents a row of books.

- [2:19:34](https://www.youtube.com/watch?v=undefined&t=80374s) And I'm going to pass that list of


books into my books.html template, which is why this for loop works the way it does. Let's look at this
actual database. Let me increase my terminal window and do SQLite of store.db.schema will show me
everything. There's not much there. It's a book-- it's a table called books with two columns, ID and title.

- [2:19:56](https://www.youtube.com/watch?v=undefined&t=80396s) Let's do select star from books


semicolon. There are the seven books, each of which has a unique ID. And you might see where this is
going. If I go to the UI and I look at each of these buttons for add to cart, just like Amazon might have,
notice that each of these buttons is just a form. And what's magical here, just like deregister, even
though I didn't highlight it at the time, there's another type of input that allows you to specify a value
without the human being able easily to change it.

- [2:20:23](https://www.youtube.com/watch?v=undefined&t=80423s) Instead of type equals text or type


equals submit, type equals hidden will put the value in the form but not reveal it to the user. So that's
how I'm saying that the idea of this book is one, the idea of this book is two, the idea of this book is
three, and so forth. And each of these forms, then, will submit, apparently, to /cart using POST and that
would seem to be what adds things to cart.

- [2:20:46](https://www.youtube.com/watch?v=undefined&t=80446s) So let's try this. Let me click on


one or two of these. Let's add the first book, add to cart. Here's my cart. Notice my route change to
/cart. All right, let's go back and let's add the book number two. There we have that one. And let's skip
ahead to the seventh book, Deathly Hallows, and now we have all three books here.

- [2:21:06](https://www.youtube.com/watch?v=undefined&t=80466s) So what does the cart route do


at /cart? Well, let's look. If I go back to my terminal window, look at app.py and look at /cart, OK, there's
a lot going on here, but let's see. So the /cart route supports both GET or POST, which is a nice way to
consolidate things into one URL. All right, this is interesting.

- [2:21:27](https://www.youtube.com/watch?v=undefined&t=80487s) If there is not a, quote unquote,


"cart" key in session, we haven't technically seen this syntax. But long story short, these lines here do
ensure that the cart exists. What do I mean by that? It makes sure that there's a cart key in the session,
global variable, and it's by default going to be an empty list.

- [2:21:46](https://www.youtube.com/watch?v=undefined&t=80506s) Why? That just means you have


an empty shopping cart. But if the user visits this route via POST and the user did provide an ID, they
didn't muck with the form in any way and try to hack into the website, they gave me a valid ID, then I'm
going to use this syntax. If session bracket cart is a list-- recall from a couple of weeks ago that dot
append just adds something to the list.

- [2:22:09](https://www.youtube.com/watch?v=undefined&t=80529s) So I'm going to add the ID to the


list and return the user to cart. Otherwise, if the user is at /cart via GET, implicitly, we just do this. Select
star from books where ID is in. And this might be syntax you recall from Pset six. It lets you look for
multiple IDs all at once, because if I have a list of session-- list of IDs in my cart, I can get all of those
books at once.

- [2:22:34](https://www.youtube.com/watch?v=undefined&t=80554s) So long story short, what has


happened here? I am storing, in the cart, the books that I myself have added to my cart. My browser is
sending the same hand stamp again and again, which is how this website knows that it's me adding
these books to my cart and not you or not Carter or not Emma. Indeed, if all of us visited the same long
URL and I made it public and allowed that, then we would all have our own illusions of our own separate
carts.

- [2:22:58](https://www.youtube.com/watch?v=undefined&t=80578s) And each of those carts, in


practice, would just be stored in this Flask session directory on the server so that the server can keep
track of each of us using, again, these cookie values that are being sent back and forth via these headers.
All right. I know that's a lot, but again, it's just the new Python way of just leveraging those HTTP headers
from last week in a clever way.
- [2:23:22](https://www.youtube.com/watch?v=undefined&t=80602s) Any questions before we look at
one final set of examples? Yeah. AUDIENCE: [INAUDIBLE] understand how a log in has to do with
[INAUDIBLE]?? How does it use [INAUDIBLE],, how do you change [INAUDIBLE]?? Because in order to use
a GET request dot [INAUDIBLE] equals, there has to be an exchange in [INAUDIBLE]..

- [2:23:43](https://www.youtube.com/watch?v=undefined&t=80623s) DAVID: So I think you're asking


about using the GET and POST in the same function. So this is just a nice aesthetic, if you will. If I had to
have separate routes for GET and POST, I mean, it literally might mean I need twice as many routes in my
file. And it just starts to get a little annoying. And these days, too, in terms of user experience, this is
maybe only appeals to the geek in us, but having clean URLs is actually a thing.

- [2:24:09](https://www.youtube.com/watch?v=undefined&t=80649s) You don't want to have lots of


words in the URL, it's nice if the URLs are nice and succinct and canonical, if you will. So it's nice if I can
centralize all of my shopping cart functionality in /cart only, and not in multiple routes, one for GET, one
for POST. It's a little nitpicky of me, but this is commonly done here.

- [2:24:30](https://www.youtube.com/watch?v=undefined&t=80670s) So what this code here means is


that this route, this function, henceforth will support both GET requests and POST requests. But then I
need to distinguish between whether it's GET or POST coming in. Because if it's a GET request, I want to
show the cart. If it's a POST request, I want to update the cart.

- [2:24:47](https://www.youtube.com/watch?v=undefined&t=80687s) And the simplest way to do that is


just to check this value here. In the request variable that we imported from Flask up above, you can
check what is the current type of request. Is it a GET, is it a POST, or is it something else altogether?
There are other verbs. If it's a POST, that must mean, because I created the web form that uses POST,
that the user clicked the Add to Cart button.

- [2:25:11](https://www.youtube.com/watch?v=undefined&t=80711s) Otherwise, if it's not POST, it's


implicitly going to be logically GET. Then, I just want to show the user the contents of the cart and I use
these lines instead. So it's just one way of avoiding having two routes for two different HTTP verbs. You
can combine them so long as you have a check like this.

- [2:25:29](https://www.youtube.com/watch?v=undefined&t=80729s) If I really wanted to be pedantic, I


could do this elif request.method=get. This would be more symmetric, but it's not really necessary,
because I know there's only two possibilities. Hope that helps. All right, let's do one final set of examples
here that's going to tie the last of these features together to something that you probably see quite
often in real-world applications.

- [2:25:56](https://www.youtube.com/watch?v=undefined&t=80756s) And that, for better or for worse,


is now going to involve tying back in some JavaScript from last week. The goal at hand of these examples
is not to necessarily master how you yourself would write the Python code, the SQL code, the JavaScript
code, but just to give you a mental model for how these different languages work.

- [2:26:11](https://www.youtube.com/watch?v=undefined&t=80771s) So that for final projects,


especially if you do want to add JavaScript functionality, much more interactive user interface, you at
least have the bare bones of a mental model for how you can tie these languages together. Even though
our focus, generally, has been more on Python and SQL than on JavaScript from last week.
- [2:26:28](https://www.youtube.com/watch?v=undefined&t=80788s) Let me go ahead and open up an
example called shows, version zero of this. And let me do Flask run. And let me go into my URL here and
see what this application looks like by default. This has just a simple query text box with a search box.
Let's take a look at the HTML that just got sent to my browser.

- [2:26:47](https://www.youtube.com/watch?v=undefined&t=80807s) All right, there's not much going


on here at all. So there's a form whose action is /search. It's going to submit via GET. It's going to use a q
parameter, just like Google it seems, and submit it. So this actually looks like the Google form we did last
week. So let's see what goes on here.

- [2:27:04](https://www.youtube.com/watch?v=undefined&t=80824s) Let me search for something like


cat. Enter. OK, so it looks like-- all right, so this is actually a somewhat familiar file. What I've gone ahead
and done is I've grabbed all of the titles of TV shows from a couple of weeks ago when we first
introduced SQL, and I loaded them into this demo so that you can search by keyword for any word you
want.

- [2:27:23](https://www.youtube.com/watch?v=undefined&t=80843s) I just searched for cat. If we were


to do this again, we would see all the title of TV shows that contain D-O-G, dog, as a substring
somewhere and so forth. So this is a traditional way of doing this. Just like in Google, it uses /search?
q=cat, q=dog, and so forth. How does that work? Well, let's just take a quick look at app.py here.

- [2:27:45](https://www.youtube.com/watch?v=undefined&t=80865s) Let me go into my zero example


here, show zero, and open up app.py and see what's going on. All right, very simple. Here's the form,
that's how we started today. And here is the /search route. Well, what's going on here? This gets a little
interesting. So I first select a whole bunch of shows by doing this.

- [2:28:06](https://www.youtube.com/watch?v=undefined&t=80886s) Select star from shows, where


title like question mark. And then I'm using some percent signs from SQL on both the left and the right,
and I'm plugging in whatever the user's input was for q. If I didn't use like and I used equal instead, I
could get rid of these curly brace, these percent signs, but then it would have to be a show called cat or
called dog as opposed to it being like cat or like dog.

- [2:28:32](https://www.youtube.com/watch?v=undefined&t=80912s) This whole line returns to me a


list of dictionaries, each of which represents a show in the database. And then, I'm passing all of those
shows to a template called search.html. So let's just follow that breadcrumb, let's open up shows dot--
sorry, search.html. All right, so this is where templating gets cool.

- [2:28:50](https://www.youtube.com/watch?v=undefined&t=80930s) So I just passed back hundreds of


results, potentially, but the only thing I'm outputting is an unordered list and using a Jinja for loop and li
tag containing the titles of each of those shows. And just to prove that this is indeed a familiar data set
and I actually simplified it a bit, if I look at shows.

- [2:29:10](https://www.youtube.com/watch?v=undefined&t=80950s) db with SQLite, I threw away all


the other stuff like ratings and actors and everyone else and I just have, for instance, select star from
shows limit 10, just so we can see 10 of them. There's 10 of the shows from that database. So that's all
that's in the database itself. So it would look like this is a pretty vanilla web application.
- [2:29:30](https://www.youtube.com/watch?v=undefined&t=80970s) It uses GET, it submits it to the
server, the server spits out a response, and that response, then, looks like this, which is a huge number
of li tags, one for each cat or one for each dog match. But everything else comes from a layout.html. All
the stuff at the top and at the bottom. All right, so these days, though, we're in the habit of seeing
autocomplete.

- [2:29:51](https://www.youtube.com/watch?v=undefined&t=80991s) And you start typing something


and you don't have to hit Submit, you don't have to click a button, you don't have to go to a new page.
Web applications, nowadays, are much more dynamic. So let's take a look at this version one of this
thing. Let me go into shows one and close my previous tabs and run Flask run in here.

- [2:30:10](https://www.youtube.com/watch?v=undefined&t=81010s) And it's almost the same thing,


but watch the behavior change a little bit. I'm reloading the form, there's no button now. So gone is the
need for a submit button. I want to implement autocomplete now. So let's go ahead and type in C. OK,
there's every show that starts with C.

- [2:30:26](https://www.youtube.com/watch?v=undefined&t=81026s) A, there's every show that has C-A


in it, rather. T, there's every show with C-A-T in it. I can start it again and do dog, but notice how
instantaneous it was. And notice my URL never changed, there's no /search route, and it's just
immediate. With every keystroke, it is searching again and again and again.

- [2:30:44](https://www.youtube.com/watch?v=undefined&t=81044s) That's a nice UX, user experience,


because it's immediate. This is what users are used to these days. But if I look at the source code here,
notice that in the source code, there's just an empty UL by default but there is some fancy JavaScript
code. So let's see what's going on here. This JavaScript code is doing the following.

- [2:31:04](https://www.youtube.com/watch?v=undefined&t=81064s) Let me zoom in a little bit more.


This JavaScript code is first selecting, with query selector, which you used this past week, quote unquote
"input, " all right, so that's just getting the text box. Then it's adding an event listener to that input for
the input event. We didn't talk about this last week, but literally, when you provide any kind of input by
typing, by pasting, by any other user interface mechanism, it triggers an event called input.

- [2:31:30](https://www.youtube.com/watch?v=undefined&t=81090s) So similar to key press or key up. I


then have a function, no worries about this async function for now. Then what do I do inside of this? All
right, so this is new, and this is the part that let's just focus on the ideas and not the syntax. JavaScript,
nowadays, comes with a function called fetch that allows you to GET or POST information to a server
without reloading the whole page.

- [2:31:50](https://www.youtube.com/watch?v=undefined&t=81110s) You can sort of secretly do it


inside of the page. What do I want to fetch? slash search question mark q equals whatever the value of
that input is. When I get back a response, I want to get the text of that response and store it in a variable
called shows. And I'm deliberately bouncing around, ignoring special words like await and await here,
but for now, just focus on what came back.

- [2:32:11](https://www.youtube.com/watch?v=undefined&t=81131s) A response came back from the


server, I'm getting the text from it, storing it in a variable called shows. What am I then doing? I'm using
query selector to select my UL, which is empty by default, and I'm changing its inner HTML to be equal to
the shows that came back from the server. So let's poke around.
- [2:32:30](https://www.youtube.com/watch?v=undefined&t=81150s) Here's where, again, developer
tools are quite powerful. Let me go ahead and reload this page to get rid of everything. And let me now
open up inspect. Let me go to the Network tab and let's just sniff the traffic going between my browser
and server. I'm going to search for C.

- [2:32:49](https://www.youtube.com/watch?v=undefined&t=81169s) Notice that immediately triggered


an HTTP request to /search?q=c. So I didn't even finish my cat thought, but notice what came back. A
bunch of response headers, but let's actually click on the raw response. This is literally the response from
the server, just a whole bunch of li tags. No UL, no HTML, no title, no body, nothing. Just li tags. And we
can actually simulate this.

- [2:33:13](https://www.youtube.com/watch?v=undefined&t=81193s) Let me manually go to that same


URL, q=c, Enter. We are just going to get back-- whoops, sorry. slash search q equals c, we are just going
to get back this stuff, which if I view source, it's not even a complete web page. The browser is trying to
show it to me as a complete web page with bullets, but it's really just partial HTML.

- [2:33:34](https://www.youtube.com/watch?v=undefined&t=81214s) But that's perfect, because this is


literally what I essentially want my Python code to copy paste into the otherwise empty UL tag. And
that's what this JavaScript code then, here, is doing. Once it gets back that response from the server, it's
using these lines of code to plug all of those li's into the UL after the fact.

- [2:33:56](https://www.youtube.com/watch?v=undefined&t=81236s) Again, changing the so-called


dom. But there's a slightly better way to do this because, honestly, this is not the best design. Because if
you've got a hundred shows or more, you're sending all of these tags unnecessarily. Why do I need to
send all of these stupid HTML tags? Why don't I just create those when I'm ready to create them? Well,
here's the final flourish.

- [2:34:16](https://www.youtube.com/watch?v=undefined&t=81256s) Whenever making a web


application nowadays, where client and server keep talking to one another, Google Maps does this,
Gmail does this, literally every cool application nowadays you load the page once and then it keeps on
interacting with you without you reloading or having to change the URL. Let's actually use a format
called JSON, JavaScript Object Notation, which is to say there's just a better, more efficient, better
designed way to send that same data.

- [2:34:42](https://www.youtube.com/watch?v=undefined&t=81282s) I'm going to go into shows two


now and do Flask run. And I'm going to go back to my page here. The user interface is exactly the same,
and it still works exactly the same. Here's C, C-A, C-A-T, and so forth. But let's see what's coming back
now. If I go to /search?q=cat, Enter, notice that I get this crazy-looking syntax.

- [2:35:09](https://www.youtube.com/watch?v=undefined&t=81309s) But the fact that it's so compact is


actually a good thing. This is actually going to-- let me format it a little nicer, well, or a little worse. This is
what's called JavaScript Object Notation. In JavaScript, an angle-- a square bracket means here comes an
array. In JavaScript, a curly bracket says here comes an object, AKA, a dictionary.

- [2:35:33](https://www.youtube.com/watch?v=undefined&t=81333s) And you might recall from-- did


we do? Yes, sort of recall that you can now have keys and values in JavaScript notation using colons like
this. So long story short, cryptic as this is to you and me and not very human friendly, it's very machine
friendly. Because for every title in that database, I get back its ID and its title, its ID and its title, its ID and
its title.

- [2:36:02](https://www.youtube.com/watch?v=undefined&t=81362s) And this is a very generic format


that an API, an application programming interface, might return to you. And this is how APIs, nowadays,
work. You get back very raw textual data in this format, JSON format, and then you can write code that
actually programmatically turns that JSON data into any language you want, for instance, HTML.

- [2:36:22](https://www.youtube.com/watch?v=undefined&t=81382s) So here's the third and final


version of this program. I, again, select my input. I, again, listen for input. I then, when I get input, call
this function. I fetch slash search q equals whatever that input was, C or C-A or C-A-T. I then wait for the
response, but instead of getting text, I'm calling this other function that comes with JavaScript these
days, called JSON, that just parses that.

- [2:36:47](https://www.youtube.com/watch?v=undefined&t=81407s) It turns it into a dictionary for me,


or really a list of dictionaries for me, and stores it in a variable called shows. And this is where you start
to see the convergence of HTML with JavaScript. Let me initialize a variable called HTML to nothing,
quote unquote, using single quotes, but I could also use double quotes.

- [2:37:04](https://www.youtube.com/watch?v=undefined&t=81424s) This is JavaScript syntax for a loop.


Let me iterate over every ID in the show's list that I just got back in the server, that big chunk of JSON
data. Let me create a variable called Title that's equal to the shows-- the title of the show at that ID. But
for reasons we'll come back to, let me replace a couple of scary characters.

- [2:37:26](https://www.youtube.com/watch?v=undefined&t=81446s) Then let me dynamically add to


this variable, an li tag, the actual title, and a close li tag. And then very lastly, after this for loop, let me
update the ULs in our HTML to be the HTML I just created on the fly. So in short, don't worry too much
about the syntax because you won't need to use this unless you start playing with more advanced
features quite soon.

- [2:37:49](https://www.youtube.com/watch?v=undefined&t=81469s) But what we're doing is, with


JavaScript, we're creating a bigger and bigger and bigger string of HTML containing all of the open
brackets, the li tags, the closed brackets, but we're just grabbing the raw data from the server. And so in
fact in problem set nine, you're going to use a real world third party API, application programming
interface, for which you sign up.

- [2:38:08](https://www.youtube.com/watch?v=undefined&t=81488s) The data you're going to get back


from that API is not going to be show titles, but actually stock quotes and stocks ticker symbols and the
prices of last-- at which stocks were last bought or sold, and you're going to get that data back in JSON
format. And you're going to write a bit of code that's then going to convert that to the requisite HTML on
the page.

- [2:38:28](https://www.youtube.com/watch?v=undefined&t=81508s) So the final result here is literally


the kind of autocomplete that you and I see and take for granted every day, and that's ultimately how it
works. HTML and CSS are used to present the data, your so-called view. Python might be used to send or
get the data on the backend server. And then lastly, JavaScript is going to be used to make things
dynamic and interactive.
- [2:38:49](https://www.youtube.com/watch?v=undefined&t=81529s) So I know that's a whole bunch of
building blocks, but the whole point of problem set nine to tie everything together, set the stage for
hopefully a very successful final project. Why don't we go ahead and wrap up there, and we'll see you
one last time next week for emoji.

You might also like