Binary Conversion
Binary Conversion
Byte 8 bits
1. Alice has 600 MB of data. Bob has 2000 MB of data. Will it all fit on Alice's 4 GB thumb drive?
2. Alice has 100 small images, each of which is 500 KB. How much space do they take up overall in MB?
3. Your ghost hunting group is recording the sound inside a haunted classroom for 20 hours as MP3 audio files. About how
much data will that be, expressed in GB?
2.
4000 bytes=____________KB
3.
6000 bytes=___________KB
4.
75000 bytes=____________KB
5.
26000 bytes=_________KB
6.
360,000 bytes=_____KB
7.
2,000,000 bytes=________________MB
8.
50,000,000,000 bytes=_________________GB
Overview
In this lesson students are introduced to the standard units for measuring the sizes of digital files, from a single byte, all the way up to
terabytes and beyond. Students begin the lesson by comparing the size of a plain text file containing “hello” to a Word document with the
same contents. Students are introduced to the units kilobyte, megabyte, gigabyte, and terabyte, and research the sizes of files they make
use of every day, using the appropriate terminology. This lesson foreshadows an investigation of compression as a means for combatting
the rapid growth of digital data.
Purpose
The simple purposes of this lesson are:
The 8-bit byte has become the de-facto fundamental unit with which we measure the “size” of data on computers, and in fact, today most
computers only let you save data as combinations of whole bytes; even if you only want to store 1 bit of information, you have to use a
whole byte to do it. And many computer systems will require you store even more than that. Messages sent over the Internet are also
typically structured as messages with byte-offsets.
Paralleling the explosion of computing power and speed, the sheer size of the digital data now created and consumed every day is
staggering. Units of measure (terabytes) that previously seemed unfathomably large are now making their way into personal computing.
This rapid growth of digital data presents many new opportunities and also poses new challenges to engineers and programmers. The
implications of so-called Big Data will not be investigated until later in the course, but it's good and interesting to be thinking about the size
of things now.
Agenda
Getting Started (10 mins)
Review worksheet
Foreshadow Compression
Assessment
Objectives
Students will be able to:
Preparation
You should verify that you know how to look at the sizes of files on computers that your students are using (see activity).
For the getting started activity might want a Word processing program (such as MS Word) and plain text editor (such as
Notepad or TextEdit) open and ready.
The teaching remarks and content corners in this lesson contain lots of little bits of history that you might choose to share at
various points in the lesson.
Links
Heads Up! Please make a copy of any documents you plan to share with students.
Support
Lesson Forum
Report a Bug
Teaching Guide
Getting Started (10 mins)
Content Corner
Why is a Byte 8 bits?: The 8-bit byte was not always standard. Computers used many different "byte" sizes over the course of history,
depending on hardware and how addressable memory worked. However, much of the early computing world relied on representing data
and computer instructions encoded in ASCII text where every character is 8 bits. Thus, 8-bits was such a common chunk-size for
representing information that it stuck and they gave it its own name - byte.
There are various accounts about why it was called a “byte” but most point to early days at IBM where “bite” was used to to refer to groups
of 8-bits that a computer was processing, as in it could “bite” off 8 bits at time. The spelling was changed to “byte” to avoid confusion with
“bit”.
Bytes became the fundamental unit with which we measure the “size” of data on computers, and in fact, today most computers only let you
save data as combinations of whole bytes; even if you only want to store 1 bit of information, you have to use a whole byte to do it.
Remarks
As we start a new unit about Data and Digital Information we need to get familiar with terminology about data and different types of data
files.
Vocabulary: Recall that a single character of ASCII text requires 8 bits. The technical term for 8 bits of data is a byte.
A byte is the standard fundamental unit (or “chunk size”) underlying most computing systems today. You may have heard "megabyte",
"kilobyte", "gigabyte", etc. which are all different amounts of a bytes. We're going to learn more about them today.
If a single ASCII character is one byte then if we were to store the word “hello” in a plain ASCII text file in a computer, we would expect it to
need 5 bytes (or 40 bits) of memory.
What about a Microsoft Word document that contains the single word "hello"? How many more bytes will a Word document require to store
the word “hello” than a plain text document?
Discuss: Have students silently make their prediction, then share with a partner, then share with the group. Prompt a couple students to
share why they chose the size they did.
Teaching Tip
Try a Live Demo: If you wish, it might be more fun to create these files in front of your students, saving them on the desktop for a quick
demo. To make a plain ASCII text file you’ll need to use the correct program:
Demonstrate: Do a live demo where you show the size of the different files. Here are some files you can download to use.
Content Corner
NOTE: A 5-byte file is so small that some computers won't allocate a chunk of memory that small. For example you might see something
like this:
Which indicates that even though the file is 5 bytes, it's taking up 4 Kilobytes of memory on your computer.
To find the actual size of a file on your computer, do one of the following:
In general, the Word Doc should be thousands of times larger than the plain text. For the files above:
hello.txt - 5 bytes
hello.docx = 21,969 bytes
Remarks
The big difference in file size between .txt and .docx is due to the extensive formatting information included along with the actual text in
.docx. Modern data files typically measure in the thousands, millions, billions or trillions of bytes. Let's get a little practice looking at files
and how big they are.
Activity (30 mins)
Content Corner
There are some discrepancies in common usage of the kilo, mega, giga prefixes.
It's convenient within the computer to organize things in groups of powers of 2. For example, 210 is 1024, and so a program might group
1024 items together, as a sort of "round" number of things within the computer. The term "kilobyte" above refers to this group size of 1024
things. However, people also group things by thousands -- 1 thousand or 1 million items.
There's this problem with the word "megabyte" .. does it mean 1024 * 1024 bytes, i.e. 220 which is 1,048,576, or does it mean exactly 1
million, 1000 * 1000. It's just a 5% difference, but marketers tend to prefer the 1 million, interpretation, since it makes their hard drives etc.
appear to hold a little bit more. In an attempt to fix this, the terms "kibibyte" "mebibyte" "gibibyte" "tebibyte" have been introduced to
specifically mean the 1024 based units (see wikipedia kibibyte article). These terms do not seem to have caught on very strongly thus far.
If nothing else, remember that terms like "megabyte" have this little wiggle room in them between the 1024 and 1000 based meanings. For
purposes of CS Principles the distinction is not important - "about a million bytes" is a fine, close-enough interpretation for "megabyte".
Teaching Tip
Finding Solutions: Note that answers to 3 of the 6 questions on the activity guide can be found on the Stanford CS 101 page linked to in
the activity guide.
Perfect accuracy is not important for some sections in this activity, but using the correct terminology and achieving a rough estimate of size
(one million bytes vs. one billion) is important. Encourage students to practice using terms like megabyte, gigabyte, and terabyte to gain
comfort with them.
Has questions and space for students to write answers to questions like:
There are 6 practice questions on the 2nd page of the activity guide.
Wrap-up
Review worksheet
Share: Provide students an opportunity to clear up any remaining confusion and share interesting pieces of information they came across.
Foreshadow Compression
Teaching Tip
Time Saving Tip: Time permitting you could do the warm up activity from the next lesson (Text Compression) here. That warm up activity
asks students to write down common abbreviations they use when sending text messages to friends and family, and then asks why they
do that. The answer is compression: to save time and space.
Remarks
As you have seen data file size can grow very quickly in size. In the modern world there is a lot of data around us and usually we want it
transmitted over the internet.
There is a problem though: If you want to transmit a lot of data you are limited by the speed of your internet connection. Even if you have a
fast Internet connection there is a physical limit to how fast you can transmit bits.
What if the data you want to send is big enough that it takes an unreasonable amount of time to transmit it, even with a really fast internet
connection. Assuming you can't make the Internet connection any faster, could you still transmit the data faster somehow?
The answer is yes and it's probably something you've done, or do every day!
Assessment
Standards Alignment
CT - Computational Thinking
2.1 - A variety of abstractions built upon binary sequences can be used to represent all digital data.3.3 - There are trade
offs when representing information as digital data.
Below is a list of each of the accepted disk drive space values. It is important to realize that not all manufacturers and
developers list their value using binary, which is base 2. For example, a manufacturer may list a product's capacity as
one gigabyte (1,000,000,000 bytes, a metric value) and not 1,073,741,824 bytes (gibibyte) that it actually is. For this
page, we are using the "common names" and listing all values in base 2.
Note
All values are listed as whole numbers, which means a GB shows it can only contain one 650 MB CD. Technically, 1
GB could hold 1.5753 CDs worth of data, but this document isn't meant to show you how many "parts" of an object a
value can hold. Therefore, we are omitting decimal values. More plainly, you can only fit one complete 650 MB CD on a
1 GB drive since two full 650 MB discs exceed 1 GB.
Tip
Except for a bit and a nibble, all values explained below are in bytes and not bits. For example, a kilobyte (KB) is
different than a kilobit (Kb). When referring to storage, bytes are used whereas data transmission speeds are measured
in bits.
Bit
Nibble
A nibble is 4 bits.
Byte
Today, a byte is 8 bits.
Kilobyte (KB)
2 or 3 paragraphs of text.
Megabyte (MB)
Gigabyte (GB)
1 650 MB CD.
Terabyte (TB)
40 25 GB Blu-ray discs.
Petabyte (PB)
A petabyte is 1,125,899,906,842,624 (250) bytes, 1,024 terabytes, 1,048,576 gigabytes, or 1,073,741,824 megabytes.
Exabyte (EB)
An exabyte is 1,152,921,504,606,846,976 (260) bytes, 1,024 petabytes, 1,048,576 terabytes, 1,073,741,824 gigabytes,
or 1,099,511,627,776 megabytes.
Zettabyte (ZB)
Yottabyte (YB)