Uteach Cs Principles
Uteach Cs Principles
Uteach Cs Principles
of Contents
Welcome to Computer Science Principles
UNIT 1: Computational Thinking
PROJECT: Password Generator Project
Password Generator Project
TOPIC: Algorithmic Thinking
Problem Solving
Building Blocks
Problem Solving
Flow Patterns
Flow Patterns
Flowcharts
Algorithms
CODING SKILLS: Encryption
Encryption
TOPIC: Cybersecurity
Confidentiality
Integrity
Availability
Social Engineering
BIG PICTURE: Electronic Voting
Electronic Voting
TOPIC: Programming Languages
Grammar, Vocabulary, and Syntax
Clarity and Ambiguity
Artificial Languages
High-level vs. Low-level Languages
Idea to Execution
Idea to Execution
CODING SKILLS: Pseudocode
Pseudocode
Pseudocode for the AP Exam
TOPIC: Solvability and Performance
Decidability and Efficiency
Moore's Law
Heuristics
Distributed Computing
PROJECT: Password Generator Project
Password Generator Project: Rubric Check
UNIT 2: Programming
2
PROJECT: Scratch Programming Project
Scratch Programming Project
BIG PICTURE: The Who, What, and Why of Programming
Who Programs?
Why Program?
What is a Program?
TOPIC: Visual Programming
Welcome to Scratch
Personal Scratch Pages
The Cat's Meow
Save Early and Often
Programming with Blocks
Experimenting with "play" Blocks
Different Ways to Broadcast
Remixing Scratch Projects
Remixing Scratch Projects
CODING SKILLS: Choreography Notation
Let's Dance
Choreography
Animated Movie
TOPIC: Program State
User Input
Stored State
User Input and Interaction
Show Me Your State
Text Input
Variables
Variables
Names Are Important
Game of Tag
Randomness
Custom Variables
Drawing Commands
Reviewing Variables
TOPIC: Selection Statements
“if...else" Statements
Decisions
How Many Days...?
Switching and Nesting
Quiz Show
Quiz Show
PROJECT: Scratch Programming Project
3
Scratch Project: Documentation
TOPIC: Repetition
Repeat
Repeat
Repeat After Me
Tempo
Regular Polygon Generator
Repeat Until
Repeat Until
Conditional Loops Compared
Draw a "Squiral"
Loops and Variables
Loops and Variables Mini-Project
Option I: Draw a Picture
Option II: Electronic Keyboard
Option III: Countdown
TOPIC: Procedures
Abstraction
Procedures
Procedures in Scratch
CODING SKILLS: Rock, Paper, Scissors
Rock, Paper, Scissors
PROJECT: Scratch Programming Project
Scratch Project: Rubric Check
UNIT 3: Data Representation
PROJECT: Unintend'o Controller Project
Unintend'o Controller Project
TOPIC: Binary Encoding of Information
Binary
What Is Binary?
Twenty Questions
State Space
High/Low Guessing
Base Conversions
The Amazing Binari
Binary Finger Counting
ASCII vs. Unicode
Alphanumeric Representation
Digital Scavenger Hunt
Reading and Writing in ASCII
Unicode vs. ASCII
CODING SKILLS: Binary Birthday Cake
4
Binary Birthday Cake
Floating Point Numbers
TOPIC: Hardware Abstraction
Logic Gates and Hardware Abstraction
PROJECT: Unintend'o Controller Project
Unintend'o Project: Binary Mapping
TOPIC: Digital Approximations
Digitization
Variable vs. Fixed-Width Encodings
Analog vs. Digital Data
Approximating Physical Media
Digital vs. Analog
Perfect Copies
Digital Copies
Perfect Imperfection
BIG PICTURE: Legality of Reselling Digital Music
Reselling Digital Music
PROJECT: Unintend'o Controller Project
Unintend'o Project: Programming
TOPIC: Lists
Making a List
Making a List
Weird Cases in Lists
Reading a List
Processing a List
Processing a List
Index Variables
Remove from a List
Sentences as Lists
Sorting a List
Swaps
Reorder
Lists in Action
MultiSet to Set
PROJECT: Unintend'o Controller Project
Unintend'o Project: Rubric Check
Unintend'o Project: Gallery Walk
UNIT 4: Digital Media Processing
PROJECT: Image Filter Project
Image Filter Project
TOPIC: Procedural Programming
Introduction to Processing
5
Writing Code
Scratch vs. Processing
Scratch Constructs Revisited
Punctuation
Drawing
Draw Shapes
Draw a Figure
Debugging with println()
Mouse Interaction
Movement
Animate Your Figure
Keyboard Interaction
Keyboard Input
Loops
TOPIC: Image Manipulation
RGB Color
Calculating Colors
Hexadecimal
Color
Raster Images
Raster Images
Eliminating Digital Noise
Raster Image Manipulation
Raster Image Manipulation
Encoding Schemes
Digital Image File Extensions
Encoding Schemes
Picture Logic
Manipulating Digital Images
Filters
TOPIC: Audio Manipulation
Digital Audio
Digitizing Audio
Audio Generation
Audio Processing
Post-Processing Audio
Audio Processing
Audio Compression
X Marks the Spot
Compression Algorithms
BIG PICTURE: Ethics of Digital Manipulation
Original or Manipulated
6
Original or Manipulated (Answers)
Ethics of Digital Manipulation
Creative Commons
PROJECT: Image Filter Project
Image Filter Project: Rubric Check
UNIT 5: Big Data
PROJECT: TEDxKinda Project
TEDxKinda Project
TOPIC: Data Science
Introduction to Big Data
What Is Big Data?
Exploring US Employment Data
Usability and Usefulness of Data
Usability vs. Usefulness
PROJECT: TEDxKinda Project
TEDxKinda Project: Topics
TEDxKinda Project: Big Data Sets
TEDxKinda Project: Tools
TOPIC: Data Aggregation
Collection
Big Data Collection
Creating Structure from Unstructured Data Sets
Digitizing Business Cards
Extraction
The Internet's Data Structure
Spiderbots
Fetching Flutter-bys
Storage
Indexing Julius Caesar
Data Persistence
Your Filter Bubble
Privacy vs. Utility
BIG PICTURE: Data Breaches
Data Breaches
My Data Rules
TOPIC: Data Analysis
Statistical Analysis
Statistical Analysis
Justin Who?
Data Mining
Data Mining
Association Rule Mining
7
Clustering
TEDxKinda Project: Identifying Clusters
Anomaly Detection
Outliers
TEDxKinda Project: Identifying Outliers
Regression
TEDxKinda Project: Making Predictions
Classification and Summarization
Classify Me
TEDxKinda Project: Automated Summarization
TOPIC: Data Visualization
Interactive Infographics
BIG PICTURE: Wisdom of the Crowd
ReCAPTCHA
Crowdsourcing
TOPIC: Models and Simulations
Models and Simulations
PROJECT: TEDxKinda Project
TEDxKinda Project: Rubric Check
UNIT 6: Innovative Technologies
PROJECT: Future Technology Project
Future Technology Project
TOPIC: Everyday Computing
Social Networking and Communication
Social Networking
Models of Sharing
Search, Wikis, Commerce, and News
Essential Services
Search
Wikis
Commerce
Cloud Computing
Cloud Computing
Ownership of Cloud Data
The Digital Divide
The Digital Divide
TOPIC: The Internet
Network Infrastructure
Network Infrastructure
Communication Protocols
Internet Protocol
Domain Name System
8
World Wide Web
World Wide Web
BIG PICTURE: Net Neutrality
Net Neutrality
TOPIC: Interconnectedness in Computing
Internet of Things
Ethics of Autonomous Technology
PROJECT: Future Technology Project
Future Technology Project: Rubric Check
AP CSP Course Assessments
Explore Performance Task
Guidelines
Chief Reader Report
Student Examples
Sample Response A
Sample Response B
Sample Response C
Sample Response D
Sample Response E
Create Performance Task
Guidelines
Chief Reader Report
Definitions
Student Examples
Sample Response A
Sample Response B
Sample Response C
Sample Response D
Sample Response E
Sample Response F
Sample Response G
Sample Response H
Sample Response I
Sample Response J
UNIT A1: Artificial Intelligence: Turing Test
Artificial Intelligence: Turing Test Project
Chatterbots
What Is a Chatterbot?
Black-Box Testing Chatterbots
The “Humanity” of AI
Consider L’il Johnny McPixel
L’il Johnny McPixel’s Humanity
9
Artificial Intelligence: Turing Test Project Draft
Programming Intelligence
Creating Intelligence
Artificial Intelligence: Turing Test Experiment
Beyond Text
Experimenting with Visual Identification
Multi-modal Approaches
Dasher
Difficulties with Text
Ambiguity
Ambiguity Rocks
Artificial Intelligence: Project
Artificial Intelligence: Turing Test: Trials
BETA UNIT 6: Innovative Technologies
PROJECT: Protyping the Future Project
Future Technology Project
TOPIC: The Implications of Computing
Social Networking
Search
Wikis and Commerce
Cloud Computing
Ownership of Cloud Data
The Digital Divide
TOPIC: The Internet
Network Infrastructure
Communication Protocols
Internet Protocol
Domain Name System
TOPIC: Cryptography
Public Key Encryption
Open Vs Closed Standards
Steganography
BIG PICTURE: Net Neutrality
Net Neutrality
TOPIC: Interconnectedness in Computing
World Wide Web
Internet of Things
Ethics of Autonomous Technology
PROJECT: Prototyping the Future Project
Prototyping the Future Project: Rubric Check
10
11
UTeach CS Principles
Welcome to UTeach Computer Science Principles
https://www.youtube.com/embed/DTF3fM--XM0
Throughout this course, you will be introduced to this amazing world and the many ways that
computer science has helped to shape nearly every aspect of your life. Whether it is the cell
phone in your pocket, the game console connected to your television, the self-checkout
register at the store, the robot-assisted surgery that saves your life, or the self-driving car
that brings you to school, we are surrounded by the products made possible by centuries’
worth of technological advances in math, science, logic, and design.
Humans have long striven to develop the knowledge and skills needed to harness the varied
resources of our world. With those resources, we have created tools and other technological
artifacts that have allowed us to manage the challenges and complexities of life. Whether it
was spears and stone axes to help us hunt or ink and paper to help us record our thoughts
for posterity, technological advances have shaped the ways that individuals have related to
the world around them and enabled us to achieve the seemingly impossible.
By the mid-20th century, these advances entered a new arena and the age of digital
computing was born. The advent of the modern computer and its modular design, stored-
program architecture, and ability to perform complex computations with both speed and
accuracy opened the doors to a wealth of incredible innovations and fundamentally changed
the way that we interact with others and the world around us. It is truly a remarkable time to
live in, both as a consumer of digital computing and as an innovator of the new ideas that
will shape tomorrow’s digital world.
innovation
The process of imagining something that does not yet exist, but that
has potential value, and making it real through the application of
design, implementation, and production.
12
Over the next several units, you will be introduced to a number of innovations in computing
and digital media that have come to form the backbone of nearly all of our online and offline
interactions. As you explore each new component of this digital landscape, you should get in
the habit of asking yourself questions like the following:
As this course guides you through the art and science of digital technology and helps you
develop robust computational thinking skills, you will become a master of innovation and be
fully prepared to thrive in our digital world!
13
UTeach CS Principles Unit 1: Computational Thinking
UNIT 1
Computational Thinking
First, you will explore a number of techniques for analyzing common problems
and visualizing their solutions. They will use these techniques to investigate a
number of real-world applications, such as searching, sorting, and encryption.
Next, you will examine how programmers utilize various levels of abstraction in
the languages that they use to write programs and communicate their intentions
in a form that can be executed by a computer. Finally, you will turn your
attention to the question of whether various problems are solvable and
investigate the factors that affect the efficiency of a solution to a given problem.
14
UTeach CS Principles Unit 1: Computational Thinking
UNIT PROJECT:
Password Generator
Highlights
15
UTeach CS Principles Unit 1: Computational Thinking
Password Generator Project
“Choosing a hard-to-guess, but easy-to-remember password is
important!”—Kevin Mitnick
Video link
https://www.youtube.com/embed/qpxLusH4quY
Password Vulnerabilities
In March 2013, the popular online note-taking service, Evernote, issued a Security Notice
alerting users of a “service wide password reset” that they were enfocing as a result of a
“coordinated attempt to access secure areas of the Evernote Service.”
Fortunately, no personal information or data was breached and that by issuing a required
password reset, Evernote was merely taking proactive steps in an abundance of caution.
However, such widespread password leaks and security breaches are becoming all too
common as online services play a larger and larger role in our digital lives. Several times
every year, the news reports of yet another hack of a popular site or a leak of passwords and
other sensitive user data.
In its announcement, Evernote offered its users advice to follow that could help to ensure
their data on any site.
16
Never use the same password on multiple sites or services
Never click on ‘reset password’ requests in e-mails—instead go
directly to the service
The first two bullet points on this list relate to the two most likely mistakes that users make
when choosing a password. They relate to the dilemma of trying to balance the security of
complex passwords with the convenience of having a single, easy-to-remember password.
Unfortunately, despite their concerns over privacy, people often choose convenience over
security. Let’s take a closer look at two ways that users often leave themselves vulnerable to
attack.
The goal of any password should be to choose something that is difficult to guess, especially
when some automated systems are capable of making millions of guesses per second.
People use a number of techniques to increase the complexity of their passwords, such as
mixing upper- and lowercase letters, substituting digits and punctuation for letters,
appending extra characters or numbers, etc. Unfortunately, while these efforts might
increase the effort required to guess them, it also increases the difficulty of remembering
them.
A popular xkcd comic by Randall Munroe addressed the issue of how difficult it is to crack a
password vs. how easy it is to remember by attempting to measure password strength in
terms of “bits of entropy.”
Unfortunately, as security expert Bruce Schneier notes in his article, “Choosing Secure
Passwords,” despite Munroe’s logic, his suggested solution is actually quite vulnerable to
attack due to its reliance upon common dictionary words that are easy to guess by brute
force.
So, just how does one create a password that is strong but memorable? What steps could
17
you take to create a secure password that is convenient for you to use? Discuss some ideas
with your partner and be prepared to share your suggestions with the class.
There are many reasons why users might reuse the same password for every site or service
they visit, with the most obvious being that it is simply easier to remember one password
than it is to remember many different passwords. It is common sense. But it is also highly
insecure.
The problem is that a password is meant to be a secret credential that you use to identify
yourself to someone else. This method of authentication relies on the basic assumption that
there is a one-to-one relationship between knowing the password and having the right to
access an account. That is, in theory, only the owner of an account can provide the secret
piece of information (i.e., the password) that can confirm the individual’s identity. However, if
any other, third party should ever possess that secret information, then the initial assumption
fails as knowing the password is no longer guarateed to be limited to the account owner.
The real problem lies in the fact that in order for someone else (e.g., Facebook, Amazon,
your bank) to be able to authenticate you, they need to know what the secret piece of
information is that only you can tell them—which itself violates the first assumption and
invalidates your use of that password with any other party.
For example, imagine you sign up with Facebook and set your password to be
fuzzybunny123 . Now you know your password. And Facebook knows your password. So
far, so good. Whenever you visit Facebook, you can prove to the site that you are you by
identifying yourself with the fuzzybunny123 password that only you and Facebook know.
However, you also sign up for an account with some other social sharing site, like
InstaChatOmatic, and you use the same fuzzybunny123 password that you use with
Facebook. Now, three different parties all know the same information that, in theory, only you
should know. A malicious employee, rogue software, or database leak at InstaChatOmatic
can result in an unknown party attempting to access your Facebook account using your
InstaChatOmatic password. And since the passwords are the same, Facebook will
authenticate the intruder as if it is you because they will have provided that information that
supposedly only you should have known.
In short, by reusing the same password with more than one service, you’ve undermined the
security of your password at both sites—all in the name of making it easier for you to
remember. This is flawed thinking. Passwords are meant for security and the strength of that
security should be prioritized above anything else that undermines it.
The important thing to note about this approach is that rather than trying to remember an
obscure collection of odd and difficult to guess letters, digits, and punctuation, one needs
only to remember a personalized phrase or other mnemonic that will remind you how to
easily reconstruct the password.
And if you customize the key phrase to match the site or service, a single, simple set of rules
can be created that will allow you to easily reconstruct the password anytime you visit the
site. For example, consider the following phrase/password scheme:
The single algorithm that you would need to memorize for this scheme would be something
like the following:
What are the advantages to this particular algorithm? Are there any problems or
weaknesses to this algorithm? What other algorithms can you think of that might work
better? Be prepared to discuss your ideas with the class. (Of course, if you have a really
clever idea, you might want to save that one for yourself and only discuss your rejected
ideas.)
Assignment
Once you’ve designed your solution, write out each step of your password-generating
algorithm in some form of pseudocode. No specific format is required for your algorithm, but
your pseudocode should be clear enough and detailed enough that anyone who is not
familiar with how your algorithm is supposed to work can still follow along and apply its steps
in generating a valid password.
Submission
Your submission will be in the form of a written algorithm (i.e., pseudocode) that explicitly
states each of the discrete steps and decisions that must be made in generating a valid
password. Also, you must provide at least five examples of passwords that your algorithm
would generate for five different sites. One of those examples must be thoroughly annotated,
showing how each step of the algorithm contributes to the final password.
Learning Goals
Over the course of this module and this project, you will learn to:
20
encode and decode messages using common cryptographic techniques
examine a number of common threats to cybersecurity, including distributed denial of
service attacks (DDoS), phishing, viruses, and social engineering
examine the implications of Moore’s Law on the research and development of new and
existing technologies
Rubric
Content
Area Performance Quality
Algorithm is Algorithm is Algorithm has Not enough
typed, organized and formatting and criteria are
organized, and nicely organization met in order
nicely formatted for that makes it to award any
formatted for easy use, but somewhat credit.
easy use. is not typed. difficult to use
AND is not
—OR— typed.
Readability
Algorithm is —OR—
typed, but the
formatting and Algorithm may
organization be typed, but
makes it the formatting
somewhat and
difficult to use. organization
makes it
extremely
difficult to use.
—OR— —OR—
21
The algorithm
generates a
unique
password for
all sites,
however, it is
not
reproducible.
There are five There are four There are Not enough
sample sample three or fewer criteria are
passwords passwords sample met in order
Examples generated generated passwords to award any
correctly correctly generated credit.
based on the based on the correctly
algorithm. algorithm. based on the
algorithm.
—OR—
There is one
annotated
example
documented at
most steps of
the process,
but the
organization
and formatting
make it difficult
to follow.
22
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
23
UTeach CS Principles Unit 1: Computational Thinking
UNIT TOPIC:
Algorithmic Thinking
Problem Solving
Flow Patterns
You will identify and examine a number of common features of algorithms, including
sequencing, selection, and repetition.
24
UTeach CS Principles Unit 1: Computational Thinking
UNIT TOPIC:
Algorithmic Thinking
Problem Solving
25
UTeach CS Principles Unit 1: Computational Thinking
Building Blocks
People Programs
One of the major goals of this unit is to help you develop a better understanding of the
methods and techniques computer scientists rely on to construct simple and elegant
solutions to potentially complex problems.
Today, in order to give you a feel for what this is like, you’ll be participating in a group
exercise in which you’ll write and execute your first program. However, most of you have not
learned any programming languages yet, so for now, we’ll stick with using a language that
everybody in your group likely does know—English. And instead of executing your programs
on an actual computer (which probably wouldn’t understand English as well as people do),
you and your group will role-play the parts of a simulated computer as you attempt to
execute your program in much the same way a real computer would run a real program. And
if programs that are run by computers are called computer programs, then it only seems
appropriate that our programs today that are run by people should be called people
programs.
Instructions
Working in groups of four, each team member should assume one of the roles listed below.
Each group will be given a construction site (grid with directions and “X,” “A,” and “B” spaces
marked), a bag of building blocks, and a set of criteria describing what they are to build.
Each group should write/type a list of detailed instructions that the Supervisor can read off to
the Supplier, Worker, and Inspector so that the required set of towers is properly
constructed.
Roles
Supervisor: This is the “construction foreperson” whose job it is to follow the steps in the
plan exactly, ask “Yes/No” questions of the Inspector, make decisions when appropriate, and
tell the Worker and Supplier what to do. The Supervisor can be thought of as being
analagous to the “processor” within a computer.
Supplier: This individual is responsible for supplying and disposing of the raw materials (i.e.,
blocks) for construction. Think of the Supplier as an “input/output device” like a keyboard,
mouse, or computer display.
26
When instructed by the Supervisor, can randomly remove a single block from the
supply bag and place it on the “Loading Zone” (X).
When instructed by the Supervisor, can remove a single block from the “Loading Zone”
(X) and randomly place it back into the supply bag.
Worker: This individual is solely responsible for manipulating, moving, and placing the
construction materials into their designated locations. The Worker can be seen as
performing the role of a “bus” that transfers data between components inside a computer.
Inspector: This is the “answer person” of the construction team who is responsible for giving
the Supervisor essential feedback about materials and the status of constructions. The
Inspector functions like a computer’s “logic unit” (part of the ALU—Arithmetic Logic Unit).
When asked a question by the Supervisor, can only answer “YES” or “NO” about the
block(s) in one or both of the two “Inspection Areas” (A, B).
Cannot answer questions about anything not in one of the two “Inspection Areas” (A,
B).
May not communicate anything else.
If at any point, a question cannot be answered strictly “YES” or “NO,” then the question
is considered faulty. Stop the construction process, go back to the drawing board, and
rethink the set of instructions your team has written for the Supervisor.
27
UTeach CS Principles Unit 1: Computational Thinking
Problem Solving
The Big Problem
From the time you wake up in the morning to the time you go to sleep at night, you face
problems of one sort or another throughout the day. Whether it’s the challenge of dragging
yourself out of a cozy bed when you just want to keep sleeping, deciding what to wear for
the day, making sure you arrive at each of your classes on time and with the proper
materials, completing your assigned homework, or carrying on a texting conversation with
your best friend, everything you do can be seen as a problem that you must solve.
Every task that you deal with on a day-to-day basis, no matter how large or small, requires
that you perform a specific set of actions and make a particular set of decisions in very
precise ways. And if you manage to perform the right sequence of steps in just the right
order, you’ll succeed in achieving what you set out to accomplish. That is, you’ll have solved
your problem.
Executing Solutions
For example, the very act of reading this paragraph right now is a problem that you are
currently solving, likely without even realizing that you’re solving a problem. Just a few
moments ago, you had no idea what this paragraph was about, what it says, or what you’re
supposed to be taking away from it. But as you started reading it, your mind began
subconsciously following an algorithm that you were taught many years ago for solving just
this very type of problem. First, your eyes focused on the upper left-hand corner of the block
of text (i.e., where it says “For example"). Then your eyes scanned left to right across the
text, noting the spaces that separated the various words and then decoding the meaning of
each of those words as you encountered it. When your eyes reached the right-hand margin
of a line of text, they dropped down to the next row and quickly repositioned themselves
back on the left-hand margin. You then repeated this left-to-right scanning/decoding process
line by line until you reach the end of the paragraph. And when you reach that point, without
even thinking about how you were doing it, you will have actually solved the “big problem” of
reading and, hopefully, comprehending this paragraph. Congratulations!
Generalizing Problems
Also note that the algorithm you learned for reading the above paragraph (as well as this
and every other paragraph) was a generalized algorithm. It wasn’t specific to the paragraph
above. Instead, that single algorithm of scanning through the text of a paragraph word-by-
word and line-by-line is perfectly applicable to any paragraph. Imagine how difficult life would
be if it only worked for a single paragraph of text and that reading every other paragraph
required its own unique set of instructions.
Instead, the mark of a good problem solver is the ability to use abstraction to generalize a
variety of related problems and develop a common solution that can be applied to solve any
28
of them. Much of computer science involves identifying these generalized solutions,
codifying them in a programming language, and then letting a computer do all the work of
executing your plan.
Breaking it Down
Big problems are BIG! They’re hard. They’re challenging. They can be overwhelming. If they
weren’t, we wouldn’t call them “big problems.” But, how does one solve a “big problem"?
The easy answer is that you don’t. DO NOT TRY TO SOLVE A BIG PROBLEM! At least not
directly.
Smaller problems are much easier to handle. They’re simple. They’re easy. They’re trivial.
Or at least they can be if you just make them small enough.
In fact, the best way to solve a big problem is to recognize that every big problem is really
just a combination of many smaller problems. And those smaller problems, themselves,
might be made up of even smaller problems that can be broken down even further until all
you’re left with are a bunch of trivially simple problems that you already know how to solve.
And here’s the important thing to realize: If you can solve all of the small problems, you’ll
have solved the big problem!
Building Towers
The challenge is to take a good, long look at your big problem and try to reimagine it as a
series of smaller problems. More than likely, you and your group members intuitively used
some very clever problem-solving strategies despite whether you were aware of it.
For example, in the previous “Building Blocks” activity, you were tasked with building a
series of precisely located towers, each with a specific set of properties. Building multiple
towers simultaneously (big problem) is much more challenging than building a single tower
(small problem). So, we can redefine the original problem (hard) as being equivalent to
building a sequence of individual towers (easy).
But even though building a single tower is an easier problem to solve than building multiple
towers, it’s still a pretty big problem in and of itself. Perhaps we can break it down into even
smaller problems.
29
Place 2nd block
...
Place Mth block
For each tower we want to build, we can think of that as merely placing a series of properly
oriented blocks. But how might we break down the task of placing a block?
For each block, we only need to supply it, orient it, and move it into its final position. But
“Supply a block to X” is one of the two actions the Supplier already knows how to do! And
moving a block to a desired location is something the Worker already knows how to do! Two
of the three steps to “Place a block” involve built-in functionality and there’s no need to
redefine either of those steps any further. But, unfortunately, neither the Supplier, Worker,
nor Inspector already knows how to “Orient a block,” so that’s still a “big problem”—break it
down further!
Again, the first and third steps of this operation involve manipulating the block, which is built-
in functionality that the Worker can already do. But in order to “Determine the current
orientation of a block,” we’ll need to ask a number of carefully chosen questions.
And since the Inspector already knows how to answer each of these Y/N questions, we can
stop here. There’s no need to break this big problem down any further. Instead, we can start
constructing a solution.
For example, depending on the particular building criteria, the above analysis and
redefinition of each problem into simpler tasks might result in a typical solution similar to the
30
following:
31
UTeach CS Principles Unit 1: Computational Thinking
UNIT TOPIC:
Algorithmic Thinking
Flow Patterns
You will identify and examine a number of common features of algorithms, including
sequencing, selection, and repetition.
32
UTeach CS Principles Unit 1: Computational Thinking
Flow Patterns
Familiar Patterns
The algorithmic solutions developed by programmers and problem solvers can take on a
virtually infinite number of diverse forms. They can be brief, requiring only a few operations
to implement or they can be massive and complex, spanning millions of lines of code and a
team of programmers to keep it all straight. But despite this range of diversity, all of these
algorithms are characterized by the same basic elements of sequencing, selection, and
iteration.
A good problem solver can learn to recognize these three features and easily use them to
construct simple, effective, and elegant solutions to the problems they might want/need to
solve.
With repeated practice throughout this course, you’ll develop a trained eye that can quickly
recognize each of these common patterns within your everyday actions. In fact, if you stop to
take a closer look at a few of the common, everyday tasks that you perform all the time,
you’ll likely discover that you already make use of these patterns without even noticing.
In the next few units, we’ll see that all programming languages are designed to make it easy
for programmers to write code that uses these three basic patterns.
Sequencing
The simplest of all of the patterns consists of each step of an algorithm following the
previous step. When the steps of an algorithm are sequentially arranged, each instruction is
executed one at a time in the order specified by the algorithm.
1) put on pants
2) put on left shoe
3) put on right shoe
4) put on shirt
One of the most important things to recognize about the sequencing of instructions is that it
enables one to explicity state the order in which certain instructions are executed.
Depending on the situation, sometimes an instruction needs to precede one or more other
instructions in the same algorithm. This is often the case if a “later” step is dependent upon
the successful completion of an “earlier” step.
In the above example, put on pants is sequenced to come before put on left
shoe and put on right shoe which makes sense, since it would be difficult to put on
pants after putting on shoes.
33
However, just because sequencing imposes an order on which instructions are executed
doesn’t mean that order is important or even necessary.
Again, in the previous example, the order of put on left shoe versus put on right
shoe probably doesn’t matter. That is, the algorithm could probably work just as well with
the right shoe preceding the left shoe. Similarly, put on shirt , which is currently the last
step, could probably have been placed anywhere in the algorithm, either before or after the
pants and shoes, without affecting its correctness.
Selection
Algorithms that consist only of sequencing will always perform exactly the same, regardless
of external conditions, which is often good for designing specific solutions to specific
problems.
However, most problems would benefit from a more generalized solution that can
dynamically adapt to changes in conditions. More specifically, problems that handle variable
data, such as user-supplied input, oftentimes need to execute different sequences of
instructions depending on the particular conditions.
The classic metaphor used for this type of situation is an “if... else...” sort of structure.
34
Literally (since most languages actually use the keyword if ), the algorithm asks a question
(specifically a “YES/NO” question) and the answer to that question is used to determine
which path of the algorithm to execute and which path of the algorithm to skip.
1) if wearing pants...
2) ...put on left shoe
3) ...put on right shoe
4) else...
5) ...put on pants
In the above example, the if condition at Step 1 creates a fork in the algorithm that
enables one of two possible sequences of instructions to be executed. If the condition
wearing pants is true (i.e., the pants have already been put on at some earlier time in
the algorithm), then it’s safe to put on one’s shoes. In that case, Steps 2 and 3 ( put on
left shoe and put on right shoe ) will be executed while Step 5 ( put on
pants ) will be skipped. However, if the condition was false (i.e., not wearing pants), then
someone better put on some pants! In that case only Step 5 will be executed while Steps 2
and 3 will be skipped.
It’s important to note that the nature of an if/else question, or condition, always allows
for only two possible answers (e.g., yes/no, true/false, etc.) because each answer is used to
select one of the two specified sequences – the first sequence/branch if, and only if, the
condition is true or, alternatively, the second sequence/branch if, and only if, the condition if
false.
If more than two branches are ever needed, they can be achieved by “nesting” multiple
if/else clauses together. That is, one or both of the two branches can contain its own internal
if/else clause. For example:
1) if wearing pants...
2) ...if wearing shoes
3) ...put on shirt
4) ...else...
5) ...put on left shoe
6) ...put on right shoe
7) else...
8) ...put on pants
Note that in the first and second situations, Step 8 will be ignored completely because of the
pants-wearing status. Then, depending on the shoes-wearing status, either Step 3 will be
executed or Steps 5 and 6 will be executed. Likewise, in the third situation (no pants!), the
35
condition about shoes is irrelevant as Step 2 (and its subsequent Steps 3-6) will be ignored.
Iteration
The third common pattern that you’ll see in most algorithms, especially with large and
complex solutions, is repetition – more properly referred to as iteration.
For example, consider the very simple act of hammering a nail into a block of wood:
1) grab a hammer
11)
...
N-1)
When you look at it this way, the repetition becomes immediately obvious. Steps 2–4 are
repeated again as Steps 5–7 and Steps 8–10, etc. Each of these repeated sections of steps
is called an iteration. If we were to continue this algorithm, what do you think Steps 11–13
will be? What about Steps 14–16? 17–19? How many iterations will we need? When will
these instructions stop? When should they stop?
And more importantly, if we ever needed to actually write out these detailed instructions for
someone else, like maybe a robot, to follow, even a simple task like hammering a nail might
become extremely long and tedious to write. But we don’t normally think of tasks like this in
such overwhelming ways. Instead, we simplify the tasks in our mind. That is, we apply
abstraction (a topic we’ll cover in more detail throughout this course).
For example, in the real world, if you were actually driving a nail into a block of wood, you’d
know to repeatedly swing that hammer over and over until you knew it was time to stop (i.e.,
when the nail head was flush with the wood surface).
Now think about that. You intuitively know to repeat a certain set of instructions ("Raise,”
“Aim,” “Swing") until a certain condition is met ("Nail head is flush with the wood surface").
And that condition that helps you decide whether to repeat those “Raise-Aim-Swing”
instructions one more time or to stop and put the hammer down is just like the condition we
saw above in the section on selection.
36
This familiar pattern of repeating sections of instructions until a condition is met can be seen
as a special case of selection in which one of the branches loops back to test the condition
again.
Putting this into words, we might use a phrase like Go back to Step #__ to indicate
that the flow of a program should not continue sequentially to the next step, but should
instead jump back to an earlier step.
In the above example, whatever action is being performed in Step 2 will continue to be
repeated over and over again until the problem is solved. Notice how Step 3 explicitly states
that the algorithm should go back to Step 1 where the condition (i.e., “Does the problem still
exist?”) can be tested again. The condition then determines whether another iteration is
needed or if the loop can be exited and the program can continue on with whatever comes
after this loop. That is, Step 4 is like an implicit “else” branch of an “if/else” clause. It will only
be executed when the “if” condition in Step 1 is false.
Indefinite Loops
The previous example is an example of an “indefinite loop”—a form of iteration in which the
number of times that the loop will execute is not explicitly stated in the algorithm. In these
types of problems, we can use a repeat until notation to specify that the loop should
continue indefinitely until a particular condition is reached.
1) repeat until...
2) ...
37
Definite Loops
1) repeat N times...
2) ...
Combinations
Finally, it’s important to note that most algorithms are not limited to only one of these three
common flow patterns. In fact, most programs make use of a combination of these three
patterns. Oftentimes, a number of selection blocks are arranged together in sequence. Or,
as we saw earlier in the hammer/nail example, a sequential series of steps may make up the
body of the loop (i.e., the instructions that keep getting repeated).
38
UTeach CS Principles Unit 1: Computational Thinking
Flowcharts
Flowcharts
When attempting to describe the structure and
operation of an algorithm, it is often useful to
visualize its components and their relationships
to one another. Simple diagrams can be
constructed that depict each step of the
algorithm (usually represented by rectangles,
diamonds, etc.) interconnected with arrows
showing the general flow of the process, hence
the name, flowchart. You’ve most likely seen
and/or used flowcharts for years in your daily
life, but perhaps you didn’t consider them
analog “programs” of sorts. Like programs,
flowcharts serve different purposes and have
different qualities.
First, as a class, develop general guidelines for the flowchart. Flowcharts should contain a
variety of components, such as a minimum number of steps, decisions, and inputs/outputs.
Brainstorm an idea for your flowchart with a partner quickly, then begin diagramming the
flowchart. Focus on the content and function of the flowchart, but consider the usability of
the flowchart, too. Be effective and efficient with your design, test it out (perhaps others will
help you), and do your best.
For example, this flowchart illustrates how to sing the Beatles’ song Hey Jude (lyrics).
However, it is imperfect, like many flowcharts and programs.
Consider that as you design your own flowchart, and be prepared to share, illustrate, and
discuss your work afterward.
39
40
UTeach CS Principles Unit 1: Computational Thinking
Algorithms
Recipes for Success
While computer science encompasses a lot more than just writing code, the process of
constructing and expressing a detailed sequence of well-specified operations is a key
component of harnessing the raw, computing power of our digital world.
As problem solvers, computer scientists and programmers must be able to develop careful
and precise algorithmic solutions to whatever problem they are faced with and then
communicate those solutions to other individuals and/or machines with clarity and 100%
assurance that the instructions not only solve the problem, but that they do so efficiently.
Just like with a cook following a recipe in a cookbook, an algorithm is intended to spell out
the step-by-step process for turning an available set of raw data (i.e., the ingredients) into a
working solution (i.e., the finished dish). Each instruction represents one incremental step in
that transformation and the collective total of all instructions are carefully orchestrated and
arranged to perform a very specific, and oftentimes complex, task.
But even non-programmers who might never write a program in their life can benefit from the
ability to think computationally. Algorithms, with their logical and structured order, enable
individuals to more effectively organize their thoughts and manage the complexity of modern
life in this digital age.
In fact, much of this course centers on the task of helping you develop the skills to analyze a
task at hand, identify what the real problem might be, recognize what you have to work with,
and design an effective and efficient solution.
To help get you started, we’ll examine a number of well-known solutions to common, data-
oriented problems that computer scientists regularly have to deal with, such as searching
and sorting algorithms. While each of the examples we’ll look at are well-known solutions,
they also provide an excellent look into the processes and techniques that are used in
designing such solutions—techniques that you will learn to master and use in designing your
own algorithms.
A well-written algorithm includes all of the relevant information needed to properly implement
(i.e., turn ideas into code) and execute (i.e., turn code into actions) a given solution. As such,
it is critical that the algorithm be communicated in some form that ensures that the individual
steps of the algorithm are perfectly understood and executed exactly the way they were
intended.
1) put on pants
2) put on left shoe
3) put on right shoe
4) put on shirt
Both algorithms contain much the same information, but the second version is much more
clearly expressed as a sequence of understandable instructions than the first version. Let’s
examine a few of the features that the second version has that make it more useful as an
algorithm than the first version.
1) Format: The first version merely lists four words in succession, but doesn’t make clear
that this is an algorithm consisting of four discrete steps that must be performed. In the
second version, each individual step is written on its own line, helping to make it visually
clear each step is separate and unique from the other steps.
3) Descriptive qualifiers: The first version only lists two shoes, but does not make any
distinction between the left shoe and the right shoe. Maybe it doesn’t matter whether you
interpret the instructions as “left shoe, right shoe” or “right shoe, left shoe.” But it probably
does matter if you interpret it as “left shoe, left shoe.” However, the second version of the
algorithm eliminates this possible confusion by explicitly stating which shoe should be put on
at each step in the process.
It is important to understand that every step of an algorithm is critically important and that
they must be executed exactly as intended. If a step is unimportant, it would just be left out
of the algorithm in the first place. The fact that it’s there means it needs to be there. And it
needs to be executed exactly as it was intended. If, for example, an algorithm for getting
from Point A to Point Z says to turn left at Point F and you decide to turn right instead, it is
42
highly unlikely that the remaining instructions will get you to your intended destination
because you’ve deviated from the intended plan laid out by the algorithm.
In all of these cases, it’s important that whoever (or whatever, in the case of computers)
reads an algorithm be able to understand with 100% certainty exactly what was intended by
a given algorithm.
Fortunately, algorithms can be expressed in many different visual and/or textual formats,
including flow diagrams, pseudocode, mathematical notation, programming languages, etc.
Each of these formats dictates its own specific style and syntactic structure that is designed
to ensure that any algorithm can be expressed in an understandable and clearly
unambiguous way. Throughout this course, we’ll examine a number of these formats for
writing algorithms that are clear and readable.
43
UTeach CS Principles Unit 1: Computational Thinking
CODING SKILLS:
Encryption
Highlights
You will identify the needs and applications of cryptography in our digital world.
You will analyze the differences between symmetric (single-key) encryption and
asymmetric (public key) encryption.
You will examine the mathematical foundation of cryptography.
You will encode and decode messages using common cryptographic techniques.
44
UTeach CS Principles Unit 1: Computational Thinking
Encryption
The Need for Secure Communication
For more than three millennia, humans have sought ways to protect and secure information
so that it is only available to a limited set of select individuals. Whether they were ancient
military leaders seeking to keep their battle plans out of enemy hands or modern-day
financial investors trading stocks online, there is an unlimited number of scenarios in which
access to information needs to be restricted.
One way to ensure that information is not accessible to others is to completely destroy the
information. Imagine writing a private message down on a sheet of paper and then
shredding the page into a million tiny pieces. The information contained within the original
message will be all but lost, ensuring that no bystander might piece it all back together
again. Unfortunately, it also means that you will also not be able to reconstruct your data,
rendering this an undesirable solution to the problem.
Instead, the real problem is to obscure the data, rather than destroy it. If only there were a
way to scramble the information in some way that renders it unreadable to anyone who does
not know how to unscramble it. This is the ultimate goal of cryptology, the study of securing
(or encrypting) information such that it is inaccessible by third parties.
This hypothetical scenario is a stylized version of what happens every time you make a
private phone call, access a web page, or send an email or text message. You, like Alice,
are reliant upon the computers, routers, online services, and other parts of the network
infrastructure to deliver the data of your message to your intended recipient. Unfortunately,
like with Eve, you have no oversight of these components and cannot guarantee that an
interested third party is not accessing the private data that you are transmitting.
Caesar Cipher
One of the earliest and simplest attempts at encryption is the Caesar cipher, employed by
Julius Caesar in the 1st century BC. This schema is known as a substitution cipher because
it substitutes each letter of the original, unencrypted message (called the plaintext) with a
corresponding letter in the final, encrypted message (called the ciphertext).
The Caesar cipher works by aligning two alphabets against one another and offsetting them
45
by a number of positions. Caesar himself used a “left rotation” of three spaces, causing an
a of the plaintext to align with an x in the ciphertext.
Plaintext: abcdefghijklmnopqrstuvwxyz
Ciphertext: xyzabcdefghijklmnopqrstuvw
Try playing around with an interactive demo of the Caesar cipher to see how messages can
be encrypted and decrypted.
For example, if you enter a plaintext message of this is a caesar cipher and set
the offset to be 23 (left shift of 3), then clicking on “Encipher Plaintext,” will produce the
following ciphertext:
Decoding Exercise
See if you can decipher these messages. Note that you might need to try a number of
different offsets to find the right key to decrypt the message back into an intelligible plaintext.
PGGTFU CZ POF
LUJYFWAPVU PZ MBU
Keys
Notice how in order to decrypt each of these messages, you must know three things:
The last of these items, the offset, serves as the key that effectively locks and unlocks the
message. Using the key to encrypt the message into a ciphertext secures the message and
protects it from prying eyes, much like locking the message in a box or safe. Likewise, you
use the key to properly align the alphabets and unlock the message in order to read the
46
original plaintext.
In the earlier scenario, as long as Bob knows which key Alice used to encrypt her message,
he can use the same key to decrypt the message once he receives it. However, without
being told which key was used, Eve cannot decrypt the message as easily as either Bob or
Alice.
Obviously, as you have previously seen for yourself, a Caesar cipher’s key can be easily
deduced through simple trial and error. After all, there are only 25 possible keys and it is
easy to try each one through brute force.
Today, modern encryption schemes are far more sophisticated than those used in Julius
Caesar’s day and they use far more complex keys that are much harder to guess through
brute force techniques. But the model used more than 2000 years ago is still more or less
the one we use today.
The only difference between the ancient Roman times and now is that computational
processing power has made Eve’s job much easier and the need for stronger encryption
algorithms and keys much greater.
Encoding Exercise
Using the Caesar cipher simulator, create five new ciphertexts of your own and exchange
them with a partner. See who can decrypt the other’s five messages first.
Consider designing a more sophisticated algorithm that would be more secure than the
Caesar cipher. What sequence of steps could you perform to securely encode your message
that would make it harder for an “Eve” to crack your code?
47
UTeach CS Principles Unit 1: Computational Thinking
UNIT TOPIC:
Cybersecurity
Highlights
48
UTeach CS Principles Unit 1: Computational Thinking
Cybersecurity
Cybersecurity
A Caesar cipher may be sufficient for obscuring notes passed in class, but modern
technology makes circumventing such simple algorithms quite easy. After all, there are 25
possible offsets to attempt in a simple Caesar cipher based on the English alphabet. More
sophisticated algorithms were developed centuries ago, such as the Vigenère cipher—which
effectively rotates the Caesar cipher offset used to encrypt each new letter in a text.
Traditionally, the CIA Triad defines the target areas when developing a secure system. CIA is
an initialism representing the following concepts:
Although the field of information security has grown to include other areas of focus, such as
authentication, these three core concepts remain central.
Confidentiality
Confidentiality is the ability to limit access to information to a certain set of users.
Confidentiality is what most people equate with cybersecurity, and at the heart of
confidentiality protocols lies encryption. We examined encryption—through the Caesar
cipher—as an example application of an algorithm. Although the basic algorithm of the
Caesar cipher—using mathematics to convert one character to another—remains relevant
today, the protocols and mathematics used to apply the algorithm have gotten much more
sophisticated.
Keys
The first departure from the Caesar cipher algorithm is the use of more sophisticated keys in
generating an encrypted text. The Vigenère cipher took the basic idea of a Caesar cipher
and added an extra piece of information to make it more secure—a keyphrase.
In this simple example comparing the two methods, imagine that a plaintext message, IT
IS BURIED IN THE BACKYARD is encrypted first with a Caesar cipher with an offset of 3:
49
and second with the Vigenère cipher using the keyphrase DIG :
In reality, this Vigenère cipher is just using three different Caesar ciphers in succession,
each with an offset corresponding to the letters of the keyphrase, DIG :
It is worth noting that the Vigenère cipher with the keyphrase of simply the letter D is the
same as our original Caesar cipher in this example—each time offsetting by 3.
Restricted Knowledge
Notice that each of these encryption methods relies on restricted knowledge.
With the Caesar cipher, anyone who knows the algorithm and the offset can decrypt the
message. However, we have seen that even if you know just the algorithm, it is not hard to
decrypt without also knowing the offset—there are only 25 possibilities. With the Vigenère
cipher, anyone who knows the algorithm and the key can decrypt the message. It is much
more difficult to decrypt the message without the key than before because patterns in the
text are less obvious, and there are many more possibilities than 25.
Scenario 1:
Alice sends Bob a locked box with her message inside. Although it gets passed through
many hands before reaching Bob (e.g., the courier system), it is locked and so Bob receives
it securely. However, to unlock it, he needs the key. How does Alice send Bob the key in a
secure way?
50
Scenario 2:
Alice sends Bob a locked box with her message inside. When it reaches Bob, he put his own
lock on it and sends it back. Then Alice removes her lock, and sends the box back to Bob.
When Bob receives the box, it is locked only with the lock that he put on it. He unlocks it and
retrieves Alice’s message—or does he?
Scenario 3:
Bob has invented a special lock. It is special because it costs nothing to duplicate and send,
and it is virtually impossible to analyze the lock and create a key. The key and the original
lock must be created at the same time. He sends out his locks to anyone who wants to send
him a message. Alice locks her box with one of Bob’s special locks and sends it to him.
When he receives it, he unlocks it with his special unique key and reads the message.
The third scenario is actually how modern secure message passing happens. For example,
when a website asks a consumer to send
his/her credit card information through a
webform, the website first sends the user a lock
to encrypt his/her information. Now, any traffic
sent over the Internet may only be read by the website that created the original key-lock pair.
This is called Secure Sockets Layer (SSL) and is typically indicated by a padlock icon in the
browser’s address bar.
51
UTeach CS Principles Unit 1: Computational Thinking
Cybersecurity
Integrity
Integrity is the certainty that information is accurate.
Although confidentiality is the most well-known aspect of information security, integrity may
be the most important. In considering the difference between the two, imagine an online
banking system. Confidentiality dictates that only the account’s owner has access to the
information. Integrity guarantees that the information is correct. If every time the page is
refreshed, a new (and incorrect) bank balance is shown, then the confidentiality of the
information is arguably less important than its integrity.
We will discuss bits (binary digits) in detail in Unit 3: Data Representation, but for now, at a
high level, imagine that your bank balance is encoded using only zeroes and ones.
So, your balance of $1,000 might be stored in the bank’s database as:
0000001111101000
However, in transmission between the bank’s database in Switzerland and your home
computer, one of the bits gets flipped from a 1 to a 0 :
X
0000000111101000
When you view your bank balance, instead of the expected $1,000, you see $488 instead!
One little binary digit just cost you half of your money!
Computer scientists have created an entire subfield called “coding theory” devoted to solving
problems such as these.
Can you brainstorm some ways to protect against this type of error?
Repetition
One of the simplest ways to detect errors such as this is to simply repeat what you send
multiple times.
52
How might this help resolve errors?
Is it guaranteed to resolve errors?
What is a drawback of using this method?
Can you brainstorm any other methods to present data to guarantee its correctness?
53
UTeach CS Principles Unit 1: Computational Thinking
Cybersecurity
Availability
Availability is the reliability of access to information.
Moreover, in this scenario, the consumer interacts with a bunch of websites, not just the
retailer. Perhaps she visits a search engine to find and visit specialized websites for price
comparison, reviews, and retail. The entire process as a whole involves two-way
communication between multiple computer systems.
Imagine that one of these pieces suddenly becomes unavailable—how does that affect the
scenario?
An Analogy
Imagine that you are in a classroom of about 30 students. Alice and Bob are the class
“know-it-alls” who always raise their hands to answer questions posed by the teacher, Mr.
Garrison. A fellow student, Eve, decides to play a trick on them. She visits each of the other
students in the class and convinces them to raise their hands anytime Mr. Garrison asks a
question. Assuming he calls on any of the other students, not Alice or Bob, they agree to
respond, “May I go the restroom?” or “Did you lose some weight?” or something else
unrelated.
Think About It
Can you imagine any solutions to the DDoS attack? How might Mr. Garrison solve and/or
prevent the analogous attack in his classroom? What are some potential drawbacks to any
solutions he implements? How well do these solutions carry over to DDoS attacks on web
servers?
55
UTeach CS Principles Unit 1: Computational Thinking
Cybersecurity
Social Engineering
Information security is not just focused on technical attacks on computer systems. Many
malicious attackers use social engineering techniques as well. Social engineering refers to
the “psychological manipulation of people into performing actions or divulging confidential
information.”
However, there are many other types of attacks that rely on manipulating people into
believing something false or not entirely true. Most of these rely on the use of malware
(malicious software).
Viruses/Worms
Botnets
Keyloggers
Backdoors
Trojan Horses
Time Bombs
Spyware
Exercise
Select, research, and report on one of these examples of malware. Include the following
information in your report:
56
How does it work?
How does social engineering figure into its distribution (i.e., what is misleading about
its installation or use)?
What are some real-world examples?
How is it prevented and/or removed?
57
UTeach CS Principles Unit 1: Computational Thinking
BIG PICTURE:
Electronic Voting
Highlights
You will integrate the cybersecurity concepts of confidentiality, integrity, availability, and
social engineering in the greater context of electronic voting.
58
UTeach CS Principles Unit 1: Computational Thinking
Electronic Voting
From Paper to Bits
One of the biggest cybersecurity concerns is
electronic voting. Paper ballots are increasingly
giving way to electronically submitted, tallied,
and stored votes. We have seen ways the CIA
Triad is applied to electronic messaging, data
storage, and account authentication to protect
against malicious attacks. However, given the
importance and far-reaching implications of
election results, electronic voting must adhere to
the CIA Triad perhaps more than any other
application.
Your Task
You will select one of the components of the CIA Triad (confidentiality, integrity, or
availability) and explore its implications for electronic voting. Your group will present to the
class on the following:
59
UTeach CS Principles Unit 1: Computational Thinking
UNIT TOPIC:
Programming Languages
Grammar, Vocabulary, and Syntax
You will examine the need for clarity and precision in communicating an algorithmic
solution to a problem.
Idea to Execution
You will examine the process in which a program is written in a high-level language,
compiled into a low-level language, loaded into memory, and then executed by a
processor.
60
UTeach CS Principles Unit 1: Computational Thinking
UNIT TOPIC:
Programming Languages
Grammar, Vocabulary, and Syntax
You will examine the need for clarity and precision in communicating an algorithmic
solution to a problem.
61
UTeach CS Principles Unit 1: Computational Thinking
Clarity and Ambiguity
Communication Is Key
Algorithms are just lists of instructions for solving a given problem. But in order for the
instructions to work, they must be followed exactly as intended. Fortunately, computers are
very good at doing what they’re told. In fact, they’re designed to execute a sequence of
operations in a completely predictable manner. They do exactly what they’re told.
When computers misbehave, it’s not the fault of the computer, but the fault of the program.
More specifically, it’s the fault of the programmer—the human who designed the algorithm in
the first place. If the computer does something the programmer didn’t intend, it’s because it
was given the wrong set of instructions. In order to get a computer to do what’s intended,
you must specify what you mean very precisely.
Exercise
Sketch a picture of the scene described by the above statement. Then, compare your
drawing to those of your neighbors. Are they the same? Are they different? Why?
Avoiding Ambiguity
If you ask someone for directions and they say, “Third door on the left,” do they mean their
left or your left? Without clarifying “my” or “your” left, the statement is ambiguous; its exact
meaning can be interpreted in different ways.
In the previous exercise, you and your neighbors likely did not all draw the same thing. And
yet, all of your drawings were probably perfectly correct interpretations of the phrase, “She
saw the man on the hill with the telescope.”
So why the difference in the drawings? Because the statement is ambiguous! Computer
scientists would say that the sentence has multiple parse trees. That means that there are
multiple ways in which the sentence can be interpreted, or parsed.
If we look more closely at the sentence, the heart of the problem stems from the
prepositional phrases, “on the hill” and “with the telescope.” Which of these nouns does “with
the telescope” refer to? The woman? The man? Or the hill? Without additional information, it
could modify any of them.
In fact, there are six different interpretations of the sentence that are all equally valid from a
grammatical standpoint.
Which one of these did you draw? Can you draw the six different interpretations?
62
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
63
UTeach CS Principles Unit 1: Computational Thinking
Artificial Languages
Natural Languages
English, French, Spanish, Arabic, Chinese, Hebrew, Italian… Languages spoken throughout
the world are known as natural languages. These are languages that evolved naturally
through everyday communication between people over hundreds of years. These languages
are complex, but also structured. Each has its own set of specialized vocabulary,
grammatical structure, and syntax that distinguishes it from another. Because these
languages grew around human communication, they usually have both written and spoken
forms. In addition, the formality of the language can vary by usage (e.g., the way you might
talk to your grandmother vs. your friends, “lol t3xtsp34k”).
Despite all their differences, the one thing that natural languages have in common is their
potential for ambiguity. Ambiguity makes natural languages a very poor means of
communicating clear and precise instructions to be interpreted and executed exactly as
intended. In short, natural languages make lousy programming languages. As computer
scientists, we need something better.
Over the years, a number of attempts have been made to develop and popularize a single,
“universal” written/spoken language that could cross cultures and political borders.
Esperanto is one such attempt. To date, only about two million people speak the language
(and most of those as a second language).
10 PRINT “Hello"
print “Hello"
System.out.print("Hello");
64
Notice how even without knowing the language or what all of the special punctuation might
mean, each of these statements is relatively readable to any English speaker (yes, most
keywords in these programming languages tend to be English, as opposed to another
natural language). Compare these to the way we’ve seen algorithms informally expressed
throughout this unit:
We will revisit textual programming languages later in Unit 4: Digital Media Processing,
where you’ll get to try your hand at writing programs in the Processing programming
language.
To combat this problem, a number of visual programming languages have been developed
to allow programmers to drag and drop pictures or icons into organized blocks that represent
the different parts of a program. Blocks-based programming tools automatically handle all
the grammar and syntax the programming language requires. This allows novice
programmers to focus on the logic of their programs, rather than the syntax, spelling,
capitalization, etc.
The example above shows how individual programming “blocks” can be assembled in a
language like Scratch. In Unit 2: Programming, you will get to build your first interactive
programs using Scratch.
65
UTeach CS Principles Unit 1: Computational Thinking
High-Level vs. Low-Level Languages
Language Hierarchy
A wide range of languages exists, both natural (like English) and artificial (programming
languages like Scratch). Ultimately, language is a tool for communicating an idea. In the
case of computing, this communication may be human, machine, or anywhere in between.
At the high end of that spectrum lie the languages that are best for human communication.
These languages are complex and use a high level of abstraction (a concept we’ll explore in
greater depth in Unit 3: Data Representation). They are symbolic and rely on strings of
letters to form words that represent abstract concepts, ideas, actions, and objects. Our
brains are optimized for this form of symbolic expression, so these languages are easy for
us to read, write, and understand. Different programming languages offer different levels of
abstraction. Examples can be found below. With all of its abstraction and inconsistent
grammar and usage, natural languages are difficult to use in ways that will allow machines
to understand our intent. With products like Apple’s “Siri” interface or Google’s “OK Google”
voice search feature, the technology has made great strides in recent years, but there’s still
a long way to go before machines can truly understand natural languages.
Low-level languages, however, are optimized for machines. Computers are built using highly
structured circuitry that responds to very logical and clear signals. Because low-level
languages are much more concrete and straightforward, with limited vocabulary and very
structured syntax, nothing is left to interpretation.
Machine languages are the most basic type of programming language. They represent the
actual binary instructions issued to computer processors and are very difficult for humans to
read. Moreover, they may vary from computer to computer! Believe it or not, in the beginning
days of computer programming, all programs were written in binary machine language.
Today, this is very rare.
The table below outlines this language hierarchy. Notice that the natural languages in the
first row (e.g., English) are suited only for humans, and processor-specific languages in the
last row (i.e., ones and zeroes, 10000010) are suited only for machines. The languages in
between—high-level programming languages like Scratch, C++, Python—and low-level
programming languages that tell the computer processor what to do (but are still somewhat
readable by humans) bridge that gap, allowing humans and computers to communicate with
each other.
Guaranteed to be LD R0 2
Low-level
unambiguous. LD R1 3
Programming
ADD R0 R1
(Assembly)
Less natural for humans, but ST R0 X
still readable to the trained
eye.
The high-level and low-level programming languages in the middle of the table are uniquely
well-suited for computer scientists because they are abstract enough to feel natural and
intuitive to humans, but basic and structured enough for machines to process with the level
of precision that’s required to have them do what we want.
67
language, allowing you to drag-and-drop blocks to communicate with the computer without
worrying about spelling, punctuation, etc., whereas Processing is a textual programming
language.
One of the key features of most high-level programming languages is that, because of the
abstraction they employ, people can program without needing to worry about (or even know
about) the specific configuration or design of the computer’s underlying circuitry. This allows
the programmer to focus solely on the task of designing and coding a logical solution to
whatever problem he/she might be working on.
Languages like Scratch and Processing are platform-independent, meaning that the
programs you write will run on just about any modern computer, regardless of the model,
version of the operating system (e.g., Windows, Mac OS, Linux, etc.), or maker of the
hardware.
Fortunately, most programmers never need to actually work at such a low level or program
directly with 1s and 0s. Instead, software development tools, such as the Scratch and
Processing interfaces you will use in upcoming units, automatically generate the low-level,
binary code that the processor requires. They do this by interpreting your high-level, abstract
instructions into lower-level machine code through a process known as compilation, which
we’ll explore in the next section.
68
UTeach CS Principles Unit 1: Computational Thinking
UNIT TOPIC:
Programming Languages
Idea to Execution
You will examine the process by which a program is written in a high-level language,
compiled into a low-level language, loaded into memory, and then executed by a
processor.
69
UTeach CS Principles Unit 1: Computational Thinking
Idea to Execution
Life Cycle of a Program
How does a program become a program? And what does a computer actually do when it
“runs” a program? Let’s take a peek under the hood and examine the process of how an
idea gets turned into a dynamic, interactive program running on your computer.
The Idea
It all starts with an idea. Well, actually, it starts with a problem that necessitates a solution
that then gives rise to the idea for how to solve the problem.
Smart problem solvers begin the process of turning an idea into something real by first
analyzing the problem at hand. They ask questions, collect information, gather data, look at
existing solutions, and think through the situation as thoroughly as they can before diving
into a solution.
It might seem difficult to restrain yourself when you are struck with a brilliant idea. But the
best ideas are almost never the first ones that come up. Remember the encryption exercise.
Most students could intuitively come up with the Caesar cipher all on their own, but that
proves to be one of the least robust solutions to that problem.
In practice, the first idea to come to mind is almost always an inferior solution. But with
careful thought, consideration, and analysis of the idea’s strengths and weaknesses, a good
designer will always come up with something better than their first instinct. So, avoid those
kneejerk reactions to initial ideas and look for the better idea. It’s out there. You just need to
make the effort to seek it out.
Construct an Algorithm
Once a developer has identified a solution that is worth pursuing, the next step is to begin
turning it into a well-specified plan of action.
As we’ve seen earlier in this unit, computer scientists—whether they are programmers
writing code or designers developing algorithms—apply their computational thinking skills to
the process of mapping out a logical sequence of instructions that will solve the problem.
They recognize and use the sequencing, selection, and iteration patterns we’ve looked at in
the past. They anticipate and identify all of the unseen and unexpected complications that
their solution will need to encounter. And they clearly define the requirements, expectations,
and specifications for how the resulting program should perform.
Smart programmers also design thorough test cases to verify that all aspects of their
program perform as intended. Sometimes the design of the algorithm is flawed, leading to a
program that misperforms. Other times, the algorithm is sound, but the program fails to
accurately implement it in the chosen programming language. Either way, it is important that
these bugs (i.e., errors) in the program are identified and corrected before the software is
distributed to end-users.
Fortunately, the programmer does not have to personally perform this tedious translation.
Instead, it can be automated in a process called compilation. This basically means that
your program can be translated by another program! During the compilation process,
another program, called the compiler, translates each of your high-level, human-readable
instructions into a corresponding string of 1s and 0s that make sense to the computer’s
processor. The resulting string of binary information constitutes an executable program.
Once a program’s binary contents have been copied into RAM, the computer’s central
71
processing unit (CPU) begins reading each of the binary instructions bit-by-bit, byte-by-
byte, word-by-word as it executes each instruction in the sequence that the code/algorithm
specifies.
The sum total of all of these discrete operations is what the end user recognizes as a
properly functioning program, whether it is a web browser, a word processor, a game, a
streaming media player, or whatever else it was that the original idea that started this
process was intended to do.
72
UTeach CS Principles Unit 1: Computational Thinking
CODING SKILLS:
Pseudocode
Highlights
You will examine the use of pseudocode to quickly and clearly express general
algorithmic ideas.
You will familiarize yourself with the College Board’s standardized pseudocode that will
be used on assessments throughout the course and on the AP Computer Science
Principles exam in May.
73
UTeach CS Principles Unit 1: Computational Thinking
Pseudocode
Communication Is Key
You might have noticed that throughout this unit, we’ve been using a particular style and
format for presenting each of our algorithms:
This format was chosen specifically to communicate very precise and unambiguous
instructions while maximizing its readability. It has a very clear and mostly intuitive structure
that has hopefully been easy for you to read, follow, and understand. Unfortunately, it has
some drawbacks.
For example, it can tend to be a bit wordy. We might start to abbreviate our wording or drop
minor words that aren’t essential for getting the idea across, like articles ( a , an , the ).
Consider these two alternatives.
4) ...do operation3
One could argue that the second version is just as clear about its intent despite being more
concise. Of course, anytime you abbreviate some text or leave out some words, you run the
risk of leaving out important information. So the challenge becomes finding the right balance
between conciseness and completeness—that is, being brief, but still saying everything you
need to say.
This is a major issue when a language developer tries to design a new programming
language. As we saw when we discussed high-level and low-level languages, each
language has its own particular set of grammatical rules and structures that are designed to
allow programmers to clearly communicate everything they need to express with
conciseness and precision.
However, when talking about algorithms in general, we’re interested in being able to
describe solutions at an extremely high conceptual level without needing to get into the
specifics of the grammar of a particular programming language. And that’s where
74
pseudocode comes into play.
"False Code"
We use the term pseudocode to describe any method used to express an algorithm in an
informal, language-agnostic manner. Derived from the Greek word pseudes, or “lying, false,”
pseudocode refers to a way of writing formal instructions in an informal manner that does not
necessarily adhere to the grammatical rules and syntax of any particular language.
Instead, pseudocode aims to mimic the general style of a programming language without
worrying about the exact syntax or structure of the language. It’s basically a shorthand way
of writing out the details of an algorithm.
In fact, while pseudocode does not have any standard, universal form, most individuals or
organizations that need to develop and communicate high-level algorithms will often settle
on their own, mutually agreed-upon set of notational conventions that they will use.
75
UTeach CS Principles Unit 1: Computational Thinking
Pseudocode for the AP Exam
Exam Reference Sheet
The following materials are available in the “Reproducibles for Students” section (p. 102) of
the AP Computer Science Principles Course and Exam Description. The “AP Computer
Science Principles Exam Reference Sheet” begins on p. 114.
While students are free to use any format of their choosing when writing their own
algorithms, they are encouraged to use this style guide as a reference.
76
77
78
79
80
81
82
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
83
UTeach CS Principles Unit 1: Computational Thinking
UNIT TOPIC:
Solvability and Performance
Decidability and Efficiency
You will identify which problems can and cannot always be solved by an algorithm.
You will examine methods of comparing equivalent algorithms for relative efficiency.
You will evaluate the relative efficiency of equivalent algorithms.
You will identify factors that allow solutions to scale efficiently.
You will examine the implications of Moore’s Law on the research and development of
new and existing technologies.
84
UTeach CS Principles Unit 1: Computational Thinking
Decidability and Efficiency
Can Computers Solve Every Problem?
It probably doesn’t surprise you to hear the answer is no. However, it may
surprise you to learn that this is mathematically proven to be impossible. In
fact, Alan Turing, one of the most influential figures in computer science,
theorized what would later become the digital computer as part of his proof.
Scalability
So, how do we know that one algorithmic solution is any better than another? In order to
answer that question, we first need a way of measuring the relative performance that an
algorithm offers. And that measurement comes down to a question of scalability. That is,
how well does the algorithm perform at larger and larger scales?
For example, if we double the number of items in a list, how much harder does that make
the task of finding a particular item in that list? What if we triple the size of the list?
Quadruple it? What if we increase the size of the overall problem by a factor of N? How
much does that increase the amount of work required by the algorithm?
85
Computer scientists use something called “Big-O Notation” (a mathematical concept we
won’t get into here) to describe how well different algorithms scale. This allows them to
classify different solutions into different categories of relative performance.
Constant: No matter how much a problem grows, the amount of work stays more or
less the same.
Logarithmic: Every doubling of the size of a problem only requires one extra unit of
work.
Linear: As the size of a problem grows, the amount of work required grows at
approximately the same rate.
Quadratic: As the size of a problem grows, the amount of extra work required
increases much more quickly.
Performance Comparisons
Imagine you needed to search through a list of 1,000,000 items.
If the item were toward the front of the list, Sequential Search would find the item with very
few examinations. However, if the item were toward the end of the list, Sequential Search
might need to look through as many as one million items! On average, depending on where
the item might be located in the list, Sequential Search will require examining about 500,000
items. This is referred to as having linear performance because there is a linear relationship
between the size of the problem (i.e., the total number of items you must search through)
and the amount of work required to solve the problem (i.e., how many items you must
actually look at before finding what you are looking for).
Compare that to Binary Search, which does not see the same one-to-one relationship
between the size of the problem and the amount of work. Because of its divide-and-conquer
approach, the amount of work required to find an item grows much more slowly with Binary
Search than with Sequential Search. In fact, with this logarithmic behavior, you can double
the total number of items in the list and it only costs one more unit of additional work. This
means that if the size of the problem grows incredibly large, the amount of extra work that
the algorithm will require will stay relatively small.
86
Problems with Scalability
Small and simple problems are often easy to solve because even an inefficient solution likely
doesn’t require much work in the first place. But as problems grow larger, the efficiency of an
algorithm becomes more and more important. So important that it might mean the difference
between the problem even being solvable or not. Improvements in algorithms, hardware,
and software increase the kinds of problems and the size of problems solvable by
programming.
In this case, it is not so much a problem with how efficiently the algorithm scales, but rather
a problem with how quickly size and complexity of the problem can scale.
Imagine you have a four-digit ( 0 – 9 ) passcode, like you might have on your phone or
home security system. How many possible passcodes exist? Well, if the code consists of
four digits, each of which is one of 10 possible digits ( 0 – 9 ), then there are 104, or 10,000,
unique combinations ( 0000 through 9999 ). A brute force approach would only need to
make no more than 10,000 attempts to ensure success. Ten thousand is a lot, but trivial with
the help of computational technology.
Notice how quickly and easily it is to increase the size of the problem. Even just a meager,
eight-digit passcode made up of alphanumeric characters ( 0 – 9 , a – z , A – Z ) has more
than 200 trillion possible combinations, meaning the brute force approach could similarly
require more than 200 trillion attempts before cracking the encryption. Even at only 12 digits
long, such a passcode would allow so many possible passcodes that a brute force attempt
taking only 1ms per attempt would still require ten times the age of the universe. Clearly, a
brute force solution does not scale well.
88
UTeach CS Principles Unit 1: Computational Thinking
Moore's Law
Moore's Law
Back in 1965, before the industry really had any idea how to
measure its rate of progress, much less the importance of
knowing how to predict the pace of future innovations, Gordon
Moore (co-founder of Intel) made an observation that
revolutionized the technology industry and the way we think
about building upon today’s technology in order to invent
tomorrow’s technology.
As the director of research and development at Fairchild Semiconductor, Moore was asked
to speculate on how he imagined the semiconductor components industry might develop
over the next 10 years. This industry was responsible for turning silicon ingots into the
intricately designed, wafer-thin discs that solid-state components and integrated circuits are
made of. Each of these components comprise millions of tiny transistors, resistors, diodes,
and capacitors all carefully interlaced to form the innerworkings of our modern computing
technology.
Moore noted that as the manufacturing tools and processes miniaturized over time, we were
able to pack more and more of these tiny components into the same amount of space.
Specifically, he observed that the number of transistors that could fit on a chip roughly
doubled every one to two years.
While he originally predicted only that this rate of progress would last for the next decade,
incredibly, it has held true for the last 50 years! Due to its long-lasting accuracy, Moore’s
observation that the density of transistors doubles at a predictable rate has more commonly
been dubbed “Moore’s Law.”
"The best way to predict the future is to invent it.”— Alan Kay
Clearly, we can look back and see that the unreasonable problems of yesterday have
become achievable today. Most software developed today simply would not run effectively
on computer hardware developed just 10 years in the past. Using Moore’s Law, we can see
that just 10 years ago, computer hardware was effectively 100 times slower than it is today!
If we can confirm the trends of the past using Moore’s Law, can we also use it to predict the
future? How do computer scientists and engineers actually use Moore’s Law?
Imagine you have an idea for a new form of technology, but after careful design and
analysis, you’ve estimated that today’s technology is an order of magnitude (i.e., a factor of
89
10) too slow to handle the massive amounts of real-time computations that your invention
will need.
Using Moore’s Law, if we assume a doubling of speed approximately every 18 months, you
can reasonably predict that computers will achieve the tenfold speed increase that you need
in only five years. You can then begin planning the research and development of your
invention over the next five years, knowing that by the time you need the technology to
perform at the levels you require, it will.
This is exactly how large tech manufacturers operate. At any given point in time, their
engineers are busy working on technologies that are not yet feasible, but that will become a
reality by the time they are ready to introduce them onto the market.
90
UTeach CS Principles Unit 1: Computational Thinking
Heuristics
Heuristics: the “kinda, sorta, maybe” of computing
Until now, problem solving with computers has been discussed in terms of algorithms.
Algorithms are guaranteed to give a “correct” answer if the proper, specific steps are
followed using sequence, selection, and repetition. However, what about problems that we
do not know how to solve? Problems that no known algorithm can sufficiently address?
What about problems that can be solved, but would take too much time or resources?
Although there are some problems that cannot be solved efficiently through computation (or
in some cases solved at all), often a method for deriving an approximate solution may be
defined through the use of heuristics. Heuristics are a problem solver’s “Rules of Thumb”;
they serve as general guidelines meant to encourage progress toward a good answer, but
with no guarantee of actually generating “the correct answer”.
Imagine that you have created autonomous robots that are programmed to dig for gold. You
can set a number of these robots down in a location, and they will explore the terrain and
scan for signs of precious metals. In the past, your company has lost many of these robots
to rain. Although the rain itself is not harmful to them, the flooding that often occurs as a
result leaves them completely submerged for long periods of time.
You are tasked with writing an emergency routine to—quite literally—climb hills. Here’s the
catch: your robots will be deployed in many different locations, and it is impossible to know
where all of the hills and highest locations actually are in any situation. All that the robot is
able to sense is when it is raining, and it current position, including its altitude.
We can’t use an algorithm to find the safest spot and park there, because given our
constraints, that would mean visiting and searching every possible location in the area to
91
locate the highest point, and then traveling back to it. The time it would take to perform this
is intractable; the robots would have their circuits flooded long before they could accomplish
their goal.
Let’s apply an heuristic. The hill climbing heuristic suggests that you will probably reach a
high point (a local maximum)—not necessarily the highest point (the global maximum)—if
you search your immediate vicinity, choose the highest point available, and relocate there.
Repeating this process is likely to get you to a local maximum.
92
UTeach CS Principles Unit 1: Computational Thinking
Distributed Computing
Sharing the Workload
Consider a typical anthill. Depending on its size,
it likely consists of millions, if not billions, of tiny
grains of dirt or clay, each one carefully
excavated from underground and individually
hauled up to the surface where it is deposited.
How long would it take to build such a structure?
A single ant moving one grain of dirt at a time
would take ages. But many anthills, even very
large ones, seem to pop up practically
overnight. How does such a monumental job
happen so quickly?
The answer, of course, is that no single ant builds an entire anthill by itself. Each hill is the
product of an entire colony, with each ant moving only a single grain of dirt at a time. But
collectively, the entire colony is capable of moving thousands of grains at a time, vastly
reducing the time needed to accomplish the greater task.
And this same process of distributing the workload for a large or complex task among
multiple workers who can each operate in parallel with one another can also be applied to
large and complex computational problems as well.
Thanks to the development of the Internet and other large-scale networking environments,
the processing power of multiple computers can be harnessed to work together in solving
complex computations that would otherwise be impractical, if not impossible, to solve using
only a single computer.
Many such problems involve the processing of large amounts of raw data or the simulation
of millions, if not billions, of various scenarios. Rather than asking a single computer to
process the entire data set, it is possible to connect to a large array of additional computers
and systematically farm out smaller segments of the larger data set to each computer. Each
computer then only processes a small fragment of the overall problem, but collectively, this
“colony” of networked computers can crunch through the data and achieve a result in a
fraction of the time.
SETI@Home
In recent years, a number of projects have emerged that employ this process of distributed
computing to tackle problems that would otherwise be too resource-intensive to solve in a
reasonable amount of time. One of the first such projects to gain widespread popularity is
SETI@Home, launched in May 1999 by the University of California at Berkeley.
93
SETI (Search for Extraterrestrial Intelligence) is a scientific area
whose goal is to detect intelligent life outside Earth. One
approach, known as radio SETI, uses radio telescopes to listen
for narrow-bandwidth radio signals from space. Such signals are
not known to occur naturally, so a detection would provide
evidence of extraterrestrial technology.
The SETI@Home project searches through massive amounts of astronomical data collected
by radio telescopes in an attempt to recognize unique, space-borne radio signals that do not
occur naturally. Signs of such signals would provide intriguing evidence for the possibility of
some form of intelligence beyond Earth.
First conceived in 1995 around the same time that the World Wide Web was emerging and
growing in popularity, the project aimed to harness the processing power of all of these
newly networked home computers around the world to assist in searching through and
analyzing the large volumes of radio telescope data.
Individual users who wish to participate in SETI@Home and donate their idle computer time
can download the project’s BOINC (Berkeley Open Infrastructure for Network Computing)
software and run it on their own computer. This program connects remotely to the
SETI@Home servers that then deliver an ongoing series of processing jobs that your
computer can crunch on in the background (the program can even be set as your
screensaver). As your computer completes each assigned task, it sends the results back to
the home server and receives its next batch of data to process.
SETI@Home is just one of many distributed computing projects that users can participate in.
The BOINC software alone allows users to join over three dozen scientific research projects
that address topics across the full array of scientific disciplines, including physics,
mathematics, chemistry, encryption, evolution, astronomy, medical, biological, computer
science, etc.
Botnets
Unfortunately, massively distributed computing can also be used for malicious purposes as
well as the beneficial uses described above. Such is the case for botnets—large collections
of networked computers that have been infected by a worm, virus, or other form of malicious
software that enables the computer to be remotely controlled or utilized without the owner’s
knowledge.
Much like the way that the SETI@Home servers deliver payloads of data to be processed on
users’ computers, botnets also deliver commands to its network of “zombie” computers,
instructing them to perform a variety of less altruistic actions like sending massive amounts
of spam or launching coordinated DDoS (Distributed Denial of Service) attacks upon
unsuspecting sites.
In fact, these botnets are one of the primary reasons that spam and DDoS attacks are as
bad as they are. Firstly, the large number of “zombie” computers forming the botnet
94
contributes to the massive scale of the spam and attacks. Secondly, the distributed nature of
the botnet makes it difficult to track down the originating computers because they do not
come from a single, centralized source.
Other botnets have been known to hijack the processing cycles of unsuspecting victims’
computers to engage in bitcoin mining.
95
UTeach CS Principles Unit 1: Computational Thinking
Password Generator Project: Rubric
Check
Instructions
This activity provides you and a partner group time to provide one another critical feedback
about the progress of your projects.
Listen to what your partner group says! You do not need to heed all of their advice, but
consider it wisely. The whole idea is to have a critical third party provide ideas to make your
project better before you submit the final version for a grade.
96
UTeach CS Principles Unit 2: Programming
UNIT 2
Programming
97
UTeach CS Principles Unit 2: Programming
UNIT PROJECT:
Scratch Program
Highlights
You will collaborate in pairs to design, implement, and debug a novel, aesthetically
pleasing, and intuitive program using the Scratch programming environment.
You will identify a specific purpose that your program will serve (e.g., entertainment,
problem solving, education, artistic expression, etc.).
You will integrate interactive and multimedia elements into your program.
You will integrate common programming constructs, such as variables and selection
statements, into your program.
You will test, debug, and correct your program.
You will use appropriate terminology while writing documentation detailing the full use
of your program and its features.
You will explain your design and implementation choices while demonstrating and
sharing your finished programs with your peers.
You will provide a written analysis of at least one other design team’s program,
identifying its strengths and weaknesses and offering suggestions for improvement.
98
UTeach CS Principles Unit 2: Programming
Scratch Programming Project
“Software is a great combination between artistry and
engineering.” – Bill Gates
https://www.youtube.com/embed/SGtecDQGV-g
Through programming, artists create music and visual art, scientists create models of
possible worlds, engineers build new products, medical researchers design and test possible
cures, and businesses create jobs and wealth. Even as a beginner, through programming,
you have the power to design, to build—to create. What will you create?
Assignment
99
Submission
Your submission will be in the form of an original program that you develop with Scratch,
along with written documentation detailing its use. You may submit either a link to the
completed program that you code, or download the program itself and submit the program’s
file itself. You must also submit a text document for your documentation. Your Scratch
program must:
When you are finished, you will submit a link to—or the source file of—your Scratch
program. Your program will be graded using the attached rubric. You will then review other
groups’ submissions, and reflect on any differences from your own work.
Learning Goals
Over the course of this module and this project, you will learn to:
Rubric
Content
Area Performance Quality
100
Program uses Program uses Program uses Not
a combination a combination only one kind enough
of four or more of fewer than of variable— criteria are
numeric and four numeric either numeric met in
string and string or string. order to
variables variables award any
appropriately. appropriately —AND— credit.
AND all
—AND— variable have Not all
meaningful variables have
All variables names and meaningful
have purposes in names or
meaningful the program. purposes in
Variables names and the program.
purposes in —OR—
the program.
Program uses
four or more
numeric and
string variables
appropriately,
but some
variables do
not have
meaningful
names or
purposes in
the program.
Program
contains four
or more
conditionals to
simulate
decisions or
implement
branching but
not all
conditionals
are used
effectively and
correctly with
purpose in the
program.
101
four or more fewer than four one kind of award any
repeat and repeat and loop, either credit.
forever loops. forever loops repeat or
AND all loops forever.
—AND— are used
effectively and —AND—
All loops are correctly with
used purpose in the Not all loops
effectively and program. are used
Loops correctly with effectively and
purpose in the —OR— correctly with
program. purpose in the
Program program.
implements
repetition
through the
combination of
four or more
repeat and
forever loops
but not all
loops are used
effectively and
correctly with
purpose in the
program.
103
UTeach CS Principles Unit 2: Programming
BIG PICTURE:
The Who, What, and Why of Programming
Highlights
You will examine and discuss the motivations behind a number of high-profile
individuals in the field of programming.
You will discuss the benefits of programming as a tool and a profession.
104
UTeach CS Principles Unit 2: Programming
Who Programs?
Careers that Program
This video, produced by code.org, includes snippets of people in various careers speaking of
the importance of programming. The accompanying page of quotes contains the thoughts of
many, many others who have indicated a great need for programming.
https://www.youtube.com/embed/nKIu9yen5nc
Choose one person from the video or the quotes page that you find interesting in some way
—perhaps what this person said is interesting, or maybe his/her appearance on a list of
people discussing the importance of computer programming is surprising to you.
105
UTeach CS Principles Unit 2: Programming
Why Program?
Why Program?
Learning to program is increasingly important in a variety of fields. In truth, as physical and
mental operations become automated, virtually every field of human endeavor will be reliant
on software and the people who create it.
Throughout history, the economy (and those activities which it drives—such as politics,
social mobility, and innovation) has been rooted in resources. In the agricultural era, these
resources were typically those that could be produced by farming, ranching, mining, fishing,
etc. This was succeeded by the Industrial Age, in which resources such as coal, oil, steel,
and mass labor were key. Many refer to the current time as the Information Age, where
information, represented digitally, is the major driving force of the economy. In this age,
those companies that are able to collect, manipulate, and analyze information (data) are
typically based on the work of (and many times, even founded by) computer
programmers/software engineers.
Beyond this, some have speculated that we are on the cusp of a new economic model, the
Imagination Age. Whereas programming is integral to function in the Information Age, in the
Imagination Age, creativity is the primary resource, and the ability to code will be
foundational.
Computational Artifacts
106
Computational Thinking Illustrated
107
UTeach CS Principles Unit 2: Programming
What Is a Program?
What Is a Program?
Carla: Defenestrate. Look it up. Anyway, TV is just a way for THE MAN to
tell you what to do, what to buy, and what to think. Why do you think they
call it “television programming"? They are programming you, Carlos; You.
are. a. robot.
Compare Carla’s connotation of the term “television programming” with these denotations of
the word “program” from dictionary.com:
pro·gram
[proh-gram, -gruhm]
noun
Think About It
108
How does the ambiguity of the word “program” seemingly support Carla’s paranoid theories
of mind control? What do “mind control” and programming a computer have in common?
Structured Processes
The definitions cited in the previous activity have a few common themes. One such theme is
that a program is a story. In fact, writing a computer program is a lot like writing a story. They
both have beginnings and ends, and the interesting part is how they move from the
beginning to the end.
Remember the days of elementary school, where you’d get an assignment like the
following?
Sequence of Events
These activities always deal with a specified process, where one event logically follows from
another. In this particular example, there is some ambiguity regarding where the process
begins. Let’s assume that the process begins with the seed packet image. Given that
constraint, it only seems logical that our watermelon eater must cut the watermelon before
eating it. So, our sequence of events looks like this:
109
Because programs (like our story) also need well-defined beginnings and ends, a number of
discrete events, and some indication of the ordering of these events, many programmers
use diagrams like the one above (called flowcharts) to plan their programs. (examples)
Flowcharts are visual representations of structured processes—encompassing computer
programs, stories and scripts, business models, and more.
You may find that flowcharting is helpful to you in planning out the sequence of events in
your programs. For many text-based programming languages (like you will see later in this
course), a flowchart provides a nice visual counterpoint to the program code itself. However,
our programs for this module will be written primarily in Scratch, a visual programming
language (VPL), which is structured similarly to a flowchart:
Scratch Flowchart
110
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
111
UTeach CS Principles Unit 2: Programming
UNIT TOPIC:
Visual Programming
Welcome to Scratch
You will utilize a graphical editor to read, construct, and execute dynamic programs.
112
UTeach CS Principles Unit 2: Programming
UNIT TOPIC:
Visual Programming
Welcome to Scratch
You will utilize a graphical editor to read, construct, and execute dynamic programs.
113
UTeach CS Principles Unit 2: Programming
Personal Scratch Pages
Creating Your Page
In order to get started with Scratch, the visual programming language that you will use for
the Unit 2 project, you will need to do the following:
When you have posted your link, spend some time exploring the user interface and with
programming. There are some excellent beginner tutorials under the Help tab at the top of
the page if you’d like to get started right away. Experiment with the different features
available to you now, but don’t worry! We will explore all of the features necessary to create
a great project in detail throughout the remainder of this unit.
114
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
115
UTeach CS Principles Unit 2: Programming
The Cat's Meow
Your Turn
Enough jibber jabber! Let’s do something! For your first program, make a quick song.
Navigate to the Sound tab to access the following blocks:
To create a sound script, simply drag any of the play blocks to the script pane. Click on
the block(s) in the script panel to execute them.
Take some time to figure out how to connect and disconnect blocks. Be sure to practice
removing a piece from the middle of a long script and reconnecting the surrounding pieces.
For example, the play sound blocks allow us to control when sounds are played as well
as how many are played.
and
116
The difference, of course, is the phrase “until done." Until done with what? Clicking on them
individually does not provide enough information to distinguish between the two. So, test
them in multiples:
If you execute this small script, how many meows do you hear?
How about the following revision—how many meows do you hear now?
Now, how would you articulate the difference between the two types of play sound
blocks?
Experiment
Describe the behaviors exhibited by the following combinations, and provide your answers in
a text submission:
1) Two play sound [meow] blocks and then one play sound [meow] until
done :
Answer:
2) Two play sound [meow] until done blocks and then one play sound
[meow] :
Answer:
117
Answer:
118
UTeach CS Principles Unit 2: Programming
Save Early and Often
Save Early and Often
Few things are more frustrating and disheartening than to see all of your hard work vanish in
an instant due to a sudden power outage, random browser crash, or your neighbor
accidentally kicking out the power cord under your desk. These unexpected events are
always quick and seem to occur at the most inopportune moment—and sooner or later they
happen to everybody, even you.
Fortunately, the impact of this inevitable disaster can be reduced by frequently saving your
work. Just like making frequent use of save points in your favorite game, taking a moment to
quickly record your progress can save you a lot of time and effort in trying to recreate it after
a disaster.
In Scratch, you can save your projects by clicking “Save now” in the “File” menu. While you
do not need to save your work after every little change, you should develop the habit of
regularly clicking on “Save now” whenever you’ve made any significant additions to your
program and would like to test it.
To save (or export) a sprite, right-click on the sprite and select “save to local file.”
To load (or import) a sprite, click on the icon with a folder next to New sprite and select the
sprite that you want to add to your project.
119
120
UTeach CS Principles Unit 2: Programming
UNIT TOPIC:
Visual Programming
Programming with Blocks
121
UTeach CS Principles Unit 2: Programming
Experimenting with Play
Events
The play sound blocks and short script examples from the last page played in order—
beginning with one block and then the next until there were no more blocks to play.
However, we did see that what we actually heard depended on whether we waited until one
sound was done playing before we began the next.
The first block (the when green flag clicked block) is an event block. Event blocks
are triggered by the corresponding described behavior. Scratch provides the green flag and
red stop sign icons in the top right corner of the stage to begin and end the execution of
programs. When the green flag is pressed, any (and all) sequences of blocks in your
program beginning with a when green flag clicked block begin executing.
Try it! Connect the play sound blocks from the previous activity to a when green
flag clicked block, then execute your program by clicking the green flag.
Broadcast
Another type of event block is tied to the broadcast block. This block allows us to have
some control over which blocks are executed at a given time by communicating a message.
When a message is broadcast[...] , the corresponding when I receive [...]
blocks are executed.
Create the following scripts in Scratch. Note that the Cat and the Duck sprites have
completely separate script panes. Click on each character to see its script pane. This allows
each sprite to have its own unique behaviors defined.
The purple Looks tab contains the say blocks used in the example.
Similarly, the brown Events tab contains the broadcast and when blocks.
When you are done, press the green flag to start the short play.
122
A note about style: You will notice that we chose to give meaningful names to the
messages that were broadcast so that it would help us keep track of what we were doing
and what messages we were sending. We recommend that you do this in your projects. As
your projects grow in size, the number of unique names will grow accordingly. Choosing
meaningful names will help you find errors more quickly.
Hints
To choose a new sprite from a library of existing sprites, click on the icon below the stage
that looks like a head.
123
124
UTeach CS Principles Unit 2: Programming
Different Ways to Broadcast
Play Sound and Wait
How many times will you hear the meow sound when you run the script below?
Answer:
Answer:
Answer:
125
Answer:
Chain of Broadcasts
How many times will you hear the meow sound when you click on the green flag?
Answer:
Mismatched Broadcasts
How many times will you hear the meow sound when you click on the green flag?
Answer:
Loops
Without trying this in Scratch, answer the following question:
What does the following set of scripts for the Cat do when you click on the when I
receive [Turn1] block?
126
Answer:
127
UTeach CS Principles Unit 2: Programming
UNIT TOPIC:
Visual Programming
Remixing Scratch Projects
128
UTeach CS Principles Unit 2: Programming
Remixing Scratch Projects
Remixing
Scratch is available free of charge and runs in your web browser. Additionally, the Scratch
site contains forums, tutorial videos, and a variety of helpful resources. You should spend
some time exploring the site, and these resources. For this activity, you will explore other
users’ Scratch programs and remix one to make it your own.
A Scratch account allows you to access many useful features, including the ability to share,
discuss, and modify other users’ projects in a collaborative manner. Using the Explore tab in
the main menu bar, you can search for and select from a wide variety of projects that other
users have created. Not only can you click “See inside” to get a look at how the project
works and how it was created, but you can also choose to “Remix” the project to create your
own copy of the project and begin modifying it in whatever creative ways you can imagine.
The Scratch developers feel that sharing, reading, and writing code are all important aspects
in the learning process. Therefore, any Scratch program shared on the site is licensed under
the Creative Commons Share Alike license and may be copied by any other user and used
as the basis of a new, derivative work.
In this activity, you will browse projects on the Scratch site, select one that interests you, and
remix it in some way.
3. Find a project that interests you. When selecting a project, try to imagine how you
would like to modify it.
4. Click the button labeled “See inside.”
6. You now have a project copy of your own to freely modify. Any changes made by you
will only affect your copy! Some projects have many, many versions that have been
modified from an original. In order to see how a project has evolved and diversified
over time, click the “Remix Tree” button on the project’s page.
129
7. In a shared space provided to you by your teacher, post links to both the original
project and your remix, as well as a description of the changes you made.
Explore your classmates’ remixes!
130
UTeach CS Principles Unit 2: Programming
CODING SKILLS:
Choreography Notation
Highlights
131
UTeach CS Principles Unit 2: Programming
Let's Dance!
Motion
In Scratch, sprite behaviors are a form of program output. Up until now, we’ve only “heard”
output in the form of play sound blocks. However, sprites can do much more than sing
(or meow).
The programming blocks located in the Motion tab will affect the sprite’s state—its location
(i.e., where it is) and/or orientation (i.e., which way it’s facing). Two of the most commonly
used of these blocks are the following:
Moving Turning
1. The blocks instruct the sprite to move relative to its state where it is before the block is
executed. In other words, you can click on the sprite and drag it somewhere, and
move [10] steps will still function correctly and move the sprite 10 steps from its
present state. The same is true for turn blocks—no matter which direction the sprite
is currently facing, turn [15] degrees will still function correctly.
2. The number of steps or degrees is customizable. The “holes” in the programming
blocks, where a selection can be made, are called parameters. Scratch will allow
negative parameters in these motion blocks.
Implement it and test it out. Were you right? What effect does turning a negative number of
degrees have on the sprite?
Motion Combinations
Motion blocks can be combined much like Sound blocks. Hypothesize and test what the
following sequence of blocks will do:
Extension
Try to figure out the relationship between the blocks on the left and the buttons on the right.
These buttons can be helpful to get the characters to face each other.
132
Blocks Buttons
Sound and motion need not be separate. Sequences of blocks that combine the two (and
other types of behaviors) can be created by switching among the tabs and selecting the
blocks you wish to combine. The example dance sequence uses four types of blocks:
The brown Control block handles the green flag event, so the program knows when to
start and what to do once it has started.
The magenta Sound blocks output audio drum beats.
The blue Motion blocks change the sprite’s location and orientation.
The purple Looks blocks change the sprite’s appearance.
Each sprite may have multiple ‘costumes’ associated with it. The default ‘Scratch
the Cat’ sprite has two—you can view them by clicking on the ‘Costumes’ tab,
then you can switch to a particular costume by name or simply choose next
costume to alternate between them.
Instructions
Your job is to remix some starter Scratch code to make it a dance all your own! In the Unit 1
activity, Flow Patterns, you explored the basic components that can constructs any
algorithm: sequence, selection, and iteration. Your “dance” that you make will be an example
of an algorithm built using only sequencing.
133
4. Share your Scratch dance with a classmate. Describe the modifications you made, and
give your dance a name.
134
UTeach CS Principles Unit 2: Programming
Choreography
Dancing Like a Programmer
Choreographers are programmers, too!
1. Informal Footprints
1. A formal notation—Labanotation
135
Rudolf Laban Labanotation
Instructions
Your classmates will try to replicate your dance based on your notation! Good luck.
136
UTeach CS Principles Unit 2: Programming
Animated Movie
Activity
In this activity, you will make an animated movie using Scratch. The topic of the short film
may be anything appropriate, but be wise about how to allot your time (e.g., don’t focus too
much on any one aspect while neglecting others).
Be creative! Create characters, tell a story, and experiment with moving sprites around the
stage. Program your own movie.
When you have finished, post a link to your Animated Movie in the shared space
provided to you by your teacher.
Rubric
Criteria Points
The program contains two motion blocks. 2 pts
The program contains two different sprites. 2 pts
The program includes four broadcasts between sprites. 4 pts
The program produces two sounds through play blocks. 2 pts
TOTAL 10 pts
137
UTeach CS Principles Unit 2: Programming
UNIT TOPIC:
Program State
User Input
You will write programs that incorporate dynamic, user-driven, keyboard controls and
input.
Variables
You will examine how the dynamic state of an object or program can be stored and
changed using variables.
You will analyze the role of clear, descriptive names for objects, behaviors, variables,
and other identifiers in maintaining the readability of code.
138
UTeach CS Principles Unit 2: Programming
UNIT TOPIC:
Program State
User Input
You will write programs that incorporate dynamic, user-driven, keyboard controls and
input.
139
UTeach CS Principles Unit 2: Programming
Stored State
The State-of-Being State
State can be such a strange word. In computer-speak, it corresponds roughly to everything
about the computer at a given moment—everything that is in memory, what programs are
executing and what instruction they are currently processing, signals to and from input and
output devices, and so on.
In this course, “state” encapsulates attributes—a description of what the computer is doing
at any given moment.
Input
Storage
Processing (We instructed the computer to make calculations, such as adjusting
location and orientation, or count beats per minute for synthetic drum beats.)
Output (Our focus has been primarily centered on output—observable behaviors
generated by the computer such as animation, sound, and text.)
To be fair, our programs have used input and storage to some degree. For example, clicking
the green flag in Scratch is a form of input. We interact with this control to signal the
computer to begin the program.
Our use of storage to this point is more abstract. Consider the what the computer needs to
“remember” in order to execute the program.
First, it needs to “remember” the program—what are its instructions? Second, there are
attributes about the environment it needs to “remember” in order to function properly. These
attributes are called state. An example of state in our previous programs is the location of
Scratch the Cat. In order for the program to move [10] steps , it needs to “remember”
where Scratch the Cat is actually located. However, these instances of input and storage are
not ones we explicitly control. During our next few activities, we will focus on customized
input and storage.
Position
The sprite occupies a point (x,y) on the stage corresponding to the x- and y- axes below.
How is a sprite’s state related to its position on a coordinate grid? How are position and state
different?
140
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
141
UTeach CS Principles Unit 2: Programming
User Input and Interaction
User Input
User input for programs can come in many forms. Most often, a user interacts with a
program by clicking or moving a mouse, or through typing via a keyboard. In this way, a user
can direct a program to behave in a certain way without predetermining the data when
programming. In the following activity, you will program Scratch the Cat to react to user
input.
In order to do that, we will use the when event blocks located in the “Control” tab.
The event trigger can be changed using the drop down menu. Create 4 separate event
blocks corresponding to each of the arrow keys.
Instructions
Your job is to create a program with a sprite that moves related to user input, including arrow
keys and at least two other inputs.
Program an original program in Scratch that meets the following minimum requirements for
your Scratch program:
When you are satisfied with your work, submit a link to your program or the program itself.
143
UTeach CS Principles Unit 2: Programming
Show Me Your State
Show Me Your State
We’ve explicitly dealt with input now. In an example program chunk of the previous activity,
we used the when ↑ key pressed block to detect our input, and then move your sprite
upward. But what does upward mean?
We know what the output (the observed behavior) of upward is: when the computer
executes its commands, it redraws the screen with the sprite 10 pixels above his previous
location. How does the program know where to draw the updated Scratch and which
direction to face him?
The computer needs to “remember” the sprite’s original location and orientation, its state,
and update them with new ones—by accessing and modifying storage. Examine the
following instructions; locate the attributes (of the Scratch sprite) we are changing:
We are changing the sprite’s direction attribute as indicated by the point in direction
[0] block. However, we are also changing his location with the move [10] steps
block. “10 steps” is not an attribute of a sprite—what attributes do you hypothesize indicate
location?
Scratch uses the familiar x-y coordinate plane to indicate location. This means that each
sprite has three attributes associated with it for location and orientation: direction , x
position , and y position . You can view these attributes and their associated values
by checking the following boxes in the Motion tab:
Try It Out!
Load your Scratch program from User input and interaction.
Experiment with location and orientation using the direction , x position , and y
position blocks. Move your sprite around and observe how its state changes over time.
For a more detailed view of a sprite’s state, you can right-click the sprite in the sprite picker,
and select “info.”
144
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
145
UTeach CS Principles Unit 2: Programming
Text Input
Text Input
Let’s use both the changeable and placeholder qualities of variables to make a program in
which Scratch interacts with us. We’ve already used text output in Scratch with the say
[...] blocks located in the Looks tab. However, we can also ask for text input as well.
Many of the input-oriented blocks are located in the Sensing tab.
Why do you think that the Sensing tab contains blocks for input? How do you receive input
from the world around you?
In the following activity, we are going to work with the following blocks:
As you can see, the variable answer is changeable. It has a different value at different
times, depending on its context—when it is being viewed and how it was updated.
What about the placeholder quality of variables? How can we leverage that in our programs?
We can use answer as a placeholder for text input in our program. This way, the program
will use whatever is entered for answer without knowing ahead of time what that is. This is
better illustrated with an example. Create the following program. Remember that blocks are
color-coded according to their tabs.
Try It Out!
Load your Scratch program from User Input and Interaction, Show Me Your State, and make
the following edits:
1. Drag the ask [What's your name?] and wait block to the Scripts pane. Click
on it. Scratch will ask your name and wait for you to respond.
2. If you’d like to see your answer, check the box next to answer as is done in the
image at the top of the page. The variable answer contains whatever you type when
Scratch asks you a question using the ask [...] block.
3. Add another ask [...] block immediately after the “What’s your name?” block.
4. Change the question text, and re-execute the program.
146
5. Now execute the program. What happens and why?
6. Drag the answer block to your script again so that it reads
Does this do what you would expect to? Why or why not? What would you need to do
to make it work properly?
7. Personalize it in at least three other ways.
8. Provide documentation for your program (describe what it does) as the Instructions. Be
sure to describe how your program is original.
When you are satisfied with your work, submit a link to your program or the program itself.
147
UTeach CS Principles Unit 2: Programming
UNIT TOPIC:
Program State
Variables
You will examine how the dynamic state of an object or program can be stored and
changed using variables.
You will analyze the role of clear, descriptive names for objects, behaviors, variables,
and other identifiers in maintaining the readability of code.
You will integrate randomness into a program through the use of the Random Number
Generator.
148
UTeach CS Principles Unit 2: Programming
Variables
Changeable Placeholders
The attributes we use to describe Scratch the Cat’s location and orientation are called
variables.
Wait a minute! We’ve seen that word before in countless contexts! Let’s recall two forms of
variables:
Variables in Algebra I
Remember Algebra I and its most famous denizen, x? That li’l guy is everywhere from
2x + 3 = 15
to
Its primary purpose is to act as a placeholder for a value that was unknown. The point of
most Algebra I problems is to figure out just exactly what x is a placeholder for—a kind of a
mathematical hide-and-seek game.
This is done by establishing controls and variables. If you want to make sure that any
difference you see in plant growth is due to the type of water you give them, you have to
make sure that they receive equal treatment in every other possible way—the same amount
of sunlight, the same type of soil, etc. These constant factors are controls, whereas the
water is the variable.
In this sense, a variable is a quantity (or quality) that changes. In other words, what it
represents in one scenario may be completely different in another.
149
Programming Variables
Variables in computer programs are similar to both of the types of variables previously
discussed—they are both placeholders and changeable quantities. Let’s look at examples
for each characteristic:
Placeholder
Scratch the Cat is minding his business when suddenly a voice from above tells him turn
← [15] degrees ! We know from examining his attributes that means subtract 15 degrees
from his direction . Or, perhaps the voice tells him move [10] steps . This one is
more complicated because it involves modifying both x position and y position and
it depends on which direction he is facing. However, to some degree, it doesn’t matter,
because the variables are an abstraction. Essentially, turn ← [15] degrees means
subtract 15 degrees from direction —whatever it is. The variable direction acts as
a placeholder.
Changeable Quantity
Consider direction in the example above. As you rotate Scratch the Cat, the
direction variable changes. Note how this is different than variables in algebra
equations. In 2x + 3 = 15, x is always 6. There is one correct answer for “what is x?”
However, unlike algebra equations, computer programs are dynamic things; they change
over time. Scratch the Cat moves, turns, and meows. All of this possible because we are
able to change the variables that define him.
Try It Out!
Load the Scratch program from User input and interaction (and Show Me Your State), and
edit it as described below.
Check the box next to direction in the Motion tab. Now, an indicator will appear on the
screen with the label direction and a value. Continually click the turn ← [15]
degrees block and track how the direction changes. In the information pane for the
sprite, you can freely rotate it. Notice how direction is updated. Clicking on turn ←
[15] degrees once more will still work as intended, because it merely subtracts 15
degrees from direction —whatever it happens to be.
150
UTeach CS Principles Unit 2: Programming
Names Are Important
Names Are Important
Add a second sprite that is controlled by a different set of keys—use W for up, A for left,
S for down, and D for right. At this point, you should have two sprites that can move
independently of one another. Remember that sprites can be imported using the folder
button below.
We want to be able to refer to our sprites with intuitive names. By default, your sprites will
probably be called something like Sprite1 and Sprite2 . Rename them by clicking on
the sprite and typing into the text box, as shown below:
Do the same for both of your sprites, and you should have something that resembles:
Names are important! Naming our sprites Cat and Mouse will make programming with
them much easier. Just as we used answer as a placeholder to refer to what the user
types in, these names that we’ve given our sprite characters can be used as placeholders to
refer to them. Giving them meaningful names makes keeping track of which is which much
easier.
Try It Out!
In your first sprite’s script pane, drag the point towards [...] block into the script
pane. In the drop down, you’ll notice that one of the options is your second sprite (e.g.,
Mouse). If we had kept the name Sprite2 , that is what would have been listed. Select
your second sprite so that the block now resembles:
151
Click the block. Move your second sprite by clicking and dragging it somewhere else. Click
the block. Move your second sprite. Repeat.
Discussion Questions
1. Describe how the name Mouse is a placeholder for the sprite associated with it.
2. How would the program be different if we named the sprite Mus musculus instead?
3. Imagine the following analogous scenario:
You are asked to face your teacher during this class. Stop and imagine what
would happen if you followed this instruction. Who would you be facing? Which
direction?
Now, imagine that you go to your next class, and you receive the same
instruction. Who would you be facing? Which direction?
Likely, the context of the situations are different, so you will be facing two different directions
and two different people. How is this representative of the variable nature of the name
“teacher"? Also, does “face your teacher” have the same meaning for all students?
152
UTeach CS Principles Unit 2: Programming
Game of Tag
Tag!
Let’s make use of these names we’ve given our sprites to make them interact with each
other. We want one of the characters to react if the two characters touch. In the last step, we
showed an example with Cat and Mouse. We can add following script to the Mouse sprite to
say “OH NOs!!!” when it touches the Cat.
Note: One really important thing to note here is that once we click the green flag, the script
above will always be running: notice that it always has a yellow outline.
Compare this to the script below that would only run once when you click the green flag. The
functionality above is often called an infinite loop, and can be very helpful when we want
something to continue running forever.
Try It Out!
Load your Scratch program from Text Input and Names are Important.
Replicate the two commands illustrated in the screenshots above using your own sprites.
Remember that blocks are color coded according to their tabs. Experiment with these
and other codes to make your program unique.
153
UTeach CS Principles Unit 2: Programming
Randomness
Random Number Generator
We can generate a random number in Scratch between any two numbers (inclusive) using
the pick random block:
This will report an integer 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10—each with 1/10 probability. This type
of block (with the round edges) is a reporter block—it reports a value. We can use this block
inside other blocks that take a value:
In other words, a reporter block can be used in any case that a manually entered value or
variable can. You can even use pick random blocks as the parameters to other pick
random blocks:
Instructions
Your job is to make a sprite move around the screen randomly. Load your Scratch program
from Game of Tag, and adjust it as follows:
1. Make one character move around randomly, but still stay within the confines of the
screen. You can do this with the if on edge, bounce block.
2. Use pick random to change the second sprite’s appearance. You can vary many
different factors in the “Looks” tab, including size, costume, as well as many colorizing
and distorting effects.
3. Be personalized in at least three ways.
154
4. Be usable, efficient, and effective.
5. Provide documentation (describe what it does) in the Instructions. Be sure to describe
how your program is original.
155
UTeach CS Principles Unit 2: Programming
Custom Variables
Customization
If Scratch were limited to the three variable attributes of x position , y position , and
direction that the interface provides, manipulating sprites on the screen would become
boring very quickly. Anyone who has played a role-playing game (RPG) is familiar with a
variety of attributes and skills typically available to characters. Because Scratch allows the
creation of custom variables, we can create new attributes for our characters.
Your current Game of Tag program dictates how fast each character moves (e.g., 10 steps
per movement). We could instead vary each sprite’s speed according to its “skill level.” In
order to do that, let’s create a new variable called speed :
In the Variables tab, click Make a Variable , and enter speed as the name. We will
want each sprite in our program to have its own speed, so make sure that the button For
this sprite only is selected. Create one speed variable for each character. What
would happen if For all sprites were selected? Experiment!
The Variables tab supplies two blocks to change the value of speed :
Now that you know about speed variables, make use of them for your Game of Tag
program. For each move [10] steps block, replace the numerical value 10 with the
variable speed by dragging it from the Variables tab into the slot.
156
For example, you should have something resembling the following:
Instructions
Add the following functionality to your Game of Tag program:
1. When the green flag is clicked, have each sprite ask for an initial speed. Set the
speed of each according to the answer given.
2. When the second sprite is caught (e.g., when the cat touches the mouse, and the
mouse says “OH NOs!!!”), increase the second sprite’s speed by 5 .
3. Pretend that the first sprite is a vertical kind of guy—whenever the first sprite moves up
or down, increase his speed by 1 . Whenever he moves right or left, decrease his
speed by 1 .
4. Make your speed variables visible on the screen so you can test your program to
ensure it works correctly.
5. Personalize your program in at least three other ways.
6. Provide documentation for your program (describe what it does) as the Instructions. Be
sure to describe how your program is original.
When you are satisfied with your work, submit a link to your program or the program itself.
Your work will be reviewed by a peer, and in turn, you will review one of your peers’ projects.
You should base your evaluation on the assignment rubric.
Rubric
Criteria Points
Characters move properly 2 pts
Mouse says “OH NOs!!!” when characters touch 2 pts
Speed is determined by user input 2 pts
Mouse speed is conditioned on collisions with Cat 2 pts
Cat’s speed is conditioned on directional movement 2 pts
Displays speeds on the screen 1 pt
Documentation, usability, and personalization 1 pt
TOTAL 12 pts
157
158
UTeach CS Principles Unit 2: Programming
Drawing Commands
Experiment with Drawing Commands
Experiment with the pieces shown below from the Motion and Pen tabs. Figure out what
each one does, use the pieces below to draw a square. To make your character smaller so
that it is easier to see where it is drawing, alter the size attribute using blocks located in the
Looks tab. Make sure that you select the option for “rotation style” so that the character
freely rotates (as seen below).
Blocks Stage
159
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
160
UTeach CS Principles Unit 2: Programming
Reviewing Variables
Change vs. Set
The previous assignment was aimed at getting you comfortable with using variables to store
values, giving you power to do things you couldn’t without them. Variables are one of the
central concepts in computer science and one of the most powerful tools in a programmer’s
toolbox. They will pop up steadily throughout the rest of the course.
These are two blocks that we will be seeing a lot in Scratch programs. Make sure that you
are clear on the differences between them.
You can use these two blocks to accomplish the same thing, but it probably makes more
sense to use one over the other depending on what you are trying to do.
If you are trying to set the value relative to what it already is (such as adding $5 to a
bank balance), you will probably be better off using a change block.
If you are trying to set it to a totally new, unrelated value (such as resetting a score in a
video game), then you will probably want to use a set block.
Consider this:
change and set blocks are analogous to turn [...] degrees and point in
direction [...] . How are they similar? Be prepared to discuss your thoughts in class
tomorrow.
161
UTeach CS Principles Unit 2: Programming
UNIT TOPIC:
Selection Statements
"if...else” Statements
Quiz Show
You will examine the use of the Boolean operators “AND,” “OR,” and “NOT” in
constructing complex conditional statements.
162
UTeach CS Principles Unit 2: Programming
UNIT TOPIC:
Selection Statements
"if...else” Statements
163
UTeach CS Principles Unit 2: Programming
Decisions
Decisions, Decisions...
It may not seem like a free-willed thinking agent, but our previous program made a decision
—sort of. It chose between two possible actions, either to increase Mouse’s speed or to not
increase Mouse’s speed. It made this decision many, many times depending on whether Cat
and Mouse were touching.
In the Unit 1 activity, Flow Patterns, you explored the basic components that can constructs
any algorithm: sequence, selection, and iteration. This type of “decision,” made by using the
conditional if block, is an example of the algorithmic component known as selection. Much
like the word “if” works in English, the conditional block only executes the blocks inside of it if
something—some condition—occurs.
Contrast these with the instruction blocks we’ve created up to this point, which have been
more like imperative statements:
“Give me $10.”
“Build the stadium.”
“Travel to Emerald City.”
"I went into a clothes store and a lady came up to me and said ‘if
you need anything, I’m Jill.’ I’ve never met anyone with a
conditional identity before.”— Demetri Martin
If Blocks
The most basic type of conditional block is the if block. Sometimes, computer scientists refer
to these types of instructions as branching. The following diagrams provide a visual
explanation:
Consider a basic computer program much like the ones we’ve written up until now. The
computer executes one instruction, then moves on to the next—the one immediately
following it:
164
Instructions Direction of Flow
The arrow indicates the control flow, or how the computer moves from instruction to
instruction over time. In other words, it starts at the top (the beginning) and moves
downward (chaining instruction after instruction) from there, until it reaches the end. Note:
Notice that the if block is contained in the Control tab—these are blocks that explicitly
deal with flow control.
The if block however represents a choice. If some condition occurs, then the control flow
branches off down a different path of instructions. If the condition does not occur, then the
flow occurs as normal:
In this example, if the sprite is touching [mud] , then its speed slows down by 5.
Regardless, it moves speed number of steps afterward. The condition of touching
[mud] just has an effect on the speed of the movement.
165
Scratch allows for easy visual indication of which blocks are affected by an if block; any
block enclosed in the yellow frame of blocks headed by the if is executed only if the
condition is met. Otherwise, they are simply skipped over.
Binary Conditions
Consider the “condition” in an if block. What kinds of conditions can be used in a
program? Much like other aspects of computer science, it boils down to a bit (a dichotomy, a
yes/no question, on or off)—a condition has two possible outcomes, either it is met or it is
not met. In the Representation module, we will spend a great deal more time examining the
power inherent in dichotomies.
Scratch “shape-codes” these binary values (also called Boolean values) as hexagons.
There are many valid hexagon-shaped blocks available to fill the hexagon-shaped hole in an
if block. Examine the following:
166
UTeach CS Principles Unit 2: Programming
How Many Days...?
Instructions
How many days are there in a given month? We have created a Scratch script that
calculates the number of days in the month corresponding to the number that a user types
in. Discuss with a partner how this script works. There are a lot of new things in this program
that you may not have seen before, so be sure to explore.
Note: This code is NOT written with the best style or efficiency. We hope your reaction to this
is “YUCK!”
We have made this code explicitly complex to challenge your understanding of basic code
and blocks and their combinations. Feel free to use pen and paper to take notes or work out
the code algorithmically.
167
UTeach CS Principles Unit 2: Programming
Switching and Nesting
Switching and Nesting... or ELSE!
Our conditional blocks up to now have been been analogous to light
switches—when turned “on,” something occurs, but when turned “off,” that
something does not occur. This makes sense given the nature of our
conditions—they are binary—either on or off.
However, this is not the only kind of binary switch we can model in a program. Consider a
railroad switch as a type of binary, either/or switch.
With this type of switch, rather than choosing to do something or not, you are selecting
between two alternate outcomes. One (and only one) of the outcomes will occur! In the
railroad switch example, the train, upon approaching the junction, will take either the left or
the right track, depending on how the switch is oriented.
Scratch allows for this type of conditional with the if / else block: Check the condition.
Is it true or false?
If it is true, then do what is enclosed in the if portion of the block. This is exactly the same
168
as the standard if block.
“If you can’t fly then run, if you can’t run then walk, if you can’t
walk then crawl, but whatever you do you have to keep moving
forward.” —Martin Luther King, Jr.
The Russian nesting dolls are an iconic example of Russian culture. However, they also
169
illustrate an important construct in computer programming—nested blocks.
Motivation
Our newly learned if / else block is powerful; it allows a program to follow one of two
paths upon reaching a condition. In other words, it allows the program to make an either/or
decision.
Example:
However, this is not how humans decide most things. Generally, there are more than two
possible outcomes to a decision.
Example:
It would be nice if our programs could choose from among multiple different paths rather
than just two.
There are four possible answers, so the program handles each to produce four different
outputs. However, notice that the if / else blocks are nested, much like nesting dolls,
where one if / else block is contained entirely within another. This is different than
saying, “If it’s water, do this. If it’s soda, do that. If it’s coffee, do x, and if it’s juice, do y.”
Like the nesting dolls, the blocks within an if or else portion of a block are not even
accessed/executed unless the program “uncovers” them by taking the appropriate branch.
170
So, in our example program, the answer is not even
checked against soda if the answer was already matched to
water. At the point at which the program finds a match for
answer , it is effectively done searching for one.
If [DONE] is ever reached, then the program is effectively done with executing the nested
if / else blocks.
Discussion Questions:
The blocks tying both questions together:
no
yes, water
yes, soda
yes, coffee
yes, juice
171
UTeach CS Principles Unit 2: Programming
UNIT TOPIC:
Selection Statements
Quiz Show
You will examine the use of the Boolean operators “AND,” “OR,” and “NOT” in
constructing complex conditional statements.
172
UTeach CS Principles Unit 2: Programming
Quiz Show
Instructions
In this assignment, you will create a Scratch program that simulates a quiz-style game show.
Your program should execute at least the following actions:
1. The host sprite should “walk” onto the stage, and introduce him/herself.
2. The host then asks a series of three questions (of your choosing). Each question
should have at least one correct answer.
3. A user’s score should be kept. HINT: Create a variable score and increase it by one
any time a correct answer is given.
4. The host should give the score and conclude the “show” when the three questions
have been asked/answered.
5. Make your game unique, usable, and enjoyable.
6. Personalize it in at least three other ways.
7. Provide documentation for your program (describe what it does) as the Instructions. Be
sure to describe how your program is original.
When you are satisfied with your work, submit a link to your program or the program itself.
Your work will be reviewed by a peer, and in turn, you will review one of your peer’s projects.
You should base your evaluation on the assignment rubric.
Rubric
Criteria Points
Host enters stage (movement) and introduces him/herself. 1 pt
Three questions asked. 3 pts
Score is correctly kept for correct answers. 3 pts
Answers are indicated as RIGHT or WRONG by host. 1 pt
Host conclude show with score. 1 pt
Documentation, usability, and personalization 1 pt
TOTAL 10 pts
173
UTeach CS Principles Unit 2: Programming
UNIT PROJECT:
Scratch Program
Highlights
You will collaborate in pairs to design, implement, and debug a novel, aesthetically
pleasing, and intuitive program using the Scratch programming environment.
You will identify a specific purpose that your program will serve (e.g., entertainment,
problem solving, education, artistic expression, etc.).
You will integrate interactive and multimedia elements into your program.
You will integrate common programming constructs, such as variables and selection
statements into your program.
You will test, debug, and correct your program.
You will use appropriate terminology while writing documentation detailing the full use
of your program and its features.
You will explain your design and implementation choices while demonstrating and
sharing your finished programs with your peers.
You will provide a written analysis of at least one other design team’s program,
identifying its strengths and weaknesses and offering suggestions for improvement.
174
UTeach CS Principles Unit 2: Programming
Scratch Project: Documentation
Correctness—In other words, the program should do what it’s supposed to do. Smaller
components of a program must themselves be correct to form an overall correct
program. Imagine a calculator program that produced output indicating that 2 + 2 =
3 . The usefulness of this calculator would certainly be suspect. We may consider
looking at how the addition procedure is executing. The style of the program can also
affect the determination of program correctness.
Efficiency—Now imagine that the calculator were fixed so that it always produced
correct output, but took a very long time to do it. Entering 2 + 2 = now returns the
proper answer 4 , but only after spending 12 minutes processing its input. The
usefulness of this calculator would certainly be suspect.
Clarity—Imagine that the calculator has been altered to be both correct and efficient.
What other quality could possibly make it unusable? Imagine a calculator in which to
add a 2 and 2 , some other key sequence than 2 + 2 must be entered. By not
knowing the proper key sequence, or more generally how the calculator functions, you
would certainly have doubts about the usefulness of this calculator. Note: The
calculator linked above is actually very useful, once you’ve mastered the input mode.
Try the examples linked at the top of the page. How does it work?
In this assignment, you will concentrate on the third point—clarity. Write up a document
detailing how your application is used. You should concentrate on the following (as
applicable):
175
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
176
UTeach CS Principles Unit 2: Programming
TOPIC:
Repetition
Repeat
You will make use of loop constructs to create repetition without the need for
duplicating code.
Repeat Until
You will make use of loop constructs to create repetition without the need for
duplicating code.
You will explore the differences between counter- and conditional-based loops.
You will integrate sequencing, selection, and iteration into a single mini-project artifact.
177
UTeach CS Principles Unit 2: Programming
UNIT TOPIC:
Repetition
Repeat
You will make use of loop constructs to create repetition without the need for
duplicating code.
178
UTeach CS Principles Unit 2: Programming
Repeat
Experiment with Repeat
In Game of Tag, you experimented briefly with the forever block. This “infinite loop”
construct is often useful, but a more common scenario is repeating an action for a specific
number of times (i.e., when the number is less than ∞!). Scratch makes this a simple task
with Repeat blocks. In the Unit 1 activity, Flow Patterns, you explored the basic
components that can constructs any algorithm: sequence, selection, and iteration. Repeat is
an example of the iteration component of algorithmic design. We can use a Repeat to
make drawing shapes a lot easier! You can see below a script to draw a square.
Equilateral Triangle
Pentagon
Hexagon
Octagon
Circle
Five-sided Star
What patterns can you identify between these shapes and the commands?
179
UTeach CS Principles Unit 2: Programming
Repeat After Me
Hey Jude
Take a moment to refer to the flowchart outlining the lyrics of Hey Jude by The Beatles.
A less concise, more precise chart of the lyrics might list each and every
time “Na” is sung. An even more precise representation would include the
timing, pitch, and tempo, like a piece of sheet music might illustrate.
However, even the precise notation that musicians rely on allows for
shortcuts to indicate repetition. The colon symbol below indicates a phrase
that is to be repeated once:
Instructions
Answer the following questions about representing repetition algorithmically with
programming.
180
1) In order for a repetition to be reproduced unambiguously, what three attributes are
needed to convey the proper instructions?
Answer:
2) How are these three attributes conveyed in the music notation above? How are they
conveyed in the Scratch repeat block?
Answer:
3) Find and record a set of instructions containing a repetition clause (e.g., instructions on a
shampoo bottle, a recipe, etc.). How are these attributes conveyed there? If any are missing,
how would the reader infer them?
Answer:
181
UTeach CS Principles Unit 2: Programming
Tempo
Changing Tempo
Changing the speed of a song can be easier than the programming strategies employed in
The Cat's Meow. A variable called tempo helps tremendously by changing how long one
beat is.
In the Sounds tab, find and select the check box next to tempo so that you can see the
value of the variable tempo at any given moment on the stage. The tempo and change
/ set tempo blocks provide three main ways to increase the tempo of a song.
Try It Out!
Experiment with these codes to learn more.
When you are satisfied with your work, submit a link to your program or the program itself.
182
UTeach CS Principles Unit 2: Programming
Regular Polygon Generator
Generalization
The power of programming lies in the ability to not only automate tasks, but to generalize
them. Rather than create a different shape generator for each regular polygon with varying
numbers of sides, we can use the abstraction variables to program a generic regular shape
generator.
Instructions
Combine what you’ve learned about variables and shapes to program an automated shape
generator. At minimum, your program should execute the following actions:
When you are satisfied with your work, submit a link to your program or the program itself.
If you are having difficulty with determining the correct “formula” for the number of degrees
to turn each time, think about the following:
You are working with two variables—what are they, and how might they be relevant?
What numeric value would a square use to fill the blank? A triangle? Use these to think
of a function to generalize them.
183
UTeach CS Principles Unit 2: Programming
UNIT TOPIC:
Repetition
Repeat Until
You will make use of loop constructs to create repetition without the need for
duplicating code.
You will explore the differences between counter- and conditional-based loops.
184
UTeach CS Principles Unit 2: Programming
Repeat Until
Repeat Until
There is another Scratch piece that lets us repeat things. It is called repeat until .
Just like repeat , it will do everything inside the C-shaped block a number of times.
Just like if , it is dependent upon a condition. Before it starts the loop each time, it
checks to see if the condition (below, x > 5 ) is true. If this is condition is true, then it
will not repeat again.
The repeat until block can be really helpful to keep track of what the variable x is at
each point to help us understand how this new piece works.
Try It Out!
Use a chart like the one above to keep track of what would happen in the complicated
185
repeat until code below. Try it without Scratch!
186
UTeach CS Principles Unit 2: Programming
Conditional Loops Compared
Conditional Loops Compared
Scratch provides two loop constructs that are dependent on a condition for execution. They
are similar, yet differ in one crucial aspect—how the condition relates to execution:
Put another way, if you were instructed to “Hop on one foot until the clock chimes,” you
would continue to hop, stopping only when the condition you are looking for (a clock chime)
occurs. However, if you were instructed to “Hop on one foot if the clock chimes, forever,” you
would wait until you heard a clock chime before you hopped. Each time the clock chimed,
you would hop on one foot. Your instruction is to do that forever, so as long as you are
following the instructions, you will continually stop to hop on one foot each time a clock
chimes.
187
UTeach CS Principles Unit 2: Programming
Draw a “Squiral"
Instructions
In this assignment, you will create a new Scratch program that can generate the following
“squiral” (i.e., square+spiral) below. Create a new program in Scratch, and open your
program from Regular Polygon Generator to reference. Your program should:
When you are satisfied with your work, submit a link to your program or the program itself.
Blocks Result
188
UTeach CS Principles Unit 2: Programming
UNIT TOPIC:
Repetition
Loops and Variables
You will integrate sequencing, selection, and iteration into a single mini-project artifact.
189
UTeach CS Principles Unit 2: Programming
Loops and Variables Mini-Project
Instructions
To demonstrate your knowledge of loops and variables, you will choose among three
different assignments. Select the one you feel is most appealing to you. Each incorporates
the loops and variables, but in different contexts.
Rubric
190
Creativity Program is creative. 2 pts
Program is minimal, contains only core
1
requirements.
TOTAL 10 pts
191
UTeach CS Principles Unit 2: Programming
Option I: Draw a Picture
Example
Create something (anything!) beautiful with the drawing tools in Scratch. For instance, the
following image was created in Scratch:
The image above plays around with set pen color to [...] and set pen size
to [...] . If you want to create more complex images, you probably want to use
broadcast blocks to delegate the drawing of different parts of the image. For example,
we used the following broadcast blocks for the image above:
Tip: Your images will draw more quickly if you use the hide tab to hide the character.
Instructions
In this assignment, you will create a Scratch program that draws something beautiful. Your
192
original program should:
When you are satisfied with your work, submit a link to your program or the program itself.
Your work will be reviewed by a peer, and in turn, you will review one of your peers’ projects.
You should base your evaluation on the assignment rubric.
193
UTeach CS Principles Unit 2: Programming
Option II: Electronic Keyboard
Features
This option asks you to program a virtual keyboard! The code
to the right gives you a start, but you must add additional
functionality, perhaps by adding scripts for other keys on the
computer keyboard, so that your electric keyboard can
change:
Volume,
Instrument,
Tempo, and/or
Drum loop/patterns.
Instructions
In this assignment, you will create virtual keyboard using Scratch. Your original program
should:
1. simulate a virtual electronic keyboard that plays notes with computer keyboard
presses,
2. use loops to add functionality (e.g., for drum patterns, tempo changes),
194
3. use variables appropriately and as necessary,
4. be usable, efficient, and effective, and
5. include documentation (describe what it does) in the Instructions pane. Be sure to
describe how your program is original.
When you are satisfied with your work, submit a link to your program or the program itself.
Your work will be reviewed by a peer, and in turn, you will review one of your peers’ projects.
You should base your evaluation on the assignment rubric.
195
UTeach CS Principles Unit 2: Programming
Option III: Countdown
Starter Code
Often, we want to display text or numbers on the screen without using the say block. We
have written a program that displays one number based upon a variable named digit .
You can copy and remix the original program starter code. You should modify this program
to work for five digits! (Initially, however, you may want to start by modifying it just to work
with two digits.)
When we use the switch to costume block, we can specify either the name or the
costume number. In the script below, we use the costume number ( 1 – 9 ) to set the
costume unless the digit is 0 . If the digit is 0 , we cannot just tell it to switch to costume
196
0 (because there is no costume 0 ), so we have to use the name of the costume [ zero ].
Modulus
The mod block is essential to completing the program efficiently.
The modulus (mod) operator is used to return the remainder of a division operation. It might
remind us how we answered division problems before we learned about decimals (e.g., 10 ÷
3 = 3 remainder 1). Correspondingly, 10 mod 3 = 1. As you complete this assignment,
consider the place value of each digit you need to change. If you count down from 25, you
can isolate the 5 from 25 by performing a mod operation (25 mod 10 = 5). You can then
isolate the 2 from 25 by performing the following operation:
For more explanation on the mod block, select the Help pane on the right side of the screen
in Scratch.
Instructions
In this assignment, you will program a virtual countdown timer using Scratch. Your original
program should:
1. simulate a virtual countdown timer that can count from any five-digit number down to
zero,
2. remix the starter code provided,
3. use loops for timed countdown functionality,
4. use variables appropriately and as necessary,
5. be personalized in at least three ways,
6. be usable, efficient, and effective, and
7. include documentation (describe what it does) in the Instructions pane. Be sure to
describe how your program is original.
197
When you are satisfied with your work, submit a link to your program or the program itself.
Your work will be reviewed by a peer, and in turn, you will review one of your peers’ projects.
You should base your evaluation on the assignment rubric.
198
UTeach CS Principles Unit 2: Programming
UNIT TOPIC:
Procedures
Abstraction
Procedures
You will explain the importance of using procedures to reduce the complexity of writing
and maintaining programs.
You will analyze programs and explain the use of abstraction within the programs.
Procedures in Scratch
199
UTeach CS Principles Unit 2: Programming
Abstraction
Abstraction is a powerful concept in computer science, ranking alongside algorithms in
importance. The ability to apply abstraction in problem-solving is often what makes large
problems feasible. But what is abstraction?
At a basic level, abstraction is the process of “removing detail.” Before examining what this
means in relation to programming, let’s look at a couple of examples of abstraction in other
fields.
In Art...
Abstract art is created to be independent of—or detached from—the real,
physical world. This allows the artist to focus on colors, shapes, and form
without the restriction of adhering to expectations. In this way, “removing
detail” amounts to exploring artistic forms that may apply directly to nothing
in the real world, or that may apply to many arbitrary objects, ideas, or
concepts.
French Window at Collioure, painted by Henri Matisse in 1914, explores the form and color
of a interior window without striving to actually look like a window.
...and Design
Cars are complex machines. They are made up
of many independent parts that are organized
into systems—much like the nervous system or
circulatory systems in the human body.
Designing, building, and maintaining them are
time-consuming tasks requiring a lot of detailed
knowledge. Only specialized engineers and
mechanics typically design and service vehicles.
Designers and engineers have created an interface that allows the driver to operate the car
without needing to know the details. To make the car go, press the accelerator pedal. To
make it stop, press the brake pedal. All of the intricate processes that occur when these
pedals are pressed are hidden from the driver!
200
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
201
UTeach CS Principles Unit 2: Programming
Procedures
Removing details in an algorithm
Imagine that you are sitting in a chair in your home when someone knocks at the door. What
do you do? How do you react to this event?
Open the door leaves out how you get to the door
Adding walk to the door doesn’t explain how to actually walk
Replace walk to the door with a repeated series of take a step doesn’t describe how
steps are taken
Lift your foot leaves out which muscles you use to bend your leg
...
In fact, at some level, your body adds layers of abstraction. After all, you are not aware of all
of the molecular processes needed to perform the seemingly simple task of contracting a
muscle.
Procedures
To walk is a common occurrence. Wouldn’t it be nice to define walking—with all of the
necessary details—once and then just call it walk? After all, listing all of the actions needed
to perform such a simple task every time you did it would be annoying and time-consuming.
PROCEDURE walk
1. Lift left leg.
2. Move left leg forward.
3. Set left leg down.
4. Lift right leg.
5. Move right leg forward.
6. Set right leg down.
Parameters
The procedure defined in the last section, walk, is problematic. It only consists of two steps.
In reality, we’d like to define walk so that we can take any number of steps. Instead of
defining separate procedures take 1 step, take 2 steps, take 3 steps, etc., we can apply
abstraction again—“removing detail”—to create a procedure take n steps, so that n can be
whatever we want it to be each time we use it.
Notice that the value n can vary, and we can create instructions with it without knowing what
202
it is. This means that it is, in effect, a variable. It’s a special type of variable called a
parameter. It is only relevant within the procedure in which it is defined.
203
UTeach CS Principles Unit 2: Programming
Procedures in Scratch
Scratch provides the means to create your blocks. These new blocks are the Scratch
equivalent of procedures with parameters. To illustrate how to create blocks in Scratch, let’s
recreate the different walk procedures from the previous section in Scratch.
Example: walk
PROCEDURE walk
1. Lift left leg.
2. Move left leg forward.
3. Set left leg down.
4. Lift right leg.
5. Move right leg forward.
6. Set right leg down.
To create a block with the same functionality as walk in Scratch, first navigate to the More
Blocks tab. Clicking Make a Block allows you to create and name a new block of your own:
Once you have created it, you can define what it does. To do this, create a block script as
you would for any other event:
204
Using it is as simple as using any other pre-made block. Simply drag your block into a
sprite’s script pane:
The take n steps procedure uses a parameter, so when the block is created, you can
add a number input as an optional component, and call it n :
205
When you define what it does, you can use the n parameter within the block the same way
you would use any other block:
Using the block with a parameter of 1 duplicates the original walk block:
Using the block with a parameter of 8 causes the sprite to take eight steps:
206
However, you are not limited to using pre-determined values in your blocks! As with any of
the pre-defined blocks you have used, you may also include variable parameters. The
following code asks the user how many steps to take,
1. Create a procedure that will define a parameter sides , and draw the polygon with
that many sides . Your block should be named DrawRegularPolygon . Example:
207
2. The procedure’s script should draw the
regular polygon matching the given
parameter sides . In other words,
3. Include an ask block, so that the user may direct which shapes the program draws.
208
UTeach CS Principles Unit 2: Programming
CODING SKILLS:
Rock, Paper, Scissors
Rock, Paper, Scissors
You will integrate sequencing, selection, and iteration into a single artifact.
You will focus on efficiency while writing code.
209
UTeach CS Principles Unit 2: Programming
Rock Paper Scissors
Rock, Paper, Scissors
Mankind has been plagued for millennia by disagreement and indecision. Luckily, we have
recently discovered plans for constructing the ultimate decision-making tool in ancient
manuscripts attributed to scholars from lost Atlantis. It’s... Rock, Paper, Scissors!
For this activity, you will program a Rock, Paper, Scissors game using Scratch.
Make a Scratch script that makes two sprites say either “Rock,” “Paper,” or “Scissors”
as appropriate when the green flag is pressed. You must use a pick random
block to generate both sprites’ choices.
Add a third sprite. After the first two sprites pick their choices and say the
accompanying words, the third sprite should say who wins each round (or whether the
round is a tie). You may find this chart of rock, paper, scissor outcomes useful.
Add score variables (e.g., cat_score and duck_score )that keep track of how
many times each character has won (ties net zero points each). Remember, you can
make these variables visible on the stage by clicking on the check-box next to the
variable name.
The third sprite should end the game when either contestant (e.g., Cat or Duck)
reaches a score of three wins, and announce the winner.
Instructions
In this assignment, you will program a Rock, Paper, Scissors game using Scratch. Your
original program should:
When you are satisfied with your work, submit a link to your program or the program itself.
Your work will be reviewed by a peer, and in turn, you will review one of your peers’ projects.
You should base your evaluation on the assignment rubric.
Rubric
Criteria Points
Uses random block appropriately to generate plays. 1 pt
Displays correct/consistent plays for randomly generated
3 pts
numbers.
Resolution of all cases is correct (e.g., Rock beats
3 pts
Scissors).
Scores for each sprite are correctly calculated and
2 pts
displayed.
Game ends as described. 1 pt
TOTAL 10 pts
211
UTeach CS Principles Unit 2: Programming
UNIT PROJECT:
Scratch Program
Highlights
You will collaborate in pairs to design, implement, and debug a novel, aesthetically
pleasing, and intuitive program using the Scratch programming environment.
You will identify a specific purpose that your program will serve (e.g., entertainment,
problem solving, education, artistic expression, etc.).
You will integrate interactive and multimedia elements into your program.
You will integrate common programming constructs, such as variables and selection
statements into your program.
You will test, debug, and correct your program.
You will use appropriate terminology while writing documentation detailing the full use
of your program and its features.
You will explain your design and implementation choices while demonstrating and
sharing your finished programs with your peers.
You will provide a written analysis of at least one other design team’s program,
identifying its strengths and weaknesses and offering suggestions for improvement.
212
UTeach CS Principles Unit 2: Programming
Scratch Project: Rubric Check
Instructions
This activity provides you and a partner group time to provide one another critical feedback
about the progress of your projects.
Listen to what your partner group says! You do not need to heed all of their advice, but
consider it wisely. The whole idea is to have a critical third party provide ideas to make your
project better before you submit the final version for a grade.
213
UTeach CS Principles Unit 3: Data Representation
UNIT 3
Data Representation
In order to make the most effective use of computational tools and data-driven
applications, you need to be aware of and comfortable with the diversity of
information these program may use, as well as how that information may be
digitally represented, stored, and manipulated within the computer. This unit
focuses on providing you with an overview of the various levels of abstraction
that are used in the digital representation of discrete data and information.
You will initially focus on the lowest levels of digital representation and storage
by examining different base representations of numbers (including decimal and
binary) and their application to ASCII and Unicode character encoding. You will
also explore the distinctions between analog and digital forms of representation.
Finally, you will examine the characteristics of lists and the types of common
use-cases for these linear, ordered collections, including traversing, searching,
and sorting.
214
UTeach CS Principles Unit 3: Data Representation
UNIT PROJECT:
Unintend’o Game Controller
Highlights
You will develop a Scratch program that acts as a device driver for a video game
controller interface.
You will map each of six controls (UP, DOWN, LEFT, RIGHT, A, and B) to individual
bits.
You will map each binary pattern of button presses to different game actions (e.g., walk
forward, walk backward, turn left, turn right, jump, duck, whirl, leap, crawl, etc.).
You will write detailed specifications and justifications for each button-to-action
mapping of your design.
You will collaborate with your peers throughout the design and development process to
determine end-user requests for features and to share feedback on design and
implementation strategies.
You will write documentation detailing the use of your program and its features using
appropriate terminology.
215
UTeach CS Principles Unit 3: Data Representation
Unintend'o Controller Project
“There are 10 types of people in this world: those who understand
binary and those who don’t.” – Unknown
Introduction
Now that you’ve learned some basics about Programming using Scratch, it’s time to delve
deeper into the layers of computational abstraction—to understand how computers
represent information. From video game platforms to smartphones and supercomputers, it’s
all bits and binary underneath. Learning how to make bits and binary work for you can be
very useful.
https://www.youtube.com/embed/iMcZ7oZDjRU
Controller Project
It’s 1987 and a low budget startup video game company, Unintend’o (pronounced: UN—
INTEND—OH), is trying to get into the gaming business by making knock-off games of
popular titles. Their first game, entitled Peerless Fabio Twins, involves two pipefitting Italian
twin bothers named Fabio and Lucio who face a variety of dangers in order to rescue the
Countess Lichen. To do so, they must defeat all sorts of animated lichens and molds by
jumping on top of them.
You’ve been hired to create a new controller for the video game company Unintend’o. You
216
will need to decide the role of each button on the controller so
that you can move the character through a 2D digital space.
Assignment
Develop a controller interface for basic game moves (e.g., walk forward, backward,
jump, duck) and advanced combo moves (e.g., whirl, leap, crawl), which requires that
you:
Map the button’s states (pressed/not pressed) to a binary sequence that travels
down a wire.
Map each binary sequence to one of the aforementioned (and other) actions.
Write detailed technical specifications that describe and evaluate your system.
Submission
Your submission will include both a Scratch program and the
technical specifications of your device driver. You should use
this starter code to begin the project.
A basic overview of the controller design with an explanation of each button’s role in
the game—in effect, a user's manual.
217
a table outlining all possible button combinations and what action each causes (if
any). Assuming you assign each button to a bit, there are 26 or 64 possible
combinations!
a description of any limitations your controller may have that lie outside of your
table of button-press combinations. For example, perhaps if both the ← and →
buttons are pressed at the same time, Fabio oscillates between the two
directions rather than favoring one over the other or standing completely still.
Learning Goals
Over the course of this module and this project, you will learn to:
Content
Area Performance Quality
Fabio performs Fabio performs The bitstring Not
tasks based on tasks based on may be enough
the bitstring the bitstring created criteria
and not directly and not directly correctly, but are met in
from the key from the key Fabio order to
presses or presses or responds to award any
Bitstrings controller controller the key credit.
buttons. buttons but the presses or
bitstring may controller
—AND— be created buttons and
incorrectly. not the
The bitstring is bitstring.
created
correctly.
218
Both loops and Loops or Loops or Not
conditionals conditionals conditionals enough
have been (but not both) (but not both) criteria
added to the have been have been are met in
program. added to the added to the order to
program AND program. award any
—AND— all loops or credit.
conditionals —AND—
All loops and are used
conditionals effectively and Not all loops or
are used correctly with conditionals
effectively and purpose in the are used
Loops and correctly with program. effectively and
purpose in the correctly with
Conditionals program. —OR— purpose in the
program.
Both loops and
conditionals
have been
added to the
program but
not all loops or
conditionals
are used
effectively and
correctly with
purpose in the
program.
—OR—
Program uses
a list to
accurately
keep track of
some button
presses
through their
associated
bitstrings.
Fabio performs
one action
when the user
presses a
combination of
buttons
simultaneously
AND the
movements
appear fluid
and are
aesthetically
pleasing.
221
UTeach CS Principles Unit 3: Data Representation
UNIT TOPIC:
Binary Encoding of Information
Binary
You will examine how numerical values are represented using different bases,
including decimal and binary.
Base Conversions
You will explore methods of converting values from decimal to binary and binary to
decimal.
You will explore methods of counting in binary.
You will examine the exponential relationship between the number of digits and their
range of representable values.
You will examine how alphanumeric characters and symbols may be represented
using ASCII and Unicode character mappings.
You will analyze the differences in state space between ASCII and Unicode standards.
222
UTeach CS Principles Unit 3: Data Representation
UNIT TOPIC:
Binary Encoding of Information
Binary
You will examine how numerical values are represented using different bases,
including decimal and binary.
223
UTeach CS Principles Unit 3: Data Representation
What Is Binary?
What Is Binary?
“20 Questions” demonstrates the power of dichotomous relationships, in which something
can only be one thing or another (Yes/No). Other examples of dichotomous relationships
might include: a light switch, which can either be flipped on/off (not including dimmer
switches), or handedness (left/right, not including ambidextrous folks), or at the most basic
level, existence (something either exists or it does not).
The purpose of bits is to represent something digitally. They are how information is stored,
accessed, transformed, and used by computers. Everything that we see on a computer is
actually stored as bits. The letters on this screen, the images, the links, everything you see
on this webpage is stored digitally as electrical switches turned off or on (typically
represented as long strings of 1 s and 0 s) that computers can interpret and transform into
symbols we understand, like numbers, letters, images, sounds, and programs.
This is, of course, an oversimplification of binary code and bits. Typically, in modern
computers, the 1 s and 0 s we refer to are the presence or absence of electrical signals,
but they don't have to be! One of the many beauties of computer science is that abstraction
allows us to view many processes and systems computationally—even those not involving a
“computer.”
Learn more about bits and binary code from Blown to Bits:
224
UTeach CS Principles Unit 3: Data Representation
Twenty Questions
Binary Code in 20Q
Bits can encode everything. How many yes/no questions would it take to identify anything
you can think of (e.g., animal, vegetable, music, artist, etc.)? Today you’ll play and discuss
the game “20 Questions” in order to think about how binary code can help represent just
about anything.
“20 Questions” has been a popular game in the United States for more than a century. Have
you played it with your friends or family before? If you haven’t, here’s how the game works:
How has computing affected 20 questions? Well, now we can play against computers using
artificial intelligence.
1. Navigate to 20Q.net.
2. Play 20 Questions against the computer by choosing the language you’d like to play in.
Follow the online instructions and see how accurately the computer can guess your
chosen subject.
3. After a few games, click “About Us” in the left side bar of the 20Q.net website. Read
225
the page to learn more about how the program works.
4. Next, play a game of 20 Questions with a neighbor. Choose the questions you ask in
the game strategically.
5. When you are finished, you will be expected to discuss the following questions:
What were some “good” questions? (“good” means “efficient and effective”) Why
were those questions good?
What were some “bad” questions? (“bad” means “inefficient or ineffective”) Why
were those questions bad?
How does the choice of questions affect their utility?
Who is more effective—20Q.net or your neighbor?
226
UTeach CS Principles Unit 3: Data Representation
State Space
Definition
Technically, the phrase state space refers to the space of potential possibilities. Watch the
following video of a man making handmade noodles in Beijing, China. In this case, state
space refers to the potential number of noodles he makes every time he folds the existing
noodles in half.
https://www.youtube.com/embed/6rfu1ZHiMP8
The state space of number systems grows exponentially with the number of possible digits.
If you have one decimal digit ( _ ), you can encode 10 different numbers (0–9).
If you have two decimal digits ( _ _ ), you can encode 100 different numbers (00–
99).
If you have three decimal digits ( _ _ _ ), you can encode 1,000 different numbers
(000–999).
...
Notice a pattern? Every time you add a decimal digit, the state space multiplies by 10 (e.g.,
10x10=100, 100x10=1,000).
227
Decimal:
324 10
= 300 + 20 + 4
= 324 10
Binary:
100111 2
= 32 + 4 + 2 + 1
= 39 10
Mathematically, any number can be described by the following notation, where i is the
position of each digit in the number (i.e., i=0 represents the “ones” place, i=1 represents the
“tens” place, i=2 represents the “hundreds” place, etc.):
The base represents how much the state space grows multiplicatively each time a digit is
added.
If you have one binary digit ( _ ), you can encode two different numbers (0–1).
If you have two binary digits ( _ _ ), you can encode four different numbers (0–3).
If you have three binary digits ( _ _ _ ), you can encode eight different numbers
(0–7).
...
Every time you add a binary digit, the state space multiplies by two (e.g., 2 x 2 = 4, 4 x 2 =
8). The pattern here is the same, though the base differs.
228
objects. With 10, the number grows to 1,024. With 20 questions, the number reaches 220 or
1,048,576 objects!
Imagine a very constrained space, in which I ask you to identify which direction you are
facing in 1D space (i.e., “left” or “right"). How many yes/no questions must you ask to figure
out the direction? The obvious answer is “2,” but that’s actually more than we need. Instead
of asking “Is it left?” and “Is it right?”, we could just ask “Is it left?” or “Is it right?” The state
space of one binary question contains two possibilities (21 = 2), so that‘s all we need.
For homework, consider a scenario in which you must identify which cardinal ( N , S , E ,
W ) or intercardinal ( NE , NW , SE , SW ) direction an object is facing. How many “yes” or
“no” questions would it take to identify the direction? Naïvely, it would take eight questions
(e.g., Is it “North?,” Is it “Northeast?,” Is it “East?,” etc.), but because we are using binary
states, we know that it can be done in three questions.
1. Give three dichotomous questions that will correctly identify an object’s cardinal or
intercardinal orientation when answered. Note: There is not just one single correct
answer here!
2. Create a table that maps each answer sequence to its matching direction. This table
should have three columns (for each of the questions) and eight rows (for each of the
directions). Note: Each sequence should be uniquely different than the others. Think
about why this must be true.
Example Table:
229
230
UTeach CS Principles Unit 3: Data Representation
High/Low Guessing
Game
In the High/Low game, you will choose a number between 1–1000. The computer will make
guesses such as “I guess that your number is 3” in order to determine your number. You will
then respond that incorrect guesses are “too high” or “too low.” In the end, the computer
always ends up guessing the correct number, but how many guesses are ideal for this task?
How many guesses guarantee a correct guess? The computer claims it can guess correctly
using at most ten guesses each time. Think about how this relates to the state space in this
scenario—the numbers 1–1000.
Revisit the Binary Search tool from Unit 1. Experiment with the applet to determine if this is
true. Write your answer in 2–3 sentences including an explanation of your reasoning.
Once you have written up your response, try to apply the same reasoning yourself. Guess
the computer’s number in as few questions as possible by playing Funbrain’s Guess the
Number game.
231
UTeach CS Principles Unit 3: Data Representation
UNIT TOPIC:
Binary Encoding of Information
Base Conversions
You will explore methods of converting values from decimal to binary and binary to
decimal.
You will explore methods of counting in binary.
You will examine the exponential relationship between the number of digits and their
range of representable values.
232
UTeach CS Principles Unit 3: Data Representation
The Amazing Binari
The Amazing Binari Transmutes Decimal to Binary!
Come one, come all, and witness the magic of the Amazing Binari (rhymes with Atari)! After
years of study at the Citadel, Binari has returned to the West with an amazing power. The
Amazing Binari can convert any decimal number into its binary representation! Wow! Don’t
believe it? Test his abilities here.
How does he do that? No matter what he may claim, it’s not magic.
Your job is to deduce how to convert decimal numbers to their binary representations by
observing the Amazing Binari at work. Give the Amazing Binari a number to transmute, and
watch what he does. Notice any patterns? How would you describe his actions?
Write and submit your interpretation of his actions. In other words, describe how to convert
decimal numbers to their binary representations, using the Amazing Binari as an example.
233
UTeach CS Principles Unit 3: Data Representation
Binary Finger Counting
Counting in Binary
How useful is binary? If it can encode anything, can it encode numbers? Letters? Of course
it can! Let’s try counting in binary just like you first tried counting in kindergarten...on your
fingers.
How high can you count on one hand? Five? Ha! You can do much better. Using the idea of
dichotomous relationships and binary, Binary Finger Counting allows you to count to
numbers much larger than five using only the five fingers you have on one hand. Consider
each finger to be either “extended” or “not.” Each of your fingers then represents one bit.
Here’s a comic that explains the entire process:
Notice that each finger doesn’t represent a separate number, it represents a dichotomy, or
division into two classes (e.g., “the number is <16 (if not extended), or ≥16 (if extended)").
Each new finger doubles the number of values that can be represented.
234
https://www.youtube.com/embed/OCYZTg3jahU
Now try it yourself. Use the cartoon or video to help you along, or if you prefer written
instructions, here is a procedure (or algorithm) that explains how you can encode a number
in the range 0-31 on one hand:
The pinky represents whether the number is <16 (if not extended), or ≥16 (if
extended).
So, if it is 16 or higher, extend your pinky, subtract 16 from your number, and
continue with Step 2.
If it is lower than 16, just skip ahead to Step 2.
The ring finger represents whether the remaining value is <8 (if not extended), or ≥8 (if
extended).
So, if it is 8 or higher, extend your ring finger, subtract 8 from your number, and
continue with Step 3.
If it is lower than 8, just skip ahead to Step 3.
The middle finger represents whether the remaining value is <4 (if not extended), or ≥4
(if extended).
So, if it is 4 or higher, extend your middle finger, subtract 4 from your number,
and continue with Step 4.
If it is lower than 4, just skip ahead to Step 4.
The index finger represents whether the remaining value is <2 (if not extended), or ≥2
(if extended).
So, if it is 2 or higher, extend your index finger, subtract 4 from your number, and
continue with Step 5.
If it is lower than 2, just skip ahead to Step 5.
The thumb represents whether the remaining value is <1 (if not extended), or =1 (if
extended).
So, if it is 1, extend your thumb, subtract 1 from your number, and you are done
(you should have 0 as a remaining value).
235
You are done (you should have 0 as a remaining value).
Notice that each finger doesn’t represent a separate number, it represents a dichotomy or
division into two classes (e.g., “the number is <16 (if not extended), or ≥16 (if extended)").
Each new finger doubles the number of values that can be represented.
Challenge
How many values could this guy encode?
HINT: Think about what value the sixth finger would represent.
Go back and look at the pattern for the values represented by
your own five fingers, and then apply it to the sixth finger.
236
UTeach CS Principles Unit 3: Data Representation
UNIT TOPIC:
Binary Encoding of Information
ASCII vs. Unicode
You will examine how alphanumeric characters and symbols may be represented
using ASCII and Unicode character mappings.
You will analyze the differences in state space between ASCII and Unicode standards.
237
UTeach CS Principles Unit 3: Data Representation
Alphanumeric Representation
Encoding Alphanumerics with the ASCII Table
This ASCII (American Standard Code for Information Interchange) table outlines a common
set of conventions established for converting between binary values and alphanumeric
characters.
The table functions as a mapping from binary values to alphanumeric symbols. This is an
arbitrary mapping that was constructed many years ago. Note that there is no inherent
meaning that dictates that the double-quote character, " , should map to 00100010 (or
34 in decimal notation). Much like your groups are doing for your controller projects, this
table and its mappings were defined by real people.
238
https://www.youtube.com/embed/9NKcgp8KH2k
You can convert alphanumerics into binary (or vice versa) using the ASCII table as the
tutorial explains, or you can use one of the many free transducers found online, like this
Binary to Text Converter. Try it out!
Errors/Noise
Have you considered the fact that g and w are only different by one bit?
Otherwise, as a result of noise, instead of what you expect, you might end up missing a
really delicious piece of cake, or wind up with a blue screen or the spinning wheel of death.
239
"There’s always the hope that if you sit and watch for long
enough, the beachball will vanish and the thing it interrupted will
return.”— The Eternal Flame, xkcd #961 (Randal Munroe)
240
UTeach CS Principles Unit 3: Data Representation
Digital Scavenger Hunt
Abstraction in Action
Bits are everywhere, and they can represent virtually anything. In fact, at the binary level, the
previous sentence looks like this:
The bits themselves do not mean anything without a rule that tells us how to interpret them.
This is the beauty of abstraction and a big part of the illusion that makes computers work the
way they do.
The task of figuring out what these bits mean is impossible without establishing a method for
decoding them. They may be a message encoded in ASCII, a save file for Peerless Fabio
Twins, or part of an image. They could be any of these and more. Assuming that they are
241
ASCII, mapping each set of eight is trivial; just map them from one encoding to another.
Use the XLATE tool to read the message encoded in ASCII in the previous bitstring. Simply
copy and paste the binary into the “binary” pane and press the button. XLATE will present
different encodings of the same bits. It should be fairly obvious which is the intended
encoding.
1. Load secret.txt in a web browser to obtain some bits. Before continuing, look at the
bits. What do you think is encoded? How different are the bits in different areas of the
bitstring?
2. Paste these bits (ALL OF THEM) into the binary pane in XLATE and click the DECODE
button. Nothing obvious will be apparent, but remember how differently the bits were
distributed in binary. Scroll around the panes and look for something interesting. You
will find further instructions there.
A brief outline of the process you used to find the message. Include dead ends!
242
UTeach CS Principles Unit 3: Data Representation
Reading and Writing in ASCII
Passing Notes
How many times have your teachers told you or your
classmates, “Don’t pass notes in class!” Well, now it’s your
turn to do just that, but there are a few caveats:
Procedure
Think About It
243
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
244
UTeach CS Principles Unit 3: Data Representation
Unicode vs. ASCII
Unicode vs. ASCII
This chunk of rock is the Rosetta Stone. Historically, it is
important because it allowed the first deciphering of otherwise
strange symbols found in ancient Egyptian ruins. It contained
one piece of narrative text in three different forms—in ancient
Egyptian hieroglyphics, Ancient Demotic, and Ancient Greek.
This allowed historians to translate the hieroglyphics into
Greek, which was already well-understood.
The problem, of course, is that the ASCII table allows a very limited set of symbols—seven
bits’ worth. In order to make room for more symbols, more bits are needed. How many are
needed?
The current solution is to use the newer Unicode standard. Unicode is a binary encoding
system that can represent much more of the world’s text than ASCII can. Unicode allows
computers to represent most of the world languages’ alphabets, not just English.
which means:
Anything encoded with ASCII will work with Unicode; the first 128 symbols are the same.
Extra zeroes are used to pad the beginning of the Unicode equivalent to an ASCII encoding
to account for Unicode’s extra state space (216 vs. 27 possibilities).
245
For example, the letter “U” is represented by both as:
ASCII Unicode
1110101 0000000001110101
Can somebody type in other languages using Unicode and any keyboard? Yes! Even with
the keyboards that are primarily found in the United States, we can use, type, and encode in
other languages using Unicode. Here’s how:
For a web accessible application that generates Unicode character tables, try The Unicode
Range Viewer.
Additionally, you may like to see some of the many examples of world language symbols in
context. Try entering some common English words and phrases into a language translation
site such as Foreignword.com or Google Translate.
An Arabic keyboard. Click here to see (and use) more language-specific keyboards.
246
UTeach CS Principles Unit 3: Data Representation
CODING SKILLS:
Binary Birthday Cake
Highlights
You will construct a Scratch program that simulates candles on a birthday cake being
lit so as to show the user’s age in binary.
247
UTeach CS Principles Unit 3: Data Representation
Binary Birthday Cake
Grandpa is Turning 60!
Unfortunately, the supermarket doesn’t have enough candles
to fill his birthday cake! He needs five boxes of 12 to light the
cake the traditional way. The store only has one box—not
enough to even form the numerals “6” and “0” legibly.
= 110102
= (1 x 24) + (1 x 23)+ (0 x 22) + (1 x 21) + (0 x 20)
= (1 x 16) + (1 x 8) + (0 x 4) + (1 x 2) + (0 x 1)
= 16 + 8 + 0 + 2 + 0
= 2610
The birthday cake for a 26-year-old should, therefore, have at least five candles to represent
the 5 bits of the binary number 110102. If we say a lit candle represents a 1 bit and an unlit
candle represents a 0 bit, our finished cake might look something like this:
248
Note that the example picture above includes two additional leading unlit candles; because
these are unlit, they are effectively identical to leading zeroes, and as such, do not affect the
value of the number represented.
Modulus
The mod block is essential to completing the program efficiently.
The modulus (mod) operator is used to return the remainder of a division operation. It might
remind us how we answered division problems before we learned about decimals (e.g., 10 ÷
3 = 3 remainder 1). Correspondingly, 10 mod 3 = 1. As you complete this assignment,
consider the place value of each digit you need to change. If you count down from 25, you
can isolate the 5 from 25 by performing a mod operation (25 mod 10 = 5). You can then
isolate the 2 from 25 by performing the following operation:
For more explanation on the mod block, visit the Scratch Wiki mod entry.
Program Notes
249
A starter project you may remix is available here.
The program screen should include at least a birthday cake and candles.
The allowable age range for a person to enter is 0–122, as 122 is the oldest confirmed
age of a human, namely Jeanne Calment—so the maximum number of candles
required is seven.
Lastly, the easiest way to have a candle lit/unlit is to have two different “costumes”—
one corresponding to each, and select the one to display based on the binary
representation of the desired age.
(Optional) The number of candles should change depending on the age. So, a one-
year-old would have one candle, a 30-year-old would have five, and an 83-year-old
would have seven, because the next highest power of 2 in each case is 1, 5, and 7,
respectively.
250
UTeach CS Principles Unit 3: Data Representation
Floating Point Numbers
Not Just Whole Numbers
Of course, digital devices can represent more than just whole
numbers. But how can numbers with decimal points be
represented in binary? The most intuitive way might be to first
treat the number as a whole number, and then always insert a
decimal point at the same point in its decimal equivalent.
Consider the following 32 bit representation:
11011111110001011101111111000101
3754287045
375428.7045
This assumes that the number is always represented to the 1/10,000ths in precision, but this
is arbitrary; we could have chosen any different number of decimal places. However, we
would need to choose this number beforehand, and it would apply to any number we
encoded with this representation. This is called fixed-point representation, because the
decimal point is always in the same place.
Because of this, fixed-point numbers are very limited in the range of values they can
represent. In our example above, we could encode 0–429496.7295. That seems like a fairly
large range, but we couldn’t even encode one million using this 32 bit, four decimal place,
fixed-point representation!
The standard convention for 32 bit floating point numbers—called the IEEE Standard for
Floating-Point Arithmetic (IEEE 754)—splits the bits into groups like this:
The mantissa is the numeric portion of the encoding, the exponent indicates where to place
the decimal point, and the sign denotes if the number is negative or positive. If you have
ever used scientific notation, then you have written a value similar to the way computers
store and process floating point numbers.
251
Note that moving the decimal point allows us to encode a much wider range of numbers
than we could with fixed-point representation. Using only 32 bits, we can encode positive
values from 1.17549 × 10-38 to 3.40282 × 1038, or
0.0000000000000000000000000000000000000117549 to
340282000000000000000000000000000000000! Note the number of zeroes in each. We
can’t encode all the values between these two numbers, just values to a certain degree of
precision. So as you see, we use a finite representation (floating point number) is used to
model the infinite mathematical concept of a number.
252
UTeach CS Principles Unit 3: Data Representation
HARDWARE ABSTRACTION:
Logic Gates and Hardware
Highlights
You will identify multiple levels of abstractions that are used when writing programs.
You will examine how logic gates model hardware abstraction through Boolean
functions.
You will explore different types of hardware used in computing.
253
UTeach CS Principles Unit 3: Data Representation
Logic Gates and Hardware
Logic in Computer Science
Logic is a foundational component of computer programming. As you create programs, there
will be occasions in which you want to check multiple conditions at the same time for an
intended result. If you want to check multiple conditions in a selection statement (if-else),
Boolean/logical operators (such as AND and OR) allow this to happen. In Unit 2: Decisions,
you were first exposed to Boolean operators as well as seeing them implemented in the
subsequent activity, “How Many Days…?” You were also able to see the practical application
of Boolean operators in the Logic Gate Game—each logic gate performs a logical operation.
Logic Gates
A logic gate is a hardware abstraction that is modeled by a Boolean function. At the simplest
level, a computer can be described as a vast collection of switches that are either on or off
(1s and 0s in binary). Most of these electronic switches are transistors or diodes. Transistors
act as “water faucets” by controlling the flow of electricity to logic gates. Multiple logic gates
form integrated circuits (or microchips). A collection of integrated circuits form computer
hardware components, such as motherboards, video cards, memory, etc. In this way,
computer hardware is built using multiple levels of abstraction. In Unit 1, you learned about
Moore’s Law and its implications for computing. Each transistor performs a single operation.
Moore’s Law states the processing power of computers will double every two years. It is
important to note that
"From 2000 – 2009 there has not really been much of a speed
difference as the speeds range from 1.3 GHz to 2.8 GHz, which
suggests that the speeds have barely doubled within a 10 year
span. This is because we are looking at the speeds and not the
number of transistors; in 2000 the number of transistors in the
CPU numbered 37.5 million, while in 2009 the number went up to
an outstanding 904 million; this is why it is more accurate to
apply the law to transistors than to speed.”— MooresLaw.org
Moore’s Law and hardware abstraction is demonstrated well in this article, which claims that
IBM has been able to squeeze 30 billion transistors into a fingernail-sized chip (WOW!).
Similar to how the human body is made of cells, computers are made of transistors.
Using the table below, explain how the idea of abstraction pervades both biology and
computer science by
254
Describing if the structure is low-level abstraction.
Summarizing how each structure functions individually and part of the larger system.
Describing the similarities between each side-by-side structure (i.e. how is a cell
similar to a transistor?).
Biological
Science Computational Science
Organ (Heart)
Integrated Circuit (Microchip)
255
Organism (Human)
Mapping Bits
As you can see, the use of 1s and 0s is just a (software) abstraction of electrical switches
turning on and off (hardware). Keeping in mind the relationship between hardware and
software, take a look at your Unintend’o Project.
Imagine the Unintend’o controller was not on screen, but instead in your hand (in place of
the keyboard).
As you press the “β” button, you are changing that switch from “off” to “on” and producing a
1 for the bitstring.
In your Unintend’o Project, you will ultimately generate a bitstring that will change based on
which input is allowing an electrical signal to pass through (i.e., which button is being
pressed). Like this example, software applications and computer systems are designed,
developed, and analyzed using a combination of levels of hardware, software, and
conceptual abstractions.
256
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
257
UTeach CS Principles Unit 3: Data Representation
UNIT PROJECT:
Unintend’o Game Controller
Highlights
You will develop a Scratch program that acts as a device driver for a video game
controller interface.
You will map each of six controls (UP, DOWN, LEFT, RIGHT, A, and B) to individual
bits.
You will map each binary pattern of button presses to different game actions (e.g., walk
forward, walk backward, turn left, turn right, jump, duck, whirl, leap, crawl, etc.).
You will write detailed specifications and justifications for each button-to-action
mapping of your design.
You will collaborate with your peers throughout the design and development process to
determine end-user requests for features and to share feedback on design and
implementation strategies.
You will write documentation detailing the use of your program and its features using
appropriate terminology.
258
UTeach CS Principles Unit 3: Data Representation
Unintend’o Project: Binary Mapping
Assignment
Today, your groups will begin work on the Unintend’o Controller Project by mapping each of
the controller button presses to a binary code. These codes will represent the digital
instructions to make the Fabio character move.
For example,
← → ↑ ↓ B A Result
1 0 0 0 0 0 Fabio walks left
0 1 0 0 0 1 Fabio jumps right
Indicate how each binary sequence is mapped to a character action. This behavior should
be expressed on the right side of the arrow.
A draft of the binary code mapping table is due by the end of the period, but you will also be
able to work on the Scratch program once you have finished your tables. Remember to
include combo moves in your table.
The following is an empty table you may use. You may create your own instead if you like.
← → ↑ ↓ B A Result
259
260
UTeach CS Principles Unit 3: Data Representation
UNIT TOPIC:
Digital Approximations
Digitization
You will examine the implications of variable-width encodings (e.g., Morse code) vs.
fixed-width encodings (e.g., Baudot code).
You will analyze the differences between discrete (digital) and continuous (analog)
representations of natural phenomena.
Perfect Copies
You will analyze the extent to which digital approximations accurately reflect the reality
that they represent.
You will examine the social implications of the ease with which perfect digital copies
can be made.
261
UTeach CS Principles Unit 3: Data Representation
UNIT TOPIC:
Digital Approximations
Digitization
You will examine the implications of variable-width encodings (e.g., Morse code) vs.
fixed-width encodings (e.g., Baudot code).
262
UTeach CS Principles Unit 3: Data Representation
Variable vs. Fixed-Width Encodings
In 1836, people wanted to communicate with each other across great distances instantly—
just like we do today—but technologies such as mobile phones and the Internet were still
well over 100 years away.
Telegraphy was the first step toward bridging this communication gap. However, in its
infancy, long-distance communication was limited to sending two states—either an electric
signal or no electric signal. Samuel Morse developed a code comprising dots and dashes
(or dits and dahs) that could be sent electronically via wires that spanned miles and miles
from one city to another. On one end, a telegraph operator would use a key to send a
message and on the other, an operator would hear the Morse code and transcribe it into
letters, numbers, and punctuation. A skilled operator could instantaneously translate the
Morse code into alphanumeric symbols. It was easily the fastest way to communicate with
people across the state, country, or even internationally.
How effective was Morse code at sending messages? Let’s compare it to SMS text
messaging:
https://www.youtube.com/embed/pRuRE-Bwk1U
This video seems a bit dated—they are using flip phones after all. What factors do you think
might change the outcome today?
Morse Code
Morse code is a variable-width encoding. This means that each of the characters
represented by the dashes and dots of Morse may be different lengths. This was done by
design for efficiency. Samuel Morse knew that some characters would be sent much more
often than others, and so the information required to send them should be less. For
example, “E” and “T” each occur often and, as such, are represented respectively by one dot
and one dash. “Z,” on the other hand, occurs infrequently, and so requires 4 bits to send
(dash dash dot dot).
263
However, variable-length codes introduce some
added complexity—how does one know where
one character ends and another begins? In
other words, what differentiates two “E’s” in a
row from one “I”? Morse used time—the delay
between characters—to delimit characters. But,
in a way, this sacrifices robustness for efficiency.
The sender’s perception of time may be different
than the receiver’s—particularly at high speeds.
Baudot Code
Émile Baudot, a French telegrapher, sought to minimize this ambiguity by creating a fixed-
width code. Every character sent was 5 bits long. There is no confusion over where one
character ends and another begins because they are all the same length.
This is effectively similar to ASCII and Unicode. When the computer was young and the
standards committee was designing a character set, they selected Baudot’s method rather
than Morse’s. They were selecting for robustness over efficiency.
Like so many choices made in the history of computer science, there is no single correct
answer. Each choice is a selection among trade-offs—efficiency, robustness, correctness,
ease of use, etc.
264
UTeach CS Principles Unit 3: Data Representation
UNIT TOPIC:
Digital Approximations
Analog vs. Digital Data
You will analyze the differences between discrete (digital) and continuous (analog)
representations of natural phenomena.
265
UTeach CS Principles Unit 3: Data Representation
Approximating Physical Media
Approximating Physical Media with Coordinate Grids
In the next unit, Unit 4: Digital Media Processing, you will investigate different ways that bits
may encode different types of media. Additionally, you will learn to use a new, more powerful
programming language called Processing to perform image processing. As a teaser, this
activity includes a pre-written application in Processing that allows the user to create a point-
by-point approximation of an image.
Activity Instructions
Next, download this activity’s Processing source file:
1. Take a second to browse through the source file, and look for any keywords that you
may recognize from Scratch.
3. Click Upload Picture , and choose an image file to load into the program window.
266
Next you will “digitize” this image by redrawing it along a grid.
4. Select Toggle Grid , and with the grid on, click carefully on grid intersections in
order to place points. Place points around your image in order to represent it as best
you can. To remove an unwanted point, click it again.
5. The Toggle Line function will cause the next series of points placed to be directly
connected with lines. Note: Clicking dots will toggle them on and off (as a sort of
“UNDO” feature). This means that there can only be one line leaving and entering each
dot.
7. Toggle both the grid and the image so that they are not shown. Show the result to
some of your peers. Can they discern the original image from your representation?
8. Try it with a new grid size. Adjust the granularity of the grid with the +|- buttons.
Attempt the drawing/guessing activity once again with the new grid. How does this
change the effectiveness?
The short tutorial below demonstrates how to run and work with the source file in the
Processing application. Note that this program has a limited set of features. It may not
perform every task you would like it to in order to best create your representation. You will
have to experiment with the program to create your representations.
https://www.youtube.com/embed/P37_1h_kJbY
267
UTeach CS Principles Unit 3: Data Representation
Digital vs. Analog
Think About It
268
269
UTeach CS Principles Unit 3: Data Representation
UNIT TOPIC:
Digital Approximations
Perfect Copies
You will analyze the extent to which digital approximations accurately reflect the reality
that they represent.
You will examine the social implications of the ease with which perfect digital copies
can be made.
270
UTeach CS Principles Unit 3: Data Representation
Digital Copies
Experiment
To learn why digital copies are easy to copy, let’s try an experiment:
Wait until your teacher instructs you to begin, then copy them as best you can. Your
neighbor will judge your work, and as a class, you will discuss it afterward.
10
17
14
4
3
12
17
271
12
6
4
10
17
272
UTeach CS Principles Unit 3: Data Representation
Perfect Imperfection
Common misconception: Digital copies are “perfect.”
Quiz
To see how well you understand the perfect imperfection of digital copies, answer the
following questions as best you can.
Answer:
Answer:
273
UTeach CS Principles Unit 3: Data Representation
BIG PICTURE:
Legality of Reselling Digital Music
Highlights
You will examine and discuss the legality of reselling “used” digital music.
274
UTeach CS Principles Unit 3: Data Representation
Reselling Digital Music
Should It Be Illegal to Resell “Used” Digital Music?
Think about purchasing an item. When you are done with it, what are your options for
disposing of it? You can throw it away, recycle it, give it away, store it, or sell it—depending
on what it actually is. What if the item is a piece of recorded music? As of March 2013, it
actually depends on how it is distributed. Today you will debate the legality of reselling used
digital music.
1. First, your teacher will assign you to either the for or against team.
3. The Digital Millennium Copyright Act (DMCA) has been a benefit and a
challenge in making copyrighted digital material widely available. Read this brief
summary provided by the American Library Association, “DMCA: The Digital
Millennium Copyright Act,” and apply it to your team’s argument.
4. Then, write a reflection about the decision considering your team’s argument. Your
teacher will ask your team to get together and allow you to brainstorm your arguments
before the debate.
5. Last, your team will debate according to the debate protocol of your teacher’s
choosing. Your teacher will be the judge and decide which team puts forth the better
argument.
Make sure you address the issues, not the other team!
275
UTeach CS Principles Unit 3: Data Representation
UNIT PROJECT:
Unintend’o Game Controller
Highlights
You will develop a Scratch program that acts as a device driver for a video game
controller interface.
You will map each of six controls (UP, DOWN, LEFT, RIGHT, A, and B) to individual
bits.
You will map each binary pattern of button presses to different game actions (e.g., walk
forward, walk backward, turn left, turn right, jump, duck, whirl, leap, crawl, etc.).
You will write detailed specifications and justifications for each button-to-action
mapping of your design.
You will collaborate with your peers throughout the design and development process to
determine end-user requests for features and to share feedback on design and
implementation strategies.
You will write documentation detailing the use of your program and its features using
appropriate terminology.
276
UTeach CS Principles Unit 3: Data Representation
Unintend’o Project: Programming
Assignment
Your job today is to work on the actual Scratch program you will create for the Unintend’o
Controller project. You may take a variety of approaches to make the program function and
to make the controller/game fun to use and play! Remember, the Scratch program should:
Map each key press to a binary sequence. Note that interaction between the keys
(e.g., ← , → , ↑ , ↓ , A , and B ) and the program is restricted to only this. In other
words, the key presses should only be used to build the binary sequence.
Cause Fabio to act according to the binary sequence you have created. Note that
Fabio does not directly reference the controller or keyboard events, but rather
acts according to the value of the binary sequence.
For your reference, here is an example solution. Note that the example solution contains no
combo moves, but simply captures the basic button presses.
277
UTeach CS Principles Unit 3: Data Representation
UNIT TOPIC:
Lists
Making a List
You will examine the use of lists as ordered data structures that may contain multiple
values.
You will investigate the use of index values to represent the position of an item in a list.
You will analyze the implications of accessing an index position beyond the bounds of
a list.
Processing a List
You will investigate common operations for processing elements of a list, including
searching for an element, removing an element, swapping the positions of two
elements, or sorting an entire list into ascending or descending order.
Sorting a List
Lists in Action
278
UTeach CS Principles Unit 3: Data Representation
UNIT TOPIC:
Lists
Making a List
You will examine the use of lists as ordered data structures that may contain multiple
values.
You will investigate the use of index values to represent the position of an item in a list.
You will analyze the implications of accessing an index position beyond the bounds of
a list.
279
UTeach CS Principles Unit 3: Data Representation
Making a List
Make a List
We are going to start working with lists! Lists are a type of data structure, a particular way
of storing data. Many objects can be easily stored in a list, and so lists have become a very
useful and ubiquitous data structure in computer science. For example, if we are developing
a game, and we want to keep track of the people that have played our game, we might keep
their names in a list.
As with anything new, the first thing to do is explore! Make a new list named players by
clicking on the button Make a List , as shown below.
Below the name of your new list, you will see a group of blocks that operate on lists. Use the
players list to experiment and determine what these blocks do before going on to the
next step. Remember to check the box next to players to view its contents on the
stage.
280
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
281
UTeach CS Principles Unit 3: Data Representation
Weird Cases in Lists
Instructions
This quiz is not graded. By this point, we do not assume you know the answers to all of
these questions, and we encourage you to make a best guess for each question. We also
want to reinforce that the answers to these questions will be important later on, but that you
can figure them out whenever you need to!
The questions in this quiz are essentially “what does this do?” questions. When dealing with
these questions, you must realize that people designed every programming language. So,
people weighed pros and cons and made a decision about what the language should do in
each situation (its specifications). You might disagree with them about what should happen,
but it is important to realize that you can always test code to figure out what it does.
The following questions test boundary cases, where what the code should do is not intuitive.
1) Assuming that you only have five elements in your players list, what would happen if you
ran the script below?
Answer:
2) Are the words in a list case-sensitive? In other words, is “Thing” different from “thing"?
A. Yes
B. No
Answer:
3) What happens when you run the script below with a list that has no elements?
Answer:
4) What gets added to players when you run the following script?
Answer:
283
UTeach CS Principles Unit 3: Data Representation
Reading a List
Read a List of Names
The following script is intended to keep track of the names of the players of a game, but it
has an error! Find and fix the error. Submit your solution.
The program is available here, as a Scratch project. Once you are done with it, you may find
it useful in your other projects.
284
UTeach CS Principles Unit 3: Data Representation
UNIT TOPIC:
Lists
Processing a List
You will investigate common operations for processing elements of a list, including
searching for an element, removing an element, swapping the positions of two
elements, or sorting an entire list into ascending or descending order.
285
UTeach CS Principles Unit 3: Data Representation
Processing a List
Index
We want to have the character read all of the names from our players list. This is a very
common type of problem when dealing with lists: we want to do the same thing for each item
in the list. To tackle this problem, we are going to use a variable called index to keep track
of the position of the element in the list we are processing.
Once you get the script above to work, try to make a script that will read all of the player
names and then say them all together, with the appropriate commas, spaces, and the word
“and,” as shown below. Remember to save your work!
List Output
Hint: Before you make the final result with the serial commas and the “and,” try to make a
script that will just join together all the names into one variable, and then have it say
that variable. The following block is a good starting point:
286
287
UTeach CS Principles Unit 3: Data Representation
Index Variables
Practice with Index Variables
As we have seen in the previous section, we generally use an index variable to process the
elements of a list. Below, we have reproduced the script that makes the character say the
names of the players in a list.
288
UTeach CS Principles Unit 3: Data Representation
Remove from a List
Remove
The script below is supposed to remove all of the occurrences of “ABC" from the players
list.
Find a list for which the script below will not remove all occurrences of ABC (such lists DO
exist), and explain your findings:
289
UTeach CS Principles Unit 3: Data Representation
Sentences as Lists
Processing a Sentence
The pattern of using an index variable to process a bunch of values is pretty common in
programming, and is not specific to lists alone! The block on the left below is the one that we
used to say all of the names in our player list. The block on the right can be used to say all of
the letters in a sentence like “Go Longhorns!”
Players Sentence
There are striking similarities between the two scripts, and so we should be able to perform
the same tasks on sentences as those we did on lists. Write blocks that make the character
say the following:
Note: Often, when writing a script, if we have already written a similar script for another
purpose or data structure, we can either modify the script for the new functionality, or we can
create a more general script that can be used for both purposes.
290
UTeach CS Principles Unit 3: Data Representation
UNIT TOPIC:
Lists
Sorting a List
291
UTeach CS Principles Unit 3: Data Representation
Swaps
Reordering Lists
Sometimes, we care about the order of a list (e.g., in a dictionary) and sometimes it doesn’t
matter (e.g., in a shopping list). A very common task for lists is reordering them, either to
impose an order that did not exist before (such as sorting a list—alphabetizing a list of
names) or changing the ordering (such as reversing a list—taking the alphabetized list of
names and making it a reverse-alphabetized list).
At the heart of the matter, reordering a list involves swapping the positions of two items
repeatedly:
In the case of a reversal, this might entail taking the first item and swapping it with the
last, then the second item with the second to last, etc.
In the case of sorting a list of names, it might involve taking what should be first
(alphabetically) and swapping it with what actually is first, then continuing down the
line.
NOTE: Sorting is a very well-studied topic in computer science. The sorting algorithm
described above is a “selection sort.” Note the various sorting algorithms described here.
“Selection sort” is not the most efficient way to sort a list, but it is the way humans typically
manually alphabetize things. How might you apply one of the more efficient algorithms
described to physically alphabetizing file folders, books, etc.?
X Y
15 32
1. Copy X to Y .
2. Copy Y to X .
Seems simple enough until you step through it. After Step 1, our table now looks like this:
X Y
15 15
Step 2 now depends on the “old” value of X , which was overwritten. We need to “save” that
value somewhere in order to use it later. We cannot save it in Y , because then we’ll have
the same problem in reverse. We need a third variable:
292
X Y SwapSpace
15 32
Using this third variable, write out the steps that would constitute an algorithm for
swapping two values.
293
UTeach CS Principles Unit 3: Data Representation
Reorder
Reordering
As the previous assignment indicated, reordering a list is a very common occurrence within
a computer program. Consider some of these common tasks:
All of these tasks are essentially reordering a list of items so that it is displayed in a different
way. We’ve output lists in different orders before now, but we have not actually altered a list’s
contents so that they are stored in a different order. Examine this (possibly daunting)
program:
This program reorders a list, but how? It’s not actually as complex as it looks at first glance.
Some of the parts of this program are directly copied from programs we have dealt with
previously. For example,
Code Explanation
The highlighted section
is just the framework
294
for processing a list—
going through each
item in the list one after
the other. For
reference, see the
previous “Process a
List” activity. Compare
the program there,
which simply instructs
Scratch to say each
item in the list, to the
highlighted blocks here.
295
in the unsorted part of
the list of words and
repeatedly checks if
any/all of the items are
alphabetically before
the rest. By the time it
reaches the end, the
first word alphabetically
is denoted by
indexOfBestSoFar .
It will be then swapped
with the first item, and
so on.
This comparison block is the portion of our program that determines that we want the listed
to be sorted in alphabetical order. If we were to desire a different ordering, such as by
length of [word] , only this block would differ in our resulting program!
Experiment
Access the Reorder! project on the Scratch website and play around with it. Work with it to
sort items in numerical order, reverse alphabetical order, and ordered by length of
[word] .
Describe each of the changes you made while experimenting (INCLUDING THE FAILED
ATTEMPTS!), and indicate why they work the way they do.
296
UTeach CS Principles Unit 3: Data Representation
UNIT TOPIC:
Lists
Lists in Action
297
UTeach CS Principles Unit 3: Data Representation
MultiSet to Set
Now that you have learned the basic operations to add, access, and remove items in a list,
let’s integrate them into a basic procedure.
A procedure is a group of blocks that have been combined to perform a specific task. In
Scratch, this is done with the “Make a Block” button in the “More Blocks” pane.
To demonstrate, let’s create a block that resets two lists to a default configuration.
First, we will click the “Make a Block” button, and name the block something descriptive, like
“reset lists":
Next, we will associate some code with it. Any time the newly created reset lists block
is used, all of the code under the define reset lists block will be executed:
298
Finally, we may use the new reset lists block anywhere in our code. In this example,
the reset lists block is called anytime the sprite is clicked:
Your Task
Your task is to create a block that creates a set from a multiset. A multiset is a collection of
items that may contain duplicates; a set does not contain duplicate entries. Order does not
matter in either collection.
Example:
299
Given the multiset { a, d, c, a, c, b } , you are to create a block that will populate
the corresponding set { a, d, c, b } . Note that because order does not matter, { b,
a, d, c } is also a valid solution.
The Scratch starter code provided here includes two lists entitled multiset and set. You will
create a block to perform the conversion as part of the DeDupe sprite, an arrow that begins
the “de-duplication” conversion process when clicked.
Note: A completely correct solution will work for any list of items, not just the sample one
provided.
After you have a working solution, submit your code and a description of how you tested it.
Note that you should try to test it on more lists than just the one provided!
Extensions
For an additional challenge, try the following:
Create a block that incorporates the selection sort code in Reorder so that the items in
Set are always in alphabetical order.
Create a block that converts MultiSet into a Set rather than simply creating a new Set.
To do this, you will need to selectively remove items from MultiSet.
300
UTeach CS Principles Unit 3: Data Representation
UNIT PROJECT:
Unintend’o Game Controller
Highlights
You will develop a Scratch program that acts as a device driver for a video game
controller interface.
You will map each of six controls (UP, DOWN, LEFT, RIGHT, A, and B) to individual
bits.
You will map each binary pattern of button presses to different game actions (e.g., walk
forward, walk backward, turn left, turn right, jump, duck, whirl, leap, crawl, etc.).
You will write detailed specifications and justifications for each button-to-action
mapping of your design.
You will collaborate with your peers throughout the design and development process to
determine end-user requests for features and to share feedback on design and
implementation strategies.
You will write documentation detailing the use of your program and its features using
appropriate terminology.
301
UTeach CS Principles Unit 3: Data Representation
Unintend’o Project: Rubric Check
Instructions
This activity provides you and a partner group time to provide one another critical feedback
about the progress of your projects.
Listen to what your partner group says! You do not need to heed all of their advice, but
consider it wisely. The whole idea is to have a critical third party provide ideas to make your
project better before you submit the final version for a grade.
302
UTeach CS Principles Unit 3: Data Representation
Unintend’o Project: Gallery Walk
Instructions
To present your controller to the class, you will participate in a gallery walk. An art gallery
refers to an art museum (or any collection of rooms) that houses an exhibition of an artist’s
work for exhibition or sale (Wikipedia). In this case, the gallery will be composed of your
Unintend’o projects. All students who have completed the project will display their Scratch
programs and technical specifications. Then, you will rotate around the classroom,
experimenting with the controls as outlined in the technical specifications. At the end of the
gallery walk, you will vote for your favorite controller artifacts:
Whose controller schemes are the most efficient? ...the most effective? ...the most
expandable? ...the most clever?
Which special combo moves are the most novel?
Which controller would make for the best game?
UNIT 4
Digital Media Processing
Building upon your earlier visual programming experiences with Scratch, this
unit guides you through the transition to programming in a high-level,
procedural language through a brief introduction to Processing. By familiarizing
yourselves with a text-based environment that more closely reflects the actual
programming tools used in industry, such as Java, C++, or Python, you will be
better equipped for continuing your studies in computer science beyond the
scope of this course.
305
UTeach CS Principles Unit 4: Digital Media Processing
UNIT PROJECT:
Image Filter
Highlights
You will utilize pair programming to design and implement a program for filtering digital
images.
Using the Processing programming language, you will develop code to systematically
transform an image by mathematically manipulating its bits, pixel by pixel.
You will write documentation detailing the use of your program and its features using
appropriate terminology.
You will explain your design and implementation choices by demonstrating and sharing
your finished programs with your peers.
306
UTeach CS Principles Unit 4: Digital Media Processing
Image Filter Project
“You don’t take a photograph, you make it.” – Ansel Adams
https://www.youtube.com/embed/p5QQNmkSE5Y
All digital media consists of bits and their abstractions. Manipulating bits can make
abstractions more useful, usable, or beautiful. Much like wizards in a fantasy setting,
programmers are able to change the very core representations of reality.
Computer scientists manipulate bits to achieve a wide variety of outcomes. Image editors
can transform many characteristics of images with ease. Productivity software can perform
complex mathematical functions automatically. Video game environments can be rendered
dynamically, and more.
With computational thinking and the proper programming skills, anyone can manipulate bits.
Assignment
Submission
Your submission will be an original program coded with the Processing language. You must
submit the “sketchbook” .pde file, along with documentation describing its functionality.
Download, execute, and build upon this starter code:
When you are finished, you will submit the source code of your Processing program, which
will be graded using the attached rubric. You will then review one other group’s submission,
and reflect upon any differences from your own work.
Learning Goals
Over the course of this module and this project, you will learn to:
Rubric
Content
Area Performance Quality
308
Both loops and Loops or Loops or Not
conditionals conditionals conditionals enough
have been (but not both) (but not both) criteria
added to the have been have been are met in
program. added to the added to the order to
program AND program. award
AND all loops or any
conditionals AND credit.
All loops and are used
conditionals effectively and Not all loops or
are used correctly with conditionals
effectively and purpose in the are used
Loops and correctly with program. effectively and
purpose in the correctly with
Conditionals program. OR purpose in the
program.
Both loops and
conditionals
have been
added to the
program but
not all loops or
conditionals
are used
effectively and
correctly with
purpose in the
program.
309
includes a third includes a third includes a third enough
image filter that image filter that image filter that criteria
alters the alters the alters the are met in
image by image by image by order to
reordering reordering reordering award
pixels (not pixels (not pixels (not any
using a built-in using a built-in using a built-in credit.
Filter 3 Processing Processing Processing
filter function). filter function). filter function),
but Processing
AND AND constructs are
not used
The filter uses The filter uses appropriately.
Processing Processing
constructs constructs
appropriately appropriately.
and effectively.
310
311
UTeach CS Principles Unit 4: Digital Media Processing
UNIT TOPIC:
Procedural Programming
Introduction to Processing
Drawing
You will write programs that make use of parameterized methods to invoke specific
behaviors.
You will understand the importance of using proper punctuation and syntax when
coding in a text-based programming language.
You will write code using common programming constructs like conditional if() for
selection and while() loops for iteration.
Mouse Interaction
You will use event handlers to animate on-screen effects and respond to mouse input.
Keyboard Interaction
You will use event handlers to animate on-screen effects and respond to keyboard
input.
312
UTeach CS Principles Unit 4: Digital Media Processing
UNIT TOPIC:
Procedural Programming
Introduction to Processing
313
UTeach CS Principles Unit 4: Digital Media Processing
Writing Code
Introduction to Processing
Before we write any Processing programs, let’s take some time to read one. The following is
a short example of a Processing sketch that draws a picture using Processing’s built-in
drawing tools. Read it and try to determine roughly what it does before executing it.
As you complete the following tasks, note that each of the commands has a real-world
action that it approximates.
Type the sketch into Processing as-is, and execute it. Does it look anything like you
imagined by examining the code?
Tinker with the parameters for each of the following:
314
ellipse
fill
strokeWeight
line
curve
size
Assignment
Record
315
UTeach CS Principles Unit 4: Digital Media Processing
Scratch vs. Processing
Comparison: Scratch vs. Processing
There are many similarities between the two languages; in fact, these similarities occur
across nearly all of the programming languages in use today! Nearly all programming
languages are equivalent in terms of being able to express any algorithm.
Scratch Processing
To get you started seeing how familiar Processing should already be, here are few examples
of how Scratch and Processing compare with one another:
Each has buttons that start and stop the execution of a program.
Scratch Processing
Scratch Processing
println("Hello!”);
Scratch Processing
if ( SOMETHING )
316
{
DO THIS...
}
else
{
DO THAT...
}
Scratch Processing
index = 0;
Scratch Processing
index = index + 1;
Each offers a way to repeat an action. Note that Scratch uses a “repeat until...”
metaphor, while Processing uses a “repeat while...” metaphor. This difference means
that the Boolean conditions that determine whether to repeat another iteration of the
loop are opposite one another:
Scratch Processing
while (x >= y)
{
DO THIS...
}
317
318
UTeach CS Principles Unit 4: Digital Media Processing
Scratch Constructs Revisited
Scratch Code Blocks
Recall that Scratch organizes types of blocks according to
their usage. Each category is given its own pane and block
color. For example, those blocks that affect the flow of
program execution (such as branching paths and sequences
of events) are a gold color, as seen to the right. If you know
the type of tool you require, you can easily find it through the
Scratch interface. Unfortunately, text-based languages like
Processing are more difficult to navigate. In order to locate the
proper tool for a job, you must first learn the basic constructs
available to you.
Anna has her tools organized in such a way that they are easy
to find, easy to reuse, and easy to select. She has a toolbox
with drawers specified for each type of tool: fasteners,
diagnostics, electrical, etc. This is similar to the way Scratch
organizes its available blocks. On the upside, blocks are easy
to locate and select. On the downside, all of the available tools
must be easily placed in a drawer according to its purpose.
As a programmer, it is important to have knowledge of the basic tools when using a text-
based language such as Processing. It is much more difficult to “jump in” and start trying to
construct a program than it is with Scratch, where all of the tools are laid out before you in
319
an easily comprehensible manner.
As such, and this is very important, you should spend time familiarizing yourself with
the basic commands available in Processing. One approach that may help you is to
use the Scratch interface to explore types of commands and then seek out their
equivalents in Processing. Additionally, reading other, pre-written programs and
deconstructing how they work is one of the most useful approaches you can take to
be successful. In fact, this approach will carry you through any number of languages
and any level of expertise.
320
UTeach CS Principles Unit 4: Digital Media Processing
Punctuation
Punctuation Matters? Punctuation Matters!
Punctuation is important. The English language is full of
ambiguities that are often confusing when translating natural
speech into text. A reader does not have the luxury of hearing
the intonation, pauses, and speech fillers that usually
accompany speech. Punctuation is a form of annotation that
signals some of these missing cues to a reader.
A Short Exercise
Match the following phrases with the appropriate pictures:
Let’s eat
children!
English clearly requires punctuation to function properly. Otherwise, figuring out the meaning
of a sentence falls to the reader. But computers are not readers, and computer programs are
designed to be as unambiguous as possible, so why does punctuation matter in Processing
code?
You have probably heard the phrase, “Computers are dumb.” This isn’t exactly true, as you
are learning throughout this course, but the statement, “Computers lack intuition” is a true
statement. In other words, any behavior a computer exhibits is either programmed explicitly
or follows some set of explicitly programmed rules. Whereas a person can go back and
figure out the meaning of a sentence that is clearly incorrect, a computer cannot. It does not
have the experience and ability to guess correctly when there are ambiguities. Beyond that,
we would not want them to because they might guess wrongly, and our programs would not
work as we intended them to work.
321
Punctuation Symbols { ; }
The most important punctuation you will master in Processing are the braces that enclose a
block of statements { } and the semicolon that terminates each statement ; . You will
make many errors learning how to place these properly and remembering to do so. A
common complaint is that they don’t seem necessary. However, computers, as mentioned
above, are not able to just “get what you mean.”
Braces { } : These enclose any group of statements that need to be treated as a unit. The
most common way you will use them is to replicate the way Scratch blocks “enveloped”
other blocks. For example, the if statement in Scratch has a slot that contains the
statements that are to be executed if, and only if, the conditional statement is true . In
Processing, braces play the same role:
Semicolon ; : Each block in Scratch was just a distinct unit. With Processing, you must
piece together your own statements using keywords, variables, and punctuation. The
semicolon ends the statement. There are two major hurdles to learning its usage.
1. Not all statements utilize a semicolon (e.g., the if statement references above). For the
most part, though, statements do require it, and you will develop an intuition for the
exceptions to the rule.
2. You would think that the Processing environment would be able to tell you where to
place your missing semicolons as part of the error message you receive when you’ve
left one out. Processing does attempt to provide a likely location, but it is not foolproof.
In fact, this is a fairly difficult problem. In order to illustrate with an English equivalent,
read the following “garden path sentences.” Some of them are difficult to unravel, but
they are all grammatically correct!
322
Garden Path Sentences
The horse raced past the barn fell.
The florist sent the flowers was pleased.
Time flies like an arrow; fruit flies like a banana.
The complex houses married and single soldiers and their families.
The man whistling tunes pianos.
The old man the boat.
I convinced her children are noisy.
The tomcat curled up on the cushion seemed friendly.
The man returned to his house was happy.
Mary gave the child the dog bit a bandaid.
The government plans to raise taxes were defeated.
The girl told the story cried.
323
UTeach CS Principles Unit 4: Digital Media Processing
UNIT TOPIC:
Procedural Programming
Drawing
You will write programs that make use of parameterized methods to invoke specific
behaviors.
You will understand the importance of using proper punctuation and syntax when
coding in a text-based programming language.
You will write code using common programming constructs like conditional if() for
selection and while() loops for iteration.
324
UTeach CS Principles Unit 4: Digital Media Processing
Draw Shapes
Consult the Documentation
Knowing what Processing’s built-in functions do and how to
use them to build your application is important, but do you
need to memorize all of the various components of each? Do
you need to remember exactly what each of the parameters in
the statement rect(30, 20, 55, 55, 3, 6, 12,
18); mean?
PROCESSING DOCUMENTATION
Note that the documentation is organized by types of functions. In other words, to find out
how to draw a particular shape, look under the “Shapes” heading.
Instructions
Use the Processing Documentation to create one five-line program that does each of the
following. Note that Processing orients its axes differently than those typically
associated with Cartesian coordinates. See the figure below for reference.
1. Set the size of the display window to a height of 300 and a width of 500.
2. Draw an ellipse with a width of 20 and a height of 30. It should be centered at the x-y
coordinates (100, 40).
3. Create a color variable c set to burnt orange. The color values for burnt orange are
(RED = 204), (GREEN = 85), and (BLUE = 0).
4. Draw a quadrilateral that matches the following figure:
325
5. Fill the quadrilateral with the color you defined as c .
326
UTeach CS Principles Unit 4: Digital Media Processing
Draw a Figure
Creating the Look
Previously, you coded and modified an image of a happy face.
Now, you will combine shapes in order to create a more
complex original figure of your choosing (e.g., the robots to the
right). Unlike your picture from Writing Code, this figure will not
focus only upon a static outcome (i.e., what it looks like), but
instead includes other considerations, such as how the figure
will behave or how we might interact with it. What is the
distinction? In terms of Scratch, the distinction between a background image and a sprite
would be similar: there is much more associated with a sprite than simply its static
appearance.
For this assignment, however, we will focus solely on the appearance of your original figure.
While designing the figure, though, think about possible future behaviors as well.
Instructions
Using vector shape-drawing commands in Processing, code a sketchbook that draws a
figure. Examples of good figures might include a robot, person, cat, kangaroo, automobile,
etc.
Submission
You must submit the .pde of the Processing sketchbook that draws your original figure.
Tips
Read an example piece of code containing a drawing. Outline the steps in the
example. Think about how can you apply these skills to your own drawing.
As always, examine the Processing Reference Page. Focus on 2D Shapes,
particularly the ellipse() , line() , point() , rect() , and triangle()
functions to create a figure.
Remember that the window size is stored in width and height variables. If you’d
like to reference the exact center of the window, use something akin to
point(width/2, height/2) .
The println() function can be a lifesaver when troubleshooting your code. Use it
to output diagnostic messages as needed. Reference the Debugging with
println() Guide.
Functions Variables
327
ellipse() width
line() height
point()
rect()
triangle()
println()
328
UTeach CS Principles Unit 4: Digital Media Processing
Debugging with println()
Finding Bugs
In programming, any unwanted or unintended
property of a program or piece of hardware,
especially one that causes it to malfunction, is
referred to as a bug. The term originated in
1945 when Admiral Grace Hopper traced a
malfunction with the Harvard Mark II computer
to an actual bug (specifically, a moth) caught in
one of the electronic relays. She taped the moth
into her log book and noted it as the “First actual
case of bug being found.” The name stuck and
ever since, programmers have been plagued
(no pun intended) with bugs.
Finding software bugs in your programs is hard. It may be the most difficult part of
programming. Of course, the simplest way to prevent bugs is to not have any in the first
place! Planning the logic and anticipated input/output of your code before actually typing it
into the interpreter is important. This has spawned the humorous adage:
Of course, programs are written by people, and people make mistakes. Even the best
programmers are not immune to buggy code. Computer scientists have developed a
multitude of methods for finding errors in programs:
Printing Variables
Consider the following code that exchanges the x and y values of a point:
329
size(100 , 100);
strokeWeight(5);
stroke(0);
int x = 5;
int y = 95;
point(x , y);
x = y;
y = x;
point(x , y);
Clearly the variables x and y are not what we expect them to be. So, what are they?
Adding println(x); after each point in the program where we make a change will tell us
just that (similarly for y ):
size(100 , 100);
strokeWeight(5);
stroke(0);
int x = 5;
int y = 95;
point(x , y);
x = y;
println(x); // for debugging
y = x;
println(y); // for debugging
point(x , y);
This prints the following to the output window — not quite the values that we expected.
Where have we seen this error before?
95
95
Printing Checkpoints
Another type of error occurs when your program crashes and you cannot easily determine
where. To figure out where the problem is, you can add println() statements to your
330
code at various points:
Imagine that when the above program is run, it prints the following before crashing:
Clearly the problem lies in the area ...just a little bit more stuff... . The
program execution never made it through these statements to print You reached point
C.
331
UTeach CS Principles Unit 4: Digital Media Processing
UNIT TOPIC:
Procedural Programming
Mouse Interaction
You will use event handlers to animate on-screen effects and respond to mouse input.
332
UTeach CS Principles Unit 4: Digital Media Processing
Movement
The Illusion of Movement
As alluded to in the previous assignment, we are going to add some behaviors to our
figures. Specifically, we will add movement to our figures so that they can respond to user
input.
Before animating our figures from Draw a Figure, let’s experiment with a simple example to
explore how Processing supports this effect.
Note that there is no circle() function. Of course, a circle is just a special form of an
ellipse, much in the same way that a square is a special form of a rectangle. This line of
code means “draw an ellipse, with the center 50 pixels over from the left and 50 pixels down
from the top, with a width and height of 80 pixels.”
All of your Processing applications to this point have followed this format. The interpreter
starts at the top of the source and executes each function or command in sequence as it
moves down the list of instructions.
However, most basic programs in Processing follow a slightly more complex model, and this
provides the framework for dynamic programs. In order to animate objects on the screen, we
need to redraw the screen quickly while re-orienting the objects as they move (much as in
film—as discussed earlier).
Processing provides this framework through the setup() and draw() functions.
Up to now, you have used Processing functions to draw ellipses and rectangles or change
colors. Processing includes these pre-written functions, and you, as the programmer,
determine when and where they are used.
In this assignment, you will do something completely different. In fact, you and Processing
will switch roles in a way. Processing will execute the setup() and draw() functions
automatically to start the program and constantly redraw the screen. You will define what
333
each of these functions actually does when Processing uses them.
Exercise
Type in, and execute, the following program:
void setup()
{
size(480, 240);
}
void draw()
{
if (mousePressed)
{
fill(0);
}
else
{
fill(255);
}
ellipse(mouseX, mouseY, 80, 80);
}
Answer:
The draw() function dictates how to redraw the screen. Processing defaults to redrawing
the screen 60 times per second. This can be altered using the frameRate() function.
Answer:
3) Why should we insert the frameRate() command into setup() rather than
draw() ?
Answer:
Swap the order of the parameters so that the ellipse() command reads ellipse(80,
80, mouseX, mouseY) .
Answer:
334
Insert the background(200); instruction at the very beginning in your draw() function
block to have Processing redraw the background each time it refreshes the screen.
Answer:
Additional Tutorials
The previous activity is further detailed in the official Processing
tutorials. If you’d like more information about the Processing
environment than outlined above, visit Processing Tutorial: Getting
Started.
335
UTeach CS Principles Unit 4: Digital Media Processing
Animate Your Figure
Bringing Your Figure to Life
Using your figure from Draw a Figure, add the following functionality:
Submission
Submit the .pde of your Processing sketchbook.
Functions Variables
setup() mousePressed
draw() mouseX
background() mouseY
pmouseX
pmouseY
336
UTeach CS Principles Unit 4: Digital Media Processing
UNIT TOPIC:
Procedural Programming
Keyboard Interaction
You will use event handlers to animate on-screen effects and respond to keyboard
input.
337
UTeach CS Principles Unit 4: Digital Media Processing
Keyboard Input
Adding More Controls
Processing is not restricted to mouse movement for input and output. In this activity, you will
extend your figure from Animate Your Figure to incorporate keyboard button presses as well.
In order to do so, you will use a variety of pre-built variables for keyboard input similar to the
mousePressed variable.
As usual, you should read and reference the Processing Documentation in order to
understand the use of these constructs. However, here is a brief outline of how the keyboard
input system in Processing works.
Keyboard Input
If a key is pressed, the built-in variable
keyPressed is set to TRUE . This means that
a simple test, such as if (keyPressed) ,
will allow the execution of a block of code if (and
only if) a key has been pressed. Another
variable, named key , can be used to check the contents of the key that has been pressed.
In other words, if (key == ’A’) will check the pressed key against the capital letter A.
The key variable may indicate a key has been pressed that is not an ASCII encoded
character. In this case, if (key == CODED) will evaluate to TRUE . Should this occur,
the value of the key is stored in the variable keyCode . For example, the following code
checks to see if a key has been pressed, then if it is a non-ASCII encoded key, and finally if
it is the “up arrow” key:
if (keyPressed)
{
if (key == CODED)
{
if (keyCode == UP)
{
// up arrow key stuff happens
}
else
{
// other stuff happens
}
}
}
Instructions
Modify your sketchbook from Animate Your Figure as follows:
338
1. Add the ability to respond to key press commands.
2. Include the use of at least five key strokes, including a mix of alphanumeric characters
and the arrow keys.
Much like the mouse button press, each of these key presses should alter the appearance or
behavior of the figure in some unique way. Remember to reference the Processing
Documentation for examples.
Submission
Submit the .pde of your Processing sketchbook.
339
UTeach CS Principles Unit 4: Digital Media Processing
Loops
Forever Listening for Keystrokes
The construct to the right can be used to enable a sprite to say
“Hello!” anytime the space bar is pressed.
Processing has the same types of control statements as Scratch that allow for looping. In
particular, we will examine the while loop. However, Processing also has an implicit
looping structure. You may recall that the draw() function is executed every time the
screen is redrawn—defaulting to 60 times per second. This means that any code in the
draw() function—including key press checks—will be executed continually as long as the
program is running! For this reason, it is a good idea to reserve a section in draw()
precisely for this purpose.
As usual, you should read and reference the Processing Documentation in order to
understand the use of these constructs.
1. Initialization
2. Condition
3. Action
4. Update
To see a while loop in context, let’s imagine we are drawing the buttons on a snowman’s
torso. Each of the buttons is essentially the same, so this is a perfect task for a loop.
340
The red arrows in the above image show the relative positions for each of the four parts of
the while loop:
Initialization: Here, we are keeping track of the number of buttons we have drawn. As
we have not drawn any before the loop begins, this variable is set to zero.
Condition: We are going to draw five buttons. So, as long as we have not (i.e., while)
drawn five buttons, we will keep executing the loop.
Action: This is where we draw a button. Notice that we do not draw five completely
identical buttons, because that would mean that we draw them on top of one another.
Instead, we vary the y-position, so that the first button begins at y-coordinate 210, and
each remaining button is drawn 20 pixels below the previous.
Update: Lastly, it is important that we indicate that we have drawn a new button each
time by updating the value of numberOfButtonsDrawn . What would happen if we
did not?
341
Instructions
Modify your sketchbook from Keyboard Input as follows:
1. Add a while loop that adds various repeated components to your figure.
2. The loop should execute a minimum of three times.
342
UTeach CS Principles Unit 4: Digital Media Processing
UNIT TOPIC:
Image Manipulation
RGB Color
You will examine the structure of raster images as compositions of individual pixels.
You will explore various methods of representing color, including RGB, CMYK, and
HSV.
You will explore the various colors that can be produced by the combination of different
ratios of red, green, and blue light.
Raster Images
You will modify the color channels of pixels in an image to produce a variety of effects.
You will design algorithms for modifying the pixels in an image in prescribed ways to
create custom image filters.
Encoding Schemes
You will explore the difference between lossy and lossless encoding schemes of
several common image file formats.
343
UTeach CS Principles Unit 4: Digital Media Processing
UNIT TOPIC:
Image Manipulation
RGB Color
You will examine the structure of raster images as compositions of individual pixels.
You will explore various methods of representing color, including RGB, CMYK, and
HSV.
You will explore the various colors that can be produced by the combination of different
ratios of red, green, and blue light.
344
UTeach CS Principles Unit 4: Digital Media Processing
Calculating Colors
Red, Green, and Blue
Bits are just bits—1s and 0s. Without instructions for encoding or abstracting the bits, they
have no meaning. Think about this: What would happen if an image encoded in RGB format
were read as BGR (blue-green-red) format instead? Assuming that all other aspects are the
same, blue and red in the image would be reversed, resulting in some alternate color
schemes for the image. Although this is a trivial example, it is important to realize that these
standards for encoding give meaning to the sea of bits floating around your computer and
the Internet.
Instructions
In this experiment, you will manipulate colors by altering their binary representations with
this Color Calculator tool.
Exercise
1) Enter the following values as RGB values into the Color Calculator. What color is
generated?
A. Purple
B. Green
C. Pink
D. Brown
Answer:
2) Notice that the largest value is the Green sequence. How does that affect the color?
345
A. It makes the color green.
B. It makes green the most dominant tint to the color.
C. Red and blue dominate the green.
D. It makes it less green; higher numbers are less intense.
Answer:
3) Zero out the Red and Blue sequences. How does that affect the color? Select two
answers.
Answer:
4) Copy the Green value to the Blue value. Leave the Red at zero. What color best
describes the result?
A. Yellow
B. Pink
C. Orange
D. Teal
Answer:
5) Last, copy the Green value to Red. They should all be identical now. Note the effect that
346
has on the color. Why do you think that happens?
Answer:
347
UTeach CS Principles Unit 4: Digital Media Processing
Hexadecimal
Lingua Franca
We have discussed two commonly used bases for encoding numeric information—base 2
(binary) and base 10 (decimal). Each of these serves as a “native encoding”—binary for
digital devices and decimal for humans.
Binary is convenient notation for computers because digital circuits are essentially a
collection of on/off switches. Remember that the underlying concept of binary notation
is the representation of information as a series of dichotomies. The dichotomies may
be represented in a number of ways—including on/off or 0/1.
Decimal is convenient notation for modern humans. Most languages use decimal as
the basis for numeral representation. Many have proposed that base 10 is used
because people learn to count using their 10 fingers. There may be truth to this. The
Yuki tribe in California counted with the spaces between their fingers, and because of
this, they used base 8 to represent numbers!
Computers use binary by necessity because it best complements the underlying hardware.
Why don’t humans use binary? Perhaps it is because the length of binary numerals can
grow quite large quickly. The three digits of 999 in decimal are equivalent to eight digits in
binary— 11100111 . Beyond this, the use of only two symbols may make numerals hard to
read. Which pair is more distinct— 999 and 723 —or 11100111 and 11010011 ?
Because of this, computer scientists use a third base to serve as a lingua franca—a
language that is adopted as a common language between speakers whose native
languages are different.
Binary Abbreviated
Base 16—hexadecimal—is commonly used as this lingua franca. Why base 16?
348
0111 7 7
1000 8 8
1001 9 9
1010 10 A
1011 11 B
1100 12 C
1101 13 D
1110 14 E
1111 15 F
†Note that the numerals are padded to the left with zeroes so that the binary column has the
same number of digits. This does not affect the values. Why?
Using the 16 symbols of hexadecimal, you can represent all the possible permutations of
four bits! Note that the symbols A–F are used to extend the numerals representing each digit
beyond those of decimal—A is equivalent to 10, and F is equivalent to 15. Hexadecimal
serves as an abbreviated form of binary!
Hexadecimal is common when representing RGB colors. RGB uses 24 bits—eight for the
red channel, eight for the green channel, and eight for the blue channel. So, the following
color:
1. Starting at the right most digit of the binary number, isolate four digits.
2. Replace the four digits with their equivalent hexadecimal numeral.
3. Repeat steps 1 and 2, moving leftward†.
349
†If the final sequence of bits numbers less than four, insert 0s to the left until the sequence is
Exercises
Convert the follow from binary to hexadecimal:
1) 11001010 2 =
Answer:
2) 1010101001 2 =
Answer:
3) 10101110001111010111010 2 =
Answer:
4) 111111011011010000101011 2 =
Answer:
5) 101010111100110111101111 2 =
Answer:
6) 14 16 =
Answer:
7) C4 16 =
Answer:
8) 387 16 =
Answer:
9) FFFF 16 =
Answer:
10) B7F5CA 16 =
Answer:
350
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
351
UTeach CS Principles Unit 4: Digital Media Processing
Color
Color Channels
As previously indicated, computer screens use a combination of red, green, and blue to
produce all of the displayable colors. Similarly, in Processing, manipulating colors is merely
a matter of specifying the amounts of red, green, and blue an object’s color contains.
To illustrate how scaling these color amounts (otherwise known as “channels") works, we
shall start with a simpler example—grayscale.
Technically, a black and white image comprises two colors, black and white. These could be
represented via 1 bit (e.g., 0 for black, 1 for white). However, in Processing (and life), there
is more to the spectrum than black or white. Processing allocates 8 bits for specifying
shades of gray. This means that there are 28, or 256, different shades of gray possible.
The boundaries for black and white in the spectrum are at opposite ends, just as they were
in the 1 bit case. This means that the instruction stroke(0); would set the pen to
produce black lines and the instruction stroke(255); would set the pen to produce white
lines.
To demonstrate how the shades of gray vary between these two values, execute the
following program:
352
If you would like to see a slower gradation of grayscale, you can insert a
frameRate(10); line in your setup() function.
353
https://www.youtube.com/embed/x6D8PAGelN8
To illustrate how each of the red, green, and blue color channels works in the same way as
grayscale, alter the fill(blackwhite); line to read fill(blackwhite, 0, 0); .
How does this change the behavior of your program? Try fill(0, blackwhite, 0);
and fill(0, 0, blackwhite); . What do they produce?
Color Variables
Processing allows you to store colors as variables as well. Once you store a color, you can
access its red, green, and blue components separately.
To illustrate this, execute the follow program that “averages” two colors:
354
Instructions
355
“l” lightens them (makes the color channels have higher values—more intensity).
3. Finally, note that you can alter the colors passed as parameters to the stroke() ,
fill() , and background() functions. Use each of these at least once in your
coloring scheme.
Submission
Submit the .pde of your Processing program.
Functions
color()
stroke()
fill()
background()
356
UTeach CS Principles Unit 4: Digital Media Processing
UNIT TOPIC:
Image Manipulation
Raster Images
You will modify the color channels of pixels in an image to produce a variety of effects.
357
UTeach CS Principles Unit 4: Digital Media Processing
Raster Images
Raster Image Processing
Download the code below, and execute it. Examine the script, focusing on each of the
component parts.
1. Each of the three color channels are extracted and stored in variables named r , g ,
and b .
2. A new color value is created using these very same r , g , and b values.
3. The old color value for the pixel is replaced by the newly created color.
The resulting image appears exactly the same as the original image. But what if the new
color value created in Step 2 were to use altered values for r , g , and/or b ? How might
that affect the appearance of the resulting image?
1. Use one of the colors ( r , g , or b ) to define all three of the color channels in the
new color.
2. Keeping one of the values the same, set the other two to zero.
3. Swap the colors with one another. In other words, you might use r for the blue
channel, b for the green channel, etc.
358
5. Now, try multiplying one of the color values by 2.
6. As mentioned previously, Processing uses 8 bits for each color channel, giving a
usable range of 0–255 for the separate red, green, and blue channels. What do you
think would happen if a value of 0 in the old image were mapped to a value of 255 in
the new—and vice versa? Try it. Replace each of color channels r , g , and b with
their complements, (i.e., (255 - r) , (255 - g) , and (255 - b) ). What
happens? Why?
Pixel Order
Manipulating each pixel achieves interesting effects, but we can also work with the entire
array of pixels as well. Try these two simple examples:
1. Instead of updating the color each time a new pixel is loaded, update the color only
once every 20 times a pixel is loaded. HINT: Change the r , g , and b variables only
when location is a multiple of 20.
Example:
if (location % 20 == 0)
{
update the variables
}
2. The current loop begins at 0 and goes all the way through the array of pixels in
order. The last pixel location is pixels.length - 1 . Replace the line storing the
pixel color at location with:
Does the result surprise you? If it does, review the Swaps assignment from Unit 3: Data
Representation.
359
UTeach CS Principles Unit 4: Digital Media Processing
Eliminating Digital Noise
Noisy Images
Each of the following is a Parlante image puzzle† — an image that has been obscured with
digital noise. You must recover the original image by attenuating this noise as outlined
below.
Instructions
To access each image, right-click and copy the image address to paste into the provided
loadImage() function.
Submission
Type a description of each image (from the Iron, Copper, and West puzzles) into the
provided text box for credit.
Iron Puzzle
The iron-puzzle.png image is a puzzle. It contains an
image of something famous; however, the image has been
distorted. The famous object is in the red values; however, the
red values have all been divided by 10, so they are too small
by a factor of 10. The blue and green values are all just
meaningless random values ("noise") added to obscure the
real image. You must undo these distortions to reveal the real
image.
First, set all the blue and green values to 0 to get them out of the way. Look at the result... if
you look very carefully, you may see the real image, although it is very, very dark (way down
towards 0). Then multiply each red value by 10, scaling it back up to approximately its
proper value. What is the famous object?
Answer:
Copper Puzzle
The copper-puzzle.png image is a puzzle—it shows something famous, but the image
has been distorted. While the true image is in the blue and green values, all the blue and
green values have all be divided by 20, so the values are very small. The red values are all
just random numbers, noise added on top to obscure things. Undo these distortions to reveal
360
the true image.
First, set the red values to 0 to get that of the way. You may be
able to see the image very faintly at this point, but it is very
dark. Then multiply the blue and green values by 20 to get
them back approximately to their proper values. What is the
famous object?
Answer:
West Puzzle
The west-puzzle.png image is a puzzle. It shows
something famous, however the image has been distorted.
Use if-logic along with other pixel techniques to recover the
true image. The true image is exclusively in the blue values,
so set all red and green values to 0. The hidden image is
encoded using only the blue values that are less than 16 (that
is, 0 through 15). If a blue value is less than 16, multiply it by
16 to scale it up to its proper value. Alternatively, if a blue value in the encoded image is 16
or more, it is random garbage and should be ignored (interpreted as 0). This should yield the
recovered image, but all in the blue channel. As a final fix, the image should be in the red
channel to look more correct, so change your code to move the values from the blue
channel to the red channel.
Answer:
361
UTeach CS Principles Unit 4: Digital Media Processing
UNIT TOPIC:
Image Manipulation
Raster Image Manipulation
You will design algorithms for modifying the pixels in an image in prescribed ways to
create custom image filters.
362
UTeach CS Principles Unit 4: Digital Media Processing
Raster Image Manipulation
Assignment
Time to get your hands dirty and begin manipulating bits!
Instructions
Submission
You must submit the following items:
363
UTeach CS Principles Unit 4: Digital Media Processing
UNIT TOPIC:
Image Manipulation
Encoding Schemes
You will explore the difference between lossy and lossless encoding schemes of
several common image file formats.
364
UTeach CS Principles Unit 4: Digital Media Processing
Digital Image File Extensions
Lossy vs. Lossless Formats
Your task is to complete a table that organizes the following information about various digital
image file types. Helpful resources to begin with are Image Formats: What’s the Difference
Between JPG, GIF, PNG? and Digital Image File Types Explained.
Lossy or Ideal
Lossless? Advantages Disadvantages for...
JPG
GIF
SVG
PNG
BMP
TIFF
365
UTeach CS Principles Unit 4: Digital Media Processing
Encoding Schemes
Abstraction in Imagery
Have you ever heard the colloquialism, “There’s more than one way to skin a cat"? Taken
literally, the sentence is somewhat morbid, but figuratively, it means that there is often more
than one way to accomplish a task, whether that be skinning a cat, coding a program, or
representing an image digitally.
Look at the image below, and consider how you would describe it and how a computer
would “describe” (i.e., encode) it:
repeated patterns
names of objects
color (RGB, wavelength, intensity)
size (length, width, height)
shape
location (coordinates)
brightness (greyscale, darkness, power allocation)
texture
366
These are all characteristics that we notice visually, and they are therefore factors that we
need to consider when we represent images digitally, so that the computer can accurately
interpret the abstraction and display the image as we intend.
Each of these factors determine which encoding schemes are best suited to represent an
image. For example, let’s consider a black & white diagram of an arrow, like this one:
In this case, the color is not as important as the shape, location, and direction of the arrow,
which provide meaning to the diagram. A vector encoding would be much more suitable for
this image than a raster representation, because shape, location, and direction information
can be expressed more concisely and require less data.
Certain encoding schemes are better than others for representing shape, location, and
direction. On the other hand, other encoding schemes are more effective for encoding an
array of colors than they are for encoding shape, location, or direction.
When you choose the encoding scheme for a digital image representation, consider the
most important aspects of the image and determine what type of image formats would work
best for you to realize them.
Just as text can be represented with binary by conventional mappings, so can media
(images, video, audio, etc.). Unlike the fairly standardized encoding of text, images are
represented in many different ways.
https://www.youtube.com/embed/p7Mm-hOTOiM
Image Types
Imagine that an image file is opened, and the first few bits
read:
011101111100110100011110
368
Red Green Blue
01110111 11001101 00011110
If the image file is a vector image, these bits represent instructions for drawing shapes. This
is much like the Processing programs that contain instructions to draw on the screen:
rect(300,200,150,50);
An example encoding may produce a rectangle if it were for a vector image encoding:
Here, the same bits that determined color before are abstracted to draw a
rectangle with a 119 pixel width, 205 pixel length, and a line thickness of 30
pixels.
Note how simple it would be to double the thickness of the line. Simply
multiplying 00011110 by 2 produces 00111100 . If this were a raster
format, each of the corresponding pixels of the new larger rectangle would need redefining.
Abstraction rears its beautiful head once again. Bits mean nothing
inherently; their meaning depends upon the instructions to decode
them. You could conceivably view ASCII text as a JPG. It would
probably be ugly noise – or perhaps exhibit a distinct pattern?
Try the reverse—open a JPG in Notepad or TextEdit.
Open a file in “hexdump” to see its hexadecimal
representation (BASE16).
If you are using a Unix-based operating system (e.g., Linux or
Mac OS), try using the command “xxd -b” on a file to see its
contents in binary bitstrings of 1s and 0s.
Different file formats (abstractions) lend themselves well to specific purposes. For example:
369
UTeach CS Principles Unit 4: Digital Media Processing
Picture Logic
An Encoding Scheme Puzzle
File types and other metadata indicate how bits should be interpreted to recreate an image.
An encoding scheme describes the manner in which a file organizes its constituent bits.
There are a variety of encoding schemes that computer scientists utilize for different
purposes. Watch this video, which explains image representation in terms of a run-length
encoding scheme:
https://www.youtube.com/embed/uaV2RuAJTjQ
"For example, consider a screen containing plain black text on a solid white background.
There will be many long runs of white pixels in the blank space, and many short runs of
black pixels within the text. Let us take a hypothetical single line, with B representing a black
pixel and w representing white:
wwwwwBwwwwwBBBwwwwwwwwwwBwwwwwwwwwwww
If we apply the run-length encoding scheme to the above hypothetical line, we may interpret
this as the following:
The run-length code represents the original 37 characters in only 16 (22 if you include
spaces for clarity).
370
5w 1B 5w 3B 10w 1B 12w
Of course, the actual format used for the storage of images is generally binary rather than
ASCII characters like this, but the principle remains the same. Even binary data files can be
compressed with this method; file format specifications often dictate repeated bytes in files
as padding space.” (Run Length Encoding, Princeton University)
How are the encodings the same or different from run-length encodings?
Why wouldn’t a game using just run-length encodings be challenging?
Why do you think fax machines use RLE?
371
UTeach CS Principles Unit 4: Digital Media Processing
UNIT TOPIC:
Image Manipulation
Manipulating Digital Images
You will design algorithms for modifying the pixels in an image in prescribed ways to
create custom image filters.
372
UTeach CS Principles Unit 4: Digital Media Processing
Filters
Warhol Grids
From 1962–1964, Andy Warhol made 30 silkscreen printings of Marilyn
Monroe, addressing themes like death and celebrity (Tate Museum of
Modern Art, London). These colorful screenprints are emblematic of the
Pop Art movement that Warhol is commonly associated with. Each
screenprint was created from a simple black and white publicity photo of
Marilyn used for the 1953 film Niagara. Warhol recalls the process:
373
These transformations were analog, and occured in physical space. Today, we can
programmatically alter images in similar ways. Visit this WebExhibit about Andy Warhol’s
Marilyn Prints and experiment with programmatically altering the publicity photo to create
your own Marilyns.
Assignment
Although you can programmatically alter pixels in any way that
you wish, some operations are common enough that
Processing has pre-built functions that use the graphics
hardware in your computer to quickly carry out these image
processing tasks. Consult the Processing documentation for
filter() to see examples of each of these common built-
in tasks.
374
effects to create a 2 × 2 Warhol Grid.
Example artifact:
Submission
Submit the Processing sketch you develop, as well as a reference image.
Functions
375
filter()
text()
save()
376
UTeach CS Principles Unit 4: Digital Media Processing
UNIT TOPIC:
Audio Manipulation
Digital Audio
You will analyze the differences between analog and digital sound.
You will explore the roles that sampling rate and bit depth play in determining the
quality of digitized sound.
Audio Processing
Audio Compression
You will explore the methods and effects of compression algorithms in reducing the
amount of data needed to represent an audio sample.
377
UTeach CS Principles Unit 4: Digital Media Processing
UNIT TOPIC:
Audio Manipulation
Digital Audio
You will analyze the differences between analog and digital sound.
You will explore the roles that sampling rate and bit depth play in determining the
quality of digitized sound.
378
UTeach CS Principles Unit 4: Digital Media Processing
Digitizing Audio
The Nature of Sound
Natural sound is simply the movement of molecules through a medium like air. Nature is
continuous, whereas digitization is discrete. When the continuous sound wave is digitized, it
is broken into discrete parts. Digital sounds are merely numeric (binary) instructions for
replicating the sound production, because all music enters your ears in analog, soundwave
form.
Audio Sampling
How can we convert an analog audio wave to digitized numeric values? We measure the
sound wave over fixed periods of time, a process of “sampling.” In essence, audio sampling
involves taking short samples of the audio wave almost continuously.
379
Watch the video to learn more about the digitization process and audio sampling:
https://www.youtube.com/embed/Ss2DYPBrBhg
Sampling rates and bits per sample are very important to audio digitization in the following
ways:
1. The greater the sampling rate taken from the continuous sound wave, the better quality
the digitization.
2. The greater the bit depth used to encode the instructions for playing the sound, the
better quality the digitization.
380
UTeach CS Principles Unit 4: Digital Media Processing
Audio Generation
Theremin Demonstration
Electronics have been used to create music for nearly a century. One of the first electronic
instruments is the theremin, an instrument that is played without ever actually touching it.
https://www.youtube.com/embed/MJACNHHuGp0
Because the theremin works by using hand positions to directly determine the frequency and
amplitude of a sound wave, it is relatively easy to simulate one in software. The code below
uses the mouse pointer in lieu of the thereminist’s hands.
All future projects that require the Sound library will now be able to use
this library.
381
1. X maps to frequency.
2. Y maps to amplitude.
3. You can click at a particular point to fix the harmonic.
After you have thoroughly annoyed everyone in the vicinity made beautiful music, be
prepared to discuss the following before moving on to your assignment.
What are the parameters required to generate a sound? Note that these are the
qualities/quantities that change producing a corresponding change in output.
Are these parameters sufficient to generate any type of sound? What other parameters
might be needed to generate notes that sound like a guitar or a flute?
How do these parameters approximate physical reality? How is this similar to the
parameters used to generate visual artifacts, like color, position, or brightness?
Assignment
Your assignment is to program a virtual piano in Processing—il Processiano! Download the
starter code for il Processiano.
Extend the code to generate audio with key presses. The starter code generates and
displays notes when the corresponding keys are pressed ( c , d , e ), but your program
must meet the following specifications:
1. Extend the code to work with the entire range c , d , e , f , g , a , and b . The
following table will help you map notes to exact frequencies.
2. Extend the code to work across two octaves. When you press the SHIFT and C
keys (AKA capital “C”), the procession should play a C note the next octave higher.
HINT: Physics makes this easy. Each note is twice the frequency of itself in the
previous octave. In other words, the C one octave above middle C (261.626 Hz) is 2 ×
261.626, or roughly 523.252 Hz).
382
Submission
Submit the Processiano.pde file of your Processiano sketchbook.
Extension
Processiano uses only natural (♮) notes—i.e., no sharps (♯) or flats (♭). Extend the program
to allow the user to designate a sharp or flat by using the CTRL or ALT keys along with
the designated note key. Note that not all notes have corresponding sharps and flats! This is
why some pairs of white keys on a piano are separated by black keys and some are not
(e.g., there is no such note as E♯ or F♭).
383
UTeach CS Principles Unit 4: Digital Media Processing
UNIT TOPIC:
Audio Manipulation
Audio Processing
384
UTeach CS Principles Unit 4: Digital Media Processing
Post-Processing Audio
Auto-Tune
If you’ve listened to any pop music in the past few years, you’ve most likely heard a number
of songs that use the popular post-processing effect called Auto-Tune. Here is a
demonstration of how it works:
https://www.youtube.com/embed/9OUgXFZ_WeY
Auto-Tune takes an existing audio file and transforms the bits to make the pitch “perfect.”
This is possible because digital audio is represented with bits, which can be easily
manipulated mathematically using algorithms.
The alteration of a sound after it has been recorded is called post-processing. Most audio
effects (e.g., echo, reverb, phase shifting, backmasking) are added after the recording
(a.k.a. post-processing). Digital representations have alleviated many challenges relating to
post-processing of audio, which has drastically changed how we view the creation of music.
Pretend that we digitized the sound of a hand clap, represented as the following digital
sound wave.
385
How would we represent the sound when it echoes?
Since an echo is merely a sound that has bounced off a surface and is re-heard, the
sound repeats. Therefore, to simulate an echo, the digitized sound wave should
repeat:
Note that this would result in the exact same sound repeated, but we
could overlap the sounds to make the echo effect more pronounced.
Let’s think about this in another way, using the following “values” that represent a sound
wave:
3-1-2-4-4-5
How might an echo look in terms of the values encoded by the digital representation?
Other post-processing effects are more complex, like noise removal, auto-tuning, etc. If you
take all the values in a digitally encoded audio file and reversed their order, that would be
similar to backmasking. The variety of post-processing strategies is too great to list here and
is ever-evolving, but the great thing is you don’t need fancy production software or tools to
post-process audio when you have computational thinking and programming skills.
386
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
387
UTeach CS Principles Unit 4: Digital Media Processing
Audio Processing
Digtal Audio
Just as digital photography captures and quantizes light and
color, digital audio recording may capture sound volume and
frequencies. Once these characteristics are represented as
values and stored numerically, they may be manipulated. Just
as mathematical operations can alter the colors or brightness
in an image, they may alter the frequencies and amplitudes of
stored sound waves.
However, the techniques used in audio processing are more difficult than those in image
processing, because timing is an important factor. You will spend some time doing some
simple manipulation of audio files in Processing.
Download the audio processing starter code and three sample audio files (compressed into
a single .zip ) and execute the sketchbook.
All future projects that require the Minim library will now be able to use
this library.
Note: This is a different audio library than the one previously used
for Audio Generation.
Examine the code to see how it functions. In particular, note the following code segment:
388
while (j < newSamp.length)
{
newSamp[j] = samp[j];
j = j + 1;
}
oldSamp = samp.clone();
arrayCopy(newSamp, samp);
}
This is where the majority of the work takes place. The process() function reads into
memory an array of the samples and processes each of them one at a time.
In the default code, no post-processing is done. Each sample is simply assigned the
unaltered sample: newSamp[j] = samp[j]; . Changing this one line can result in a
variety of effects.
Instructions
Perform the following tasks, and note how the audio is altered as a result.
samp[j] × 2
samp[j] ÷ 2
samp[j] − itself
samp[j] + 1.0
samp[j] + random(0.1)
random(1.0)
Note that you are operating on each sample. This is akin to operating on each pixel in an
image. You can also operate on the order of samples as well.
Note that some of the effects will be more pronounced on different types of audio, so
you should try each of the tasks above on each of the three audio files.
Submission
Submit a reflection on the assignment. Include each of the following items:
389
A description of what each of the nine bulleted tasks does,
an explanation of how each task works, and
an explanation of how mathematics affects the wave.
390
UTeach CS Principles Unit 4: Digital Media Processing
UNIT TOPIC:
Audio Manipulation
Audio Compression
You will explore the methods and effects of compression algorithms in reducing the
amount of data needed to represent an audio sample.
391
UTeach CS Principles Unit 4: Digital Media Processing
X Marks the Spot
Buried Treasure
As you approach the docks, you see Captain Jack clutching a tattered piece
of parchment.
“Ahoy thar! I did bury our treasure but a wee bit yonder,” he
says, pointing over his shoulder with a crooked thumb.
Cracking a toothless grin, he adds, “Bein’ first mate, you do serve
as me backup in case me directions get lost.”
He hands you the parchment. Carefully straightening it, you see a list of letters crossing the
page three times over, nearly covering it.
Carefully, his brow furrowed in concentration, Jack begins ripping the lower left corner of his
page, separating a small piece of parchment from the rest.
392
Can you compress all of this information
into a smaller form?
Instructions
Produce a compressed version of the directions so that a stranger could follow them with
minimal instructions and find the location of the buried treasure. Your minimal instructions
should be confined to a brief three or four sentence description—one that could be given
completely orally and easily remembered.
Create a copy of this spreadsheet to track and compare your solutions. Submit your work
when finished.
393
UTeach CS Principles Unit 4: Digital Media Processing
Compression Algorithms
The Importance of Efficiency
Mathematicians and computer scientists do not like to waste their time repeating
themselves. They want to save time, space, and complexity wherever possible. In fact, many
of the most common mathematical notations can be considered compression algorithms.
For example, consider repeated multiplication, which is rife with redundancy:
Efficiency Expression
Original 5 × 5 × 5 × 5 × 5 × 5 × 5
Exponential notation was created to alleviate this degree of redundancy and improve
legibility:
Efficiency Expression
Original 5 × 5 × 5 × 5 × 5 × 5 × 5
Compressed 57
Efficiency Expression
Original 5 + 5 + 5 + 5 + 5 + 5 + 5
Compressed 5 x 7
Watch the video below to learn more about how compression works in terms of
mathematical operations, which as you know, are critical to our ability to manipulate digital
media:
394
https://www.youtube.com/embed/cbBFnlwK98M
Let’s take a look at a concrete way of how compression works in computer science and how
it may affect you.
https://www.youtube.com/embed/MVZNTV-cAMg
395
3. The fact that algorithms are ordered processes implies reversibility, so that the
compressed text can be decompressed (not guaranteed—only for lossless).
A good measure for comparing the effectiveness of compression algorithms is the ratio
of length from original to compressed (perhaps denoted as a percentage). Note: a
better compression ratio over one string of text does NOT guarantee that one
algorithm is more effective than another. Some forms of compression may be tuned to
a specific type of data (like text, music, images, etc.).
Watch the video below to see how two compression algorithms vary and to identify the
advantages and disadvantages of these compression algorithm strategies.
https://www.youtube.com/embed/xyKA4arxQ5I
Since no single compression algorithm is best, it’s important that one can analyze and
evaluate compression algorithms to identify the optimal strategy for given data in a given
context.
For example, JPG images and MP3 files are already compressed media files. There are firm
limits to how much information can be compressed without losing too much, and we use the
396
terms lossless and lossy to describe different compression algorithms that adhere to these
limits and those that don’t.
Lossless means “without loss,” just like hopeless means “without hope.” Lossless
compression means that compression has occurred with zero loss of information. On the
other hand, lossy compression indicates that there has been some data lost through
compression. Different lossy compression algorithms can result in different amounts of lost
data. Let’s consider an example of lossy compression.
The most common lossy compression occurs when you “rip” an audio CD and convert the
tracks to MP3s. The images below represent the digital soundwave that exists before and
after various stages of compression from raw audio to MP3. As you scroll down and the
audio becomes more compressed, what differences can you notice?
397
Perhaps it’s easier to see quality loss in lossy compression with compressed images:
Though these examples are indicative of lossy compression, not all compression is lossy. As
mentioned previously, lossless compression involves no data loss. With lossless
compression, you can convert back to the original at any time (may be time-intensive,
however). For example, FLAC audio loses no quality over CD.
Lossy compression algorithms often result in greater reductions of file size, offer the best
compression ratios, and are designed to be “good enough” approximations. A good example
might be the use of a thumbnail by a website as a link to a larger image. The compressed
thumbnail version reduces the amount of data necessary to load the page, but if users would
like to see the original image, they can follow a link to that version.
However, lossy compression can never be “undone,” because the lossy compression
algorithms remove information that isn’t necessary for representation, but can never be
398
reconstructed once lost. In other words, you cannot go from a lossy-compressed image back
to the original image.
399
UTeach CS Principles Unit 4: Digital Media Processing
BIG PICTURE:
Ethics of Digital Manipulation
Highlights
You will explore the positive and negative consequences of digitally altering images.
You will discuss the ethics of digitally manipulating images, especially in the context of
journalism.
You will discuss the issues related to intellectual property.
You will explore the limitations and rights associated with a number of common
licenses, including Creative Commons.
400
UTeach CS Principles Unit 4: Digital Media Processing
Original or Manipulated
The Photoshop Phenomenon
In today’s world, we consume media of all sorts
on a daily basis. Multimedia surrounds us, and
often influences our thinking, decision making,
and actions. However, the digital nature of these
artifacts makes it possible for them to be easily
manipulated, edited, “Photoshopped,” remixed,
remashed, etc., so it can be difficult to decipher between what is “real” and what is “fake.
Take this challenge to see how well you can discern between the “real” and the “fake.” You
will be shown a photo chosen from a set of manipulated photo pairs (in other words, one
original and one modified version exist for each photo), and you must decide whether the
photo has been digitally manipulated.
After you’re done, your teacher will present each pair of original and manipulated photos,
and you will discuss possible motivations for manipulating each of the photos. Note that
each of these photos comes from actual instances in the news media.
401
Answer:
Answer:
402
Answer:
Answer:
Answer:
403
6) Is this image the original or was it manipulated? Explain your reasoning.
Answer:
404
UTeach CS Principles Unit 4: Digital Media Processing
Original or Altered (Answers)
Before and After Comparisons
1) When TIME magazine put a mugshot of OJ Simpson on its cover in June 1994 following
the brutal double slaying of his ex-wife and her friend, the magazine was widely criticized for
manipulating the color tones and shadows to make Simpson appear more ominous and to
incite racial sentiments. TIME magazine defended their choice as “artistic interpretation.” In
contrast, Newsweek magazine published an undoctored version of the same mugshot for its
cover that same week as well, making TIME’s manipulation of the image that much more
apparent.
Original Manipulated
2) This is one of at least 79 altered photos that veteran news photographer Allan Detrich of
the Toledo Blade was found to have modified prior to their publication in the paper. In this
particular photo, an image of a basketball was digitally inserted into the photograph,
presumably to balance the composition and add excitement to the play being captured by
the shot. Despite his reputation as a Pulitzer Prize finalist, the criticism he received for
editing the photos beyond the newspaper’s accepted standards led to Detrich’s resignation
and a public apology from the paper.
Original Manipulated
405
3) In 2008, the Liberty Times published a photo of a visit from a Taiwanese delgation to the
Vatican with Pope Benedict. The photo that ran in the paper, however, had been significantly
modified to remove the publisher of a rival publication from the center of the image. The
paper and the reporter who edited the photo came under criticism that their actions “violated
journalistic ethics.”
Original Manipulated
4) A pair of photos taken in 2003 near Basra, Iraq, were combined to make a single, more
dramatic image. The two photos were from a series of images shot within moments of each
other by photographer Brian Walski that depicted scenes of British soldiers and Iraqi civilians
in the war-ravaged region. Before transmitting the day’s photos back to his newspaper in
Los Angeles, Walski chose two of the photos—neither of which was exceptional on its own
—and composited the best parts of each photo into a single image. In one photo, an armed
soldier was caught in a striking pose with his arm outstretched while an Iraqi man holding a
child was somewhat visible in the background, looking away. In the second photo, the man
with the child was more prominent and looking at the soldier, but the soldier’s posture was
less dramatic. By compositing the two photos from the same actual event, Walski
manufactured a moment in time that never actually existed. Upon the discovery of his
selective altering of events, the LA Times fired Walski, stating that, “What Brian did is totally
unacceptable and he violated our trust with our readers.”
406
Original Manipulated
5) In 2005, USA Today briefly ran a photo of Secretary of State Condoleezza Rice in which
her eyes had been unnatuarally brightened, resulting in a “menacing, demon-eyed stare.”
After the manipulated photo came to light, USA Today pulled the photo from its website and
offered the following explanation:
Original Manipulated
407
of what was acceptable. Christensen has disputed the judges’ decision, claiming that the
alterations he made to the color and saturation levels (i.e., “burning,” “dodging,” and “color
correction") fall within the letter of the contest rules, which include the following restriction:
The judges disagreed, saying that Christensen’s photographs “went too far” and that, “The
colors almost look like they have been sprayed onto the pictures.”
Original Manipulated
408
UTeach CS Principles Unit 4: Digital Media Processing
Ethics of Digital Manipulation
Debate
In this activity, your class will hold a debate in
which one sides argues that “digital
manipulation in the media” is essentially
unethical while the other side argues that it is
ethical or irrelevant.
2. Watch the video “Everything is a Remix: Part One” (7:18) to learn more about re-
mixing and re-mashing.
3. Prepare a team argument for your assigned side in the debate. You will have five
minutes to prepare. You should reference the following:
the video,
the Original or Manipulated exercise,
evidence from public record or personal experience, and
concepts covered in the course.
4. Debate the ethics of digital manipulation in the media. Your team will debate according
to the debate protocol of your teacher’s choosing. Your teacher will be the judge and
decide which team puts forth the better argument.
Make sure you address the issues, not the other team!
409
https://player.vimeo.com/video/14912890
410
UTeach CS Principles Unit 4: Digital Media Processing
Creative Commons
Intellectual Property Rights
Now that the Internet is media rich (with digital images, audio,
and video on almost every page), concerns about ownership
of these digital properties increase.
For example:
For the most part, there are laws in place that protect intellectual property, including digital
intellectual property. The most familiar set of guidelines used to understand the licensing of
digital intellectual property is that of Creative Commons.
Watch the video below that outlines the most important concepts behind Creative Commons,
and consider the appropriateness of each license in different contexts:
https://www.youtube.com/embed/AeTlXtEOplA
411
412
UTeach CS Principles Unit 4: Digital Media Processing
Image Filter Project: Rubric Check
Instructions
This activity provides you and a partner group time to provide one another critical feedback
about the progress of your projects.
Listen to what your partner group says! You do not need to heed all of their advice, but
consider it wisely. The whole idea is to have a critical third party provide ideas to make your
project better before you submit the final version for a grade.
413
UTeach CS Principles Unit 5: Big Data
UNIT 5
Big Data
414
UTeach CS Principles Unit 5: Big Data
UNIT PROJECT:
TEDxKinda
Highlights
You will collaborate in groups to analyze public data sets and extract insightful
information and new knowledge using a number of big data analysis techniques and
tools.
You will evaluate and justify the appropriateness of your chosen data set(s).
You will construct informative and aesthetically pleasing data visualizations.
You will write a script and prepare speaker notes for a formal presentation of your
findings.
You will cite all online and print sources used in your research and presentation
preparation.
You will deliver a TED-style presentation discussing your data analysis and findings
using appropriate terminology.
415
UTeach CS Principles Unit 5: Big Data
TEDxKinda Project
“The goal is to turn data into information, and information into
insight.” – Carly Fiorina, former chief executive of HP
https://www.youtube.com/embed/I6APyXdHBcg
Big data is all around us. What can it do for you? us? the world? Tara’s use of big data to win
a political campaign is no longer a novel strategy:
So what was the Obama camp able to do with this big data trove? One
senior official from the campaign told TIME that the group “ran the
election 66,000 times every night” based on the day’s data and allocated
resources based on likely outcomes. In addition, the demographic
information they collected and scored against other factors allowed them
to find more targeted ways to buy television advertising to reach their
“microtargeted” voters.
Every day, big data analysis leads to innovative ideas, applications, and knowledge. Many of
416
these innovations and discoveries are shared via dynamic TED talk presentations. TED is a
nonprofit organization devoted to “Ideas Worth Spreading.” Many of TED’s best presenters
utilize big data analysis techniques, too. For example, here is a presentation called “Your
Phone Company is Watching,” by Malte Spitz.
https://www.youtube.com/embed/Gv7Y0W0xmYQ
Assignment
Each group in your class will choose its own meaningful topic,
with the following caveat:
Data sets will not be available on every topic, so you may first want to identify potential data
sets before you narrow down your ideas to a particular topic or theme. There is a wide
variety of public data sets and tools that you may choose from in order to complete your
presentation, but you will have to be creative, flexible, focused, and diligent in order to
synthesize all the components into a composed, powerful presentation. Feel free to use and
edit the Big Data Sets and Tools for Big Data Analysis resources.
417
include at least two of the following data analysis strategies:
cluster analysis,
classification,
linear regression,
association rule mining, and/or
outlier, anomaly, and change detection.
utilize automated summarization,
provide insightful analysis and evaluation of the topic and related big data sets, and
be professional and engaging to the audience.
For extra credit, you can try to apply appropriate crowdsourcing strategies and achieve
useful results!
For classes that want to go the extra mile, create your own TED-Ed club at your school, or
you can invite the outside world to your TEDxKinda talk and organize an official
TEDxYouth@{insert your city’s name}, like TEDxYouth@Austin. This requires a little extra
work, but changing the world is never easy!
Submission
Your submission will be in the form of a presentation, including speaker notes. The
presentation you submit must:
utilize the data analysis strategies as described above and include information from
these analyses explicitly,
provide insightful analysis and evaluation of the topic and related big data sets,
contain at least three informative and aesthetically pleasing data visualizations,
use key terminology from the glossary appropriately and as necessary, and
include effective speaker notes for asynchronous presentations (and evaluation).
When you are finished, you must submit a copy of your presentation (e.g., Powerpoint,
Prezi, etc.) including speaker notes. If this includes multiple files, zip the files together into
one file. Be sure to appropriately name the file you upload as your submission.
Learning Goals
Over the course of this module and this project, you will learn to:
Rubric
Weighted: 20%
Data Analysis—
Method 2 Students must use a
different one of the
big data analysis
Performs a second strategy techniques learned
from the following list throughout the
(association rule mining, course.
classification, regression
analysis; cluster analysis;
The technique must 2 pts 0 pts
be applied
anomaly, outlier, and accurately and
change detection) on a large appropriately.
data set accurately and The technique must
appropriately. apply to the topic or
theme.
Weighted: 20%
Data Analysis—
Automated The technique must
Summarization be applied
accurately and
appropriately.
Performs automated The automated 1 pt 0 pts
summarization strategies on summarization
a large data set accurately strategy used must
and appropriately. apply to the topic or
theme.
Weighted: 10%
Students clearly
understand the data
Insight collected.
419
Students draw
Provides insightful analysis conclusions relating
of the data set and makes to the topic and
share them 2 pts 0 pts
clear connections to the
TEDxKinda presentation concisely throughout
topic. the presentation.
Students are able to
answer simple
Weighted: 20% questions about the
topic or theme.
420
UTeach CS Principles Unit 5: Big Data
UNIT TOPIC:
Data Science
Introduction to Big Data
You will relate the impact of computing to ubiquitous and large-scale data processing.
You will explore the ways that patterns within large data sets can be used in a
predictive manner.
You will discuss the risks and benefits of drawing conclusions from patterns found in
large data sets.
You will identify the characteristics that differentiate usable data from unusable data.
You will identify the characteristics that differentiate useful data from useless data.
Data Visualization
You will combine visuals, content knowledge, and interaction to create a dynamic
infographic that clearly communicates discrete information about a data set.
421
UTeach CS Principles Unit 5: Big Data
UNIT TOPIC:
Data Science
Introduction to Big Data
You will relate the impact of computing to ubiquitous and large-scale data processing.
You will explore the ways that patterns within large data sets can be used in a
predictive manner.
You will discuss the risks and benefits of drawing conclusions from patterns found in
large data sets.
422
UTeach CS Principles Unit 5: Big Data
What Is Big Data?
The Origins of Big Data
There is a mind-boggling amount of data floating around our society. Physicists at CERN
have been pondering how to store and share their increasingly massive data for decades—
stimulating globalization of the Internet along the way, while “solving” their big data problem.
Tim Smith plots CERN’s involvement with big data from 50 years ago to today.
https://www.youtube.com/embed/j-0cUmUyb-Y
Through clever and sophisticated analysis of large volumes of finely detailed information,
data scientists are able to tease out hidden patterns and reveal connections that might not
be so obvoius at first glance.
For example, many businesses collect, analyze, and use big data in an attempt to better
understand their customers and improve the service they provide to those customers.
However, this doesn’t always go quite according to plan.
423
individual privacy.
424
UTeach CS Principles Unit 5: Big Data
Exploring US Employment Data
US Employment Data
Time to get your hands dirty exploring some
free, or “open,” data sets using Google’s Public
Data Explorer. The Google Public Data Explorer
has more than 100 different data sets that you
can visualize in different ways, but for this
assignment we will focus on two that are
reputable and address the same topic—
employment in the US.
The two data sets you will explore for this assignment are:
Instructions
First, follow both links to their related data visualizations. Each page contains parameters on
the left-hand sidebar that populate the chart with data. The tool will prompt you to choose
some of these parameters. Choose a few parameters and examine the resulting charts.
What about these visualizations interests, surprises, or informs you?
After you have familiarized yourself with both data sets and visualizations, perform a more
focused analysis of each in order to discover knowledge that you can share with others.
1. Create a visualization by purposefully choosing parameters that help you examine one
aspect of unemployment and/or inequality (e.g., educational attainment, ethnicity, sex,
age, or location).
2. Write one paragraph that describes the data visualization you created, explaining what
the visualization shows, why this interests you, and why it should be meaningful for
others to understand.
3. Share a link to your visualization, a screenshot of your visualization, and the
paragraph you wrote with the class, using a shared space provided to you by
your teacher.
4. Explore your classmates’ posts, comparing them to your own. Think about any
changes you might make, and provide feedback to your classmates.
425
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
426
UTeach CS Principles Unit 5: Big Data
UNIT TOPIC:
Data Science
Usability and Usefulness of Data
You will identify the characteristics that differentiate usable data from unusable data.
You will identify the characteristics that differentiate useful data from useless data.
427
UTeach CS Principles Unit 5: Big Data
Usability vs. Usefulness
Usable and/or Useful Data?
Imagine that a cop uses a radar gun to catch
speeding cars, but that the radar gun is
completely unpredictable. This faulty radar gun
does one of the following, with no predictability:
The unpredictability of this imaginary radar gun would make the data it produces useless.
This brings us to the crux of this activity: not all data is useful or usable.
Usable Data
What makes data usable?
Data is usable if you can use it, regardless of whether that data is informative or useful.
Another aspect of usability is whether the means to process or analyze the data currently
exist. Consider the case of the SETI Institute, which has been collecting data from radio
telescopes for years and years, but has only been able to analyze roughly 2% of it (despite
the vast amounts of computation leveraged to do so). This data may not be considered
usable in its current form, because it’s simply too big and messy.
What’s interesting about usable data is that in many cases, usable data need not be
completely predictable or accurate as long as it falls within a range of possibility. An example
of this type of data would be GPS data for phones. With the latest smartphones, GPS can be
accurate up to about 10 meters (at its best), which isn’t perfect, but is definitely usable...and
useful.
Useful Data
What makes data useful?
Data is useful if somebody would want to use it, in essence making it valuable for some
purpose or another (not necessarily financially). The usefulness of data is also directly
related to what somebody can do with it.
Imagine a light sensor that gives the exact wavelength of light (color) emanating from one
square foot of the ocean.
428
This sensor provides an exact, real-time reading, but who cares? Can you make predictions,
describe some process, or solve a problem with this data, or is it simply a bit of miscellany
that’s not of value to anyone? Not surprisingly, the usefulness of data is open to
interpretation and context.
Imagine someone has a whole array of these sensors spread across the entire Pacific
Ocean. In this case, the amount of color in a grid square can be compared with the data
collected from other sensors. Is this useful? That may depend on whom you ask. To a
criminal lawyer, car mechanic, or accountant, it may be useless, but to a marine biologist, it
may be extremely valuable and useful.
Imagine again that one of these sensors has collected data regularly for 30 years, and over
that time, the color has shifted from green to blue in certain geographic areas.
What might the reason be for the ocean in this place to obtain a greenish tint? What
inference might be postulated with this data? The mere existence of these questions lends
credence to the likelihood that this data could be be useful.
429
Instructions
Think about the following scenarios and how usable or useful the data described can be.
Challenge yourself to view the questions from different perspectives, and be prepared to
discuss afterward!
Survey
1) Define usable in your own words.
Answer:
Answer:
3) How usable would the receipts from every McDonalds transaction in a day be?
A. Extremely usable
B. Somewhat usable
C. Not usable at all
Answer:
A. Extremely usable
B. Somewhat usable
C. Not usable at all
Answer:
A. Extremely useful
B. Somewhat useful
C. Not useful at all
Answer:
6) How useful would every grade that you’ve ever received be?
A. Extremely useful!
B. Somewhat useful
C. Not useful at all
430
Answer:
7) How useful would the results from hooking a robot up to a polygraph machine be?
A. Extremely useful!
B. Somewhat useful
C. Not useful at all
Answer:
Answer:
Answer:
431
UTeach CS Principles Unit 5: Big Data
UNIT PROJECT:
TEDxKinda
Highlights
You will collaborate in groups to analyze public data sets and extract insightful
information and new knowledge using a number of big data analysis techniques and
tools.
You will evaluate and justify the appropriateness of your chosen data set(s).
You will construct informative and aesthetically pleasing data visualizations.
You will write a script and prepare speaker notes for a formal presentation of your
findings.
You will cite all online and print sources used in your research and presentation
preparation.
You will deliver a TED-style presentation discussing your data analysis and findings
using appropriate terminology.
432
UTeach CS Principles Unit 5: Big Data
TEDxKinda Project: Topics
There is a variety of public data sets and tools that you will choose from in
order to complete your presentation, but you will have to be creative,
flexible, focused, and diligent to synthesize all the components into a
composed, powerful presentation. Data sets will not be available on every
topic, so you may first want to identify potential data sets before you
narrow down your ideas to a topic or theme.
The sooner you can identify an appropriate topic with a relevant data set, the sooner you
may begin and the more purposeful your exploration can be. Use these Big Data Sets to
help jumpstart brainstorming, then continue the search on your own.
Instructions
Your TEDxKinda group must post a one-to-two-sentence paragraph to a shared space
provided to you by your teacher, with the following requirements:
Of course, you may change topics or data sets at a later time if necessary, but this post
describes where you will begin. After you post, read about the other topics groups will be
presenting, and explore their data sets. Provide suggestions, share ideas, or ask questions
of other groups.
433
434
UTeach CS Principles Unit 5: Big Data
TEDxKinda Project: Big Data Sets
Big Business
Big data has become big business around the world. Companies that have their own big
data sets, like Google or Facebook, use those data sets for business purposes and often go
through great lengths to keep them private. Big data sets are not always readily available
because of this, and because of privacy issues involved with sharing data. Most of the data
sets that are openly available are from the government or nonprofit organizations. Some of
these are BIG and some are smaller.
This is a list of some of the most accessible and usable big data sets and collections. You
may use any of these for your TEDxKinda research, or compile your own subsets of data by
using a searchable database. Be sure to clarify the appropriateness of the data sets you find
with your teacher:
Google Public Data Explorer—130 data sets from Bureau of Labor Statistics, U.S.
Census Bureau, etc.
data.gov—an online repository of data sets from the U.S. government
Many counties have searchable property databases, such as the Travis County
Appraisal District
Some data sets defy categorization, such as the Texas Death Row Executions
data set
Google’s Ngram Data—data on Google’s catalog of millions of books, including raw
data sets
Google Trends—detailed search history information, including CSV downloads
NOAA National Climatic Data Center
Knoema—“free to use public and open data platform for users with interests in
statistics and data analysis, visual storytelling and making infographics"
Geocommons—“all about open data analysis and maps"
Stat Silk—“interactive maps of open data"
Better World Flux—“a beautiful interactive visualization of information on what really
matters in life"
Gapminder—“unveiling the beauty of statistics for a better world view"
DataLab—comprehensive database from the National Center of Education Statistics
Measure of America—“easy-to-use yet methodically sound tools for understanding the
435
well-being and opportunity in America"
436
UTeach CS Principles Unit 5: Big Data
TEDxKinda Project: Tools
Visual Analysis
Regression Analysis
Summarization
Maps
Crowdsourcing
438
UTeach CS Principles Unit 5: Big Data
UNIT TOPIC:
Data Aggregation
Collection
You will explore the purposes of various processing tasks, including collection,
knowledge extraction, and data storage.
You will identify multiple techniques for data collection, both on and off of the Internet.
Extraction
Storage
You will explore the basic features and functionality of modern relational databases.
You will debate the implications of large-scale data storage and data persistence on
privacy and utility, including the costs associated with each.
439
UTeach CS Principles Unit 5: Big Data
UNIT TOPIC:
Data Aggregation
Collection
You will explore the purposes of various processing tasks, including collection,
knowledge extraction, and data storage.
You will identify multiple techniques for data collection, both on and off of the Internet.
440
UTeach CS Principles Unit 5: Big Data
Big Data Collection
How and Why Is Data Collected
Computing and big data are seemingly everywhere in our digital world, but most of the time
we are oblivious to how our data is being collected, and for what purpose it is being used.
Instructions
Your task is to identify three or more ways that
big data is being collected on a regular basis,
including one data collection method for each of
the following categories:
You may use the Internet or any other resources available to you to identify and analyze
these data collection strategies.
Submission
Submit a text document with your description and analysis of the three big data collection
strategies you identified.
441
UTeach CS Principles Unit 5: Big Data
Creating Structure from Unstructured
Data Sets
From Unstructured to Structured
Think about the following scenario and what data are involved:
Is it data? Answer
Amount of gas pumped Yes / No
Type of gas pumped Yes / No
Price per gallon of gas pumped Yes / No
Date Yes / No
Time of day Yes / No
License plate number Yes / No
Make/model of car Yes / No
Color of car Yes / No
That was a trick question. The correct answer to all of these questions is “Yes.” All of these
442
are examples of data.
Data are pieces of information that are observable and/or measurable. In this scenario, even
the color of a car is a characteristic that can be observed, measured, and analyzed—
regardless of whether it seems useful. However, the raw data (camera footage) is
unstructured. By creating structured data sets, we can make data more usable and useful.
Unstructured data contain everything collected in “raw” form, but connections and
relationships among strands of data are both harder to trace and much slower to process
than structured data sets. On the other hand, structured data are easy to access and
organize, but may lack the big picture and details that unstructured data may possess.
Let’s consider the following analogy of turning logs into lumber and lumber into a barn:
To make the
These logs are like To make these logs more
lumber more
raw, unstructured useful and usable, we
useful and usable,
data. In their apply some structure by
one could build a
present state they converting them into
barn structure. To
are not very usable lumber. This lumber is
do this, one has to
or useful. much more usable and
saw the lumber
However, the trees useful. For example, one
into different, more
themselves would could use it to build a
useful and usable
be even more raw table, a doghouse, a
shapes. However,
with less structure. soapbox derby car, or
this process
By cutting them barn. However, by turning
cannot be
down, we lost the logs into lumber, we
reversed, so
some wood in the lost some wood in the
again, we lose
form of branches, form of wood chips, tree
some wood chips
roots, and leaves. bark, and sawdust.
and sawdust.
Every time we apply structure to an unstructured data set, it becomes more difficult
(sometimes impossible) to gain back some of the unstructured data. Think about it this way:
By turning trees into logs, we lose some “data” (branches, roots, and leaves).
443
By turning logs into lumber, we lose some more “data” (wood chips, bark, and
sawdust).
By turning lumber into a barn, we lose even more “data” (wood chips and sawdust).
However, by adding structure, we also make the data more usable and useful. Hence, one
has to weigh the usability and usefulness of structured data against the data loss that occurs
from structuring unstructured data sets.
444
briefly:
Unstructured data:
Let’s examine the data contained in a Walmart receipt, in order to outline the differences
between processing unstructured data and structured data.
With unstructured data processing, receipt data might be stored in the “natural” format in
which it arrives (i.e., as a massive set of receipts). Whether these are electronic versions or
scanned copies of paper receipts, the only structure assigned them would be those that are
implicit in the collection (e.g., transactions might be naturally grouped together because they
occurred at the same time or generated by the same individual). Since this data is to remain
unstructured, no further ordering would be imparted through “post-processing.”
Structured data processing imposes some organization on the data contained in the
receipted transactions (e.g., all transactions could be tracked by each customer, across all
trips to any Walmart store). This would facilitate accessibility, search, and reasoning about a
particular patron’s spending habits (e.g., all data might be organized by date and by store),
which may allow the productivity of stores to be assessed and compared easily.
As previously mentioned, though structured data sets allow for easier organization and
analysis of certain phenomena, they may make it more difficult to discover unknown
relationships because of the structure itself. Remember: when analyzing data sets, one must
always ask:
What data or relationships may be lost by imposing structure to the unstructured data set?
Screen Scraping
Sometimes we want to analyze data that is already formatted for human use. This data
might be a text (e.g., a book), an image (like the one below), a complete website (like this
one you’re reading), a formatted table/graph, etc. One method for extracting information
from these data is called screen scraping.
Screen scraping (or data scraping) is the conversion of data formatted for human use to a
format more easily used by automated computer processes. In particular, the conversion is
445
usually oriented toward taking output produced for humans (such as display on a screen)
and producing input for a computer to process (such as a file or the clipboard).
Taking a screenshot converts what is seen on the computer display (typically, the entire
desktop) to a machine-readable image format, such as in the example below. The file BtB-
screenshot.png can now be processed like any other image file (e.g., cropped, rotated,
etc.) as can be seen in the resulting BtB-description.png . In some use cases, an
image file isn’t appropriate for the task and we want to extract the on-screen information as
plaintext. We could then use Optical Character Recognition (OCR) software to convert the
visual contents of BtB-description.png into another format, such as ASCII text. Now,
we can manipulate the words and letters that we captured from the screen just like any other
bit text (e.g., copy/paste, translate, etc.).
Filename Result
BtB-
screenshot.png
BtB-
description.png
This video is an advertisement for a screen scraping software, but it provides a good
overview of how screen scraping can be useful:
446
https://www.youtube.com/embed/s0erYF8MsyM
447
UTeach CS Principles Unit 5: Big Data
Digitizing Business Cards
Digitizing Business Cards
As digital as our world has become, there is still
a tremendous amount of paperwork in it.
Schools and workplaces are attempting to
conserve resources by going paperless. Luckily,
we can use computer science to help with the
transition.
1. Use of paper is reduced. This is both good for the environment and allows for more
efficient use of space.
2. The information is now digitally represented. This means that it is no longer tied to a
physical, static piece of paper but can be manipulated computationally. Now, instead of
hunting for John Smith’s business card, you can search for it electronically. It also
means it can be copied without the need for additional physical resources.
Many products exist on the market for converting business cards into electronic data. For
instance, a Google search for the term business card scanner yields many hits that offer
solutions. The process is similar to screen scraping and another real life example of Creating
Structure From Unstructured Data Sets.
The final point is the one on which we’ll concentrate. Consider these two business cards:
Second, we need to extract the information from each card and fill the contact records
appropriately. Fill the following “contact record tables” with the information from the cards
above:
449
All of these factors dictate how the record should be filled—but there is more to it. Discuss
the following questions as a class:
Instructions
Your job is to develop algorithms for identifying five different attributes that are found on
business cards. The following attributes are a few examples:
First name
Last name
Company
Job title
Phone #
Fax #
Email address
Website
Business name
Street address
Street name
Unit/Apt. #
City
State
Postal code
Country
Your rules should make it clear how to automate the collection of information, but they do not
have to be written in “code.” Here is an example (you may not use this example as your
own!):
To identify an email address: query all text on the business card for the
following format:
“ _______@_______.___ “
each underscore ("______") represents alphanumeric text
no spaces or line breaks are allowed for this query
This is only one strategy for identifying a specific string of text. Other strategies may be more
450
useful for other items, so be creative and experiment. Then, test your rules by searching the
Internet for alternate business card designs. Will your rules work for all the cards you find?
Make adjustments as necessary. This will help you develop effective algorithms for finding
useful and usable data.
Submission
Submit a text document with your five algorithms. Be prepared to share your algorithms with
the class and demonstrate how they apply to multiple business card designs.
451
UTeach CS Principles Unit 5: Big Data
UNIT TOPIC:
Data Aggregation
Extraction
452
UTeach CS Principles Unit 5: Big Data
The Internet’s Data Structure
...or Lack Thereof
The World Wide Web is full of unstructured data. There are some light restrictions on how
the web is organized (e.g., by nation or by top-level domain—.edu, .gov, etc.), but the web
itself is largely unstructured. For example, there are no restrictions on how domain names
are semantically organized. Websites about “ferrets” do not necessarily have any
distinguishing characteristics in their domain names, such as including the term “ferrets”
(e.g., “ferrets.com") or, better yet, a taxonomy like
“mustela.mustelidae.carnivora.mammalia.chordata.animalia.”
Historically, there have been some attempts to overlay structure onto the Web. One of the
most prolific is the Open Directory Project (aka DMoz), which contains structured lists of
links to individual pages on the web. These directories form the structure and the links
represent the data. Watch the following video that demonstrates the difference between
structured and unstructured searches for the key term “twins":
https://www.youtube.com/embed/mlN7O1YyuMs
Notice that finding information is extremely easy using key terms in Google. Type twins ,
hit enter, and BAM! Results. However, there is a lot of noise in the results. Minnesota Twins
and Twins the movie are among the top hits (because Google apparently knows that the
searcher loves baseball and Arnold Schwarzenegger movies).
On the other hand, searching via the directories of DMoz led us down the wrong path
453
initially. In essence, you need to know the organizing structure that they created in order to
really make it useful. Note that Google actually provides some structure when you use its
autocomplete feature, which could be considered a dynamic structuring schema.
Instructions
Try this type of experiment for yourself. Conduct both a key term search with Google and a
directory search with DMoz on the same topic of your choosing. Compare and contrast the
results.
Prepare a stopwatch and time how long it takes you to find quality results for your key
term doing the following:
Submission
Submit links to the best link you find with your Google and DMoz searches, along with the
time it took you to find each link. Write one paragraph that compares and contrasts the
results and evaluates the effectiveness of each processes. Be sure to use the terms
unstructured and structured to describe the search processes, addressing both the quality
and quantity of your results.
454
UTeach CS Principles Unit 5: Big Data
Spiderbots
Spiders
We use Google (and other search engines) to find things on
the Web, but how does Google know where all of these pages
are located and which ones are relevant to your queries?
Watch the video below to see how these spiders create Google’s index of the web and affect
your search results:
https://www.youtube.com/embed/BNHR6IQJGZs
Assignment
Now you will observe a spider in action! Download this Processing sketchbook for a toy
spiderbot.
This is not a very robust program, as it will only work with webpages that use simple HTML.
But, it will allow you to see how a spider traverses the web, and it’s very small. Unlike a
search engine spider indexing all of the pages it encounters for later retrieval, SpiderBot just
455
counts—pages visited, number of words seen, etc.
All of the spidering work is done in the function named processURL . Have a look at the
code, noting what a spiderbot does:
1. visit webpages,
2. gather all of the links on each page visited, and
3. add them to its list of pages to visit in the future.
When you execute SpiderBot.pde , you are presented with a small interface that details
indexing statistics as the bot works, and a few controls to tailor the output. The following
images explain each of the statistics reported by the SpiderBot program.
1 : These are the dynamically generated statistics of the indexing process. Pages Indexed
indicates how many pages you’ve processed. Once they are processed, the spider will not
visit them again, even if another link to the page is found later. Pages Queued shows how
many pages have been added to the worklist. This begins with just one page, and all other
pages added are simply URLs found while spidering.
2 : Processing Speed allows the user to slow down or speed up the process. Note that you
can see the progress of your program in the Processing console.
3 : START begins the spidering process; STOP stops the spidering process and presents
the final report.
4 : These controls fine-tune the final report. Top # of Words selects the number of highest
frequency terms to report. Min. Word Length limits the terms reported to a certain length. In
other words, setting the minimum length to two will cause all one-letter words to be skipped
in the report. Stop Words? is a checkbox that will cause commonly used words to be
skipped in the report. There is a file in the SpiderBot document folder called stopwords.txt
that contains this list. You may add to or remove from it if you like.
5 : The spider begins at this URL. You may alter the beginning URL in the Processing
source by changing the value of START_HERE at the top of the program. However,
remember that many sites, particularly those with more advanced features, will not work with
456
this simple program.
The final report is generated in the Processing console. You may click and drag the slider
between the source code window and the console window to alter the height of the console.
6 : This details a summary of the dynamic statistics generated during the web crawl.
7 : A table of the highest frequency terms is generated when either STOP is pressed or the
Pages Queued falls to zero. This report is configurable using the controls outlined in figure
above.
Submission
You must submit a report (≥1 page) outlining the following:
457
An introduction to the spidering process. Include a synopsis of how a spider gets its
worklist of pages to index.
A comparison of SpiderBot on two websites with similar subjects. Note that Wikipedia
pages work well, because they are formatted fairly simply and include a lot of outgoing
links. For example, https://en.wikipedia.org/wiki/Android_(operating_system) and
https://en.wikipedia.org/wiki/IOS.
Include tables of high frequency terms, with different options selected (such as
Min. Word Length and Stop Words).
Include a summary for pages, lines, words, and characters indexed for each. To
facilitate comparison, run each spider routine for the same amount of time at the
same speed.
A brief analysis of the comparison. Consider the following questions:
Is it easy to tell which lists of high frequency indexed terms belong to which site?
How might adjusting the stop word list help you distinguish?
As the process progresses, note the URLs that are being crawled by the spider.
Do they have the same domain name as the starting URL? How does this affect
the report?
Robot or Not?
Seeing how it’s so easy to write a simple program to autonomously crawl the Web and
access even the deepest corners of the Internet without any human supervision, how much
Web traffic is actual humans and how much is the work of these bots?
458
UTeach CS Principles Unit 5: Big Data
Fetching Flutter-bys
Butterflies
Astraptes fulgerator is the subject of a
groundbreaking and controversial DNA
barcoding study that discovered at least three
subspecies within the butterfly species.
Because the butterflies are logged in a relational database, looking at certain subsets of
populations is a simple task. If we want to find all of the female butterflies, we can apply a
filter so that only butterflies who meet that criterion are displayed:
sex = female
Or, if we’d like to view all of the butterflies tracked during or before 2003, we can create the
filter:
year ≤ 2003
Finally, we can apply both of the criteria, viewing only butterflies that are both female and
tracked during or before 2003. This screenshot of the Google Fusion Table shows the
results:
459
Google Fusion Tables provides a nice graphical interface for this. The Structured Query
Language (SQL) statement to pull this same data would be:
Performing an SQL query on a relational database is much easier than sifting through all of
the data by hand, isn’t it?
Instructions
Working individually or in pairs, filter the butterflies in Google Fusion Table in order to find
the following:
Herbivore species: Astraptes SENNOV AND host plant species: Senna papillosa.
Host plant family: Sapindaceae AND wingspan (mm): 59.
Primary eco: rain forest AND elevation: <= 300 AND sex: male.
With any remaining time, explore the data set with new search queries.
Submit
Submit a document that includes the three screenshots and the search queries translated
into English.
460
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
461
UTeach CS Principles Unit 5: Big Data
UNIT TOPIC:
Data Aggregation
Storage
You will explore the basic features and functionality of modern relational databases.
You will debate the implications of large-scale data storage and data persistence on
privacy and utility, including the costs associated with each.
462
UTeach CS Principles Unit 5: Big Data
Indexing Julius Caesar
Murder of Caesar
"Friends, Romans, countrymen, lend me your ears!” is perhaps the most famous line from
Mark Antony’s speech at the end of Shakespeare’s Julius Caesar. To set the scene: Brutus
and a group of other Roman senators have just murdered Julius Caesar in the middle of the
Senate. Brutus then explains this egregious act to the gathering of Romans outside, and
now Julius Caesar’s friend Mark Antony (who did not participate in the murder) responds to
what Brutus has said:
https://www.youtube.com/embed/RJHDZc45xow
In this assignment, you will be practicing the process of indexing using text from Mark
Antony’s famous speech from Act III, Scene II in William Shakespeare’s Julius Caesar. In
pairs, you and your partner will build separate indexes for the document using Processing,
looking for specific terms and phrases, and using them to speed up searches of the text.
First, download the following Processing program that you will use to read and mark text in
the famous speech.
The starter code contains the speech and a stop watch. Each time you click on a word, it will
be highlighted in yellow. In pairs, examine the entire text and click to highlight each of the
matching terms you find. A list of terms for each partner is found below. You will use each
term, one at a time, and record your elapsed times.
Instructions
You will first search the text using Algorithm A and share the length of time that it takes you
to search for each of the terms in your assignment submission. You will then repeat the
search using Algorithm B, which provides a means of indexing the text in the file.
Before running the Processing program, edit the first two lines of code to initialize the search
term and algorithm as follows:
Note that the value of word should be initialized to whichever search term you are looking
for and algorithm should be initialized to either A or B , depending on which of the
following algorithms you are testing.
Search Algorithm A
Now, you will perform the same task, only you are allowed to use an index that highlights
each line in which your search term appears. Search for each of your assigned search terms
using Algorithm B. Don’t forget to modify the word and algorithm values in the
Processing program.
Search Algorithm B
Compare your results with your partner and discuss the differences between searching for a
464
keyword with and without an index that identifies the lines where the keyword can be found.
Assignment Submission
When you complete both indexing algorithms, submit a document containing the following
information:
465
UTeach CS Principles Unit 5: Big Data
Data Persistence
What Happens Online, Stays Online
What happens to information that you delete online? Does it simply vanish never to be seen
again? Probably not. What happens online, stays online. Computer scientists refer to this
as the persistence of digital data.
As discussed in the Representation module, digital information is easy to copy. See a digital
image you like on a website? Right click on it, choose “Save image as” and now you have
possession of a digital copy of that image that was posted online. This strategy applies to
everything posted on the web from images to tweets, and even webpages.
Read the following text from Facebook’s Help Pages about deactivating or deleting personal
pages and think about the following two questions:
466
Which option is more likely to retain your personal data?
Does either option guarantee permanent deletion of everything you’ve created/posted?
The language of “deactivation” states that (all) information is saved “just in case you want to
come back to Facebook at some point.” This indicates that Facebook retains all of your
information, just sets some of it to be publicly inaccessible. Data persistence at work, and for
good! Maybe you’re just taking a vacation from Facebook and plan on coming back.
Awesome job, Facebook!
467
Digital data is immune to generation loss, as multiple identical digital copies of photos
may exist at any given time. These digital copies can be copied further and be
maintained by many entities.
A service like Flickr typically maintains redundant backups of all its data. The data
themselves (photos, video, etc.) can exist through these means even after the actual
camera and/or storage device that originally captured them are destroyed.
This means that we gain utility from posting online. By posting to Facebook or Flickr, not only
are we able to share with our friends and the online communities, we are protecting this
digital data from accidental deletion.
These benefits don’t even begin to approach the social, scientific, and economic benefits
gleaned from users providing data to entities and of data persistence. You will discuss these
types benefits more during the Privacy vs. Utility debate.
468
UTeach CS Principles Unit 5: Big Data
Your Filter Bubble
The Filter Bubble
Most Internet users are living in a bubble right now...a filter bubble. Eli Pariser is creator of
the term filter bubble, and he’s given a fantastic TED talk explaining the concept of a filter
bubble, and how the Internet is hiding important information from you by using your data to
make your experience more personal. Watch and learn:
https://www.youtube.com/embed/B8ofWFx525s
1. Complete a Google search for a term like religion , Obama , Israel , etc.
2. Compare your search results with your neighbors’ results.
3. Compare your search results with a search using the “bubble-free” Duck, Duck, Go
search engine.
Break Free
You can escape your bubble if you try. Visit these three
websites to learn more about how to become freer on the
Internet. These resources may inform your thoughts on
privacy and data collection:
1. Learn more about how the filter bubble works from Don’t
Bubble Us.
2. Duck, Duck, Go is a search engine that does NOT bubble you.
3. Read “Are we stuck in filter bubbles? Here are five potential paths out” by Jonathan
Stray.
469
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
470
UTeach CS Principles Unit 5: Big Data
Privacy vs. Utility
Classroom Debate
Do you feel like you’re being watched all the
time? In a way, you are. Before the digitization
of everything, data was stored physically. For
example:
Life has become increasingly digital, and so has our data. Your personal data can be used to
make life easier and better for you and others. The digitization of your personal data also
means that your personal data is now easier to reproduce, share, and access—for both
good and ill.
What are the implications of large-scale data storage on privacy and utility? Let’s explore
some examples to help you think about this question.
For example, you may sign up for a promotion at a local clothing store,
Patriotic Hawk Clothier, giving them your email and phone number. Patriotic
Hawk is offering a free gift card to anyone who signs up for their email list.
You are torn because you know the store is going to bombard you with spam
email, but you really want the gift card.
A) The monetary motivation is incentive to justify linking purchase history to the card
holder/patron.
For example, many grocery stores, department stores, etc., only provide discounted prices
471
to those who have signed up for their “Rewards” program. In these cases, you trade your
data (purchasing history/spending habits) for reduced prices. In this case, your data has a
clear, monetary value to both you and the store. You must decide whether the value is great
enough for you to allow the collection of your data.
Again, there is a clear financial incentive to use personal data in these cases, though less
quantifiable since not all deals end up being purchased.
These are two simple examples of why data collection can provide users with utility, but your
digital data takes many other forms besides purchasing history (see Blown to Bits, Chapter
2). Here are some examples where the utility gained is not monetary:
Although a wide variety of data may impact your privacy and give you utility, the nature of
digital data and the Internet means that your online information is perhaps the most easily
accessible.
Browsing Histories
Internet browsers maintain a search/page history—again, this is your personal data (about
where you have been on the Internet and what you have searched for). Why do browsers
and search engines do this?
Your teacher will share the exact debate protocol with you before the debate begins. Good
luck, and remember to remain open-minded and to listen to what the other side says! Will
they be able to change your mind, or will you be able to change theirs?
473
UTeach CS Principles Unit 5: Big Data
BIG PICTURE:
Data Breaches
Highlights
You will examine the security risks and responsibilities assumed by companies that
collect and store sensitive personal data.
You will examine the causes and impact of data breaches involving sensitive personal
data.
474
UTeach CS Principles Unit 5: Big Data
Data Breaches
Is Your Personal Data at Risk?
As our digital world grows, more and more personal data is being collected and stored by
schools, businesses, government agencies, research groups, online services, etc. While the
efficiency of the Internet provides an excellent backbone for amassing and accessing this
wealth of information, it also comes with an unfortunate drawback. Specifically, the
potentially high value of that data and the ease of access that the Internet provides also
serves to attract and motivate thieves, hackers, and other malicious parties that would like to
gain access to this rich supply of personal data.
It has unfortunately become quite common to hear news stories about the latest hack or
breach of security that has compromised millions of users’ personal data. Click here to read
about one such breach at Target stores in 2013.
475
UTeach CS Principles Unit 5: Big Data
My Data Rules
Control Your Data
You’ve learned about how your personal data is collected, extracted, stored, and analyzed,
which has major implications for your privacy. Responsible digital citizens must be aware of
data collection and make purposeful decisions weighing the potential costs and benefits
associated with personal data use. In this assignment, you will make decisions about how
you will control your data:
Will you provide data as often as possible to gain utility anywhere and everywhere?
Will you never give your data to anybody to try and maintain privacy?
Can you find a happy balance between privacy and utility? These are decisions that
previous generations never had to consider, but you do.
This pledge is purposeful, and should act as a guide for your future decisions regarding data
use and storage. Pledge to always be knowledgeable and careful with your information.
Rubric
Criteria Points
Privacy and Utility
Summarizes and discusses the positive and negative
implications of personal data storage on your privacy and 2 pts
utility
Describes more than three potential costs and benefits of
3 pts
your personal data being part of large-scale data sets
Data Collection
Describes at least three data collection strategies used to
3 pts
collect personal data
Lists at least 10 examples of personal data that can be
5 pts
collected by others
476
Rules for Data Usage
Explicitly describes rules that you will follow for allowing
5 pts
your data to be collected, stored, and analyzed
Makes meaningful connections between privacy, utility,
3 pts
and your rules
Key terminology
Uses key terminology appropriately and as necessary,
2 pts
including: data, data persistence, utility, and privacy
TOTAL 23 pts
477
UTeach CS Principles Unit 5: Big Data
UNIT TOPIC:
Data Analysis
Statistical Analysis
You will analyze the tradeoff of utility and confidence in descriptive, predictive, and
prescriptive data analysis.
You will investigate traditional statistical hypothesis testing and exploratory data
analysis.
Data Mining
You will investigate the use of data mining in the discovery of patterns in large data
sets.
You will apply association rule mining to discover knowledge in data sets.
You will articulate the effects of association rule mining on business and education.
Clustering
You will visually perform cluster analysis, modeling the dynamics of groups.
Anomaly Detection
You will visually perform anomaly/outlier/change detection and discuss the impact on
potential inferences drawn from the data.
Regression
You will synthesize a prediction through linear regression over known data.
You will apply a classification protocol to a dataset and compare results with pre-
defined categories.
You will evaluate the effects of automated summarization on the utility and validity of
inference.
478
479
UTeach CS Principles Unit 5: Big Data
UNIT TOPIC:
Data Analysis
Statistical Analysis
You will analyze the tradeoff of utility and confidence in descriptive, predictive, and
prescriptive data analysis.
You will investigate traditional statistical hypothesis testing and exploratory data
analysis.
480
UTeach CS Principles Unit 5: Big Data
Statistical Analysis
Survey Says!
Not all statistics or data analyses are meaningful. Just because somebody punctuates a fact
with a statistic, does not make it meaningful (or correct). For instance, watch the video to the
right that demonstrates a little statistical absurdity from the movie Anchorman.
https://www.youtube.com/embed/pjvQFtlNQ-M
Descriptive analytics provide information about collected data via statistics that you
are probably familiar with—mean, median, mode, range, etc. They tend to “describe”
circumstances, but don’t offer conjectures about unknowns.
e.g., How many (in terms of percentage) computer science graduates are paid
salaries of $100,000 or more within five years of graduating?
Predictive analytics may provide information about future (or merely unobserved or
unknown) events based on previously collected and analyzed data.
e.g., How likely is it that I will be able to find a high-paying job if I choose to major
in computer science vs. biology?
Prescriptive analytics may provide information to maximize the chances of a future
event occurring, based on comparing the predictive analyses of multiple options.
e.g., Which major should I choose in order to maximize my chances of making
the highest starting salary after graduation?
These three types of analytics each have advantages and disadvantages that experts must
evaluate to utilize them properly. These types of analysis each serve different purposes, but
481
they all allow for powerful inferences to be drawn from data.
Let’s take a look at each type of analysis, its purpose, and how it is applied to common
computing tasks (e.g., a Google search).
Descriptive Analytics
Function Application
The spidering, caching, and
easiest to derive “hard facts”
indexing of the web is all centered
from
on descriptive analytics
Example: the percentage of
In essence, this creates an easily
graduates employed within six
accessible description of the web
months of graduating
Predictive Analytics
Function Application
uses descriptive analysis’ “hard facts” to the retrieval and ranking
extrapolate (make inferences) about of pages in response to
where unknown data may lie a search query
Example: Given that 90 of the 100 CS creates a representation
graduates were employed within six of the search terms and
months in 2011, it is __ % likely that 108 of compares it against the
the 120 CS graduates in 2012 will be descriptive (indexed)
employed within six months. model of the web
in essence, predicts
predictions are not “hard facts"; they may
what pages will be
be wrong!
relevant to whom
Prescriptive Analytics
Function Application
compiles predictive hypotheses and
recommends a plan of action to autocomplete is a form of
482
maximize the liklihood of something prescriptive analysis
happening
Example: which major should I
recommendations for further
choose to maximize the chance that
searches are based on
I’ll have a job within six months of
potential next steps for users
graduation?
confidence is the lowest here, based on the ranking of
because all of the prediction errors previous/potential queries
associated with previous analyses are that are similar to the current
compounded. search query
More Is Better
Generally, more data leads to greater confidence. Each of these are based on building
models from data. The models’ fit to the data increases their power (and thus, utility). This is
why big data can be so powerful.
Google’s searches are often effective, because their data set is huge. They have a ton of
data from which to conduct descriptive, predictive, and prescriptive analyses, and then use
those analyses to improve user experiences.
483
UTeach CS Principles Unit 5: Big Data
Justin Who?
Exploratory Data Analysis
Martindale High School has recently renovated and reopened its South Wing. Claudia is a
new student assigned a locker in this new wing. When she excitedly opens her “new” locker,
she discovers that the renovation crew seems to have neglected the actual insides of the
lockers—fresh coat of paint in the outside; same old rusty, dented interior.
Even the previous occupant’s end-of-year pile of trash is left intact. She begins to clean out
the musty mess inside. “Nasty! I think this used to be an apple.” Not wanting to touch it, she
coaxes it into a wastebasket with a ruler. Underneath, she finds a folded piece of paper.
“What’s this?”
Carefully unfolding it, she finds a note left by the mysterious previous locker tenant:
484
“Hmm…who’s Justin?”
You could formulate a hypothesis based on your intuition, such as “Of course, it’s Justin
Jones in homeroom—or—Justin Bieber,” and call it a day. But wait a minute! Given the text
of the note, it’s obvious we have a public figure on our hands—someone who’s been in the
news. This means we have data—and lots of it. Now we can test our hypothesis. We can
search for Justin in Google and see what comes up.
485
Of course, we can’t prove that the note is about him, but we can be fairly confident that it is,
right?
Maybe we can add keywords such as “album” or “hate” or we can restrict our query to
search news only. It seems he is the most likely “Justin” with an “album” in the news…
What do you think? Is it the Biebs, or somebody else? Conduct some exploratory data
analysis and test your hypothesis:
To best support your work, you should take notes on the features you use and results you
find. Screenshots of key findings may be useful.
Google Trends is rather intuitive and easy to use. It can be very interesting, especially if you
learn to set all the parameters, which are highlighted in the image below:
486
Explore the tools and use as much of the functionality as you can. You will be able to tell
how search terms change over time, what terms are used together, from where people
search, and even what news events correlate with trends over time.
Be Prepared to Discuss
487
UTeach CS Principles Unit 5: Big Data
UNIT TOPIC:
Data Analysis
Data Mining
You will investigate the use of data mining in the discovery of patterns in large data
sets.
You will apply association rule mining to discover knowledge in data sets.
488
UTeach CS Principles Unit 5: Big Data
Data Mining
Data Mining
Traditional ore mining begins with an exploration (prospecting)
of a resource pool (stone), and proceeds to determining if
usable resources exist (ore) and to what degree. Prospectors
basically have an idea of what they are looking for, and they
run small tests to see if they are correct. Sometimes they
strike gold, other times they strike out. Like these physical
mines that bring us everything from coal to diamonds, we have a new type of mining: data
mining.
Data mining is akin to the discovery of patterns in large data sets. Like ore mining, data
mining begins with an exploration (analysis) of a resource pool (data), and proceeds to
determine whether usable resources exist (correlations) and to what degree (how strong
they are). Not all data miners “strike it rich.” Like ore mining, data mining can result in the
observation of no useful patterns. However, like ore mining, sometimes data mining leads to
a bonanza of useful information.
In data mining, the emphasis is on the discovery of new knowledge. Data miners want to find
new patterns that were previously unobserved. They use statistical analysis of big data to
discover what the human eye can’t see, just like an ore miner might use a pick, dynamite, or
lab test to uncover ore that was not visible to the naked eye before. This is a form of
exploratory data analysis rather than statistical hypothesis testing.
489
purposes. This is sometimes referred to as market basket analysis.
Recommender systems—Users who like Movie X tend to also like Movie Y.
Clustering—is the task of discovering groups and structures in the data that are in
some way or another “similar,” without using known structures in the data.
Dynamically grouped movie categories: “Romantic Comedies in Paris starring
former professional football players.”
Classification—is the task of generalizing known structure to apply to new data. For
example, an e-mail program might attempt to classify an e-mail as “legitimate” or as
“spam.”
Movie X is a romantic comedy.
Regression—Attempts to find a function that models the data with the fewest errors.
Type X users typically increase their movie consumption rate by four movies per
year.
Summarization—providing a more compact representation of the data set, including
visualization and report generation.
What type of movie does User X typically like? (i.e., sum up user X’s preferences
in Y words)
These strategies all have different purposes, are sometimes more effective on certain data
sets and less on others, and oftentimes work best in conjunction with one other. Therefore,
there is no one “best” way to perform data mining. Data miners use multiple strategies to
uncover patterns and discover new knowledge.
How much power lies in data mining? Read the following article to see “How Target Figured
Out A Teen Girl Was Pregnant Before Her Father Did.”
490
UTeach CS Principles Unit 5: Big Data
Association Rule Mining
Companies Know What You Buy
French toast is one of America’s favorite breakfast foods. It’s
delicious and can be easily prepared at home using a variety
of techniques and toppings. Even though it can be prepared a
number of ways, almost all French toast recipes call for at
least three things:
1. bread
2. milk
3. eggs
If you’re going to make French toast, you’re going to need bread, you’re going to need milk,
and you’re going to need eggs. What does French toast have to do with big data?
For example:
{X, Y} ⇒ {Z}
This rule can be read as, “If the antecedents (X and Y) appear then it is likely that the
consequent (Z) will also appear.”
For example:
If most people who buy milk, bread, and eggs also buy maple syrup, then association
rule mining might turn up the following rule:
{milk, bread, eggs} ⇒ {syrup}
Walmart can now target store patrons who purchase milk, bread, and eggs to gently suggest
491
that they might like to also buy syrup. The computerized storefront (or physical storefront
with a layout determined by computational data mining) does not know that these patrons
may be making French toast, they merely have developed association rules to guide product
placement. The process of association rule mining is basically “How Target Figured Out a
Teen Girl was Pregnant.”
Instructions
Your group has been hired by Data Market, a corporation seeking to open a new chain of
stores in your region. Their goal is to provide customers with optimal arrangements of store
products, in an attempt to minimize the time and effort required to shop.
492
UTeach CS Principles Unit 5: Big Data
UNIT TOPIC:
Data Analysis
Clustering
You will visually perform cluster analysis, modeling the dynamics of groups.
493
UTeach CS Principles Unit 5: Big Data
TEDxKinda Project: Identifying
Clusters
Instructions
Your challenge is to cluster data in different ways. Using Google’s Public Data Explorer,
explore a few data sets, and attempt to cluster data within the sets. If you have chosen data
sets other than those in Google’s collection, you will have to use one of the other Tools for
Big Data Analysis, which may be more challenging, but also much more fruitful for your
research.
First, take a screenshot of a data visualization that demonstrates clusters of patterns, and
then draw lines on the screenshot that bisect the plane and create clusters. You should
cluster your data in two or more different ways, so use two different datasets if one dataset
cannot be clustered in two ways. Choose datasets that are related to your TEDxKinda topic,
so that you can apply your work to your end presentation.
Submission
Submit a document (e.g., .doc or .pdf) that includes the following items:
494
495
UTeach CS Principles Unit 5: Big Data
UNIT TOPIC:
Data Analysis
Anomaly Detection
You will visually perform anomaly/outlier/change detection and discuss the impact on
potential inferences drawn from the data.
496
UTeach CS Principles Unit 5: Big Data
Outliers
Outliers and Credit Card Fraud Detection
Credit card fraud has become more rampant as data become
digitally stored, but at the same time, credit card fraud
prevention has also become more powerful—primarily
because of effective outlier detection using Big Data
processing techniques.
Tara is a Texas high school student who just opened her first bank and credit card accounts
one year ago, after receiving her first paycheck from Taco Cabana. Last week, Tara went
online and purchased a Willie Nelson concert ticket at a ridiculously low price, $5! Wow!
Almost too good to be true—and it was. The next day, a representative from Tara’s bank
called to inform her that her credit card was used to purchase gasoline, burgers, and a dirt
bike—in Minnesota. Terrified Tara told the teller that the charges were bogus! The bank and
credit card company then began to deal with the credit card fraud. Tara was not happy.
How did the bank/credit card company know that Tara did not make those purchases?
Outlier detection. In this ficticious scenario, Tara had never spent more than $100 on her
credit card for any single transaction, and no transaction had occurred outside the state of
Texas. Both of these may have signaled to the company that this charge was an anomaly
(an outlier).
Navigate to the Fertility Rate vs. Life Expectancy visualization in Google Public Data
Explorer to investigate an anomaly. There are many unusual trends in the data visualization;
however, let’s focus on one—the case of East Timor (a.k.a., Timor-Leste, identified in the
figure below).
497
Life Expectancy—How long individuals in this country live on average. This is indicated
along the x-axis.
Fertility Rate—The average number of times the average woman in this country gives
birth. This is indicated along the y-axis.
Nation—This is the label applied to each bubble. Click on the bubble to view it.
Region—The colors associated with each bubble indicate what part of the world in
which they are located.
Population—The size of each bubble is proportional to the population of the country
represented. This is calculated yearly.
Note that actual values for these data can be obtained by hovering over the data point you
wish to inspect. The x- and y-axis values appear along their respective axes, and the
population appears in the legend.
East Timor is obviously an outlier in the static image above. It is clearly set apart from the
rest of the clustered data. As the graphic indicates, life expectancy was a staggering 32.9
years of age, though fertility rates were average in comparison to the rest world!
However, as can be seen in the dynamic visualization, the true nature of the decline in life
expectancy is drastic, taking place over the 1970s. What happened in the 1970s in East
Timor? The identification of East Timor as an anomaly might indicate that some variable
majorly affected its life expectancy during this time.
That variable, it turns out, was a civil war that broke out between East Timorese political
parties, leading to an invasion and eventual occupation by Indonesia. The occupation was
marked by extreme violence and brutality and resulted in more than 100,000 deaths, with
approximately 20% of those resulting from killings and 80% from hunger and illness. This
excessively high number of conflict-related deaths artificially skewed the mortality rate in the
region during the time immediately after and during the occupation, as can be seen in the
visualization.
Although the outlier in the static graph for 1977 establishes East Timor as a troubled area,
the narrative expressed through the visualization not only gives context, but also indicates
498
that the formation of the outlier was itself an anomaly.
Outliers tend to be meaningful and tell stories about unique circumstances. Identifying
outliers and investigating them can make descriptive statistics more accurate and also tell a
story on their own.
499
UTeach CS Principles Unit 5: Big Data
TEDxKinda Project: Identifying Outliers
Instructions
Your challenge is to identify anomalies in data sets using Google’s Public Data Explorer.
Explore a few data sets and attempt to identify anomalies within them. If you choose data
sets other than those in Google’s collection, you will have to use one of the other Tools for
Big Data Analysis, which may be more challenging.
These anomalies may distort descriptive data or may tell stories about unique situations.
Alter all variables involved in the data sets to see if any single variable contains outliers.
When you identify an outlier, take a screenshot of the complete data visualization that
illustrates the outlier(s). Be sure to examine data sets that are related to your TEDxKinda
topic, so that you can apply your work to your ultimate presentation.
Submission
Submit a document (e.g., .doc or .pdf) that includes the following items:
500
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
501
UTeach CS Principles Unit 5: Big Data
UNIT TOPIC:
Data Analysis
Regression
You will synthesize a prediction through linear regression over known data.
502
UTeach CS Principles Unit 5: Big Data
TEDxKinda Project: Making Predictions
Instructions
Your challenge is to make predictions with regression, using one of the following tools:
Explore a few data sets, and generate regressions with them. These regressions may allow
you to make predictions or identify patterns of change over time. When you perform an
effective and informative regression, take a screenshot of the data visualization. Be sure to
examine data sets that are related to your TEDxKinda topic, so that you can apply your work
to your ultimate presentation.
Submission
Submit a document (e.g., .doc or .pdf) that includes the following items:
503
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
504
UTeach CS Principles Unit 5: Big Data
UNIT TOPIC:
Data Analysis
Classification and Summarization
You will apply a classification protocol to a dataset and compare results with pre-
defined categories.
You will evaluate the effects of automated summarization on the utility and validity of
inference.
505
UTeach CS Principles Unit 5: Big Data
Classify Me
Classify Me: I’ve Got Personality!
We’ve all got personality. Some of us are
outgoing and flamboyant, while others of us are
reserved and introspective. Is it possible to
classify a person’s personality traits using a
simple assessment (test)?
Favorite world
Do you prefer to focus on the outer world or on your own inner world? This is
called Extraversion (E) or Introversion (I).
Information
Do you prefer to focus on the basic information you take in, or do you prefer to
interpret and add meaning? This is called Sensing (S) or Intuition (N).
Decisions
When making decisions, do you prefer to first look at logic and consistency or
first look at the people and special circumstances? This is called Thinking (T) or
Feeling (F).
Structure
In dealing with the outside world, do you prefer to get things decided or do you
prefer to stay open to new information and options? This is called Judging (J) or
Perceiving (P).
When you decide your preferences for each of the categories, you can put the letters
together and compile your personality type into one of the following 16 personality types:
506
Learn more about each personality type from the Myers-Briggs Personality Types—Basics
webpage.
1. Navigate to this Jung personality test (similar to the original Myers-Briggs instrument,
but free!), and complete it.
2. Write a reflection about the results:
1. What do you think about the process?
2. Do the results accurately predict your personality?
3. Post your reflection to a shared space provided to you by your teacher.
4. Read some of your classmates’ posts, and respond.
507
UTeach CS Principles Unit 5: Big Data
TEDxKinda Project: Automated
Summarization
Word Clouds
Have you seen any word clouds floating around the Internet?
For example, the Wordle image to the right is a word cloud
generated using the text of all the webpages included in the
Big Data module.
"...a toy for generating ’word clouds’ from text that you provide.
The clouds give greater prominence to words that appear more
frequently in the source text. You can tweak your clouds with
different fonts, layouts, and color schemes. The images you
create with Wordle are yours to use however you like. You can
print them out, or save them to the Wordle gallery to share with
your friends.”— Wordle
Wordles are an attempt to automatically summarize a large amount of data visually. It relies
on word counts to identify what might be important ideas; however, there is a variety of
strategies that one can use in order to automate summarization, and each strategy has
advantages and disadvantages related to its utility and its validity.
Automated Summarization
Humans are always looking for shortcuts, timesavers, and ways to make their lives easier.
This includes attempts to read and comprehend long texts. Perhaps you have used Cliff’s
Notes instead of reading a novel for an English literature class. Still, some person (maybe
Cliff?) had to read the original book and create the summary.
For example:
508
A student’s grade or score on a test is meant to summarize the student’s
understanding of the course material or individual concepts measures on a test.
A credit score is a numeric value associated with a consumer’s credit risk. Missed
payments, excessive credit checks, and amount of outstanding debt all contribute to a
final numeric FICO score in the range of 350 to 800.
Automated summaries like these can be both usable and useful—except that summarization
of data comes at a cost. Automated summarization is lossy. Summarization attempts to
reduce complexity by removing redundant or otherwise less significant details. Effectively,
this is a form of dimension reduction.
However, these details can not be recovered given just the summary. The process maps the
complexity of large sets of data to a simpler, smaller data pool. To illustrate, it is impossible
to determine which test items were missed on an exam by merely knowing the student’s
total score, or to understand the finer points of this Big Data unit by examining the Wordle
above.
Three common techniques for automated text summarization are outlined below: highest
word frequency, TF*IDF, and topic sentence concatenation.
This is the most intuitive of the algorithms, and the one that is used by Wordle:
Parse the text and keep a count of all the words read separately.
Sort the list so that the most frequent words are first.
Remove words from a stop word list.
Sample “stop word” list
Use the X (such as 10—20) most frequent words as a summary.
Think about the following: What is the point of the stop word list? How could this
summarization method be exploited by spammers?
509
The TF*IDF method
The TF*IDF method extends the previous one by adding an assumption that only uncommon
words are useful in a summary:
The words that appear in all documents are not useful in differentiating among them, so you
begin by finding the most common words and eliminate them. This is like using a stop word
list, only you make the list dynamically as you go rather than depending on a pre-determined
master list.
Calculate the TF (term frequency) for each word. Basically, this means taking each of
the word counts you generated in the previous method and dividing them by the total
number of words in the text:
Calculate the IDF (inverse document frequency) for each word. This means figuring
out how many documents out of all the ones you are processing contain the word. As
an example, you might leverage Google to get these counts. Search the term using
“quotation marks.” Google will return the number of documents it retrieved containing
exactly that term. Now we just divide all the English documents Google has indexed
total, by the number of documents it indexed with that term in order to find the IDF.
Of course, we don’t know what that total number of documents that Google has
indexed total, so let’s cheat. Assume that every document contains the word
“the.” Use the document count of the word “the” as the total document count. A
Google search returned 10,690,000,000 documents when this page was created
(March 29, 2014).
Sort the list so that the highest TF*IDF words are first.
Use the top X (such as 10–20) highest ranked TF*IDF words as a summary.
Think about the following: How might this compare with the highest word frequency method?
What’s the major difference?
This method assumes that the main idea of each paragraph is the first sentence, so the topic
sentences of each paragraph can encapsulate the meaning of the whole document. For this
method, beginning at the top of the document:
510
1. Scan the first sentence and add it to your summary.
2. Skip remaining text until you reach either the next paragraph or the end of the
document.
3. If you reach the next paragraph, repeat steps 1 and 2.
4. If you reach the end, your summary is complete.
Think about the following: How is this method different than the others? How does it fare if
you limit the word count to X as you did with the other methods?
Your challenge is to apply one of the automated summarization strategies (i.e., word cloud,
Highest Word Frequency, TF*IDF, or Topic Sentence Concatenation) to a text (or combine
multiple texts for even more data), considering which summarization strategy might be the
most useful. You may use automated summarization Tools for Big Data Analysis, which may
be more challenging, but also much more fruitful for your research. Be sure to examine texts
that are related to your TEDxKinda topic, so that you can apply your work to your end
presentation.
Submission
Submit a document (e.g., .doc or .pdf) that includes the following items:
511
UTeach CS Principles Unit 5: Big Data
UNIT TOPIC:
Data Visualization
Data Visualization
You will combine visuals, content knowledge, and interaction to create a dynamic
infographic that clearly communicates discrete information about a data set.
512
UTeach CS Principles Unit 5: Big Data
Interactive Infographics
Big Data Reflection
In this course—and no doubt in life—you have encountered
many infographics, like the one to the right. They have
become a part of our culture, our digital zeitgeist. What exactly
is an infographic, though?
Instructions
Your job is to program an interactive infographic about your TEDxKinda presentation.
An interactive infographic allows users to dynamically manipulate the data in order to
engage users and illustrate major concepts. You may program the interactive infographic
with Processing or Scratch, but there are many other tools specifically designed for creating
interactive infographics.
The design of your infographic will likely not represent all of the work you have put in for the
TEDxKinda Project, so select the most relevant points, focusing on quality over quantity.
513
The following tools and resources may help you obtain the visuals and content of a static
infographic. Once you have designed a static infographic prototype, make it interactive so
that users can manipulate the data or format of the infographic.
Excel/Numbers/Google Docs
Wordle/Tagxedo
Many Eyes
Word/Pages
Powerpoint/Keynote
Photoshop
ComicLife
Prezi
Visualize (iOS app)
Easely
Piktochart
Infogr.am
Explore these 10 Wonderful Examples of Interactive Infographics for inspiration. Think about
how you may apply the same strategies as they have, but with your topic, your knowledge,
and your data.
514
UTeach CS Principles Unit 5: Big Data
BIG PICTURE:
Wisdom of the Crowd
Highlights
You will apply the technique of crowdsourcing to a novel data collection problem.
515
UTeach CS Principles Unit 5: Big Data
ReCAPTCHA
Humans Are Computers, Too!
The image you see to the right is an example of
reCAPTCHA (revision of Completely Automated
Public Turing test to tell Computers and
Humans Apart).
The inventor of reCAPTCHA, a computer scientist named Luis von Ahn, noted that the
computer’s difficulty deciphering text is also a roadblock to progress in a different arena—the
scanning, recording, and cataloging of pre-digital era books. He decided to crowdsource
some human computers to help out. Look at this example to see how he applied
reCAPTCHA to the scanning of books:
By simply altering the origin of the problematic data (fuzzy text) from computer-generated
images to alphanumeric text typed by human beings, the reCAPTCHA problem becomes
doubly useful. Not only are you demonstrating your humanity to the system, you are also
utilizing your human computing skills to help digitize the world’s books for generations to
come. Good for you!
Most of the time, digital computers work for humans. However, sometimes humans work for
digital computers to help with tasks that digital computers are not currently capable of
performing. A technique known as human-based computation leverages the ability of
516
humans to process certain types of information more efficiently than machines by
outsourcing certain tasks to humans. Together, man and machine are able to solve complex
tasks faster and more economically than either could do on their own.
Some tasks that humans can perform remarkably well through specialized brain/neural
functions are not well-suited to digital computers (yet!). For example, humans are much
better at face recognition and language translation (though state-of-the-art algorithms are
becoming more competitive all the time). On the other hand, computers exceed the
capabilities of humans in rote data collection, organization, and in some cases,
interpretation. As such, human-based computation leverages the abilities of both humans
and computers to complete complex tasks. The analysis of tasks performed well or quickly
by humans is accelerated or automated by computers, and vice versa.
A great example of human computation is the use of reCAPTCHA to digitize physical texts,
but it’s not the only one. For instance, Luis von Ahn has also developed Duolingo, an digital
tool to translate the World Wide Web into different languages by leveraging human
computation and translation skills as a language learning opportunity for those who
participate. Watch the following TEDx talk in which Luis explains how human computation is
leveraged with both reCAPTCHA and Duolingo:
https://www.youtube.com/embed/cQl6jUjFjp4
Not all crowdsourcing efforts are entirely “noble.” Many simply want to leverage human
computing on a large scale to accomplish tasks that may be more difficult for a small amount
of disconnected people to complete today. Listed below are few of these very popular
crowdsourcing efforts online today. How might they inspire your own crowdsourcing efforts?
518
UTeach CS Principles Unit 5: Big Data
Crowdsourcing
Collecting Data
Crowdsourcing techniques do not always have to be digital. Before the Internet, it may have
been more difficult to collect data from a large number of people, but it was not impossible.
In fact, Francis Galton used crowdsourcing in 1906 to accurately predict the weight of a bull.
While advising a livestock fair in 1906, Galton observed a contest in which participants
attempted to guess the weight of a particular ox that was on display. Out of the nearly 800
guesses made, nobody accurately estimated the exact weight of 1,198 pounds. Some
guesses were too low, while others were too high. Galton surveyed the range of guesses
and noted that out of the nearly 800 guesses, the mean (average) prediction was 1,197
pounds! While no single individual was able to make an accurate guess, the crowd, as a
whole, was surprisingly accurate.
This “wisdom of the crowd” phenomenon is not restricted to the weight of bulls, however. It
can be seen over and over again anytime a sufficiently large sample size of individuals is
asked to estimate an unknown result and is the basis for the use of crowdsourcing.
We can crowdsource an estimated quantity using Google Forms and Spreadsheets to help
gather a large number of responses. Make your best estimates and submit them to the
collection. Then, examine the crowdsourced collection of estimates, and see how accurate
the average estimate is to the actual amount.
* Required
519
Guess the number of jelly beans
Your answer
Your answer
Your answer
Your answer
SUBMIT
This content is neither created nor endorsed by Google. Report Abuse - Terms of Service - Additional Terms
521
UTeach CS Principles Unit 5: Big Data
UNIT TOPIC:
Models and Simulations
Models and Simulations
522
UTeach CS Principles Unit 5: Big Data
Models and Simulations
Models
Models allow us to create a physical or virtual representation of an object. Above, you can
see that the concept for the Etihad Towers in Abu Dhabi began as a computer-generated
model using a CAD (Computer Aided Design) program such as Autodesk AutoCAD
Architecture or Graphisoft ArchiCAD. As the architect refines the model, he or she can test
strength, stress, and other important factors before beginning construction. Models can also
be physical representations of real-world objects such as a scaled version of the Rotherhithe
Bridge shown.
Simulations
There exists a branch of computer science that exclusively focuses on the design and
implementation of computer-generated models to test situations that may be too dangerous,
costly, or otherwise inconvenient to test in the real world. Simulations use models to test a
hypothesis about a situation. For example, what would happen if an object (like a bird) flew
into a jet plane midflight? Thankfully, engine manufacturers do not have to spend time and
resources throwing random objects (like birds) into a jet engine to see the effects on the
engine. Models and simulations facilitate the formulation and refinement of hypotheses
523
related to the objects or phenomena under consideration. Simulations allow hypotheses to
be tested by mimicking real-world events without the cost or danger of building and testing
the phenomena. Then, hypotheses can be refined by examining the insights that models and
simulations provide.
Models (and simulations using those models) are useful tools in a decision-making process
while saving time and money. In Unit 1, Computational Thinking, you used a simulation in
the activity about Heuristics called Hill Climb. This Processing program is a simulation where
a robot looked for the highest relative point in a randomly generated environment and
relocated there. If we really did have these gold mining robots, it would be much easier to
run a simulation to test our algorithm than to wait to see how many robots survived the next
flood.
How are video games saving the government money? Read the following article to see how
“Better Simulation Could Save the Military Millions.”
Simulations can facilitate extensive and rapid testing of models. This allows companies and
other organizations to iterate, or repeatedly adjust their models based on previous test runs,
by simply running a program and not actually taking the time, money, and hassle of doing
the thing in real life. For example, jet engine companies, like Rolls Royce or General
Electric, can heave virtual birds into virtual engines without ruining a $35 million product or
removing the “no animals were harmed” stickers from planes to iteratively design the best
possible fowl-free engine.
The time required for simulations is impacted by the level of detail and quality of the models
and the software and hardware used for the simulation. I think most of you would agree the
time and detail it takes to create a program that models a jet engine exploding would be
524
significantly more than a running simulation created in Scratch. Models may use different
abstractions or levels of abstraction depending on the objects or phenomena being posed.
On The Horizon
Computer scientists use information about conditions to create and study a wide range of
complex systems. Analysis from the results of simulations is usually able to generate new
knowledge and new hypotheses related to the phenomena being modeled. According to the
National Institute of Biomedical Imaging and Bioengineering (NIBIB), computational
modeling has benefited society in many ways. Like weather forecasting and earthquake
predictions, the following are just some of the examples related to computational modeling
that the NIBIB is busy researching:
With continued advancement in computational models, one day you could be flying to work!
525
UTeach CS Principles Unit 5: Big Data
UNIT PROJECT:
TEDxKinda
Highlights
You will collaborate in groups to analyze public data sets and extract insightful
information and new knowledge using a number of big data analysis techniques and
tools.
You will evaluate and justify the appropriateness of your chosen data set(s).
You will construct informative and aesthetically pleasing data visualizations.
You will write a script and prepare speaker notes for a formal presentation of your
findings.
You will cite all online and print sources used in your research and presentation
preparation.
You will deliver a TED-style presentation discussing your data analysis and findings
using appropriate terminology.
526
UTeach CS Principles Unit 5: Big Data
TEDxKinda Project: Rubric Check
Feedback
This activity provides you and a partner group time to provide one another critical feedback
about the progress of your projects.
Listen to what your partner group says! You do not need to heed all of their advice, but
consider it wisely. The whole idea is to have a critical third party provide ideas to make your
project better before you submit the final version for a grade.
527
UTeach CS Principles Unit 6: Innovative Technologies
UNIT 6
Innovative Technologies
You will begin by exploring many of the key roles that technology plays in your
life, including social networking, online communication, search, commerce, and
news and examining the ways these ever-evolving technologies have impacted
individuals and societies in recent years. With so many of these technologies
relying on the Internet to connect users and data across varied and remote
locations, you will then “take a peek under the hood” to examine the systems
and protocols that make up the global infrastructure of the Internet. Finally, you
will turn your attention to the past, present, and future of computing to begin
imagining the technology that might exist in your future and the role that you
might play in bringing it about.
528
UTeach CS Principles Unit 6: Innovative Technologies
UNIT PROJECT:
Future Technology
Highlights
You will collaborate in pairs to envision and design a future innovation in technology.
You will discuss and identify a specific purpose that your innovation will serve (e.g.,
entertainment, problem solving, education, artistic expression, etc.) and its key
features.
You will evaluate the potential benefits and risks of your innovation.
You will identify existing technological resources that your innovation may utilize.
You will identify technological challenges that must be overcome before your
innovation can be fully realized.
You will develop a mock-up of your innovation that demonstrates its use and
functionality.
You will write a detailed product description and deliver an elevator pitch to the class
detailing the features of your innovation and its potential impact on society using
appropriate terminology.
You will provide written feedback to your peers on the potential of each collaborative
team’s design.
529
UTeach CS Principles Unit 6: Innovative Technologies
Future Technology Project
“If I had asked people what they wanted, they would have said
faster horses.” – Henry Ford
https://www.youtube.com/embed/AlmASHISmTM
Whether evolutionary or revolutionary, the impact that new technology has on our lives can
be quite profound. Not only do these technological advances bring us new tools that help to
make us more efficient or productive, but they oftentimes can completely change our daily
routines and radically alter the ways we interact with the world around us.
What is the most revolutionary technological innovation in the last five years?
Make a list of at least 10 technological advances from the last five years. For each item on
your list, decide whether it is evolutionary or revolutionary and consider how it has
influenced people’s lives in the years since its introdution. Be prepared to share and discuss
your findings with the class.
Technological Advances
Sometimes, innovation comes about solely from creative thought and imagination. However,
more often than not, innovative ideas also rely on essential advances in technology. Without
these achievements, even the most creative idea might not be feasible. Here, scientists and
530
engineers push technology further by broadening scientific knowledge and inventing new
tools and processes that others may use in bringing their ideas to reality.
For example, Facebook and other social networking sites would not be possible were it not
for a host of other underlying technologies that make those sites possible, including the
World Wide Web, TCP/IP networking protocols, web servers, web browsers, mobile phones,
digital cameras, and of course, computers just to name a few. Until all of those technologies
had been developed, what we know of as Facebook could never have existed.
Imagine your typical cell phone and all of the things that it can do. Make a list of at least 10
underlying technologies that had to exist before that phone could have been built. Then
identify at least five things that you do all the time that you would not be able to do if cell
phones did not exist. Be prepared to discuss your lists with the class.
Make a list of at least 10 forms of digital technology that directly impact your own life. As you
make your list, try to identify examples that you think nobody else in the class will think to list
themselves, but that at least one other person in the class will agree impacts them in the
same way that you have described. Afterward, each student in the class will have the
chance to name one example from his or her list to see if anybody else in the class shares
that same relationship with that form of technology.
Assignment
To begin your collaboration on this project, you and your partner should work together to
531
identify recent technological trends and use those trends to imagine what might very well
come next.
For starters, consider how technology has already advanced during your own life so far.
What products, services, or technologies do you now rely on every day that did not exist
when you were born? How have those technologies changed over time? Has the change
occurred practically overnight or has it evolved gradually? How will these technologies
continue to change in the near future? What do they allow you to do today that you could not
do in the past? What can you still not do today that you hope to maybe do in the future?
These are the types of questions that futurists ask themselves when they ponder the
possibilities that the future likely holds. As a budding young futurist yourself, your task is to
use your imagination and your own, personal aspirations and desires to envision one or
more of these possibilities and identify ways in which you might be able to someday change
the world.
For this assignment, you will need to perform a series of tasks in the process of designing
and documenting your own future innovation and its potential impact on society:
elevator pitch
At the end of this unit, you and your partner will deliver a 2–3 minute “elevator pitch” in which
you will demonstrate your idea to the rest of the class and attempt to inspire them get behind
your vision of the future. After each presentation, you will also provide written feedback to
your peers on the potential of the innovations they presented.
Submission
Your submission will be in the form of written report and an “elevator pitch” presentation that
you will give to the class that introduces your vision of the future.
532
Purpose: Describe a problem that your innovation seeks to address and what purpose
it will ultimately serve (e.g., entertainment, problem solving, education, artistic
expression).
Description: Provide a detailed description of your idea, what it would look like, and
how it would work or be used.
Features: Provide list of key features that your innovation will possess.
Benefits: Identify multiple ways that future individuals and/or communities will benefit
from your technological innovation.
Risks: Identify the potential hazards that your innovation might introduce in society
(e.g., security, privacy, social disparity, physical harm).
Technological Resources: Identify existing technological resources that your
innovation might utilize or rely upon.
Technological Challenges: Identify any technological challenges or limitations that
must be overcome before your innovation can become a reality.
For your “elevator pitch,” you should prepare the following items:
Rubric
Weighted: 20%
Features Provides a
comprehensive list
Provide a comprehensive of features.
list of key features that the Provides some
533
innovation will possess. description as to the
purpose of these
features.
Weighted: 10%
Weighted: 20%
Technological
Resources Provides a list of
technological
resources that the
Identify a comprehensive list innovation relies
of existing technological upon. 1 pt 0 pts
resources that your Lists reasons their
innovation might utilize or innovation relies
rely upon. upon these
resources.
Weighted: 10%
Presentation
Presenters speak
Quality loudly and clearly.
Presenters are
The presentation is easily engaging and/or
heard, engaging, and entertaining. 2 pts 0 pts
incorporates appropriate Visualizations
visualizations. appropriately
support the
presentation.
Weighted: 20%
Presenter Length
Presentation is
Presentation must span 2–3
minutes.
between 2–3 1 pt 0 pts
minutes long.
Weighted: 10%
TOTAL 10 pts
534
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
535
UTeach CS Principles Unit 6: Innovative Technologies
UNIT TOPIC:
Everyday Computing
Social Networking and Communication
You will explore the ways that innovations in digital technology can impact the lives of
individuals and communities.
You will analyze the role that digital technology plays in your everyday life.
You will analyze the role that digital technology plays in your social communications
and interactions.
You will explore the impact that instant access to global search, news, and information
has had on individuals and communities.
Cloud Computing
You will investigate the socioeconomic causes and effects related to the digital divide.
536
UTeach CS Principles Unit 6: Innovative Technologies
UNIT TOPIC:
Everyday Computing
Social Networking and Communication
You will explore the ways that innovations in digital technology can impact the lives of
individuals and communities.
You will analyze the role that digital technology plays in your everyday life.
You will analyze the role that digital technology plays in your social communications
and interactions.
537
UTeach CS Principles Unit 6: Innovative Technologies
Social Networking
Yesterday’s Technology
While it is becoming increasingly difficult to comprehend, there
was actually a time in which most of the modern-day
conveniences we take for granted did not yet exist. Things like
Facebook, Twitter, Instagram, and Snapchat, for instance, are
all relatively new phenomena that have rapidly taken over our
culture and forever changed the ways that we connect and
interact with one another. But each of these and many other
similar technologies now form such an integral component of our digital lifestyles that it is
hard to imagine how we might have ever been able to function without these everyday
resources.
The truth is, this has always been true about all forms of technology throughout the history of
human civilization. All revolutionary forms of technology change people’s behavior in ways
that make the past seem crude and arcane. Modern digital technology is no different. And
even today’s cutting-edge innovations will one day feel slow and laborious when something
even better inevitably comes along.
Social Structures
One of the most empowering features of electronic social media has been its ability to create
a sense of community, especially in places or situations where such a community could not
have otherwise existed. With the global scope and widespread reach of social media,
previous obstacles like geography, age, or socioeconomics, that isolated groups of people
from one another are no longer barriers. Individuals with shared interests, but whose paths
would never have crossed in the “real world,” are able to come together, communicate, and
interact in the virtual world of an online social network.
This increased ability to find someone who shares the same interests as you allows
538
marginalized individuals to experience a unique sense of belonging. Likewise, the
centralized nature of social networking environments enables new and unique special
interest groups and facilitates coordination of group projects and collaboration in a way that
was previously difficult, if not impossible.
“I would expect that next year, people will share twice as much
information as they share this year, and next year, they will be
sharing twice as much as they did the year before.”—Mark
Zuckerberg, November 2008
The rise in popularity of social networking in recent years has radically altered people’s
perceptions of privacy and their willingness to “share” what previously had always been seen
as personal and/or private information. In 2008, shortly after Facebook opened up its service
to the public at large, Mark Zuckerberg made headlines with his bold assumption that people
would be increasingly willing to share anything and everything about themselves, regardless
of privacy issues. At the time, most critics scoffed at Zuckerberg’s naiveté and arrogance to
make such an assumption. However, through the growth and popularity of services like
Facebook and Twitter, users, and time, have proven Zuckerberg to be surprisingly correct.
This raises the question of whether people of the past were actually private for privacy’s
sake or if their reluctance to publicly share personal information had more to do with the
simple lack of any effective way to do it. After all, before text messaging, tweeting, or blogs
existed, there really was no effective way for individuals to reach out to “the world” and
express themselves. It was the invention of a set of new, digital technologies that suddenly
enabled this ability to freely and broadly share oneself openly with those who might listen.
Think about that for a second. A 24-year-old college dropout with a unique and controversial
vision of the future created a website and forever changed an entire society’s attitudes
toward personal privacy and public sharing—all through computational technology.
In fact, a recent study by the Pew Research Center shows that an increasing number of
Americans are now getting their news from social sites like Facebook and Twitter. This
marks a radical shift in the way information is disseminated throughout the public and,
perhaps more importantly, who controls that flow of information. Traditionally, as the only
539
common source of news, the journalistic community has
served to filter and shape the news as it sees fit. However,
with the free and open access for anybody to share breaking
events, a much more diverse set of voices has emerged.
Assignment
When you go home, find a parent, grandparent, or other adult from an earlier generation with
whom you can sit down for a few minutes to discuss the different forms of social interaction
and networking that they grew up with.
You should ask questions that will allow you to compare and contrast their experiences with
your own. Some ideas for things you might discuss include the following:
Write a short summary of the things you learn from your conversations that includes the
following items:
Identify five aspects of social interaction that have fundamentally changed since they
were your age.
Identify five aspects of social interaction that are more or less the same as when they
were your age.
Describe the most surprising thing that you learned about social interaction in the past
and explain why it was so surprising.
Identify one way that you think social interaction might change between now and the
next generation (i.e., in 20–30 years).
540
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
541
UTeach CS Principles Unit 6: Innovative Technologies
Models of Sharing
It’s All About Who You Know
Whether you are a developer of a social network or merely one of the countless users of
such a network, it is important to recognize that not all social structures are the same. There
are many different types of relationships that individuals might have with one another and
there are many different ways that they might want to either limit or encourage interaction
between each other.
For example, there are many different ways of “knowing” somebody. You might know about
George Washington or Thomas Jefferson from history class, but do you know them
personally? Likely not, as they both died more than 200 years ago. Similarly, you might know
your school counselor, nurse, or principal, but do you know any of them outside of school?
Better still, do any of them know you? And if so, to what degree? The school nurse might
know about your peanut allergy, but does she know what you did on you last family vacation
or the name of your favorite author or musical artist? What about your best friends? How
well do they know you? How well do you know them? Do you share everything? Or do you
only share some things? Likely, you choose to share a limited set of details with certain
individuals or groups while sharing a different subset of your life with other individuals or
groups.
Today’s most popular online social networks and self-publishing platforms exhibit this
diversity of “knowing” in their very design. Each service is designed to model a very specific
form of relationship that you or anybody else might have in your everyday, offline lives.
When it comes to designing or using a social network, it is critical to understand how the
service might handle the issue of privacy. On one end of the spectrum, a service might
choose to make all information publicly available to anybody at any time. At the other end of
the privacy spectrum, all information is securely locked down and made available only to the
owner or creator of the information. The very nature of a “social” medium that involves the
interaction and exchange of ideas between multiple people would suggest that the latter
extreme is inappropriate for such a service. However, the former extreme in which
542
everything is publicly accessible might be too public. Most cases call for a solution that falls
somewhere in between the two extremes. That is, users usually want to be able to restrict
what they share to their intended audience on more of a case-by-case basis.
Public
In its original conception, the World Wide Web was seen as a tool for facilitating a one-to-
many publishing platform. Website authors would publish their content where it would be
publicly hosted on a web server that anybody with a web browser could access and retrieve
a copy of the published materials. By default, the content hosted at a web site was to be
considered completely public (i.e., available to anybody at any time).
Private(ish)
On the other hand, the email (or “electronic mail") protocol was designed to facilitate private
communication. That is, email is shared communication that is intended only for the very
specific recipient(s) to which it is sent. Ironically, while access to an email message is
usually limited to the recipient, the Simple Mail Transfer Protocol (SMTP) that defines how
all email transmission is handled, actually does not ensure total privacy. In fact, as an email
message is transmitted from node to node throughout the public network of routers and
gateways across the Internet, the contents of the messages are exposed in plaintext. In real-
world terms, email is private in the way that a postcard sent through the US Postal Service is
private—in the end, only the addressee receives the card, but any mail handler who looks at
the back of the card can see and read its contents.
Private
A better example of truly private communication (to the extent that that is not an oxymoron)
might be Skype or Google Hangouts. Like email, these forms of communication allow one
individual to specify their intended contact(s) and limit the conversation to just those parties.
Unlike email, however, these video chat services employ encryption so that the digital
information that makes up the audio and video signal is encoded before it is transmitted
across the Internet and then decoded only after it reaches its destination. This means that
the two people on either end of the conversation can see and hear everything being said,
but for any eavesdropping router along the way between them, the signal will be
unintelligible. In this case, the use of encryption is the key factor that enables true privacy
with video chat systems. However, this does not prevent either party on either end of the
conversation from recording the chat session and later sharing it with others—a privacy
vulnerability that even something like Snapchat cannot solve.
543
famous, by definition, it means that millions of people suddenly know who they are. But the
reverse cannot be said. Becoming famous does not ensure that the celebrity knows each
and every one of their millions of adoring fans. Almost all of those millions of relationships
are one-sided. These are examples of asymmetric relationships.
In enabling users to interact with one another, online social networks encourage and
regulate the types of activities their users can engage in based on the nature of the type of
relationship the network is designed to serve. A symmetric system is built upon the idea that
interactions will involve two-way exchanges, often of a more personal nature and in which
both parties are more or less peers. In contrast, an asymmetric system assumes a largely
one-way flow of information that often involves an impersonal delivery of information.
Which model an online service chooses, whether symmetric or asymmetric, will dictate the
primary characteristics of the network, the engagements it offers, and the users it aims to
serve. And of course, many services that start out as one type might, over time, adopt
features of the other type, resulting in a hybrid system that is sometimes symmetric and
other times asymmetric.
Symmetric
Facebook was built to be a symmetric network. The very idea of “friending” involves mutual
actions by both parties in the friendship. One person must first choose to “Add Friend” and
then the second person must accept the friend request. Both parties must take action to
create the relationship in the Facebook system and both parties then have equal access to
the features that such a relationship enables within the Facebook ecosystem. In every way,
the relationship is mutual, two-way, and symmetric. Of course, over time, Facebook has
added features that introduce more asymmetric behaviors, such as the ability to selectively
restrict which friends can see/interact with a post. Nevertheless, the service is still primarily
true to its symmetrically designed origins.
Asymmetric
Twitter, on the other hand, was built to be an asymmetric network. Unlike Facebook’s friend-
based approach where all users are on par with one another, Twitter uses a publisher-
subscriber model. One user “tweets” (publishes) a message that is then automatically sent
to each of their “followers” (subscribers). For public accounts, these relationships are one-
way, requiring action by only one party to create the relationship. Namely, the subscriber
chooses to “Follow” the posts from another user. Unlike Facebook, the publishing user does
not need to “accept” or explicitly permit the relationship to be established. And like
Facebook, Twitter has also evolved into a hybrid system that incorporates some symmetric
aspects through the use of direct messages, protected accounts, and per-user blocking.
Encryption
Encryption methods have traditionally been described as either symmetric or asymmetric.
When we talked about the Caesar Cipher in Unit 1, Alice and Bob both had the same
information to decrypt a ciphertext (encrypted data). Since both parties use the same key to
encrypt and decrypt data, we would consider this a symmetric encryption. If Alice and Bob
544
used public key encryption to encode and decode data, they would be utilizing asymmetric
encryption. Public key encryption allows the generation of two keys: a public key (made
available publicly) and a private key (known only by the owner). Certificate authorities (CAs)
issue digital certicates that validate the ownership of encrypted keys used in secured
communications and are based on a trust model.
Together, the ARPANET and Internet provide excellent examples of the differences between
closed and open systems. Closed platforms are often referred to as being “proprietary”
because they are owned, managed, and operated by a single owner or proprietor. While that
owner might choose to allow others to access and use their platform, they still maintain final
control over the system, how it is used, and how it evolves. Open platforms, however, do not
have a single owner calling the shots. Instead, they might have a centralized committee of
interested partners to coordinate and manage the development and growth of the platform.
There are advantages and disadvantages to both approaches, but the ultimate choice of
whether a platform should be open or closed comes down to the goals that drive the
development of the system. In the case of ARPANET, the US military had ample government
funding to build and operate the network as well as the need for a secure and reliable means
of maintaining communication in the event of a nuclear attack. As such, it was their system
that they built with their own funds and for their own needs, so they kept it to themselves
(i.e., a closed network). By the 1980s, the technological and economic advantages of
making a global electronic network available to the commercial market merited opening up
the network to more civilian uses by business and individuals.
545
Open
Open source and licensing of software and and web content raise legal and ethical
concerns. Open source code is code that is publicly available for anyone. Unlike Apple’s
super secret source code for Mac OS X, open source software allows programmers to view,
reuse, and remix the source code for individual use. Though this source code is generally
publically available, there are still rules that govern its use (or reuse). Like using any material
that you have not created, you must abide by the terms and conditions of use. Typically
these terms can be found close to the source code or provided in commented form within
the program.
Email and web publishing are both examples of open platforms that have been created
around open standards. Any developer can create an email application that conforms to the
various standards for handling email (SMTP, POP, IMAP). The program can then send email
to and receive email from any other email user anywhere on the Internet no matter which
email client (program) they might be using. Without these open standards, there would be no
guarantee that any message that you write and send would be compatible with the software
being used by your intended recipient. But with these standards, all emails created by any
standards-compliant email client are guaranteed to be fully compatible with all other such
clients, thus enabling the global communications system that we have come to rely on. Open
standards also allow us to ensure cryptography is secure for Internet encryption. Nataraj
Nagaratnam, IBM Distinguished Engineer, explains how these open standards reinforce
security.
Closed
While services like Facebook and Twitter might be publicly available and free to use, they
are still closed, proprietary systems. Each company controls the data that they store and
strictly regulates how that data can be accessed and modified by its users. For example,
users cannot easily transport their tweets, status messages, comments, chat histories, or
friends lists to other competing services. Similarly, the ability to integrate these services into
other, third-party apps or sites is strictly limited to what Facebook or Twitter choose to allow.
Having this level of control over their platform gives each company the ability to more
reliably build, develop, and monetize their platform, but it comes at the expense of user
choice and compatibility.
Assignment
As a class, make a list of all of the different services and methods that people can use to
communicate and/or interact with one another online. Then, working in pairs, decide whether
you would classify each of the services as either public or private, open or closed, and
symmetric or asymmetric. Be prepared to explain and defend your choices.
546
Twitter public asymmetric closed
547
UTeach CS Principles Unit 6: Innovative Technologies
UNIT TOPIC:
Everyday Computing
Search, Wikis, Commerce, and News
You will explore the impact that instant access to global search, news, and information
has had on individuals and communities.
548
UTeach CS Principles Unit 6: Innovative Technologies
Essential Services
The New Normal
As computational technology has grown over the last several decades, it has come to play
an increasingly greater role in our modern lifestyle. What was once limited to a select few
(e.g., wealthy corporations, electrical engineers, scientific researchers, etc.) is now a
commonplace commodity that nearly everybody uses and relies upon to one degree or
another.
Today’s technology is ubiquitous. From the smart phone in your pocket to the cash register
at your neighborhood store to the traffic signals at a nearby intersection, the products of
computer science are everywhere. Everything around us collects, stores, and/or manipulates
digital data to record information, compute results, and make decisions about how things
should work. We cannot escape the influence of digital computing in our everyday lives.
And that has had a profound impact on the ways that societies function and has altered the
ways that individuals behave and make decisions. The influence of computational
technology is so pervasive that people increasingly take it for granted, especially younger
individuals for whom this technology has always existed.
One of the goals of this course has been to open your eyes to many of the computational
influences that surround you and show you how to harness the capabilities of these tools
and resources.
Utility or Luxury?
As computing becomes more pervasive and individuals integrate it into their daily lives, we
grow increasingly dependent upon this technological resource. More and more of our
infrastructure, whether it is banking, shopping, medical records, communication, or basic
utilities like electricity and water, is being built around and optimized to rely upon
computational technologies.
One of the issues this raises is the question of what will happen if this infrastructure
collapses or is taken away for some natural, political, or economic reasons. How will we
function if the Internet “goes out"? What happens to those who cannot afford to pay for
network access? These are just a few of the serious questions that arise as we integrate this
technology into our lives.
At some point, a technology becomes so essential to the normal functioning of a society that
it elevates from a mere convenience or personal luxury to an essential public utility like water
and electricity. In recent years, many policymakers have begun to make the argument that
access to the Internet is rapidly becoming such an essential utility and that it is something all
individuals should be entitled to. In fact, many would argue that we are already there and
that the law is simply lagging behind.
549
Later in this unit, when we discuss a few high-profile social and political issues like the digital
divide or net neutrality, we will look more closely at a number of the questions that society
will need to address as the advances of digital technologies continue their forward march of
progress.
550
UTeach CS Principles Unit 6: Innovative Technologies
Search
Search
In 1998, Larry Page and Sergei Brin launched Google, an online search engine that was
driven by the PageRank algorithm (named after Larry Page) that the two had developed
while students at Stanford University. While Google was not the first search engine on the
Web (at least a dozen popular search engines were developed in the half decade before
Google), it was the first to use the relative connectedness of webpages to rank and prioritize
search results, an approach that quickly built Google into one of the leading sources of
online search. Google’s success has been further cemented by the verbification of its name.
Today, in popular language, to look up answers online is to “Google it.”
The very fact that people routinely turn to search engines like Google, Bing, DuckDuckGo, or
even Siri whenever they have a question about something is a testament to the power of
search and the value it adds to our lives. Before the Web, information was not as readily
available to the mass public. Much of the knowledge and information that society possessed
was either private, undocumented, or locked up in books buried in a local library. Immediate
access to diverse ideas and resources was not available to the degree that it is today. And
search engines, like the Dewey Decimal card catalog system of libraries, provide an efficient
interface for indexing and finding obscure and relevant bits of information.
The easy access to any and all types of information has profoundly altered individuals’
behaviors, especially when it comes to learning. If there is anything a person might want to
learn about or any skills that they might want to develop, the Web has made the tools and
resources needed to acquire that knowledge readily available to anyone who is interested.
551
But while online search has increased the amount that we can know, it has also reduced the
amount that we need to know. No longer is there a need to learn and remember infrequent
details. If there is anything important that you need to know later, you can “just Google it.”
552
UTeach CS Principles Unit 6: Innovative Technologies
Wikis
Wikis
While online search has made it easier to catalog and index the wealth of knowledge to be
found across the entirety of the Web, wikis have consolidated this vast volume of information
into well-organized online references built around user-based communities.
Created in 1994 by Ward Cunningham, a wiki is a platform in which multiple users are able
to collectively contribute to a shared knowledge base. The crowdsourced nature of wikis
gives individual users the ability to shape and inform the content in authentic ways that
traditional information sources historically have not. Rather than presenting its information
through the filter or with the bias of a centralized editorial control, wikis rely on peer-based
writing, fact checking, editing, and moderation.
Many people find it uncomfortable that any random user can edit the content in a wiki
unchecked, assuming that such lack of oversight reduces the reliability of the site. However,
most wikis develop strong communities of dedicated volunteers to moderate the content on
their sites and help to keep vandalism and other disruptive behavior in check. As Ward
Cunningham himself has said, “Wikis work best in environments where you’re comfortable
delegating control to the users of the system,” although he has also stated that, “With wiki,
you have to trust people more than you have any reason to trust them. In 1995, it was a
safer environment, don’t know if I could have launched wiki today.”
While Wikipedia is perhaps the most well-known and visited wiki, it is by no means the only
one. Across the Web, thousands of wikis have been created that specialize in a broad range
of special interests serving smaller, often underrepresented populations, giving each one a
global voice that they would not have otherwise had.
553
UTeach CS Principles Unit 6: Innovative Technologies
Commerce
Commerce
"Why leave the comfort of your own home when you can get
something custom made to your exact size for less? I believe this
is the future.” —Patrick Curtis
Another phenomenon that has emerged from the growth of the Internet is e-commerce, or
electronic commerce. With its global reach, the World Wide Web is a powerful tool for those
who have something to sell. No longer is a retailer’s customer base limited to the local area
within driving distance of a storefront. Now, the entire global online population can be
potential customers.
Online Storefronts
Without the need for the physical presence of a so-called “brick and mortar” store, online
retailers are often able to offer more efficient services and better pricing. Resources that
would traditionally be spent on an elaborate showroom and hired salespeople can instead
be redirected to an automated online storefront, improved products, and discounts on
products and/or shipping costs.
Many consumers have embraced the convenient and hassle-free opportunity to shop online,
just as previous generations did with mail-order catalogs. Only now, the online “catalog”
offers a much richer and more informative shopping experience through the use of
multimedia, user reviews, and personalized customer recommendations based on previous
buying behavior.
Independent Sellers
Commerce is another area in which the democratizing effects of the Web can reveal
themselves. With online auction sites like eBay or e-commerce sites for artisans like Etsy,
individuals now have access to the same global market as the large, corporate retailers for
selling their wares. This has sparked a boom in both the supply of and demand for custom-
made and limited-production runs of unique new products and services that were either not
available or not feasible before the advent of an e-commerce market.
Crowdfunding
For those independent sellers who envision creating a sustained business or a complete
new product or service, crowdfunding uses online access to customers as a means of
554
funding their project. In the past, clever entrepreneurs who had a brilliant idea for a new
product might never have been able to bring that product to market due to lack of start-up
funds (i.e., it takes money to make money). Sites like Kickstarter and Indiegogo were
created to enable these innovators a way of reaching out to potential customers and
recruiting them as backers who might help make their idea a reality by providing initial
“investment” funding.
For example, brick and mortar retailers using credit card scanners and electronic cash
registers are vulnerable to attacks through these network-connected devices. These
automated systems are desirable targets for attackers simply because their very function is
to store and process large volumes of valuable information, such as credit card numbers and
customer profile data. In recent years, a number of major retailers have accidentally
exposed millions of customers’ financial data when their systems were attacked.
A major focus of research in the computing industry centers on encryption techniques and
standards for protecting sensitive data and on the design of more robust systems and
procedures for making attacks more difficult to perform. Technologies such as Google Pay,
Apple Pay, and other mobile payment systems are some of the most recent attempts to
strengthen the security of electronic financial transactions.
555
UTeach CS Principles Unit 6: Innovative Technologies
UNIT TOPIC:
Everyday Computing
Cloud Computing
556
UTeach CS Principles Unit 6: Innovative Technologies
Cloud Computing
Everything Old is New Again
The term “cloud computing” is a relatively recent buzzword for a new type of computing.
Thanks to the Internet, Wi-Fi, and other modern networking technologies, individual users
can “offload” much of their computational and data storage efforts onto remotely hosted
servers and online services. This has helped to make smaller, more portable computing
devices like laptops, tablets, and phones more practical because the device itself does not
need to do all of the work. Instead, the heavy lifting can be handed off to “the cloud.”
Believe it or not, this is not a new idea in computing. In fact, decades earlier, long before the
availability of public or private Internet access, most computers were the size of entire
buildings. While the actual devices that people used might have looked like today’s desktop
computers, they were actually just simple interfaces, so-called “dumb terminals” or “thin
clients,” that connected remotely to the actual computer or server located elsewhere, much
like our phones and tablets connect remotely to web services.
Of course, yesterday’s “dumb terminals” have since given way to today’s “smart phones,”
which have far more computing power in their own right. But the increasing use of and
reliance upon “the cloud” for remote computing show that recent and incredible advances in
technology have not strayed far from their predecessors.
Client-Server Model
Like with the “dumb terminals” of the past, today’s cloud computing is built upon the “client-
server model.” That is, the networking process can be described according to the
interactions of two remotely located computers (or more accurately, programs running on
those computers)—namely, the “client” and the “server.”
The client stands at one end of the communication process and typically represents the
end-user. At the other end of the process is the server, a centralized computer that all
individual end users connect to. In other words, the server serves the needs of its clients
(i.e., users).
The analogy is that of a waiter at a restaurant. The as the waiter (i.e., server) waits on the
customers (i.e., clients, each individual customer makes requests about which particular
food items or beverages they want. The waiter then goes back into the kitchen while the
cooks (i.e., processor, or CPU) gather the raw ingredients (i.e., data) from storage (i.e.,
database) and prepares the meal. When it is ready, the waiter returns to the customer and
serves the meal as they requested.
557
browser is the client. The website you visit is essentially data hosted (stored and served) on
a remote web server.
This has freed us from our desks and allowed people to create entirely new behaviors and
use-cases that were previously impossible or prohibitively expensive before cloud
computing.
In the early days of PCs, your data was trapped in a large box sitting on a person’s desk at
home or in their office. Data that was on their computer at home could not be easily taken to
work. Data that they used at work could not be brought home. Any data that needed to be
transported from one location to another had to be copied onto physical disks that were slow
and had relatively little capacity.
With cloud computing, users can now access their data from anywhere—home, school,
work, airports, coffee shops, etc. And since wireless transmissions free us of the need for
physical media, we can now use phones, tablets, and other devices that lack physical ports
to access this cloud data as well.
Of course, one of the trade-offs of storing your data remotely is the additional costs of
transmitting your data to and from the cloud—both in terms of time and bandwidth.
Off-Site Storage
Another benefit of cloud-based systems is their use as off-site storage for your data. Anyone
who has ever used a computer either has or will experience the loss of data at some point.
Accidents happen. Hardware fails. Your data is important, but there is no way to ensure that
it will remain 100% safe, intact, and uncorrupted. While it is good practice to maintain regular
backups of your files, photos, and other digital information, a good rule of thumb is the “3-2-1
Rule":
The first two items deal with redundancy and maintaining multiple copies of your files, but
the last item avoids the issue of creating a “single point of failure.” It does no good to have
multiple backups of your files if they are all kept in the same location. A single disaster, like a
fire, flood, or earthquake that destroys your house will destroy your files and your backups at
the same time.
558
Cloud storage helps to solve that problem by allowing users to keep a separate copy of their
data safely stored at a remote location.
559
UTeach CS Principles Unit 6: Innovative Technologies
Ownership of Cloud Data
Who Owns Your Data
Despite the advantages of storing one’s data in the cloud, the use of online services and
remote storage systems also comes with a number of legal risks. Specifically, the issues of
“ownership” and “access” come into play whenever any party (i.e., the cloud service) acts as
a custodian for the property (i.e., the data) of another party (i.e., the user). Unfortunately,
cloud computing is still a relatively new phenomenon and the legal distinctions about the
ownership of personal information and data is not always so clear.
For example, who owns your Facebook profile? You? Or Facebook? Clearly, Facebook
hosts your profile and stores the raw data from all of your posts, comments, photos, likes,
chat histories, and friend connections on their servers, but does that mean that Facebook
owns the information making up all of that data? If asked, most users would expect that the
answer is “no,” Facebook does not own their personal data. And fortunately, Facebook
agrees, actually. According to their terms of service, individual users “own all of the content
and information” that they post on Facebook. But those terms of service also include a
number of statements in which, by using the service, you grant Facebook permission to do a
number of things with your data that may or not be in your best interest.
Selected excerpts from the Facebook terms of service include the following (emphasis
added):
The terms of service also clarifies, in detail, several more generalized terms that
560
By “information" we mean facts and other information about you,
including actions taken by users and non-users who interact with
Facebook.
Most other services have similar terms of service statements that specify the rights,
limitations, and obligations of both users and the service itself with regard to the user’s data.
Each agreement is different and is tailored to the interests of the service that wrote it, which
are not always the same as the interests of the users. As a user, it is important to
understand what rights and privileges one might be giving up when they agree to use a
particular service.
Assignment
When you sign up for a new service or product, you are often presented with a Terms of
Service agreement to read and acknowledge. It is almost always a very long document with
a lot of seemingly obscure legalese, so most people skip past it and quickly check the box
saying that they have read and agree to the terms when they have not actually read
anything. You have probably done it yourself. It is only human to do so. But what exactly are
you agreeing to when you do this? What rights might you be giving up? Even if you never
read any of these agreements, it is important to understand that the terms that are carefully
detailed in each of these agreements directly affect you, your privacy, and your rights to your
own content that you create, store, and/or use with the service. More importantly, these
agreements usually specify the limits that the service can be held to with regard to what it
can do with your data, whether it is on your behalf or in your best interests or not.
In this exercise, you are to select one of the following online cloud services and actually read
its Terms of Service agreement to see what such a document includes and then answer the
questions below.
Answer the following questions about the Terms of Service that you read:
562
UTeach CS Principles Unit 6: Innovative Technologies
UNIT TOPIC:
Everyday Computing
The Digital Divide
You will investigate the socioeconomic causes and effects related to the digital divide.
563
UTeach CS Principles Unit 6: Innovative Technologies
The Digital Divide
What Is It?
Following a 1995 study by the Markle Foundation, Lloyd Morrisett, president of the
foundation, described a “digital divide” between the haves and have-nots when it comes to
access to information. Essentially, the study found that the same racial and cultural barriers
that impact societies offline also have a similar effect on their access to online resources. As
a result, some have likened this phenomenon to the racial and socioeconomic disparities of
previous generations, calling it the Civil Rights issue of the new millennium.
The term is now used to describe the gap that exists between those who have sufficient
access to information and communication technologies and those who do not. The reasons
for this gap are numerous and complex, but the effects of the unequal access to
computational services within society can be felt in a variety of areas, including education,
healthcare, employment, social connectedness, and political awareness and participation.
There are many different factors that specifically contribute to the existence of the digital
divide. In general though, the problem can often be described in terms of obstacles that
inhibit individuals from fully realizing the potential of information and communication
technologies:
Simply connecting online requires certain financial and physical expenses, such as a
computer, networking infrastructure (e.g., copper cables, fiber optics, wireless access points,
routers), and Internet and network connectivity services. For many, these costs are
insurmountable and make online engagement an impossible fantasy. In poorer communities
or nations, the benefits promised by Internet connectivity must compete with more pressing
564
needs, such as food, clothing, and shelter. For people in these communities, the Internet
often does not rise to the level of becoming a priority. As a result, they are excluded and
disconnected from the larger community of the digital world.
In addition, a lack of technological literacy also prevents many people from connecting
online. Whether it is that they do not know how to use technology or that they simply do not
understand or recognize the benefits of technology, many people avoid or are otherwise
prevented from accessing the online world. As a result, these people, too, are isolated from
the digital world, whether by choice or by circumstance.
Many of today’s basic utilities and social resources are moving exclusively online or at least
have an online component. In journalism, inexpensive and readily accessible print
publications are rapidly being discontinued in favor of their online counterparts. Those
without connectivity are thus losing their access to news and civic discourse, impeding their
ability to remain well-informed citizens. Similarly, healthcare and many government-related
services now expect individuals to create and manage their accounts via online portals.
Again, for those without connectivity, this becomes yet another obstacle between them and
the services they need and are entitled to.
In addition, the problem is not just about what these disconnect communities lack. It is also
about what the rest of society as a whole, including both the haves and have-nots, loses
though their lack of participation. For every individual who is excluded from the modern
digital ecosystem, one more voice is silenced. One more voice that is unable to contribute
ideas and solutions to problems. One more voice whose perspective is lost. These
individuals bring much-needed value to a community and their lack of participation in the
digital world does a great disservice to all of society.
Also, as the disparity between the informational haves and have-nots grows, these two
populations become even more divided and polarized, leading to civil inequality and
increased social tensions that further alienate communities from one another.
Assignment
For this task, you will be given two topics on which you are to prepare a short presentation.
For the first task, you must work alone and you may not use any technological devices
beyond a pen/pencil and paper. For the second task, you may collaborate in small groups
and the group may use any computational device(s) that you have access to (e.g., computer,
smart phone, tablet, calculator, presentation software, web browser, access to the Internet,
etc.).
In addition to your two short group presentations, you should each individually write a brief
565
reflection on your experiences completing the two tasks. Be sure to compare and contrast
the experiences of working collaboratively with technology versus working alone and without
technology.
566
UTeach CS Principles Unit 6: Innovative Technologies
UNIT TOPIC:
The Internet
Network Infrastructure
You will examine the overall design and architecture of the Internet.
You will explore the role of servers, routers, gateways, and clients.
You will examine the domain name system and its role in network routing.
Communication Protocols
You will examine a number of standard network protocols, including IP, TCP, UDP,
SMTP, HTTP, and FTP.
You will investigate the series of components and events that are involved in the
transmission of an email or SMS text over the network.
You will analyze the impact of hyperlinked documents on how individuals find, acquire,
and learn new information.
You will analyze the legal, social, and commercial impact that the World Wide Web has
had on society.
567
UTeach CS Principles Unit 6: Innovative Technologies
UNIT TOPIC:
The Internet
Network Infrastructure
You will examine the overall design and architecture of the Internet.
You will explore the role of servers, routers, gateways, and clients.
You will examine the domain name system and its role in network routing.
Communication Protocols
You will examine a number of standard network protocols, including IP, TCP, UDP,
SMTP, HTTP, and FTP.
You will investigate the series of components and events that are involved in the
transmission of an email or SMS text over the network.
568
UTeach CS Principles Unit 6: Innovative Technologies
Network Infrastructure
Network Architecture
In its simplest form, the Internet is just a large collection of interconnected devices. Some of
the devices are connected directly to one another while others are only connected indirectly
through a series of intermediate devices. Altogether, these devices make up the network.
They consist of all varieties of electronic, computational hardware, including desktop
computers, laptops, tablets, phones, printers, cameras, routers, Wi-Fi access points, etc.
One of the strengths of the Internet’s design is the inherent redundancy that such a complex
and multiply connected network offers. In most cases, there is more than one pathway
through which a transmission can be sent in order to reach its destination. This allows the
network to not only be fast and efficient by finding the most optimal route, but also robust
enough to continue functioning even if part of the network fails and a pathway is cut off.
Client-Server Model
The basic operation of any networked communication system centers on the transmission of
information between two parties. Traditionally, these parties are referred to as the “client”
and the “server.”
Typically, the client initiates the communication by sending a request to the server—usually
a fully automated program running on a remotely located computer—which then processes
the request and sends the appropriate response back to the client. Examples of client
software that users might be familiar with include web browsers, e-mail applications, and
chat programs.
When a user clicks on a link in a browser, a URL (i.e., the address of a web page) request is
sent out into the larger network, where it is then routed to the location of the particular
computer that is running a web server for the requested URL. The server then generates the
content for the page (formatted in HTML) and transmits it back through the network to the
user’s computer. The web browser (i.e., the client) then interprets the HTML information that
it receives and uses that to render the text and images onto the user’s screen.
So how fast can this virtual exchange happen? And what factors limit this speed? While
technically transfer of information over the Internet can be very, very fast (up to 750 Mbps in
some areas), two aspects to consider are latency and bandwidth. The bandwidth of a
system is a measure of bit rate — the amount of data (measured in bits) that can be sent in
a fixed amount of time. The latency of a system is the time elapsed between the
569
transmission and the receipt of a request. If you can imagine the Internet as a pipe that
information travels through, bandwidth is the size of the pipe. You can see in the picture
below the relationship between latency and bandwidth, as well as a small amount bandwidth
(top) compared to larger bandwidth (bottom).
In some use-cases, the server might initiate the communication in what is referred to as a
“server-side push” (in contrast to the “client-side pull” described above). In a “push” situation,
a centralized server pushes information out to one or more clients without the end-user
explicitly requesting the information.
A chat program is an example of an application that makes use of both “client-side pull” and
“server-side push” transmissions. In a situation where two users are messaging with one
another, they are each operating an application that is functioning as a client and are
remotely communicating with a central server located somewhere on the larger network. The
server effectively sits in between these two users, relaying their messages from one to the
other. When one user sends a message, the contents of the message as well as delivery
instructions are sent to the server, which is listening for such an incoming transmission. The
server then interprets the delivery instructions to identify which user the message is intended
for. The server then initiates a transmission in which the message is “pushed” to the other
user’s client, which renders the message on their device as an incoming message from the
first user.
570
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
571
UTeach CS Principles Unit 6: Innovative Technologies
Communication Protocols
The Importance of Protocols
protocol
Every time you perform one of these actions as a user, you set off a chain reaction of
complex data processing and information exchange within dozens, if not hundreds, of
separate computational devices across the breadth of the Internet. In these examples, the
software running on your computer, phone, tablet, or other computing device responds to
your clicks and taps by initiating a multi-stage network transmission in which your data hops
from node to node as it is handed off from one router or hub to the next until it finally reaches
an appropriate server on the other end. In each of these hops, where your data is
transmitted between two nodes, the exchange of information is made possible through the
use of standard, agreed-upon communication protocols that each node follows precisely.
These protocols, or sets of rules for the proper handling and formatting of information, are
designed to establish a common interface through which different hardware/software
components can interact, regardless of their own internal design or manufacturer.
In the case of communication protocols, abstraction allows for a clear set of standards for
how information should be exchanged to be explicitly described while avoiding the
complexity and minutiae of how a manufacturer might actually implement that standard in
their hardware or software. By describing the standards broadly, abstraction also enables
future innovation by not limiting or restricting the kinds of hardware that can implement the
standards. As long as it conforms to the expected protocols, any type of compatible
component can be developed to participate in the exchange of digital information on the
Internet.
Standard Protocols
Throughout the history of the Internet, as technologies have evolved, a number of standards
have been developed to ensure the efficiency and robustness of these new capabilities.
572
Traditionally, the task of developing these communication protocols falls on the shoulders of
independent researchers who develop the protocols and ad hoc standards committees,
often made up of experts and organizations who have a vested interest in the proper
functioning of the Internet as a whole. Together these committees agree upon and oversee
application and further refinement of existing protocol standards.
Designing and developing a protocol is no easy task. Any agreed-upon standard needs to 1)
enable reliable and efficient transmission of data at large and small scales, 2) provide an
unambiguous set of protocols, and 3) be flexible enough to adapt to and accommodate
future technological innovations. Hierarchy (which will be discussed in greater detail in
Domain Name System) and redundancy (provided by established protocols) help systems,
like the Internet, scale.
Several of the most commonly used protocols include those for Internet data transmission
(IP, TCP, UDP), e-mail (SMTP), webpages (HTTP, HTTPS), and files (FTP). Using the notion
of abstraction, the Internet protocol suite, more commonly known as TCP/IP, organizes each
of these into a stack of distinct layers based on the hierarchical relationships of their use.
These include the link, transport, Internet, and application layers.
But communication protocols are not just limited to the Internet as a whole. A number of
independent, privately owned platforms also offer protocols for publicly integrating third-
parties into their networks through the use of APIs (application program interface). APIs and
libraries simplify complex programming tasks by providing developers the building blocks
necessary to interface with an existing environment or other software components.
Programming documentation (discussed in Unit 4: Draw Shapes) for APIs/libraries is an
important aspect for a programmer to understand all components of development.
For example, when Twitter first launched in 2006 as a private messaging service, it was built
around the SMS text messaging standards (whose protocol limits messages to 160
characters). Twitter’s protocol limited its own messages to 140 characters, using the
remaining 20 characters to convey meta information about the message (e.g., user id of the
sender, etc.). However, Twitter’s initial growth benefited from their offering of a public API
that allowed countless third-party developers to create their own innovative applications. By
following the protocols laid out in the API, these applications could tap into the Twitter
backbone to send and retrieve tweets. In this way, open standards (like the Twitter API) fuel
the growth of the Internet.
This protocol specifies the use of physical, short-range connections commonly used in local
area networks (LAN), such as homes or offices, which uses coaxial cable, twisted pair, or
fiber optic connections. You will often see these types of connections between a cable
573
modem and a wireless access point or router or between a desktop computer and a wall
outlet (which ultimately connects elsewhere to a modem).
802.11 is the standard wireless protocol for Wi-Fi communications seen in most laptops,
phones, tablets, and other networked devices that are not physically connected to a network.
The TCP protocol ensures that all packets of a data stream are transmitted and received
exactly as originally sent. It specifies methods of performing error-checking analysis of the
received packets to ensure that no error or loss of information was introduced along the way.
If a packet is lost or found to be damaged, the TCP protocol will identify the error and
request that the packet be resent. In this way, routing on the Internet is fault tolerant and
redundant. TCP is most suited for applications that prioritize accuracy and completeness of
transmission over speed, such as e-mail (SMTP), webpages (HTTP, HTTPS), and file
transfer (FTP).
The UDP protocol is better suited for applications in which speed is more important that
accuracy or completeness of information, such as online gaming or video or audio
streaming.
Originally developed as part of the Transmission Control Program in 1974 and then later
separated out as its own standard, the Internet Protocol (IP) specifies how individual packets
of information are packaged and labeled for delivery, much like how the postal service
specifies how envelopes and packages should be sized, addressed, and stamped.
574
(FTP).
This protocol specifies how electronic mail should be formatted in order to be sent and
routed to the intended recipient. Most notably, the standard specifies how header information
may be prepended to the beginning of a message by each node that handles the message
along the way. Users who receive an e-mail can view the headers of the full message once
they receive it to see various information about where the message came from and through
which server it originated from and those that it passed through on its way to their inbox.
These are the primary distribution protocols used by the World Wide Web for delivering web
content. The protocol was designed to transmit hypertext documents that have been
formatted according to another protocol, HTML (Hypertext Markup Language). The “S” at the
end of “HTTPS” indicates “Secure” and provides additional protocols and procedures for
encrypting information prior to transmission and decrypting it upon receipt.
Other common protocols in the application layer include the File Transfer Protocol (FTP)
for sending files, Domain Name System (DNS) for looking up IP addresses of domains,
Dynamic Host Configuration Protocol (DHCP) for initializing an Internet connection with
an ISP, and Post Office Protocol (POP) and Internet Message Access Protocol (IMAP)
for storing/retrieving e-mail messages.
575
UTeach CS Principles Unit 6: Innovative Technologies
Internet Protocol
A Protocol for Packet Network Intercommunication
When Vint Cerf and Bob Kahn first proposed the Transmission Control Program in 1974, it
established the use of a technique, known as packet switching, as the underlying method of
transferring data between nodes across the Internet.
With packet switching (or packet routing), all data is subdivided into small, suitably sized
blocks that are then transmitted independently from one another, potentially taking different
routes to reach the data’s intended destination. Rather than sending an entire message of
some arbitrary and varying length all at once and hoping that it reaches its destination intact,
larger messages are broken up into smaller, fixed-size packets.
Each of these packets contains two components: a header and a payload. The header
contains the IP addresses of the source node (e.g., the sending client) and the destination
(e.g., the receiving server) and any other information needed to deliver the packet. The
payload is the actual data that is being sent.
In a real-world example, the process of sending digital information across a network is a lot
like mailing a document across the country via postal mail. The sender seals the document
(the payload) in an envelope or box (the header) upon which is written the return address
(source IP address) and the recipient’s address (destination IP address). Delivery of the
package involves handing it off through a series of many postal carriers and routing systems
(nodes) as they transport the package to its destination.
By sending data as individualized packets of information, the Internet is able to operate more
efficiently and is better able to handle unforeseen errors in transmission.
Imagine sending a very large document consisting of thousands of pages. If sent all at once
as a single document, any single error, delay, or misdelivery along the way will require the
entire document to be resent in full. However, if each page of the document was individually
sent, any misdelivery will require only that particular page to be resent. In addition, as the
amount of data traffic varies over time, different routes might become faster than others. By
sending the larger document as many smaller pieces, each piece can be routed along the
most optimal path at that moment. Together these advantages of packet switching serve to
improve the speed, efficiency, and reliability of the communication network.
IP Addresses
The relative location of each node on the network is specified by the node’s IP address, a
unique numerical identifier, that allows other nodes to know which nodes they are directly
connected to and to determine a route for sending data to its intended destination. In the
case of the Internet, the people who designed the network devised a system for assigning
unique 32-bit numbers to each device on the network.
576
For example, as of 2016, the address for the google.com search engine is
74.125.224.72 . What this means is that when you “Google something,” your browser
(i.e., your client) sends a search request through the Internet’s network of nodes, addressed
to a server located in some far-off location whose IP address just happens to be
74.125.224.72 .
As devices are added to the network, each is assigned an IP address that translates roughly
to its geographic location on the network. More precisely, the Internet Assigned Numbers
Authority (IANA) oversees the distribution of these addresses through a number of regional
registries. Each registry then assigns blocks of addresses to different entities, like your local
Internet service provider (ISP), which then assigns an available address from that block to
each computer or device that connects through its service.
As a result, while each IP address can provide a general sense of the geographic location of
the organization that a node is associated with (e.g., your local ISP), the actual address
does not specifically locate the actual node itself. For example, two computers with very
similar IP addresses from the same block of numbers might actually be physically located
across town from one another and have nothing in common other than they both use the
same ISP for Internet connectivity.
Each of these four numbers is referred to as an octet because they each represent eight bits
(or binary digits) of information. In UNIT 3: Data Representation, you looked at binary
numbers and how they correspond to the decimal (base 10) numbers system. You also
investigated why eight bits of data can only represent values between 0 and 255 . Suffice
to say, four octets of eight bits each adds up to 32 bits of total information that can be used
to identify each node on a network.
Just like there is a limit to the range of possible values that can be represented by eight bits
of data (e.g., 0 – 255 ), there is a similar limit to what can be represented by 32 bits of
information (e.g., 0 – 4,294,967,296 ). That means that the Internet, as originally
designed, can only generate 4.3 billion unique, 32-bit addresses, which limits the Internet to
a maximum capacity of no more than 4.3 billion nodes. While that is a lot of devices, that
number is still finite and can easily run out. In fact, it already has in some areas. APNIC (the
Asia Pacific Network Information Centre), the registry organization for the Asia Pacific
region, exhausted its pool of regional addresses in 2011. Then, in the fall of 2015, the
American Registry for Internet Numbers (ARIN), the registry organization that assigns
addresses for the United States, Canada, the Caribbean, and North Atlantic islands issued
the last of its unassigned addresses.
577
IPv6 (“Internet Protocol, version 6”)
While most Internet traffic today uses IPv4 as its addressing standard (and will likely
continue for some time to come), it is gradually being replaced by IPv6 ("version 6") due to
this issue of limited address capacity. IPv6 solves this problem by using 128-bit addresses—
four times the length of the IPv4 addresses. With 128 bits of data packed into each address,
this new standard is capable of sustaining more than 3.4x1038 individual nodes. That is 340
billion billion billion billion!
But is that enough? Well, to put that number in perspective, if every grain of sand on all of
Earth’s beaches were assigned its own unique IP address, the IPv6 standard would still
allow for 60 quadrillion more similar Earthlike planets each with their own IP-enabled beach
sand to have their own, unique addresses. In other words, while that number, too, is finite, it
is reasonable to say that yes, it most likely is enough.
578
UTeach CS Principles Unit 6: Innovative Technologies
Domain Name System
The Internet’s Directory Assistance
Every website is essentially a collection of content stored on a web server. Every web server
is just a program running on an Internet-connected computer. Every Internet-connected
computer is a networked node. And every networked node is assigned an IP address that
identifies where it can be found on the network. So, it is reasonable to conclude that every
website can be identified by its 32-bit IP address.
But how often have you ever entered an IP address into your browser? How often do you
see any IP address in the address bar as you browse the Web? For most people, the
answer to these questions is “rarely, if ever.” Instead of IP addresses, most users rely on the
use of domain names to reference an online site or service. And the Domain Name System
is the hidden service quietly working in the background that makes it all possible.
Domain Names
The fact is that a 32-bit number, while perfectly ideal for the digital and electronic
components of the routers and other computational devices that support the Internet, is not
very human-friendly. Which of the following is more intuitive for humans?
01000010110111001001111001000100
66.220.158.68
facebook.com
The first is an actual 32-bit address. The second is that same address expressed in a more
human-friendly format of four octets. The last version, however, is the most recognizable of
the three because it is descriptive and tells us everything that we, as humans, would want to
know—namely that it refers to the social networking site, Facebook.
Numbers are great ways of cataloging and indexing things. People have Social Security
Numbers, the products we buy have Universal Product Codes (UPC barcodes), books have
International Standard Book Numbers (ISBN), credit cards and bank accounts are
numbered, driver licenses have numbers, etc. But these numbers are meant for automated
systems to store and process computationally. We prefer to call each other by our first and
last names, products by their brands, and books by their titles. So it comes as no surprise
that we would prefer to reference our favorite websites by descriptive names rather than an
obscure and seemingly arbitrary sequence of numbers.
DNS Lookup
When you type a URL into an address bar, it is usually in a form that includes a domain
name. For example:
579
https://www.facebook.com/
But how does this translate into an IP address that your computer and all of the intermediate
routers and gateways can use to locate the actual Google server that hosts the page you are
trying to load? The answer is a massive lookup table that acts like a large directory,
dictionary, or phonebook that allows you to use a value that you do know to look up a value
you do not know.
When you try to load a webpage, your web browser (e.g., the client) isolates the domain
name specified in the URL you typed (e.g., “facebook.com") and sends that name to a
special server known as a Domain Name Server (DNS). That is, the browser sends a
request to a pre-configured IP address that corresponds to the location of a nearby
nameserver (usually belonging to your ISP, although Google actually operates a couple of
handy nameservers at 8.8.8.8 and 8.8.4.4 ).
The nameserver, which is a computer that stores an updated list of every registered domain
name and the IP address of the server that hosts that domain, looks up the address of the
requested domain name and sends that information back to your web browser.
Your web browser then sends the URL of your original page request to the IP address that it
received from the nameserver.
Without a centralized nameserver at a known IP address, your browser would have no way
of knowing which IP address it should contact to fulfill your page request.
DNS Hierarchy
The domain name syntax is hierarchical. A hierarchy is an arrangement of elements in a
ranking of inclusiveness or superiority. As you can see below, the beginning is the root
(signified by the period/dot). This is the top of the DNS hierarchal tree.
The root is divided into familiar domains (.com, .org, .edu). From there, you can see we can
travel into several subdomains. Novell offers an example furthering this understanding:
The hierarchy that DNS utilizes allows for this system to scale to solve larger problems!
DNS was not completely secure because it did not include security
based on the the information that it contains (like host names and IP
addresses).
581
UTeach CS Principles Unit 6: Innovative Technologies
UNIT TOPIC:
The Internet
World Wide Web
You will analyze the impact of hyperlinked documents on how individuals find, acquire,
and learn new information.
You will analyze the legal, social, and commercial impact that the World Wide Web has
had on society.
582
UTeach CS Principles Unit 6: Innovative Technologies
World Wide Web
WWW
How many times have you seen www. at the start of a URL? It is so ubiquitous that many
web browsers and web sites will insert it into the URL even if you do not type it. But www.
is a special part of a domain’s address indicating that it is a server hosting content designed
to meet the standards of the World Wide Web. And almost every online service you likely
use is a part of the World Wide Web.
In fact, the World Wide Web is one of those things that most of us use on a regular basis
without ever thinking about how it works or what problems it was originally created to solve.
But, since its inception in the early 1990s, the Web has proven to be one of the most
revolutionary and empowering inventions in history.
Not to be confused with the broader concept of the Internet, the World Wide Web, itself, is a
content-oriented ecosystem that has been built atop the globally networked infrastructure of
the Internet. It was designed primarily to provide an open platform that could provide uses
from all over the world a standard and accessible means of communicating and sharing
information online.
Berners-Lee proposed that a standardized set of protocols and tools be developed that
might help to ease the integration of these disparate computing systems and to facilitate
583
improved communications between them. In short, he wanted to employ the ideas of
abstraction to design a more generalized means of sharing information across the Internet
that was independent of any particular hardware or software that a user might be using. It is
also to keep in mind that abstractions can be combined. Lower-level abstractions can be
blended to make higher-level abstractions, such as short message services (SMS) or email
messages, images, audio files, and videos.
As a result of his efforts, Berners-Lee created the set of fundamental tools and technologies
that make up what we now more familiarly know of as the World Wide Web*.
Web Applications:
Web Technologies:
On August 6, 1991, Berners-Lee brought the world’s first web site online. It ran on a NeXT
cube computer located in his lab at CERN and prominently displayed a sticker on the front of
the machine which read, “This machine is a server. DO NOT POWER IT DOWN!!”
584
*Interestingly enough, “World Wide Web” was not the only name that Berners-Lee
considered when choosing a name for his creation. He almost named it one of his other
ideas: Information Mesh, Mine of Information, or Information Mine. Consider how the Web
the Mine might look today with URLs like moi.google.com or moi.facebook.com
instead of our familiar www. prefix.
Hyperlinks
One of the key features that Berners-Lee incorporated into his invention is the use of
hyperlinks to connect documents with one another in a non-linear way. While the pages of a
book are arranged linearly in sequence (e.g., page 1, page 2, page 3, etc.), there is no such
sequencing of documents in the World Wide Web. Instead, like the multiply connected
computers of the Internet, the Web consists of a collection of massively interconnected
pages of content.
Each web page is effectively a single, text-based document that has been “marked up” with
embedded formatting instructions known as HTML (Hypertext Markup Language) tags. Each
of these electronic documents are stored on a computer running a web server. The location
of the file within the computer’s file system corresponds to the documents URL (i.e., the
address of the web page).
A hyperlink is a clickable bit of text, image, or other on-screen element within an HTML
document that a user can select to request another, related document. Each link is designed
to enable the user to selective seek out, or browse, from one document to the next, following
whatever sequence they choose. This non-linear approach to organizing and connecting
information has created an unlimited number of new ways that people can find, learn, and
consume information.
The above example, would produce the following hyperlinked text within a web page:
You can search for something, tweet a comment, or like a friend’s post at
these popular sites.
Here, you can see that “search,” “tweet,” and “like” have each been formatted to act as
hyperlinks (linking to Google, Twitter, and Facebook, respectively). Each hyperlink is
denoted with the use of an anchor ( <a>...</a> ) tag that frames the text being linked
(e.g., “search,” “tweet,” and “like"). Each anchor tag includes the URL of the other page or
site that the hyperlink is referencing (e.g. href="..." ).
When a user clicks on any of these links, the web browser sends a request to the
corresponding web server for the specified page (as referenced in the href tag).
1. Using your preferred search engine (Google, Bing, DuckDuckGo, etc.), conduct a
search for your own name.
2. Record the URL of the first link that your search returns.
3. Visit that URL and count and record the total number of different links that you can find
on that page.
4. Also record the URLs of up to three more of the hyperlinks on that page.
5. Continue repeating this process counting and recording hyperlinks for each URL you
record for at least two more levels.
Using your findings, estimate the total number of different pages that could be reached if you
were to start at the URL found from your original “vanity search” (i.e., searching for your own
name) and followed a series of five clicks. What about 10 clicks? 20 clicks?
586
587
UTeach CS Principles Unit 6: Innovative Technologies
BIG PICTURE:
Net Neutrality
Highlights
You will discuss and explore the issues on both sides of the net neutrality issue.
588
UTeach CS Principles Unit 6: Innovative Technologies
Net Neutrality
"A Series of Tubes"
"The Internet is not something that you just dump something on.
It’s not a big truck. It is a series of tubes.” —Sen. Ted Stevens
For many Americans, the term “net neutrality” first rose to their level of awareness as a
result of the widespread ridicule in the popular media received by remarks made on the
Senate floor on June 28, 2006. While addressing a Senate commerce committee discussing
pending amendments to a telecommunications bill, Alaskan senator Ted Stevens secured
his Internet fame with a rambling, 11-minute speech, highlighted by an unfortunately chosen
“series of tubes” analogy.
https://www.youtube.com/embed/f99PcP0aFNE
While Senator Stevens’ remarks (and the numerous Internet memes that they inspired)
helped to raise public awareness of the so-called net neutrality debate, they did little to
clarify the issue or inform the public about the actual stakes involved.
In simple terms, net neutrality concerns itself with ensuring equal and unrestricted access to
all legal content that is available throughout the Internet. In principle, it argues that Internet
service providers (ISP) and other gateway services that connect individual users to the
wealth of content and services available across the network should not discriminate against
or favor certain content over other content.
Common Carriage
Ultimately, the net neutrality issue comes down to the question of how government
legislation should or should not regulate the access to and delivery of Internet services. Is
the Internet a public utility, like telephone, water, or electricity, where the good of the
589
community entitles all individuals to unrestricted access to these essential services? Or is
the Internet more like a commodity or specialized luxury that consumers merely choose to
purchase or not?
Historically, the FCC has considered the Internet to be an “information service.” That is, it
treats networked information as a product that can be optionally purchased or subscribed to,
rather than pure communications between two parties (i.e., more like a magazine or
newspaper than a phone call or letter). As such, the FCC (as in the Federal Communications
Commission) has, until recently, taken a rather hands-off view of the Internet since it was not
strictly regarded as a communication medium.
However, many proponents for net neutrality argue that the ways that people actually use
and rely upon access to the Internet are much more analogous to other “telecommunications
services” like telephone service. That is, the Internet is merely a conduit through which users
connect to a remote service to send and receive communications in the same way that a
phone line is a conduit, managed by a phone company, for communication between two
callers.
In order to ensure that such party-to-party communications are unrestricted, unfiltered, and
uncensored, legislation specifically regulates how telecommunications services operate and
imposes limits on what they can and cannot do with the data and information passing
through their channels. These services are designated as common carriers—a classification
for a person or company that transports goods or people for another party and that is
responsible for any possible loss of the goods during transport.
For example, telephone companies have been granted common carrier status. This means
that while they are obligated to provide connection services between telephone users (i.e.,
carry communications), they are restricted from interfering, altering, filtering, or accessing
the content of those calls. Similarly, this common carrier status also protects telephone
companies from any liability for the communications they transmit (e.g., a telephone call
between two individuals planning a crime does not make the telephone company a co-
conspirator or accomplice to the crime). As long as a common carrier treats all
communications equally, with the same hands-off policy, it is not responsible for the
misdoings of its users.
590
“telecommunications,” the FCC is able to apply many of the same precedents to the Internet
as it does to other, more well-established telecommunications services.
https://www.youtube.com/embed/uKcjQPVwfDk
During his presidency, Barack Obama served as a strong proponent of the net neutrality
principles. In his recommendations to the FCC, Obama advocated for the creation of new
rules and policies designed to ensure that no single company offering access to Internet
services can ever act as a gatekeeper that limits, filters, or censors what users can do, say,
or see online.
The president’s recommendations reflected the expectations that the majority of users
already hold concerning their Internet access and usage, specifically addressing the
following assumptions:
591
No blocking: All legally available content should be accessible by any user. ISPs
should not be allowed to selectively restrict access to or censor content requested by
its users.
No throttling: ISPs should not intentionally slow down or degrade transmissions to
certain sites or online services. All content should be treated equally.
Increased transparency: FCC rules should apply to stages of Internet transmission
and not be limited only to ISPs and their connection to individual users.
No paid prioritization: In order to promote the growth of Internet-related businesses
of all sizes, all sites and online services should have equal access to the Internet and
should compete on a level playing field. No service should be handicapped simply
because it cannot afford to pay for better service.
"Imagine if the phone company could mess with your calls every
time you tried to order pizza from Domino’s, because Pizza Hut is
paying them to route their calls first. The phone company isn’t
allowed to do that, and, for a while, the FCC said broadband
providers couldn’t either.”—ACLU
For example, Christopher Yoo, a legal scholar who specializes in communication and
computer and information sciences, offers a number of reasons why net neutrality, as
proposed, is misguided and potentially counterproductive.
One of Yoo’s arguments centers on the issue of data discrimination—the notion that an ISP
might treat different types of data or data from different sources differently, potentially
favoring one type of data while restricting or filtering another type.
592
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
593
UTeach CS Principles Unit 6: Innovative Technologies
UNIT TOPIC:
Interconnectedness in Computing
Internet of Things
You will explain how computing innovations affect communication, interaction, and
cognition.
You will examine how sensor networks function.
You will examine the societal benefits and threats of smart devices.
594
UTeach CS Principles Unit 6: Innovative Technologies
Internet of Things
Are they taking over?
You have probably heard a conversation about machines rising up and controlling
humankind—probably from an older relative or paranoid aquaintance. Glorified by the
entertainment industry through movies (I, Robot, The Matrix, and Transformers) and books
(Cinder: Book One in the Lunar Chronicles, Robopocalypse, and Do Androids Dream of
Electric Sheep?), the fear of robots “taking over the world” has been a concern for many
individuals. With the advent of many of our smart technologies, like sensor networks and
Global Positioning Systems (GPS), convenience has triumphed over necessity in many of
our lives. Our knowledge concerning how these smart technologies work and the securities
that surround them may be our saving grace if/when the machines decide to turn against
their makers (us).
Internet of Things
595
The Internet of Things is the interconnection via the Internet of computing devices
embedded in everyday objects, enabling them to send and receive data. All of the devices
that are connected using the protocols of the Internet make up the Internet of Things. We
humans have found these devices useful for tasks such as communication, navigation, and
health care. For example, GPS and related technologies have changed how humans travel,
navigate, and find information related to geolocation. Before GPS, navigation would be
planned in advance—spontaneous changes would be difficult to make.
So, will these smart devices become too intelligent through their interconnectedness and
take over the world? Your sibling has the same question...
Instructions
Recently, your younger sibling watched I, Robot and is now scared to go to bed because
they think the “robots are going to attack in the night.” Compose a short (between one and
two minutes) presentation on the medium of your choice (Microsoft PowerPoint, Google
Slides, etc.) to help your younger sibling feel brave about going to bed. If you cannot
convince them, your parents said that your younger sibling would have to sleep in your room
(and they probably snore).
596
visit the Purdue Online Writing Lab.
3. Submit your presentation to your teacher through the specified method. Make sure to
explain these topics in a way that a younger sibling would understand.
597
UTeach CS Principles Unit 6: Innovative Technologies
Ethics of Autonomous Technology
Are Machines Ethical?
Does the question of whether machines are ethical even make
sense? Can a machine have ethics? From phones to cars to
home-automated devices, as our technology becomes
increasingly “smart” and capable of managing many of our
daily tasks for us, these questions begin to become more
relevant.
Humans are clearly capable of ethical behavior. As such, each of our actions is performed
within the context of what we consider safe, fair, just, and ethical. But as new forms of
technology take over these tasks, do they (or even can they) exhibit the same level of ethical
awareness that we demonstrate ourselves?
You witness a runaway trolley (i.e., a streetcar) barreling down the street.
In its path, you notice five people stuck on the tracks and unable to move
out of the way before they are struck by the trolley.
Fortunately, you have time to pull a lever that will redirect the trolley onto
a parallel track where there is only one person in its path.
Most people tend to agree that pulling the lever is the better choice since it lessens the loss
of life from five victims to only one. However, an alternate scenario that also results in only
one loss of life is not quite so clear-cut. Instead of a lever and a parallel track, consider the
following variation:
598
This time, however, you notice a very fat man standing next to the tracks.
You quickly conclude that shoving the fat man into the path of the
speeding trolley will derail and stop the trolley while killing the fat man, but
save the five people farther down the track.
How is this scenario different from the first? And why do so many people say that they could
not sacrifice the fat man while they were perfectly willing to sacrifice the sole person
standing on the parallel track? It is quite a dilemma and shows just how difficult it can be to
know what the right solution is in various scenarios.
In these hypothetical thought experiments, it may be easy to dismiss the ethical challenges
because the situations are not real and there are no real victims, only imaginary victims. But
with the advent of autonomous vehicles, the potential victims are very real and the
programmers who design the algorithms that dictate how a car chooses between different
options must deal with these very real issues. The code that they write might literally mean
the difference between life and death.
https://www.youtube.com/embed/ixIoDYVfKA0
However, with autonomous vehicles, this question becomes a bit more nebulous. If self-
599
driving car A causes an accident, who is the driver of record? The car itself? The owner of
the car that caused the accident? The passenger(s) of the car who instructed the car to drive
them through that intersection? The carmaker that designed and sold the car as a safe
vehicle? The team of programmers whose code made the decision that led to the accident?
Or is the responsibility shared between some number of these parties? And if so, which
parties and in which proportions?
These are not easy questions to answer and the laws regarding autonomous vehicles (or
other autonomous devices) have traditionally not kept up with the pace of new technologies.
Inevitably, the necessary laws will be written and many of these questions will be resolved,
as they have with other technological advances of the past. But for now, things are anything
but clear.
But this issue does introduce a fascinating complication of any technological advancement
that impacts the full spectrum of individuals involved in the application of new technology. At
the coding and design end of the spectrum, programmers and engineers need to be well-
versed in ethics and the law, especially as they attempt to model such behavior in the
decision-making algorithms of their devices. And on the opposite end of the spectrum,
lawmakers and legal authorities need to have a solid understanding of technology and its
capabilities and limitations in order to write sound and enforcible legislation.
Exercise
As legislators draft new laws and automakers develop new autonomous vehicle
technologies, both groups often rely upon a consensus of expert opinions with regard to the
ethics and responsibilities of these new innovations. They use these agreed-upon sets of
policies and standards to shape and guide their development of new technologies and the
laws that apply to them.
Initially, as an entire class, you will discuss the issues related to autonomous vehicles and
brainstorm ideas and the goals that you wish your policies to achieve. Next, you will divide
into small groups, with each group focusing on the development of one or two policy ideas in
detail. Finally, the groups will share their results with the full class, who will then discuss,
revise, and select the various policy ideas that will make up the standards for the class’
overall recommended autonomous vehicle policy.
600
As you develop and discuss your new policies, you should make sure to generalize your
definitions such that they can apply to a broad variety of scenarios. It would be helpful to
consider examples of similar scenarios to which your policy should apply and those to which
it should not apply. Your policies should include the following elements:
601
UTeach CS Principles Unit 6: Innovative Technologies
UNIT PROJECT:
Future Technology
Highlights
You will collaborate in pairs to envision and design a future innovation in technology.
You will discuss and identify a specific purpose that your innovation will serve (e.g.,
entertainment, problem solving, education, artistic expression, etc.) and its key
features.
You will evaluate the potential benefits and risks of your innovation.
You will identify existing technological resources that your innovation may utilize.
You will identify technological challenges that must be overcome before your
innovation can be fully realized.
You will develop a mock-up of your innovation that demonstrates its use and
functionality.
You will write a detailed product description and deliver an elevator pitch to the class
detailing the features of your innovation and its potential impact on society using
appropriate terminology.
You will provide written feedback to your peers on the potential of each collaborative
team’s design.
602
UTeach CS Principles Unit 6: Innovative Technologies
Future Technology Project: Rubric
Check
Feedback
This activity provides you and a partner group time to provide one another critical feedback
about the progress of your projects.
Listen to what your partner group says! You do not need to heed all of their advice, but
consider it wisely. The whole idea is to have a critical third party provide ideas to make your
project better before you submit the final version for a grade.
603
UTeach CS Principles Performance Tasks
AP CSP Course Assessments
Exam Overview
The following information is available in the full AP Computer Science Principles Course and
Exam Description. See page 105 for a detailed overview of the exam and performance task
requirements.
View PDF
604
UTeach CS Principles Performance Tasks
By this point in the course, all of the projects, exercises, and classroom
discussions from the previous six units will have provided students with
extensive, hands-on experience with the exploration, use, and creation of
computational artifacts in a variety of contexts. In this unit, students will draw
upon those collective skills to demonstrate mastery of essential course
concepts by completing the Explore and Create Performance Tasks that make
up the AP through-course assessment.
605
UTeach CS Principles Performance Tasks
The Explore Performance Task
Guidelines
Performance Task Instructions
Refer to the following assignment description and submission requirements as you complete
the Performance Task: Explore—Impact of Computing Innovations. These documents
can be found in the AP Computer Science Principles Course and Exam Description, starting
on page 108.
View PDF
606
UTeach CS Principles Performance Tasks
The Explore Performance Task
Chief Reader Report
Refer to the following document to help understand previous and common mistakes as you
complete the Performance Task: Explore—Impact of Computing Innovations. This
resource (formerly the Student Performance Q&A) features commentary from the Chief
Reader on how students performed on the FRQs, typical student errors, and the specific
concepts and content students struggled with the most in 2017.
View PDF
607
UTeach CS Principles Performance Tasks
The Explore Performance Task
Explore PT—Scoring Guidelines
The scoring guidelines for the Explore Performance Task is shown below:
View PDF
Sample Response A
Sample Response B
Sample Response C
Sample Response D
Sample Response E
608
UTeach CS Principles Performance Tasks
The Explore Performance Task
Explore PT – Sample Response A
ARTIFACT
Video link
WRITTEN SUBMISSION
View PDF
All of the information to help students prepare for the successful completion of the Explore
Performance Task has been provided by College Board on the AP Computer Science
Principles Exam Page.
609
610
UTeach CS Principles Performance Tasks
The Explore Performance Task
Explore PT – Sample Response B
ARTIFACT
View PDF
WRITTEN SUBMISSION
View PDF
All of the information to help students prepare for the successful completion of the Explore
Performance Task has been provided by College Board on the AP Computer Science
Principles Exam Page.
611
UTeach CS Principles Performance Tasks
The Explore Performance Task
Explore PT – Sample Response C
ARTIFACT
View PDF
WRITTEN SUBMISSION
View PDF
All of the information to help students prepare for the successful completion of the Explore
Performance Task has been provided by College Board on the AP Computer Science
Principles Exam Page.
612
UTeach CS Principles Performance Tasks
The Explore Performance Task
Explore PT – Sample Response D
ARTIFACT
View PDF
WRITTEN SUBMISSION
View PDF
All of the information to help students prepare for the successful completion of the Explore
Performance Task has been provided by College Board on the AP Computer Science
Principles Exam Page.
613
UTeach CS Principles Performance Tasks
The Explore Performance Task
Explore PT – Sample Response E
ARTIFACT
View PDF
WRITTEN SUBMISSION
View PDF
All of the information to help students prepare for the successful completion of the Explore
Performance Task has been provided by College Board on the AP Computer Science
Principles Exam Page.
614
UTeach CS Principles Performance Tasks
By this point in the course, all of the projects, exercises, and classroom
discussions from the previous six units will have provided students with
extensive, hands-on experience with the exploration, use, and creation of
computational artifacts in a variety of contexts. In this unit, students will draw
upon those collective skills to demonstrate mastery of essential course
concepts by completing the Explore and Create Performance Tasks that make
up the AP through-course assessment.
615
UTeach CS Principles Performance Tasks
The Create Performance Task
Guidelines
Performance Task Instructions
Refer to the following assignment description and submission requirements as you complete
the Performance Task: Create—Applications from Ideas. These documents can be found
in the AP Computer Science Principles Course and Exam Description, starting on page 113.
View PDF
616
UTeach CS Principles Performance Tasks
The Create Performance Task
Chief Reader Report
Refer to the following document to help understand previous and common mistakes as you
complete the Performance Task: Create—Applications from Ideas. This resource
(formerly the Student Performance Q&A) features commentary from the Chief Reader on
how students performed on the FRQs, typical student errors, and the specific concepts and
content students struggled with the most in 2017.
View PDF
617
UTeach CS Principles Performance Tasks
The Create Performance Task
AP Computer Science Principles Create Definitions
Refer to the following document to help understand definitions and concepts as you
complete the Performance Task: Create—Applications from Ideas. This document was
created to help you understand some of the expectations that may not be clearly stated or
defined in the Create Performance Task.
View PDF
618
UTeach CS Principles Performance Tasks
The Create Performance Task
Create PT—Scoring Guidelines
The scoring guidelines for the Create Performance Task is shown below:
View PDF
Sample Response A
Sample Response B
Sample Response C
Sample Response D
Sample Response E
Sample Response F
Sample Response G
Sample Response H
Sample Response I
Sample Response J
619
UTeach CS Principles Performance Tasks
The Create Performance Task
Create PT – Sample Response A
VIDEO OF PROGRAM
Video link
WRITTEN SUBMISSION
View PDF
All of the information to help students prepare for the successful completion of the Create
Performance Task has been provided by College Board on the AP Computer Science
Principles Exam Page.
620
621
UTeach CS Principles Performance Tasks
The Create Performance Task
Create PT – Sample Response B
VIDEO OF PROGRAM
Video link
WRITTEN SUBMISSION
View PDF
All of the information to help students prepare for the successful completion of the Create
Performance Task has been provided by College Board on the AP Computer Science
Principles Exam Page.
622
623
UTeach CS Principles Performance Tasks
The Create Performance Task
Create PT – Sample Response C
VIDEO OF PROGRAM
Video link
WRITTEN SUBMISSION
View PDF
All of the information to help students prepare for the successful completion of the Create
Performance Task has been provided by College Board on the AP Computer Science
Principles Exam Page.
624
625
UTeach CS Principles Performance Tasks
The Create Performance Task
Create PT – Sample Response D
VIDEO OF PROGRAM
Video link
WRITTEN SUBMISSION
View PDF
All of the information to help students prepare for the successful completion of the Create
Performance Task has been provided by College Board on the AP Computer Science
Principles Exam Page.
626
627
UTeach CS Principles Performance Tasks
The Create Performance Task
Create PT – Sample Response E
VIDEO OF PROGRAM
Video link
WRITTEN SUBMISSION
View PDF
All of the information to help students prepare for the successful completion of the Create
Performance Task has been provided by College Board on the AP Computer Science
Principles Exam Page.
628
629
UTeach CS Principles Performance Tasks
The Create Performance Task
Create PT – Sample Response F
VIDEO OF PROGRAM
Video link
WRITTEN SUBMISSION
View PDF
All of the information to help students prepare for the successful completion of the Create
Performance Task has been provided by College Board on the AP Computer Science
Principles Exam Page.
630
631
UTeach CS Principles Performance Tasks
The Create Performance Task
Create PT – Sample Response G
VIDEO OF PROGRAM
Video link
WRITTEN SUBMISSION
View PDF
All of the information to help students prepare for the successful completion of the Create
Performance Task has been provided by College Board on the AP Computer Science
Principles Exam Page.
632
633
UTeach CS Principles Performance Tasks
The Create Performance Task
Create PT – Sample Response H
VIDEO OF PROGRAM
Video link
WRITTEN SUBMISSION
View PDF
All of the information to help students prepare for the successful completion of the Create
Performance Task has been provided by College Board on the AP Computer Science
Principles Exam Page.
634
635
UTeach CS Principles Performance Tasks
The Create Performance Task
Create PT – Sample Response I
VIDEO OF PROGRAM
Video link
WRITTEN SUBMISSION
View PDF
All of the information to help students prepare for the successful completion of the Create
Performance Task has been provided by College Board on the AP Computer Science
Principles Exam Page.
636
637
UTeach CS Principles Performance Tasks
The Create Performance Task
Create PT – Sample Response J
VIDEO OF PROGRAM
Video link
WRITTEN SUBMISSION
View PDF
All of the information to help students prepare for the successful completion of the Create
Performance Task has been provided by College Board on the AP Computer Science
Principles Exam Page.
638
639
UTeach CS Principles Unit A1: Artificial Intelligence: Turing Test
UNIT A1
Artificial Intelligence: Turing Test
640
UTeach CS Principles Unit A1: Artificial Intelligence: Turing Test
Turing Test Project
“A computer would deserve to be called intelligent if it could
deceive a human into believing that it was human.” – Alan Turing
https://www.youtube.com/embed/DccwpcoWY0w
Introduction
As computation becomes more efficient, software is rapidly replacing human workers in the
labor market. Robots now work assembly lines, the Internet has replaced the postal service
as the primary means for distributing written communication, and electronic kiosks have
replaced information desks and even, in some cases, waiters. These are examples of
physical labor—but what about mental labor? Can computer software ever rival human
intelligence? Will computers replace doctors, educators, or even artists?
Assignment
Your job is to write a protocol for conducting a Turing Test that can distinguish artificial
intelligence from human intelligence using only text-based chat.
Submission
Submit a write-up of your Turing Test protocol experiment. The protocol should include the
following:
an introduction that states the purpose of the test and its underlying principles,
at least 10 detailed directives for someone to complete in order to identify artificial
intelligence through text-based chat (these steps should be so detailed that any two
people performing the Turing Test would do so in exactly the same way),
quotations used as necessary to indicate exact text that must be used (e.g., questions
641
to ask),
example responses for each of the questions, describing in detail how text responses
should be analyzed (e.g., differences in expected and actual output for a given input),
and
an analysis of the strengths and weaknesses of the protocol.
Learning Goals
You will be able to:
Criteria Points
TURING TEST PROTOCOL
Describes clear and concise steps for the administration of
3 pts
your Turing Test experimental design.
Explains how the final Turing Test design assures that the
3 pts
experiment is effective and generalizable.
Uses key terminology from the Artificial Intelligence glossary
2 pts
appropriately and effectively.
TURING TEST RESULTS
Provides background information for the analysis of the
3 pts
chatterbots’ AI.
Provides a detailed, sequential account describing the
administration of your Turing Test to two different 2 pts
chatterbots.
TURING TEST EVALUATION
Describes the two chatterbots, analyzing their strengths and
weaknesses, and pattern recognition and manipulation 4 pts
abilities.
Includes specific examples as evidence. 2 pts
Commits few, if any, grammatical, organizational, or spelling
1 pt
errors.
642
TURING TEST SYNTHESIS
Your Scratch program:
Responds generally to user input. 4 pts
Responds specifically to selective phrases. 2 pts
Utilizes previous responses (memory). 2 pts
Includes useful and clear documentation. 2 pts
TOTAL 30 pts
643
UTeach CS Principles Unit A1: Artificial Intelligence: Turing Test
What Is A Chatterbot?
What Is a Chatterbot?
Q: What is a chatterbot?
For example, in Wargames (1983), “a young man finds a back door into a military central
computer in which reality is confused with game-playing, possibly starting World War III.”
(IMDB)
https://www.youtube.com/embed/D-9l5jSDL50
Recently, freakishly realistic telemarketing robots are denying they’re robots. Follow the link,
read the article and listen to the sound recordings. Would you be convinced by the “real”
644
telemarketer? Read the update to the story (as linked from within the article). How does this
change things?
Assignment
1. Play with and analyze the strategies used by these three different chatterbots for a few
minutes each:
1. Eliza
2. Cleverbot
3. Program-O
2. Take notes on the strengths/weaknesses of each.
3. After you have tested and recorded your notes for each of the chatterbots, brainstorm
a response to the question, "What are some common strategies that might ’break’ the
system and expose the AI as not human?”
645
UTeach CS Principles Unit A1: Artificial Intelligence: Turing Test
Black-Box Testing Chatterbots
Black-Box Testing
When using a software application, it is rare for the user to read—or even have access to—
the source code. The only real observable components of the system are what goes in (i.e.,
input, such as clicking, mouse movement, data entry) and what comes out (i.e., output, such
as video, printout, gameplay, or text on the screen). The software itself is no different than a
featureless “black box.” The user may have no idea how the software works, but she
definitely can discern what it does.
Well, errors are often caught when a system is used in a way not originally intended, or
when specific boundary cases occur. In order to find unknown errors, it is often useful to
abstract away the inner functionality of a system (i.e., how it works) and concentrate on its
input and output (i.e., what it does). Comparing expected and actual results can spotlight
which component of the system is at fault.
Activity
In this jigsaw activity, you will experiment with different chatterbots in order to better
understand their behavior patterns.
1. Work with team members from other groups to perform a black-box test on one of the
chatterbots below. To black-box test the chatterbot, ask it a list of questions, and record
and compare its responses to possible reasonable answers. Using these responses,
form a theory of how the chatterbot works.
2. Share what you learned about that particular chatterbot with your project group.
3. Outline a general method for discovering patterns in a chatterbot’s automated
reasoning.
Jigsaw instructions
646
Your teacher will assign each member in your
group a number. You will sit with students from
other groups assigned the same number (e.g.,
students assigned #1 sit together). Each
numbered group will analyze the chatterbot that
corresponds with its number:
1. Eliza
2. Cleverbot
3. Program-O
For example:
Hypothesis
I hypothesize that Dr. Romulon is not easily able to parse the negation
when phrased as do YOU not . It seems to pick up the pattern of
don’t —which is a single word. Perhaps it would understand do not
if it were collocated (i.e., occur in the same location) in place of don’t ,
even though that is awkward sounding to a human speaker.
Test
647
’Bot: There are many things still mysterious to me. I am just beginning.
’Bot: There are many things still mysterious to me. I am just beginning.
Conclusion
The hypothesis is borne out by this pattern. Of course, this is just another
anecdote, and it may be that other cases of “do YOU not” are hand-coded
to work properly.
1. Share what you learned about each chatterbot with your team members.
2. Outline a method for discovering patterns in any chatterbot’s automated
reasoning.
Include specific steps.
Test your method. Does the method generate patterns in all of the chatterbots’
responses? If not, which ones and why?
Revise your methods of how the chatterbots function to accommodate any new
discoveries, if necessary.
Be prepared to discuss your work with the class.
1. “What were some question/answer pairs that gave you clues about how the
chatterbots worked?”
2. “Which chatterbot was the most effective? Why?”
3. “How might you expose the chatterbot as non-human? Would this strategy always
work?”
4. “What could you do differently in creating a chatterbot, so that it might seem more
human?”
Unspecified Input
Common Misconception
This is false. An easy way to test this is to substitute out one word for another and compare
648
responses. To illustrate, consider the following trivial example:
"Dave.” "Frank.”
649
UTeach CS Principles Unit A1: Artificial Intelligence: Turing Test
Consider L’il Johnny McPixel TM
Protestors have been picketing in Washington, D.C. for the rights of their McPs.
The McP is not a central program shared by all users; each user has its own instance.
The McP centrally incorporates data from all users and pushes out updates to each
device incorporating new changes.
The McP is designed to perform personal assistant tasks (e.g., make appointments,
set reminders, perform searches, answer questions). The McP is also outfitted to be
conversational, learning by imitation of human interactions worldwide.
Each individual McP contains both a centralized knowledge base and an individual
knowledge base. It can learn aspects of the local environment (e.g., its users’ favorite
colors, pet peeves, habits).
Different McP instances develop different mannerisms as reinforced by their users.
This includes use of slang, general tone, and demeanor.
The McP is not designed as a general, all-purpose AI, but rather to perform certain
tasks. However, some users have explained that the imitative, conversational nature of
their McPs includes human-like discussion of virtually any topic. Through imitation,
McPs have learned to indicate personal preferences, including stances on political
platforms/candidates.
650
McPs do not smell, taste, or feel. They are limited to acquiring information through
sound and sight from external sensors, as well as Wi-Fi information exchange.
Any McP can be reverted to a prior version, whereupon it loses any learned
functionality that is no longer supported.
651
UTeach CS Principles Unit A1: Artificial Intelligence: Turing Test
The Humanity of L’il Johnny McPixel ? TM
Your teacher will share the exact debate protocol with you before the debate begins.
Remember to remain open-minded and to listen to the opposing view and respond
appropriately.
652
UTeach CS Principles Unit A1: Artificial Intelligence: Turing Test
Turing Test Project: Draft
In the classic 1982 science fiction movie Blade Runner, Deckard is a police officer who
serves as a type of AI “judge.” In the future, blade runners test for and identify potentially
homicidal human “replicants” (androids). Watch this clip to get a better understanding of
what exactly a blade runner like Deckard does in this envisioning of the future:
https://www.youtube.com/watch?v=Yk3yQ5ktvWw
In essence, the officers in Blade Runner perform Turing Tests, but have access to
physiological responses for analysis (like the “Voight-Kampff Test" in the movie). Of course,
Turing Tests today rely only on text-based chat, as computers have no physiology.
653
Example: After completing Example 1, follow with “Why would you like to meet
him?”
Assignment
Your job is to write a draft protocol for a Turing Test that can distinguish human from artificial
intelligence using only text-based chat. The protocol should include the following:
1. An introduction that states the purpose of the test and its underlying principles.
2. At least 10 detailed directives for someone to complete in order to identify artificial
intelligence only through text-based chat.
1. These steps should be so detailed that any two people performing the Turing
Test would do so in exactly the same way.
2. Use quotations as necessary to indicate exact text that must be used (e.g.,
questions to ask).
3. Describe in detail how text responses should be analyzed (e.g., differences in
expected and actual output for a given input).
3. An analysis of the strengths and weaknesses of the protocol.
654
UTeach CS Principles Unit A1: Artificial Intelligence: Turing Test
Creating Intelligence
Turning Analysis into Synthesis
Now that you have spent some time analyzing chatterbots, let’s direct what you have
learned toward synthesizing a new chatterbot.
Cleverbot requires sophisticated machine learning techniques and vast amounts of data to
achieve its realism. However, simple manipulation of words and phrases can go a long way,
as seen with Eliza. It is likely that the chatterbot you create will be more like Eliza than
Cleverbot.
Each of these will get you started, but your final product will likely reflect the personality and
tone that you create. Will your chatterbot be sassy? polite? aloof? The choice is yours!
655
UTeach CS Principles Unit A1: Artificial Intelligence: Turing Test
Turing Test Project: Experiment
What happens when one chatterbot converses with another chatterbot? Do they recognize
each other as artificial intelligences?
https://www.youtube.com/embed/zWuw7tWbrlw
As you can see, the female chatterbot immediately recognized that the male conversant was
a chatterbot. Perhaps the female recognized certain responses, or the answers from the
male were too mechanical-sounding, or perhaps the male was inhumanly polite.
Assignment
Now it’s your turn! Your job is to apply your own Turing Test protocol to the three chatterbots
from the Black-box Testing Chatterbots activity:
1. Eliza
2. Cleverbot
3. Program-O
First, download the Turing Test—Practice spreadsheet where you may record the results of
your Turing Test experiments.
There are five individual worksheets in the spreadsheet. Each of these sheets describes a
suggested strategy for you to try with each chatterbot. Experiment with different questions
within each category and see how the chatterbot responds. Try to ask as many questions as
possible, and ask each chatterbot the same questions to gauge how they respond
differently. Think about the following questions in your analysis:
656
Does the order in which questions are asked make a difference?
Which questions are more effective for your purpose?
Are there useful questioning strategies you have discovered that do not fit in the five
pre-defined categories?
Class discussion
657
UTeach CS Principles Unit A1: Artificial Intelligence: Turing Test
Experimenting with Visual
Identification
In pairs, you will experiment with a process used in AI to define data by features. Features
are used by AI algorithms to classify data through similarities and differences with other data
in a dataset.
Activity
1. Pair up in groups of two. One person is designated Student A and the other Student B.
2. Student A will choose an image of a single animal, and define one attribute of the
image (e.g. pink nose, etc.).
3. Student B will find an image online that matches the attribute Student A chose.
4. Determine if they are the same type of object (not necessarily the same image,
however).
5. Student A will continue to add attributes until the objects match. Students will keep
track of how many attributes were necessary in order to make the objects match.
6. After the objects match, switch roles and repeat the experiment. As you continue the
experiment, improve upon selecting attributes that are the most useful.
Your first attribute for distinction may be size. Although this is a starting point for these
objects, what if, instead of an airliner, it’s a paper airplane?
Now the size distinction is irrelevant. Next, you choose the attribute movable wings.
658
Object X is this and Object Y is that because they differ in these ways (features a, b,
c,...).
Class Discussion
659
UTeach CS Principles Unit A1: Artificial Intelligence: Turing Test
Multi-modal Artificial Intelligence
Although each sensory system (e.g., visual, auditory, olfactory, tactile, gustatory) is generally
thought of as distinct, they each contribute toward cognitive and linguistic function. Many
times, the knowledge acquired through these different systems is integrated. Consider the
following concept:
CAT
貓
gato
Click for audio.
Each of the multimedia above are linked to the same concept, though they engage different
modalities. Additionally, some people may link different subsets of these media to the
concept. For example,
a Chinese speaker who only speaks Chinese may not recognize cat or gato,
an illiterate person may not recognize any of the words, and
a deaf person may not incorporate the audio clip into his/her concept.
In order for the robot to carry out the command, it must use different modes of acquiring data
and must understand what you are saying by using the following:
660
the words mean when put together.
Pragmatics: Ideally, the interface should infer intention—that this is more than a
descriptive sentence, but a command directed to the robot.
Assuming the linguistic system is sufficient and the robot understands the command, the
robot must also physically carry out the command. This requires integration of sensory and
motor functions:
Visual recognition: Identify the object to be interacted with; in this case, the cat.
Note that a connection is established here between the word cat and a visual of a cat. But,
what if the object looks like a cat, but isn’t. For example the Corgi below is not a cat, but
looks similar to one:
How might the system disambiguate dog from cat when they are visually similar? What
other sensory systems might contribute?
The robot will also need to execute motor skills in order to physically pet the cat. This
includes:
While this task may seem simple, remember that a robot has to know important information
in order to perform this task. Consider all of the additional nuances involved in the simple
statement “Pet the cat.” How long does the robot pet the cat? How much pressure should it
use? Where does it pet the cat? What if the cat moves when the robot approaches? It is a
far more complex task than the three-word command implies.
Common Misconception
661
IF ’walk’ THEN execute step...
This is not quite true, and the distinction between the statements
The AI sub-field of machine learning is concerned with systems that learn connections
among data from data. The programming required is not oriented toward manipulating
known data as much as implementing strategies to learn from patterns in the data. For
example, rather than program a robot to recognize objects with pointy ears, whiskers, and
vertically slit pupils as cats, a machine learning system would learn these features (and/or
possibly others) as common features of cats by training over a large set of images labeled
cat. Researchers are currently working on unsupervised machine learning techniques,
where the explicit training over samples is unnecessary.
662
UTeach CS Principles Unit A1: Artificial Intelligence: Turing Test
Dasher
Role-play—Adaptive Technologies
Natural Language Processing (NLP) and Machine Learning (ML) can have a profound
impact on how we interact with computers. We may not have computers that converse with
us just yet, but the strides toward realizing this have enabled many advances in HCI
(Human-Computer Interaction).
Dasher
Dasher is an interface designed in the UK that has helped many in similar situations of
impairment.
Explore the Dasher site, and if possible, download and execute an instance of Dasher.
Experiment with the interface, and record your answers to the following questions:
1. Some of the squares corresponding to letters are much larger than others. Also,
sometimes one letter will be large, and other times, it will be relatively small. Why?
What significance does the size of each letter’s area have?
663
2. If you move randomly (but not wildly), actual words and phrases tend to pop up rather
than plain gibberish. Why do you think that is?
3. How might this experience be different if the application were trained on a different
language? Why?
4. What effect on performance do you anticipate if the application continually retrained on
your input?
5. Spell some common words and phrases like, “Hello how are you today” and some
uncommon words and phrases like, “my uncle used to love me but she died.” How do
these experiences differ?
6. How is Dasher similar to autocorrect?
Dasher +
In 2016, an article was published describing an extension to Dasher to work through a brain-
computer interface (BCI) entitled A Brain–Computer Interface for the Dasher Alternative Text
Entry System. The article describes “a new software tool was developed to allow subjects to
type words onto a computer screen via Dasher using their thoughts.”
664
UTeach CS Principles Unit A1: Artificial Intelligence: Turing Test
Ambiguity
Why do computer systems have a difficult time with ambiguity?
Computers are not yet capable of storing and processing the vast amounts of contextual
data required for resolving ambiguities well. Consider the following scenario overheard by
both a human and a speech recognition system:
A speech recognition system is likely to choose pear, whereas a human (with the proper
context) would most likely choose pair. Why do you think the computer would pick pear?
Also, what could we do to the program to correct this incorrect word choice?
665
UTeach CS Principles Unit A1: Artificial Intelligence: Turing Test
Ambiguity Rocks
What’s the big deal with language anyway? Why is it so hard
for computers to process it? After all, once we learn a
language as a child, we don’t really think too much about how
it works (except maybe in English class).
A B C
Probably, you pictured something like B, maybe like A, and almost assuredly not C (unless
you are a sadistic duck-hater). However, a computer program would need to be trained to
prefer one of these over the others in any given context.
Humans are able to resolve ambiguities and missing information almost without conscious
thought through intuition. Computers, as we know, however, must have explicitly defined
666
instructions for processing data. Even if the computer were to choose randomly among
possible choices, this random selection routine would have to be specified in the program’s
code.
Computers can mimic this intuition by statistically analyzing which of the possible choices
are most likely given the context. So, autocorrect might notice that you are typing Should we
incl... and autofills the rest ude. Other words may be relevant (e.g., incline), but include is
more likely to occur in that particular phrase. If instead the text read Head up the incl...,
autocorrect may have made a different choice.
Eh, humans
Of course, the calculated make mistakes,
most likely choice (based on too. Ambiguity is
context and prior history) a difficult
isn’t always the best choice: problem.
667
https://www.youtube.com/embed/MA1hD3XRlh0
Not really. On a surface level, what Siri (integrated with GPS positioning) does is much like
the Star Trek computer. However, Siri’s functionality is limited in scope—to the phone and
the functions it provides. It cannot play chess with you, or understand and process
compound sentences, like “Message my mom that I got home safely AND set my alarm for 3
AM.”
668
669
UTeach CS Principles Unit A1: Artificial Intelligence: Turing Test
Artificial Intelligence: Turing Test
Project: Rubric Check
Instructions
This activity provides you and a partner group time to provide one another critical feedback
about the progress of your projects.
Listen to what your partner group says! You do not need to heed all of their advice, but
consider it wisely. The whole idea is to have a critical third party provide ideas to make your
project better before you submit the final version for a grade.
670
UTeach CS Principles Unit 6: Innovative Technologies
UNIT 6
Innovative Technologies
You will begin by exploring many of the key roles that technology plays in your
life, including social networking, online communication, search, commerce, and
news and examining the ways these ever-evolving technologies have impacted
individuals and societies in recent years. With so many of these technologies
relying on the Internet to connect users and data across varied and remote
locations, you will then “take a peek under the hood” to examine the systems
and protocols that make up the global infrastructure of the Internet. Finally, you
will turn your attention to the past, present, and future of computing to begin
imagining the technology that might exist in your future and the role that you
might play in bringing it about.
671
UTeach CS Principles Unit 6: Innovative Technologies
UNIT PROJECT:
Prototyping the Future
Highlights
You will collaborate in pairs to envision and design a future innovation in technology.
You will discuss and identify a specific purpose that your innovation will serve (e.g.,
entertainment, problem solving, education, artistic expression, etc.) and its key
features.
You will evaluate the potential benefits and risks of your innovation.
You will identify existing technological resources that your innovation may utilize.
You will identify technological challenges that must be overcome before your
innovation can be fully realized.
You will develop a mock-up of your innovation that demonstrates its use and
functionality.
You will write a detailed product description and deliver an elevator pitch to the class
detailing the features of your innovation and its potential impact on society using
appropriate terminology.
You will provide written feedback to your peers on the potential of each collaborative
team’s design.
672
UTeach CS Principles Unit 6: Innovative Technologies
Prototyping the Future Project
“If I had asked people what they wanted, they would have said
faster horses.” – Henry Ford
https://www.youtube.com/embed/AlmASHISmTM
Whether evolutionary or revolutionary, the impact that new technology has on our lives can
be quite profound. Not only do these technological advances bring us new tools that help to
make us more efficient or productive, but they oftentimes can completely change our daily
routines and radically alter the ways we interact with the world around us.
What is the most revolutionary technological innovation in the last five years?
Make a list of at least 10 technological advances from the last five years. For each item on
your list, decide whether it is evolutionary or revolutionary and consider how it has
influenced people’s lives in the years since its introdution. Be prepared to share and discuss
your findings with the class.
Technological Advances
Sometimes, innovation comes about solely from creative thought and imagination. However,
more often than not, innovative ideas also rely on essential advances in technology. Without
these achievements, even the most creative idea might not be feasible. Here, scientists and
673
engineers push technology further by broadening scientific knowledge and inventing new
tools and processes that others may use in bringing their ideas to reality.
For example, Facebook and other social networking sites would not be possible were it not
for a host of other underlying technologies that make those sites possible, including the
World Wide Web, TCP/IP networking protocols, web servers, web browsers, mobile phones,
digital cameras, and of course, computers just to name a few. Until all of those technologies
had been developed, what we know of as Facebook could never have existed.
Imagine your typical cell phone and all of the things that it can do. Make a list of at least 10
underlying technologies that had to exist before that phone could have been built. Then
identify at least five things that you do all the time that you would not be able to do if cell
phones did not exist. Be prepared to discuss your lists with the class.
Make a list of at least 10 forms of digital technology that directly impact your own life. As you
make your list, try to identify examples that you think nobody else in the class will think to list
themselves, but that at least one other person in the class will agree impacts them in the
same way that you have described. Afterward, each student in the class will have the
chance to name one example from his or her list to see if anybody else in the class shares
that same relationship with that form of technology.
Assignment
To begin your collaboration on this project, you and your partner should work together to
674
identify recent technological trends and use those trends to imagine what might very well
come next.
For starters, consider how technology has already advanced during your own life so far.
What products, services, or technologies do you now rely on every day that did not exist
when you were born? How have those technologies changed over time? Has the change
occurred practically overnight or has it evolved gradually? How will these technologies
continue to change in the near future? What do they allow you to do today that you could not
do in the past? What can you still not do today that you hope to maybe do in the future?
These are the types of questions that futurists ask themselves when they ponder the
possibilities that the future likely holds. As a budding young futurist yourself, your task is to
use your imagination and your own, personal aspirations and desires to envision one or
more of these possibilities and identify ways in which you might be able to someday change
the world.
For this assignment, you will need to perform a series of tasks in the process of designing
and documenting your own future innovation and its potential impact on society:
elevator pitch
At the end of this unit, you and your partner will deliver a 2–3 minute “elevator pitch” in which
you will demonstrate your idea to the rest of the class and attempt to inspire them get behind
your vision of the future. After each presentation, you will also provide written feedback to
your peers on the potential of the innovations they presented.
Submission
Your submission will be in the form of written report and an “elevator pitch” presentation that
you will give to the class that introduces your vision of the future.
675
Purpose: Describe a problem that your computing innovation seeks to address and
what purpose it will ultimately serve (e.g., entertainment, problem solving, education,
artistic expression). Note: Your innovation must be a computing innovation (i.e,.
includes the use of a computer or program code as an integral part of its functionality).
Description: Provide a detailed description of your idea, what it would look like, and
how it would work or be used.
Features: Provide list of key features that your innovation will possess.
Benefits: Identify multiple ways that future individuals and/or communities will benefit
from your computing innovation in relation to their society, culture, or economy.
Risks: Identify the potential harmful effects that your computing innovation might
introduce in society, culture, or economy (e.g., security, privacy, social disparity,
physical harm). Data Concerns: Explain how the computing innovation consumes,
transforms, or produces data.
Technological Challenges: Identify any technological challenges or limitations that
must be overcome before your innovation can become a reality (e.g., security, privacy,
data storage).
For your “elevator pitch,” you should prepare the following items:
Rubric
Rubric
Content
Area Performance Quality
The computing The computing The innovation Not enough
innovation innovation described in criteria are
uses uses the report is met in order to
computing or computing or not a award any
program code program code computing credit.
as an integral as an integral innovation.
part of its part of its
Written functionality. functionality.
Report: AND OR
Purpose
Report states Report states
a fact about a fact about
676
the correctly the correctly
identified identified
computing computing
innovation’s innovation’s
intended intended
purpose. purpose.
AND OR OR
AND OR OR
AND AND OR
Written AND OR OR
Report: Data Report Report Report
Concerns explains how explains how explains how
and Tech that data is that data is that data is
consumed, consumed, consumed,
Challenges produced, or produced, or produced, or
transformed. transformed. transformed.
AND AND OR
AND AND OR
AND OR OR
Partner AND OR
Participation
Both team Both team
members are members are
678
able to answer able to answer
questions questions
about the topic about the topic
as a whole, not as a whole, not
just their part just their part
of it. of it.
679
UTeach CS Principles Unit 6: Innovative Technologies
UNIT TOPIC:
The Implications of Computing
Social Networking and Communication
You will explore the ways that innovations in digital technology can impact the lives of
individuals and communities.
You will analyze the role that digital technology plays in your everyday life.
You will analyze the role that digital technology plays in your social communications
and interactions.
You will explore the impact that instant access to global search, news, and information
has had on individuals and communities.
Cloud Computing
You will investigate the socioeconomic causes and effects related to the digital divide.
680
UTeach CS Principles Unit 6: Innovative Technologies
Social Networking
Yesterday’s Technology
While it is becoming increasingly difficult to comprehend, there
was actually a time in which most of the modern-day
conveniences we take for granted did not yet exist. Things like
Facebook, Twitter, Instagram, and Snapchat, for instance, are
all relatively new phenomena that have rapidly taken over our
culture and forever changed the ways that we connect and
interact with one another. But each of these and many other
similar technologies now form such an integral component of our digital lifestyles that it is
hard to imagine how we might have ever been able to function without these everyday
resources.
The truth is, this has always been true about all forms of technology throughout the history of
human civilization. All revolutionary forms of technology change people’s behavior in ways
that make the past seem crude and arcane. Modern digital technology is no different. And
even today’s cutting-edge innovations will one day feel slow and laborious when something
even better inevitably comes along.
Social Structures
One of the most empowering features of electronic social media has been its ability to create
a sense of community, especially in places or situations where such a community could not
have otherwise existed. With the global scope and widespread reach of social media,
previous obstacles like geography, age, or socioeconomics, that isolated groups of people
from one another are no longer barriers. Individuals with shared interests, but whose paths
would never have crossed in the “real world,” are able to come together, communicate, and
interact in the virtual world of an online social network.
This increased ability to find someone who shares the same interests as you allows
681
marginalized individuals to experience a unique sense of belonging. Likewise, the
centralized nature of social networking environments enables new and unique special
interest groups and facilitates coordination of group projects and collaboration in a way that
was previously difficult, if not impossible.
“I would expect that next year, people will share twice as much
information as they share this year, and next year, they will be
sharing twice as much as they did the year before.”—Mark
Zuckerberg, November 2008
The rise in popularity of social networking in recent years has radically altered people’s
perceptions of privacy and their willingness to “share” what previously had always been seen
as personal and/or private information. In 2008, shortly after Facebook opened up its service
to the public at large, Mark Zuckerberg made headlines with his bold assumption that people
would be increasingly willing to share anything and everything about themselves, regardless
of privacy issues. At the time, most critics scoffed at Zuckerberg’s naiveté and arrogance to
make such an assumption. However, through the growth and popularity of services like
Facebook and Twitter, users, and time, have proven Zuckerberg to be surprisingly correct.
This raises the question of whether people of the past were actually private for privacy’s
sake or if their reluctance to publicly share personal information had more to do with the
simple lack of any effective way to do it. After all, before text messaging, tweeting, or blogs
existed, there really was no effective way for individuals to reach out to “the world” and
express themselves. It was the invention of a set of new, digital technologies that suddenly
enabled this ability to freely and broadly share oneself openly with those who might listen.
Think about that for a second. A 24-year-old college dropout with a unique and controversial
vision of the future created a website and forever changed an entire society’s attitudes
toward personal privacy and public sharing—all through computational technology.
In fact, a recent study by the Pew Research Center shows that an increasing number of
Americans are now getting their news from social sites like Facebook and Twitter. This
marks a radical shift in the way information is disseminated throughout the public and,
perhaps more importantly, who controls that flow of information. Traditionally, as the only
682
common source of news, the journalistic community has
served to filter and shape the news as it sees fit. However,
with the free and open access for anybody to share breaking
events, a much more diverse set of voices has emerged.
Assignment
When you go home, find a parent, grandparent, or other adult from an earlier generation with
whom you can sit down for a few minutes to discuss the different forms of social interaction
and networking that they grew up with.
You should ask questions that will allow you to compare and contrast their experiences with
your own. Some ideas for things you might discuss include the following:
Write a short summary of the things you learn from your conversations that includes the
following items:
Identify five aspects of social interaction that have fundamentally changed since they
were your age.
Identify five aspects of social interaction that are more or less the same as when they
were your age.
Describe the most surprising thing that you learned about social interaction in the past
and explain why it was so surprising.
Identify one way that you think social interaction might change between now and the
next generation (i.e., in 20–30 years).
683
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
684
UTeach CS Principles Unit 6: Innovative Technologies
Search
Search
In 1998, Larry Page and Sergei Brin launched Google, an online search engine that was
driven by the PageRank algorithm (named after Larry Page) that the two had developed
while students at Stanford University. While Google was not the first search engine on the
Web (at least a dozen popular search engines were developed in the half decade before
Google), it was the first to use the relative connectedness of webpages to rank and prioritize
search results, an approach that quickly built Google into one of the leading sources of
online search. Google’s success has been further cemented by the verbification of its name.
Today, in popular language, to look up answers online is to “Google it.”
The very fact that people routinely turn to search engines like Google, Bing, DuckDuckGo, or
even Siri whenever they have a question about something is a testament to the power of
search and the value it adds to our lives. Before the Web, information was not as readily
available to the mass public. Much of the knowledge and information that society possessed
was either private, undocumented, or locked up in books buried in a local library. Immediate
access to diverse ideas and resources was not available to the degree that it is today. And
search engines, like the Dewey Decimal card catalog system of libraries, provide an efficient
interface for indexing and finding obscure and relevant bits of information.
The easy access to any and all types of information has profoundly altered individuals’
behaviors, especially when it comes to learning. If there is anything a person might want to
learn about or any skills that they might want to develop, the Web has made the tools and
resources needed to acquire that knowledge readily available to anyone who is interested.
685
But while online search has increased the amount that we can know, it has also reduced the
amount that we need to know. No longer is there a need to learn and remember infrequent
details. If there is anything important that you need to know later, you can “just Google it.”
686
UTeach CS Principles Unit 6: Innovative Technologies
Wikis and Commerce
Wikis
While online search has made it easier to catalog and index the wealth of knowledge to be
found across the entirety of the Web, wikis have consolidated this vast volume of information
into well-organized online references built around user-based communities.
Created in 1994 by Ward Cunningham, a wiki is a platform in which multiple users are able
to collectively contribute to a shared knowledge base. The crowdsourced nature of wikis
gives individual users the ability to shape and inform the content in authentic ways that
traditional information sources historically have not. Rather than presenting its information
through the filter or with the bias of a centralized editorial control, wikis rely on peer-based
writing, fact checking, editing, and moderation.
Many people find it uncomfortable that any random user can edit the content in a wiki
unchecked, assuming that such lack of oversight reduces the reliability of the site. However,
most wikis develop strong communities of dedicated volunteers to moderate the content on
their sites and help to keep vandalism and other disruptive behavior in check. As Ward
Cunningham himself has said, “Wikis work best in environments where you’re comfortable
delegating control to the users of the system,” although he has also stated that, “With wiki,
you have to trust people more than you have any reason to trust them. In 1995, it was a
safer environment, don’t know if I could have launched wiki today.”
While Wikipedia is perhaps the most well-known and visited wiki, it is by no means the only
one. Across the Web, thousands of wikis have been created that specialize in a broad range
of special interests serving smaller, often underrepresented populations, giving each one a
global voice that they would not have otherwise had.
Commerce
"Why leave the comfort of your own home when you can get
something custom made to your exact size for less? I believe this
is the future.” —Patrick Curtis
Another phenomenon that has emerged from the growth of the Internet is e-commerce, or
electronic commerce. With its global reach, the World Wide Web is a powerful tool for those
who have something to sell. No longer is a retailer’s customer base limited to the local area
within driving distance of a storefront. Now, the entire global online population can be
potential customers.
687
New Business Models
By moving their business online, retailers have discovered a number of new and innovative
ways to serve their customers.
Online Storefronts
Without the need for the physical presence of a so-called “brick and mortar” store, online
retailers are often able to offer more efficient services and better pricing. Resources that
would traditionally be spent on an elaborate showroom and hired salespeople can instead
be redirected to an automated online storefront, improved products, and discounts on
products and/or shipping costs.
Many consumers have embraced the convenient and hassle-free opportunity to shop online,
just as previous generations did with mail-order catalogs. Only now, the online “catalog”
offers a much richer and more informative shopping experience through the use of
multimedia, user reviews, and personalized customer recommendations based on previous
buying behavior.
Independent Sellers
Commerce is another area in which the democratizing effects of the Web can reveal
themselves. With online auction sites like eBay or e-commerce sites for artisans like Etsy,
individuals now have access to the same global market as the large, corporate retailers for
selling their wares. This has sparked a boom in both the supply of and demand for custom-
made and limited-production runs of unique new products and services that were either not
available or not feasible before the advent of an e-commerce market.
Crowdfunding
For those independent sellers who envision creating a sustained business or a complete
new product or service, crowdfunding uses online access to customers as a means of
funding their project. In the past, clever entrepreneurs who had a brilliant idea for a new
product might never have been able to bring that product to market due to lack of start-up
688
funds (i.e., it takes money to make money). Sites like Kickstarter and Indiegogo were
created to enable these innovators a way of reaching out to potential customers and
recruiting them as backers who might help make their idea a reality by providing initial
“investment” funding.
A major focus of research in the computing industry centers on encryption techniques and
standards for protecting sensitive data and on the design of more robust systems and
procedures for making attacks more difficult to perform. Technologies such as Google Pay,
Apple Pay, and other mobile payment systems are some of the most recent attempts to
strengthen the security of electronic financial transactions.
689
UTeach CS Principles Unit 6: Innovative Technologies
Cloud Computing
Everything Old is New Again
The term “cloud computing” is a relatively recent buzzword for a new type of computing.
Thanks to the Internet, Wi-Fi, and other modern networking technologies, individual users
can “offload” much of their computational and data storage efforts onto remotely hosted
servers and online services. This has helped to make smaller, more portable computing
devices like laptops, tablets, and phones more practical because the device itself does not
need to do all of the work. Instead, the heavy lifting can be handed off to “the cloud.”
Believe it or not, this is not a new idea in computing. In fact, decades earlier, long before the
availability of public or private Internet access, most computers were the size of entire
buildings. While the actual devices that people used might have looked like today’s desktop
computers, they were actually just simple interfaces, so-called “dumb terminals” or “thin
clients,” that connected remotely to the actual computer or server located elsewhere, much
like our phones and tablets connect remotely to web services.
Of course, yesterday’s “dumb terminals” have since given way to today’s “smart phones,”
which have far more computing power in their own right. But the increasing use of and
reliance upon “the cloud” for remote computing show that recent and incredible advances in
technology have not strayed far from their predecessors.
Client-Server Model
Like with the “dumb terminals” of the past, today’s cloud computing is built upon the “client-
server model.” That is, the networking process can be described according to the
interactions of two remotely located computers (or more accurately, programs running on
those computers)—namely, the “client” and the “server.”
The client stands at one end of the communication process and typically represents the
end-user. At the other end of the process is the server, a centralized computer that all
individual end users connect to. In other words, the server serves the needs of its clients
(i.e., users).
The analogy is that of a waiter at a restaurant. The as the waiter (i.e., server) waits on the
customers (i.e., clients, each individual customer makes requests about which particular
food items or beverages they want. The waiter then goes back into the kitchen while the
cooks (i.e., processor, or CPU) gather the raw ingredients (i.e., data) from storage (i.e.,
database) and prepares the meal. When it is ready, the waiter returns to the customer and
serves the meal as they requested.
690
browser is the client. The website you visit is essentially data hosted (stored and served) on
a remote web server.
In the early days of PCs, your data was trapped in a large box sitting on a person’s desk at
home or in their office. Data that was on their computer at home could not be easily taken to
work. Data that they used at work could not be brought home. Any data that needed to be
transported from one location to another had to be copied onto physical disks that were slow
and had relatively little capacity.
With cloud computing, users can now access their data from anywhere—home, school,
work, airports, coffee shops, etc. And since wireless transmissions free us of the need for
physical media, we can now use phones, tablets, and other devices that lack physical ports
to access this cloud data as well.
Of course, one of the trade-offs of storing your data remotely is the additional costs of
transmitting your data to and from the cloud—both in terms of time and bandwidth.
Off-Site Storage
Another benefit of cloud-based systems is their use as off-site storage for your data. Anyone
who has ever used a computer either has or will experience the loss of data at some point.
Accidents happen. Hardware fails. Your data is important, but there is no way to ensure that
it will remain 100% safe, intact, and uncorrupted. While it is good practice to maintain regular
backups of your files, photos, and other digital information, a good rule of thumb is the “3-2-1
Rule":
691
3) Have at least three copies of your data.
The first two items deal with redundancy and maintaining multiple copies of your files, but
the last item avoids the issue of creating a “single point of failure.” It does no good to have
multiple backups of your files if they are all kept in the same location. A single disaster, like a
fire, flood, or earthquake that destroys your house will destroy your files and your backups at
the same time.
Cloud storage helps to solve that problem by allowing users to keep a separate copy of their
data safely stored at a remote location.
692
UTeach CS Principles Unit 6: Innovative Technologies
Ownership of Cloud Data
Who Owns Your Data
Despite the advantages of storing one’s data in the cloud, the use of online services and
remote storage systems also comes with a number of legal risks. Specifically, the issues of
“ownership” and “access” come into play whenever any party (i.e., the cloud service) acts as
a custodian for the property (i.e., the data) of another party (i.e., the user). Unfortunately,
cloud computing is still a relatively new phenomenon and the legal distinctions about the
ownership of personal information and data is not always so clear.
For example, who owns your Facebook profile? You? Or Facebook? Clearly, Facebook
hosts your profile and stores the raw data from all of your posts, comments, photos, likes,
chat histories, and friend connections on their servers, but does that mean that Facebook
owns the information making up all of that data? If asked, most users would expect that the
answer is “no,” Facebook does not own their personal data. And fortunately, Facebook
agrees, actually. According to their terms of service, individual users “own all of the content
and information” that they post on Facebook. But those terms of service also include a
number of statements in which, by using the service, you grant Facebook permission to do a
number of things with your data that may or not be in your best interest.
Selected excerpts from the Facebook terms of service include the following (emphasis
added):
The terms of service also clarifies, in detail, several more generalized terms that
693
By “information" we mean facts and other information about you,
including actions taken by users and non-users who interact with
Facebook.
Most other services have similar terms of service statements that specify the rights,
limitations, and obligations of both users and the service itself with regard to the user’s data.
Each agreement is different and is tailored to the interests of the service that wrote it, which
are not always the same as the interests of the users. As a user, it is important to
understand what rights and privileges one might be giving up when they agree to use a
particular service.
Assignment
When you sign up for a new service or product, you are often presented with a Terms of
Service agreement to read and acknowledge. It is almost always a very long document with
a lot of seemingly obscure legalese, so most people skip past it and quickly check the box
saying that they have read and agree to the terms when they have not actually read
anything. You have probably done it yourself. It is only human to do so. But what exactly are
you agreeing to when you do this? What rights might you be giving up? Even if you never
read any of these agreements, it is important to understand that the terms that are carefully
detailed in each of these agreements directly affect you, your privacy, and your rights to your
own content that you create, store, and/or use with the service. More importantly, these
agreements usually specify the limits that the service can be held to with regard to what it
can do with your data, whether it is on your behalf or in your best interests or not.
In this exercise, you are to select one of the following online cloud services and actually read
its Terms of Service agreement to see what such a document includes and then answer the
questions below.
Answer the following questions about the Terms of Service that you read:
695
UTeach CS Principles Unit 6: Innovative Technologies
The Digital Divide
What Is It?
Following a 1995 study by the Markle
Foundation, Lloyd Morrisett, president of the
foundation, described a “digital divide” between
the haves and have-nots when it comes to
access to information. Essentially, the study
found that the same racial and cultural barriers
that impact societies offline also have a similar
effect on their access to online resources. As a
result, some have likened this phenomenon to the racial and socioeconomic disparities of
previous generations, calling it the Civil Rights issue of the new millennium.
The term is now used to describe the gap that exists between those who have sufficient
access to information and communication technologies and those who do not. The reasons
for this gap are numerous and complex, but the effects of the unequal access to
computational services within society can be felt in a variety of areas, including education,
healthcare, employment, social connectedness, and political awareness and participation.
There are many different factors that specifically contribute to the existence of the digital
divide. In general though, the problem can often be described in terms of obstacles that
inhibit individuals from fully realizing the potential of information and communication
technologies:
Simply connecting online requires certain financial and physical expenses, such as a
696
computer, networking infrastructure (e.g., copper cables, fiber optics, wireless access points,
routers), and Internet and network connectivity services. For many, these costs are
insurmountable and make online engagement an impossible fantasy. In poorer communities
or nations, the benefits promised by Internet connectivity must compete with more pressing
needs, such as food, clothing, and shelter. For people in these communities, the Internet
often does not rise to the level of becoming a priority. As a result, they are excluded and
disconnected from the larger community of the digital world.
In addition, a lack of technological literacy also prevents many people from connecting
online. Whether it is that they do not know how to use technology or that they simply do not
understand or recognize the benefits of technology, many people avoid or are otherwise
prevented from accessing the online world. As a result, these people, too, are isolated from
the digital world, whether by choice or by circumstance.
Is the “digital divide” much ado about nothing? Does everybody really need to be
connected? Are they truly disadvantaged if they do not have access to information and
communication technologies? The answer to all of these questions depends on the degree
to which access to online services is becoming essential.
Many of today’s basic utilities and social resources are moving exclusively online or at least
have an online component. In journalism, inexpensive and readily accessible print
publications are rapidly being discontinued in favor of their online counterparts. Those
without connectivity are thus losing their access to news and civic discourse, impeding their
ability to remain well-informed citizens. Similarly, healthcare and many government-related
services now expect individuals to create and manage their accounts via online portals.
Again, for those without connectivity, this becomes yet another obstacle between them and
the services they need and are entitled to.
In addition, the problem is not just about what these disconnect communities lack. It is also
about what the rest of society as a whole, including both the haves and have-nots, loses
though their lack of participation. For every individual who is excluded from the modern
digital ecosystem, one more voice is silenced. One more voice that is unable to contribute
697
ideas and solutions to problems. One more voice whose perspective is lost. These
individuals bring much-needed value to a community and their lack of participation in the
digital world does a great disservice to all of society.
Also, as the disparity between the informational haves and have-nots grows, these two
populations become even more divided and polarized, leading to civil inequality and
increased social tensions that further alienate communities from one another.
Assignment
For this task, you will be given two topics on which you are to prepare a short presentation.
For the first task, you must work alone and you may not use any technological devices
beyond a pen/pencil and paper. For the second task, you may collaborate in small groups
and the group may use any computational device(s) that you have access to (e.g., computer,
smart phone, tablet, calculator, presentation software, web browser, access to the Internet,
etc.).
In addition to your two short group presentations, you should each individually write a brief
reflection on your experiences completing the two tasks. Be sure to compare and contrast
the experiences of working collaboratively with technology versus working alone and without
technology.
698
UTeach CS Principles Unit 6: Innovative Technologies
UNIT TOPIC:
The Internet
Network Infrastructure
You will examine the overall design and architecture of the Internet.
You will explore the role of servers, routers, gateways, and clients.
You will examine the domain name system and its role in network routing.
Communication Protocols
You will examine a number of standard network protocols, including IP, TCP, UDP,
SMTP, HTTP, and FTP.
You will investigate the series of components and events that are involved in the
transmission of an email or SMS text over the network.
699
UTeach CS Principles Unit 6: Innovative Technologies
Network Infrastructure
Network Architecture
In its simplest form, the Internet is just a large
collection of interconnected devices. Some of
the devices are connected directly to one
another while others are only connected
indirectly through a series of intermediate
devices. Altogether, these devices make up the
network. They consist of all varieties of
electronic, computational hardware, including
desktop computers, laptops, tablets, phones,
printers, cameras, routers, Wi-Fi access points,
etc.
One of the strengths of the Internet’s design is the inherent redundancy that such a complex
and multiply connected network offers. In most cases, there is more than one pathway
through which a transmission can be sent in order to reach its destination. This allows the
network to not only be fast and efficient by finding the most optimal route, but also robust
enough to continue functioning even if part of the network fails and a pathway is cut off.
Client-Server Model
The basic operation of any networked communication system centers on the transmission of
information between two parties. Traditionally, these parties are referred to as the “client”
and the “server.”
Typically, the client initiates the communication by sending a request to the server—usually
a fully automated program running on a remotely located computer—which then processes
the request and sends the appropriate response back to the client. Examples of client
software that users might be familiar with include web browsers, e-mail applications, and
chat programs.
When a user clicks on a link in a browser, a URL (i.e., the address of a web page) request is
sent out into the larger network, where it is then routed to the location of the particular
computer that is running a web server for the requested URL. The server then generates the
content for the page (formatted in HTML) and transmits it back through the network to the
user’s computer. The web browser (i.e., the client) then interprets the HTML information that
it receives and uses that to render the text and images onto the user’s screen.
700
So how fast can this virtual exchange happen? And what factors limit this speed? While
technically transfer of information over the Internet can be very, very fast (up to 750 Mbps in
some areas), two aspects to consider are latency and bandwidth. The bandwidth of a
system is a measure of bit rate — the amount of data (measured in bits) that can be sent in
a fixed amount of time. The latency of a system is the time elapsed between the
transmission and the receipt of a request. If you can imagine the Internet as a pipe that
information travels through, bandwidth is the size of the pipe. You can see in the picture
below the relationship between latency and bandwidth, as well as a small amount bandwidth
(top) compared to larger bandwidth (bottom).
In some use-cases, the server might initiate the communication in what is referred to as a
“server-side push” (in contrast to the “client-side pull” described above). In a “push” situation,
a centralized server pushes information out to one or more clients without the end-user
explicitly requesting the information.
A chat program is an example of an application that makes use of both “client-side pull” and
“server-side push” transmissions. In a situation where two users are messaging with one
another, they are each operating an application that is functioning as a client and are
remotely communicating with a central server located somewhere on the larger network. The
server effectively sits in between these two users, relaying their messages from one to the
other. When one user sends a message, the contents of the message as well as delivery
instructions are sent to the server, which is listening for such an incoming transmission. The
server then interprets the delivery instructions to identify which user the message is intended
for. The server then initiates a transmission in which the message is “pushed” to the other
user’s client, which renders the message on their device as an incoming message from the
first user.
701
The overall structure of a communications network, through which all digital traffic flows,
consists of a number of nodes. Each node is a machine or device that acts as either a
communication endpoint (e.g., a client computer or a remote server), a connection point that
links together two other nodes (e.g., a gateway, bridge, or modem), or a redistribution point
that links together multiple nodes and other networks (e.g., a router, hub, or switch).
702
UTeach CS Principles Unit 6: Innovative Technologies
Communication Protocols
The Importance of Protocols
protocol
Every time you perform one of these actions as a user, you set off a chain reaction of
complex data processing and information exchange within dozens, if not hundreds, of
separate computational devices across the breadth of the Internet. In these examples, the
software running on your computer, phone, tablet, or other computing device responds to
your clicks and taps by initiating a multi-stage network transmission in which your data hops
from node to node as it is handed off from one router or hub to the next until it finally reaches
an appropriate server on the other end. In each of these hops, where your data is
transmitted between two nodes, the exchange of information is made possible through the
use of standard, agreed-upon communication protocols that each node follows precisely.
These protocols, or sets of rules for the proper handling and formatting of information, are
designed to establish a common interface through which different hardware/software
components can interact, regardless of their own internal design or manufacturer.
In the case of communication protocols, abstraction allows for a clear set of standards for
how information should be exchanged to be explicitly described while avoiding the
complexity and minutiae of how a manufacturer might actually implement that standard in
their hardware or software. By describing the standards broadly, abstraction also enables
future innovation by not limiting or restricting the kinds of hardware that can implement the
standards. As long as it conforms to the expected protocols, any type of compatible
component can be developed to participate in the exchange of digital information on the
Internet.
Standard Protocols
Throughout the history of the Internet, as technologies have evolved, a number of standards
have been developed to ensure the efficiency and robustness of these new capabilities.
703
Traditionally, the task of developing these communication protocols falls on the shoulders of
independent researchers who develop the protocols and ad hoc standards committees,
often made up of experts and organizations who have a vested interest in the proper
functioning of the Internet as a whole. Together these committees agree upon and oversee
application and further refinement of existing protocol standards.
Designing and developing a protocol is no easy task. Any agreed-upon standard needs to 1)
enable reliable and efficient transmission of data at large and small scales, 2) provide an
unambiguous set of protocols, and 3) be flexible enough to adapt to and accommodate
future technological innovations. Hierarchy (which will be discussed in greater detail in
Domain Name System) and redundancy (provided by established protocols) help systems,
like the Internet, scale.
Several of the most commonly used protocols include those for Internet data transmission
(IP, TCP, UDP), e-mail (SMTP), webpages (HTTP, HTTPS), and files (FTP). Using the notion
of abstraction, the Internet protocol suite, more commonly known as TCP/IP, organizes each
of these into a stack of distinct layers based on the hierarchical relationships of their use.
These include the link, transport, Internet, and application layers.
But communication protocols are not just limited to the Internet as a whole. A number of
independent, privately owned platforms also offer protocols for publicly integrating third-
parties into their networks through the use of APIs (application program interface). APIs and
libraries simplify complex programming tasks by providing developers the building blocks
neccessary to interface with an existing environment or other software components.
Programming documentation (discussed in Unit 4: Draw Shapes) for APIs/libraries is an
important aspect for a programmer to understand all components of development.
For example, when Twitter first launched in 2006 as a private messaging service, it was built
around the SMS text messaging standards (whose protocol limits messages to 160
characters). Twitter’s protocol limited its own messages to 140 characters, using the
remaining 20 characters to convey meta information about the message (e.g., user id of the
sender, etc.). However, Twitter’s initial growth benefited from their offering of a public API
that allowed countless third-party developers to create their own innovative applications. By
following the protocols laid out in the API, these applications could tap into the Twitter
704
backbone to send and retrieve tweets. In this way, open standards (like the Twitter API) fuel
the growth of the Internet.
This protocol specifies the use of physical, short-range connections commonly used in local
area networks (LAN), such as homes or offices, which uses coaxial cable, twisted pair, or
fiber optic connections. You will often see these types of connections between a cable
modem and a wireless access point or router or between a desktop computer and a wall
outlet (which ultimately connects elsewhere to a modem).
802.11 is the standard wireless protocol for Wi-Fi communications seen in most laptops,
phones, tablets, and other networked devices that are not physically connected to a network.
The TCP protocol ensures that all packets of a data stream are transmitted and received
exactly as originally sent. It specifies methods of performing error-checking analysis of the
received packets to ensure that no error or loss of information was introduced along the way.
If a packet is lost or found to be damaged, the TCP protocol will identify the error and
request that the packet be resent. In this way, routing on the Internet is fault tolerant and
redundant. TCP is most suited for applications that prioritize accuracy and completeness of
transmission over speed, such as e-mail (SMTP), webpages (HTTP, HTTPS), and file
transfer (FTP).
The UDP protocol is better suited for applications in which speed is more important that
accuracy or completeness of information, such as online gaming or video or audio
streaming.
705
packaged and labeled for delivery over the network.
Originally developed as part of the Transmission Control Program in 1974 and then later
separated out as its own standard, the Internet Protocol (IP) specifies how individual packets
of information are packaged and labeled for delivery, much like how the postal service
specifies how envelopes and packages should be sized, addressed, and stamped.
This protocol specifies how electronic mail should be formatted in order to be sent and
routed to the intended recipient. Most notably, the standard specifies how header information
may be prepended to the beginning of a message by each node that handles the message
along the way. Users who receive an e-mail can view the headers of the full message once
they receive it to see various information about where the message came from and through
which server it originated from and those that it passed through on its way to their inbox.
These are the primary distribution protocols used by the World Wide Web for delivering web
content. The protocol was designed to transmit hypertext documents that have been
formatted according to another protocol, HTML (Hypertext Markup Language). The “S” at the
end of “HTTPS” indicates “Secure” and provides additional protocols and procedures for
encrypting information prior to transmission and decrypting it upon receipt.
Other common protocols in the application layer include the File Transfer Protocol (FTP)
for sending files, Domain Name System (DNS) for looking up IP addresses of domains,
Dynamic Host Configuration Protocol (DHCP) for initializing an Internet connection with
an ISP, and Post Office Protocol (POP) and Internet Message Access Protocol (IMAP)
for storing/retrieving e-mail messages.
706
UTeach CS Principles Unit 6: Innovative Technologies
Internet Protocol
A Protocol for Packet Network Intercommunication
When Vint Cerf and Bob Kahn first proposed the Transmission Control Program in 1974, it
established the use of a technique, known as packet switching, as the underlying method of
transferring data between nodes across the Internet.
With packet switching (or packet routing), all data is subdivided into small, suitably sized
blocks that are then transmitted independently from one another, potentially taking different
routes to reach the data’s intended destination. Rather than sending an entire message of
some arbitrary and varying length all at once and hoping that it reaches its destination intact,
larger messages are broken up into smaller, fixed-size packets.
Each of these packets contains two components: a header and a payload. The header
contains the IP addresses of the source node (e.g., the sending client) and the destination
(e.g., the receiving server) and any other information needed to deliver the packet. The
payload is the actual data that is being sent.
In a real-world example, the process of sending digital information across a network is a lot
like mailing a document across the country via postal mail. The sender seals the document
(the payload) in an envelope or box (the header) upon which is written the return address
(source IP address) and the recipient’s address (destination IP address). Delivery of the
package involves handing it off through a series of many postal carriers and routing systems
(nodes) as they transport the package to its destination.
By sending data as individualized packets of information, the Internet is able to operate more
efficiently and is better able to handle unforeseen errors in transmission.
Imagine sending a very large document consisting of thousands of pages. If sent all at once
as a single document, any single error, delay, or misdelivery along the way will require the
entire document to be resent in full. However, if each page of the document was individually
sent, any misdelivery will require only that particular page to be resent. In addition, as the
amount of data traffic varies over time, different routes might become faster than others. By
sending the larger document as many smaller pieces, each piece can be routed along the
most optimal path at that moment. Together these advantages of packet switching serve to
improve the speed, efficiency, and reliability of the communication network.
IP Addresses
707
The relative location of each node on the network is specified by the node’s IP address, a
unique numerical identifier, that allows other nodes to know which nodes they are directly
connected to and to determine a route for sending data to its intended destination. In the
case of the Internet, the people who designed the network devised a system for assigning
unique 32-bit numbers to each device on the network.
For example, as of 2016, the address for the google.com search engine is
74.125.224.72 . What this means is that when you “Google something,” your browser
(i.e., your client) sends a search request through the Internet’s network of nodes, addressed
to a server located in some far-off location whose IP address just happens to be
74.125.224.72 .
As devices are added to the network, each is assigned an IP address that translates roughly
to its geographic location on the network. More precisely, the Internet Assigned Numbers
Authority (IANA) oversees the distribution of these addresses through a number of regional
registries. Each registry then assigns blocks of addresses to different entities, like your local
Internet service provider (ISP), which then assigns an available address from that block to
each computer or device that connects through its service.
As a result, while each IP address can provide a general sense of the geographic location of
the organization that a node is associated with (e.g., your local ISP), the actual address
does not specifically locate the actual node itself. For example, two computers with very
similar IP addresses from the same block of numbers might actually be physically located
across town from one another and have nothing in common other than they both use the
same ISP for Internet connectivity.
Each of these four numbers is referred to as an octet because they each represent eight bits
708
(or binary digits) of information. In UNIT 3: Data Representation, you looked at binary
numbers and how they correspond to the decimal (base 10) numbers system. You also
investigated why eight bits of data can only represent values between 0 and 255 . Suffice
to say, four octets of eight bits each adds up to 32 bits of total information that can be used
to identify each node on a network.
Just like there is a limit to the range of possible values that can be represented by eight bits
of data (e.g., 0 – 255 ), there is a similar limit to what can be represented by 32 bits of
information (e.g., 0 – 4,294,967,296 ). That means that the Internet, as originally
designed, can only generate 4.3 billion unique, 32-bit addresses, which limits the Internet to
a maximum capacity of no more than 4.3 billion nodes. While that is a lot of devices, that
number is still finite and can easily run out. In fact, it already has in some areas. APNIC (the
Asia Pacific Network Information Centre), the registry organization for the Asia Pacific
region, exhausted its pool of regional addresses in 2011. Then, in the fall of 2015, the
American Registry for Internet Numbers (ARIN), the registry organization that assigns
addresses for the United States, Canada, the Caribbean, and North Atlantic islands issued
the last of its unassigned addresses.
But is that enough? Well, to put that number in perspective, if every grain of sand on all of
Earth’s beaches were assigned its own unique IP address, the IPv6 standard would still
allow for 60 quadrillion more similar Earthlike planets each with their own IP-enabled beach
sand to have their own, unique addresses. In other words, while that number, too, is finite, it
is reasonable to say that yes, it most likely is enough.
709
UTeach CS Principles Unit 6: Innovative Technologies
Domain Name System
The Internet’s Directory Assistance
Every website is essentially a collection of content stored on a web server. Every web server
is just a program running on an Internet-connected computer. Every Internet-connected
computer is a networked node. And every networked node is assigned an IP address that
identifies where it can be found on the network. So, it is reasonable to conclude that every
website can be identified by its 32-bit IP address.
But how often have you ever entered an IP address into your browser? How often do you
see any IP address in the address bar as you browse the Web? For most people, the
answer to these questions is “rarely, if ever.” Instead of IP addresses, most users rely on the
use of domain names to reference an online site or service. And the Domain Name System
is the hidden service quietly working in the background that makes it all possible.
Domain Names
The fact is that a 32-bit number, while perfectly ideal for the digital and electronic
components of the routers and other computational devices that support the Internet, is not
very human-friendly. Which of the following is more intuitive for humans?
01000010110111001001111001000100
66.220.158.68
facebook.com
The first is an actual 32-bit address. The second is that same address expressed in a more
human-friendly format of four octets. The last version, however, is the most recognizable of
the three because it is descriptive and tells us everything that we, as humans, would want to
know—namely that it refers to the social networking site, Facebook.
Numbers are great ways of cataloging and indexing things. People have Social Security
Numbers, the products we buy have Universal Product Codes (UPC barcodes), books have
International Standard Book Numbers (ISBN), credit cards and bank accounts are
numbered, driver licenses have numbers, etc. But these numbers are meant for automated
systems to store and process computationally. We prefer to call each other by our first and
last names, products by their brands, and books by their titles. So it comes as no surprise
that we would prefer to reference our favorite websites by descriptive names rather than an
obscure and seemingly arbitrary sequence of numbers.
DNS Lookup
When you type a URL into an address bar, it is usually in a form that includes a domain
name. For example:
710
https://www.facebook.com/
But how does this translate into an IP address that your computer and all of the intermediate
routers and gateways can use to locate the actual Google server that hosts the page you are
trying to load? The answer is a massive lookup table that acts like a large directory,
dictionary, or phonebook that allows you to use a value that you do know to look up a value
you do not know.
When you try to load a webpage, your web browser (e.g., the client) isolates the domain
name specified in the URL you typed (e.g., “facebook.com") and sends that name to a
special server known as a Domain Name Server (DNS). That is, the browser sends a
request to a pre-configured IP address that corresponds to the location of a nearby
nameserver (usually belonging to your ISP, although Google actually operates a couple of
handy nameservers at 8.8.8.8 and 8.8.4.4 ).
The nameserver, which is a computer that stores an updated list of every registered domain
name and the IP address of the server that hosts that domain, looks up the address of the
requested domain name and sends that information back to your web browser.
Your web browser then sends the URL of your original page request to the IP address that it
received from the nameserver.
Without a centralized nameserver at a known IP address, your browser would have no way
of knowing which IP address it should contact to fulfill your page request.
DNS Hierarchy
The domain name syntax is hierarchical. A hierarchy is an arrangement of elements in a
ranking of inclusiveness or superiority. As you can see below, the beginning is the root
(signified by the period/dot). This is the top of the DNS hierarchal tree.
The root is divided into familiar domains (.com, .org, .edu). From there, you can see we can
travel into several subdomains. Novell offers an example furthering this understanding:
The hierarchy that DNS utilizes allows for this system to scale to solve larger problems!
DNS was not completely secure because it did not include security
based on the the information that it contains (like host names and IP
addresses).
712
UTeach CS Principles Unit 6: Innovative Technologies
TOPIC:
Cryptography
Highlights
You will discuss the benefits and risks of open versus closed platforms.
713
UTeach CS Principles Unit 6: Innovative Technologies
Public Key Encryption
Modern cryptographic techniques move well beyond the simple modular arithmetic ciphers
of the Ancient Greeks and Romans. Although their functionality and security are more
robust, basic mathematical ideas still underpin their algorithms.
Aside from Gandalf in the Lord of the Rings, people do not use passwords to open doors.
Instead, we Muggles use keys. Encryption algorithms use keys as well — mathematical
keys. In Unit 1: Encryption, you discovered the use of the Cæsar cipher. In this type of
encryption, the number of times the characters are shifted could be considered the key. In
that case, there were only 26 possible keys, and so if someone knew the algorithm, they
could break the code quickly and easily. The security was in hiding how the encryption was
performed.
Modern systems do not rely on this assumption. A quick search of the Web reveals many
different algorithms for encryption. In fact, many businesses actually advertise which
algorithm they use to indicate the strength of their security protocols. Instead of the classical
system's reliance on ignorance of the encoding process, modern encryption schemes rely
on the complexity of the keys used to secure information.
Early key systems relied on the fact that the both the sender and the recipient had the same
key. One could use the key to encrypt the message, and the other could use it “in reverse” to
decrypt it. These symmetric key systems have a major flaw — How are the keys themselves
securely exchanged? Are there ways that the sender and the recipient can have separate
714
keys (only known to themselves), and exchange information securely?
Restricted Knowledge
Notice that each of these encryption methods relies on restricted knowledge.
With the Caesar cipher, anyone who knows the algorithm and the offset can decrypt the
message. However, we have seen that even if you know just the algorithm, it is not hard to
decrypt without also knowing the offset—there are only 25 possibilities. With the Vigenère
cipher, anyone who knows the algorithm and the key can decrypt the message. It is much
more difficult to decrypt the message without the key than before because patterns in the
text are less obvious, and there are many more possibilities than 25.
Scenario 1:
Alice sends Bob a locked box with her message inside. Although it gets passed through
many hands before reaching Bob (e.g., the courier system), it is locked and so Bob receives
it securely. However, to unlock it, he needs the key. How does Alice send Bob the key in a
secure way?
Scenario 2:
Alice sends Bob a locked box with her message inside. When it reaches Bob, he put his own
lock on it and sends it back. Then Alice removes her lock, and sends the box back to Bob.
When Bob receives the box, it is locked only with the lock that he put on it. He unlocks it and
retrieves Alice’s message—or does he?
Scenario 3:
Bob has invented a special lock. It is special because it costs nothing to duplicate and send,
715
and it is virtually impossible to analyze the lock and create a key. The key and the original
lock must be created at the same time. He sends out his locks to anyone who wants to send
him a message. Alice locks her box with one of Bob’s special locks and sends it to him.
When he receives it, he unlocks it with his special unique key and reads the message.
The third scenario is actually how modern secure message passing happens. This is called
Secure Sockets Layer (SSL) and is typically
indicated by a padlock icon in the browser’s
address bar. SSL is considered an asymmetric
key system. Certificate authorities (CAs) issue
digital certificates that validate the ownership of encrypted keys used in secured
communications and are based on a trust model.
Imagine a system in which the sender could send anybody (or everybody) copies of a lock
he had the only key to open. Beyond that, there is no way that someone could practically
figure out how to reproduce the key to match the lock no matter how they examined the lock
itself.
Crypto-keys can be designed in much the same way mathematically by the use of a one-
way function. A one-way function is a mathematical function that is relatively easy to apply,
716
but difficult to invert (i.e., undo). Common public key cryptosystems rely on the fact that
multiplying two numbers together is easy to do, but once the product is obtained, it is difficult
to retrieve the original factors by applying the inverse operation, division.
104101108108111 ×
Multiply the encoded plaintext and lock
93954233 =
n to encrypt the message.
9780739766747650083863
This is a simplified version of RSA encryption methodology. For more information about RSA
encryption visit https://brilliant.org/wiki/rsa-encryption/.
717
UTeach CS Principles Unit 6: Innovative Technologies
Open vs. Closed Standards
The Internet as we know it today got its start in the early 1960s as a US Department of
Defense project known as ARPANET (Advanced Research Projects Agency Network). For
the next 20 years, ARPANET was privately operated by the US military and its use was
limited exclusively to government and military applications. Only in the 1980s did the
network open up to civilian uses and transition to the free and open Internet that we know
today.
There are advantages and disadvantages to both approaches, but the ultimate choice of
whether a platform should be open or closed comes down to the goals that drive the
development of the system. In the case of ARPANET, the US military had ample government
funding to build and operate the network as well as the need for a secure and reliable means
of maintaining communication in the event of a nuclear attack. As such, it was their system
that they built with their own funds and for their own needs, so they kept it to themselves
(i.e., a closed network). By the 1980s, the technological and economic advantages of
making a global electronic network available to the commercial market merited opening up
the network to more civilian uses by business and individuals.
Open
Open source and licensing of software and and web content raise legal and ethical
concerns. Open source code is code that is publicly available for anyone. Unlike Apple’s
super secret source code for Mac OS X, open source software allows programmers to view,
reuse, and remix the source code for individual use. Though this source code is generally
publically available, there are still rules that govern its use (or reuse). Like using any material
that you have not created, you must abide by the terms and conditions of use. Typically
these terms can be found close to the source code or provided in commented form within
the program.
Email and web publishing are both examples of open platforms that have been created
718
around open standards. Any developer can create an email application that conforms to the
various standards for handling email (SMTP, POP, IMAP). The program can then send email
to and receive email from any other email user anywhere on the Internet no matter which
email client (program) they might be using. Without these open standards, there would be no
guarantee that any message that you write and send would be compatible with the software
being used by your intended recipient. But with these standards, all emails created by any
standards-compliant email client are guaranteed to be fully compatible with all other such
clients, thus enabling the global communications system that we have come to rely on. Open
standards also allow us to ensure cryptography is secure for Internet encryption. Nataraj
Nagaratnam, IBM Distinguished Engineer, explains how these open standards reinforce
security.
Closed
While services like Facebook and Twitter might be publicly available and free to use, they
are still closed, proprietary systems. Each company controls the data that they store and
strictly regulates how that data can be accessed and modified by its users. For example,
users cannot easily transport their tweets, status messages, comments, chat histories, or
friends lists to other competing services. Similarly, the ability to integrate these services into
other, third-party apps or sites is strictly limited to what Facebook or Twitter choose to allow.
Having this level of control over their platform gives each company the ability to more
reliably build, develop, and monetize their platform, but it comes at the expense of user
choice and compatibility.
719
UTeach CS Principles Unit 6: Innovative Technologies
Steganography
Whereas cryptography alters a message so that it cannot be read if it is discovered, the
security of steganography lies in obscuring the message so that it cannot be found.
Used during the Cold War, steganography was employed as counter-propaganda. In 1968,
crew members of the USS Pueblo intelligence ship held as prisoners by North Korea
communicated in sign language during staged photo opportunities, informing the United
States they were not defectors, but rather were being held captive by the North Koreans. In
other photos presented to the U.S., crew members gave "the finger" to the unsuspecting
North Koreans, in an attempt to discredit photos that showed them smiling and comfortable.
The North Koreans were told in passing by the American captives that the middle finger was
the “Hawaiian Good Luck sign.”
The steganography analog in this example is the use of sign language to convey messages
to the U.S. recipients of the propaganda photographs.
Digital Steganography
720
Modern steganographic techniques rely on the viewers’ lack of knowledge about digital
representation of information rather than ignorance of sign language or other symbols.
Because images, audio, text, and video are all encoded with bits, steganographers can
utilize this commonality to send one form of information hidden within another.
Example: Consider a raster image that encodes each pixel by its RGB color value. If using
24 bits, such as here,
there are 224 (16,777,216) colors usable for encoding. Imagine a message encoded as text
by using the least significant bit to spell out words in ASCII as follows:
1. Take 3 pixels encoded with the same color. Here we are using the one defined above:
2. Alter each of the color’s least significant bit to encode the hidden message. Here, we
will alter the values of the first seven blocks of bits to encode the letter X in ASCII
(binary value: 1011000 ).
3. The image can now be used to send the hidden message. As long as the recipient
knows it is there and how to retrieve it, the message is apparent.
Will any unintended recipients be able to tell that a message is hidden within the image?
Given that the color space is 24 bits, and that there are 16,777,216 colors to choose from,
altering the color by a value of 1 will not produce a discernible change in the image:
Discussion Questions
721
How well did it work? Were there any issues?
Can you tell the image has hidden information just by looking at it?
Assume you are using a 320 x 200 pixel image. The image has 24 bits of color
encoding per pixel. Approximately how many bits does the image require?
Assume the message you wrote contained 25 characters (including spaces). If each
ASCII character requires seven bits to represent it, how many bits does the text
message require?
Repeat Step 3 using your own image and text measurements. What’s the bit ratio of
hidden message to the containing image? What effect might this have on detecting
steganographic messages?
Journal Reflection
Read the following article, Bin Laden: Steganography Master?, and a response, Terrorists
and Steganography, from a world-renowned security expert, Bruce Schneier. Note that the
articles were written seven months before the September 11th attacks on the World Trade
Center. Reflect on the articles and respond to the following questions in your journals:
722
UTeach CS Principles Unit 6: Innovative Technologies
BIG PICTURE:
Net Neutrality
Highlights
You will discuss and explore the issues on both sides of the net neutrality issue.
723
UTeach CS Principles Unit 6: Innovative Technologies
Net Neutrality
"A Series of Tubes"
"The Internet is not something that you just dump something on.
It’s not a big truck. It is a series of tubes.” —Sen. Ted Stevens
For many Americans, the term “net neutrality” first rose to their level of awareness as a
result of the widespread ridicule in the popular media received by remarks made on the
Senate floor on June 28, 2006. While addressing a Senate commerce committee discussing
pending amendments to a telecommunications bill, Alaskan senator Ted Stevens secured
his Internet fame with a rambling, 11-minute speech, highlighted by an unfortunately chosen
“series of tubes” analogy.
https://www.youtube.com/embed/f99PcP0aFNE
While Senator Stevens’ remarks (and the numerous Internet memes that they inspired)
helped to raise public awareness of the so-called net neutrality debate, they did little to
clarify the issue or inform the public about the actual stakes involved.
In simple terms, net neutrality concerns itself with ensuring equal and unrestricted access to
all legal content that is available throughout the Internet. In principle, it argues that Internet
service providers (ISP) and other gateway services that connect individual users to the
wealth of content and services available across the network should not discriminate against
or favor certain content over other content.
Common Carriage
Ultimately, the net neutrality issue comes down to the question of how government
legislation should or should not regulate the access to and delivery of Internet services. Is
the Internet a public utility, like telephone, water, or electricity, where the good of the
724
community entitles all individuals to unrestricted access to these essential services? Or is
the Internet more like a commodity or specialized luxury that consumers merely choose to
purchase or not?
Historically, the FCC has considered the Internet to be an “information service.” That is, it
treats networked information as a product that can be optionally purchased or subscribed to,
rather than pure communications between two parties (i.e., more like a magazine or
newspaper than a phone call or letter). As such, the FCC (as in the Federal Communications
Commission) has, until recently, taken a rather hands-off view of the Internet since it was not
strictly regarded as a communication medium.
However, many proponents for net neutrality argue that the ways that people actually use
and rely upon access to the Internet are much more analogous to other “telecommunications
services” like telephone service. That is, the Internet is merely a conduit through which users
connect to a remote service to send and receive communications in the same way that a
phone line is a conduit, managed by a phone company, for communication between two
callers.
In order to ensure that such party-to-party communications are unrestricted, unfiltered, and
uncensored, legislation specifically regulates how telecommunications services operate and
imposes limits on what they can and cannot do with the data and information passing
through their channels. These services are designated as common carriers—a classification
for a person or company that transports goods or people for another party and that is
responsible for any possible loss of the goods during transport.
For example, telephone companies have been granted common carrier status. This means
that while they are obligated to provide connection services between telephone users (i.e.,
carry communications), they are restricted from interfering, altering, filtering, or accessing
the content of those calls. Similarly, this common carrier status also protects telephone
companies from any liability for the communications they transmit (e.g., a telephone call
between two individuals planning a crime does not make the telephone company a co-
conspirator or accomplice to the crime). As long as a common carrier treats all
communications equally, with the same hands-off policy, it is not responsible for the
misdoings of its users.
725
“telecommunications,” the FCC is able to apply many of the same precedents to the Internet
as it does to other, more well-established telecommunications services.
https://www.youtube.com/embed/uKcjQPVwfDk
During his presidency, Barack Obama served as a strong proponent of the net neutrality
principles. In his recommendations to the FCC, Obama advocated for the creation of new
rules and policies designed to ensure that no single company offering access to Internet
services can ever act as a gatekeeper that limits, filters, or censors what users can do, say,
or see online.
The president’s recommendations reflected the expectations that the majority of users
already hold concerning their Internet access and usage, specifically addressing the
following assumptions:
726
No blocking: All legally available content should be accessible by any user. ISPs
should not be allowed to selectively restrict access to or censor content requested by
its users.
No throttling: ISPs should not intentionally slow down or degrade transmissions to
certain sites or online services. All content should be treated equally.
Increased transparency: FCC rules should apply to stages of Internet transmission
and not be limited only to ISPs and their connection to individual users.
No paid prioritization: In order to promote the growth of Internet-related businesses
of all sizes, all sites and online services should have equal access to the Internet and
should compete on a level playing field. No service should be handicapped simply
because it cannot afford to pay for better service.
"Imagine if the phone company could mess with your calls every
time you tried to order pizza from Domino’s, because Pizza Hut is
paying them to route their calls first. The phone company isn’t
allowed to do that, and, for a while, the FCC said broadband
providers couldn’t either.”—ACLU
For example, Christopher Yoo, a legal scholar who specializes in communication and
computer and information sciences, offers a number of reasons why net neutrality, as
proposed, is misguided and potentially counterproductive.
One of Yoo’s arguments centers on the issue of data discrimination—the notion that an ISP
might treat different types of data or data from different sources differently, potentially
favoring one type of data while restricting or filtering another type.
727
UTeach Computer Science—http://uteachcs.org © 2016 The University of Texas at Austin
728
UTeach CS Principles Unit 6: Innovative Technologies
UNIT TOPIC:
Interconnectedness in Computing
World Wide Web
You will analyze the impact of hyperlinked documents on how individuals find, acquire,
and learn new information.
You will analyze the legal, social, and commercial impact that the World Wide Web has
had on society.
Internet of Things
You will explain how computing innovations affect communication, interaction, and
cognition.
You will examine how sensor networks function.
You will examine the societal benefits and threats of smart devices.
729
UTeach CS Principles Unit 6: Innovative Technologies
World Wide Web
WWW
How many times have you seen www. at the start of a URL? It is so ubiquitous that many
web browsers and web sites will insert it into the URL even if you do not type it. But www.
is a special part of a domain’s address indicating that it is a server hosting content designed
to meet the standards of the World Wide Web. And almost every online service you likely
use is a part of the World Wide Web.
In fact, the World Wide Web is one of those things that most of us use on a regular basis
without ever thinking about how it works or what problems it was originally created to solve.
But, since its inception in the early 1990s, the Web has proven to be one of the most
revolutionary and empowering inventions in history.
Not to be confused with the broader concept of the Internet, the World Wide Web, itself, is a
content-oriented ecosystem that has been built atop the globally networked infrastructure of
the Internet. It was designed primarily to provide an open platform that could provide uses
from all over the world a standard and accessible means of communicating and sharing
information online.
Berners-Lee proposed that a standardized set of protocols and tools be developed that
might help to ease the integration of these disparate computing systems and to facilitate
730
improved communications between them. In short, he wanted to employ the ideas of
abstraction to design a more generalized means of sharing information across the Internet
that was independent of any particular hardware or software that a user might be using. It is
also to keep in mind that abstractions can be combined. Lower-level abstractions can be
blended to make higher-level abstractions, such as short message services (SMS) or email
messages, images, audio files, and videos.
As a result of his efforts, Berners-Lee created the set of fundamental tools and technologies
that make up what we now more familiarly know of as the World Wide Web*.
Web Applications:
Web Technologies:
On August 6, 1991, Berners-Lee brought the world’s first web site online. It ran on a NeXT
cube computer located in his lab at CERN and prominently displayed a sticker on the front of
the machine which read, “This machine is a server. DO NOT POWER IT DOWN!!”
731
*Interestingly enough, “World Wide Web” was not the only name that Berners-Lee
considered when choosing a name for his creation. He almost named it one of his other
ideas: Information Mesh, Mine of Information, or Information Mine. Consider how the Web
the Mine might look today with URLs like moi.google.com or moi.facebook.com
instead of our familiar www. prefix.
Hyperlinks
One of the key features that Berners-Lee incorporated into his invention is the use of
hyperlinks to connect documents with one another in a non-linear way. While the pages of a
book are arranged linearly in sequence (e.g., page 1, page 2, page 3, etc.), there is no such
sequencing of documents in the World Wide Web. Instead, like the multiply connected
computers of the Internet, the Web consists of a collection of massively interconnected
pages of content.
Each web page is effectively a single, text-based document that has been “marked up” with
embedded formatting instructions known as HTML (Hypertext Markup Language) tags. Each
of these electronic documents are stored on a computer running a web server. The location
of the file within the computer’s file system corresponds to the documents URL (i.e., the
address of the web page).
A hyperlink is a clickable bit of text, image, or other on-screen element within an HTML
document that a user can select to request another, related document. Each link is designed
to enable the user to selective seek out, or browse, from one document to the next, following
whatever sequence they choose. This non-linear approach to organizing and connecting
information has created an unlimited number of new ways that people can find, learn, and
consume information.
The above example, would produce the following hyperlinked text within a web page:
You can search for something, tweet a comment, or like a friend’s post at
these popular sites.
Here, you can see that “search,” “tweet,” and “like” have each been formatted to act as
hyperlinks (linking to Google, Twitter, and Facebook, respectively). Each hyperlink is
denoted with the use of an anchor ( <a>...</a> ) tag that frames the text being linked
(e.g., “search,” “tweet,” and “like"). Each anchor tag includes the URL of the other page or
site that the hyperlink is referencing (e.g. href="..." ).
When a user clicks on any of these links, the web browser sends a request to the
corresponding web server for the specified page (as referenced in the href tag).
1. Using your preferred search engine (Google, Bing, DuckDuckGo, etc.), conduct a
search for your own name.
2. Record the URL of the first link that your search returns.
3. Visit that URL and count and record the total number of different links that you can find
on that page.
4. Also record the URLs of up to three more of the hyperlinks on that page.
5. Continue repeating this process counting and recording hyperlinks for each URL you
record for at least two more levels.
Using your findings, estimate the total number of different pages that could be reached if you
were to start at the URL found from your original “vanity search” (i.e., searching for your own
name) and followed a series of five clicks. What about 10 clicks? 20 clicks?
733
734
UTeach CS Principles Unit 6: Innovative Technologies
Internet of Things
Are they taking over?
You have probably heard a conversation about machines rising up and controlling
humankind—probably from an older relative or paranoid aquaintance. Glorified by the
entertainment industry through movies (I, Robot, The Matrix, and Transformers) and books
(Cinder: Book One in the Lunar Chronicles, Robopocalypse, and Do Androids Dream of
Electric Sheep?), the fear of robots “taking over the world” has been a concern for many
individuals. With the advent of many of our smart technologies, like sensor networks and
Global Positioning Systems (GPS), convenience has triumphed over necessity in many of
our lives. Our knowledge concerning how these smart technologies work and the securities
that surround them may be our saving grace if/when the machines decide to turn against
their makers (us).
Internet of Things
735
The Internet of Things is the interconnection via the Internet of computing devices
embedded in everyday objects, enabling them to send and receive data. All of the devices
that are connected using the protocols of the Internet make up the Internet of Things. We
humans have found these devices useful for tasks such as communication, navigation, and
health care. For example, GPS and related technologies have changed how humans travel,
navigate, and find information related to geolocation. Before GPS, navigation would be
planned in advance—spontaneous changes would be difficult to make.
So, will these smart devices become too intelligent through their interconnectedness and
take over the world? Your sibling has the same question...
Instructions
Recently, your younger sibling watched I, Robot and is now scared to go to bed because
they think the “robots are going to attack in the night.” Compose a short (between one and
two minutes) presentation on the medium of your choice (Microsoft PowerPoint, Google
Slides, etc.) to help your younger sibling feel brave about going to bed. If you cannot
convince them, your parents said that your younger sibling would have to sleep in your room
(and they probably snore).
736
visit the Purdue Online Writing Lab.
3. Submit your presentation to your teacher through the specified method. Make sure to
explain these topics in a way that a younger sibling would understand.
737
UTeach CS Principles Unit 6: Innovative Technologies
Ethics of Autonomous Technology
Are Machines Ethical?
Does the question of whether machines are ethical even make
sense? Can a machine have ethics? From phones to cars to
home-automated devices, as our technology becomes
increasingly “smart” and capable of managing many of our
daily tasks for us, these questions begin to become more
relevant.
Humans are clearly capable of ethical behavior. As such, each of our actions is performed
within the context of what we consider safe, fair, just, and ethical. But as new forms of
technology take over these tasks, do they (or even can they) exhibit the same level of ethical
awareness that we demonstrate ourselves?
You witness a runaway trolley (i.e., a streetcar) barreling down the street.
In its path, you notice five people stuck on the tracks and unable to move
out of the way before they are struck by the trolley.
Fortunately, you have time to pull a lever that will redirect the trolley onto
a parallel track where there is only one person in its path.
Most people tend to agree that pulling the lever is the better choice since it lessens the loss
of life from five victims to only one. However, an alternate scenario that also results in only
one loss of life is not quite so clear-cut. Instead of a lever and a parallel track, consider the
following variation:
738
This time, however, you notice a very fat man standing next to the tracks.
You quickly conclude that shoving the fat man into the path of the
speeding trolley will derail and stop the trolley while killing the fat man, but
save the five people farther down the track.
How is this scenario different from the first? And why do so many people say that they could
not sacrifice the fat man while they were perfectly willing to sacrifice the sole person
standing on the parallel track? It is quite a dilemma and shows just how difficult it can be to
know what the right solution is in various scenarios.
In these hypothetical thought experiments, it may be easy to dismiss the ethical challenges
because the situations are not real and there are no real victims, only imaginary victims. But
with the advent of autonomous vehicles, the potential victims are very real and the
programmers who design the algorithms that dictate how a car chooses between different
options must deal with these very real issues. The code that they write might literally mean
the difference between life and death.
https://www.youtube.com/embed/ixIoDYVfKA0
However, with autonomous vehicles, this question becomes a bit more nebulous. If self-
739
driving car A causes an accident, who is the driver of record? The car itself? The owner of
the car that caused the accident? The passenger(s) of the car who instructed the car to drive
them through that intersection? The carmaker that designed and sold the car as a safe
vehicle? The team of programmers whose code made the decision that led to the accident?
Or is the responsibility shared between some number of these parties? And if so, which
parties and in which proportions?
These are not easy questions to answer and the laws regarding autonomous vehicles (or
other autonomous devices) have traditionally not kept up with the pace of new technologies.
Inevitably, the necessary laws will be written and many of these questions will be resolved,
as they have with other technological advances of the past. But for now, things are anything
but clear.
But this issue does introduce a fascinating complication of any technological advancement
that impacts the full spectrum of individuals involved in the application of new technology. At
the coding and design end of the spectrum, programmers and engineers need to be well-
versed in ethics and the law, especially as they attempt to model such behavior in the
decision-making algorithms of their devices. And on the opposite end of the spectrum,
lawmakers and legal authorities need to have a solid understanding of technology and its
capabilities and limitations in order to write sound and enforcible legislation.
Exercise
As legislators draft new laws and automakers develop new autonomous vehicle
technologies, both groups often rely upon a consensus of expert opinions with regard to the
ethics and responsibilities of these new innovations. They use these agreed-upon sets of
policies and standards to shape and guide their development of new technologies and the
laws that apply to them.
Initially, as an entire class, you will discuss the issues related to autonomous vehicles and
brainstorm ideas and the goals that you wish your policies to achieve. Next, you will divide
into small groups, with each group focusing on the development of one or two policy ideas in
detail. Finally, the groups will share their results with the full class, who will then discuss,
revise, and select the various policy ideas that will make up the standards for the class’
overall recommended autonomous vehicle policy.
740
As you develop and discuss your new policies, you should make sure to generalize your
definitions such that they can apply to a broad variety of scenarios. It would be helpful to
consider examples of similar scenarios to which your policy should apply and those to which
it should not apply. Your policies should include the following elements:
741
UTeach CS Principles Unit 6: Innovative Technologies
UNIT PROJECT:
Prototyping the Future
Highlights
You will collaborate in pairs to envision and design a future innovation in technology.
You will discuss and identify a specific purpose that your innovation will serve (e.g.,
entertainment, problem solving, education, artistic expression, etc.) and its key
features.
You will evaluate the potential benefits and risks of your innovation.
You will identify existing technological resources that your innovation may utilize.
You will identify technological challenges that must be overcome before your
innovation can be fully realized.
You will develop a mock-up of your innovation that demonstrates its use and
functionality.
You will write a detailed product description and deliver an elevator pitch to the class
detailing the features of your innovation and its potential impact on society using
appropriate terminology.
You will provide written feedback to your peers on the potential of each collaborative
team’s design.
742
UTeach CS Principles Unit 6: Innovative Technologies
Future Technology Project: Rubric
Check
Feedback
This activity provides you and a partner group time to provide one another critical feedback
about the progress of your projects.
Listen to what your partner group says! You do not need to heed all of their advice, but
consider it wisely. The whole idea is to have a critical third party provide ideas to make your
project better before you submit the final version for a grade.
743