Rust Handouts v1.0
Rust Handouts v1.0
Part - I
- by Anjum Nazir
Table of Contents
How to Install Rust
$ curl --proto '=https' --tlsv1.3 https://sh.rustup.rs -sSf -o rust_setup.sh
$ curl --proto '=https' --tlsv1.3 https://sh.rustup.rs -sSf | sh
$ ./rust_setup.sh
Cargo:
Cargo is Rust’s (1) build system and (2) package manager.
Cargo:
o Builds your code
o Download and build the libraries and dependencies.
$ cargo --version
$ cd hello_cargo
Cargo has generated three files and one directory for us:
Cargo.toml file and a src directory with a main.rs file inside.
It has also initialized a new Git repository along with a .gitignore file.
TOML File
TOML stands for Tom’s Obvious, Minimum Language. It is generally used to store configuration
data like JSON file format but it is designed to be simple and human readable. It is based on (i)
sections or tables and (ii) key-value pairs use to store information.
Cargo.toml file is used to store project’s metadata, such as the
1. Project’s name,
2. Its version, and
3. Its dependencies.
$ cat Cargo.toml
[package]
name = "hello_cargo"
version = "0.1.0"
edition = "2021"
[dependencies]
The first line, [package], is a section heading that indicates that the following statements are
configuring a package.
In Rust, packages are referred to as crates.
### Building and Running a Cargo Project ###
$ cargo build
$ cargo run
$ cargo check
$ cargo build --release
1 directory, 2 files
block3 /root/rust/hello_world # cargo run
Compiling hello_world v0.1.0 (/root/rust/hello_world)
Finished dev [unoptimized + debuginfo] target(s) in 1.74s
Running `target/debug/hello_world`
Hello, world!
block3 /root/rust # tree hello_world
hello_world
├── Cargo.lock
├── Cargo.toml
├── src
│ └── main.rs
└── target
├── CACHEDIR.TAG
└── debug
├── build
├── deps
│ ├── hello_world-41309fa5631cf679
│ └── hello_world-41309fa5631cf679.d
├── examples
├── hello_world
├── hello_world.d
└── incremental
└── hello_world-piieziagd64g
├── s-gf11eev05t-b8gy7z-3dgcwcgxcvw94
│ ├── 17shh8yjp31t9tyr.o
│ ├── 1b6ydeni2b79s1xb.o
│ ├── 1oobr0kbynjc5z0d.o
│ ├── 2d12tio9maj7d8ty.o
│ ├── 2oj6d9jlshlxum4j.o
│ ├── 3j7hh9aw5aybsehd.o
│ ├── 47gw2ue69elo3h9v.o
│ ├── 4c2488d0zcqpcvb4.o
│ ├── dep-graph.bin
│ ├── query-cache.bin
│ └── work-products.bin
└── s-gf11eev05t-b8gy7z.lock
9 directories, 20 files
Cargo.lock is a file that specifies the exact version numbers of all the dependencies so that
future builds are reliably built the same way until Cargo.toml is modified.
One of the first things that you are likely to notice is that strings in Rust are able to
include a wide range of characters. Strings are guaranteed to be encoded as UTF-8
Macros can be thought of as fancy functions for now. These offer the ability to avoid
boilerplate code.
In the case of println!, there is a lot of type detection going on under the hood so that
arbitrary data types can be printed to the screen.
Rust also brings contemporary developer tools to the systems programming world:
Cargo, the included dependency / package manager and build tool, that makes adding,
compiling, and managing dependencies painless and in a consistent way across the Rust
ecosystem.
Rustfmt (code formater) ensures a consistent coding style across developers.
The Rust Language Server powers Integrated Development Environment (IDE) integration
for code completion and inline error messages.
Getting Started
1. Installing Rust on Linux, macOS, and Windows
2. Writing a program that prints Hello, world!
3. Using cargo, Rust’s package manager and build system
If you want to stick to a standard style across Rust projects, you can use an automatic formatter
tool called rustfmt to format your code in a particular style. The Rust team has included this
tool with the standard Rust distribution, like rustc, so it should already be installed on your
computer! Check the online documentation for more details.
fn main() {
println!("Hello, world!");
}
3. Just compiling with rustc is fine for simple programs, but as your project grows, you’ll
want to manage all the options and make it easy to share your code. Next, we’ll introduce
you to the Cargo tool, which will help you write real-world Rust programs.
Cargo!
Filename: Cargo.toml
1. This file is in the TOML (Tom’s Obvious, Minimal Language) format, which is Cargo’s
configuration format.
2. In Rust, packages of code are referred to as crates.
3. If you started a project that doesn’t use Cargo, as we did with the “Hello, world!”
project, you can convert it to a project that does use Cargo. Move the project code into
the src directory and create an appropriate Cargo.toml file.
What is Rust?
Rust’s distinguishing feature as a programming language is its ability to prevent invalid data
access at compile time.
Rust is a systems programming language that runs blazingly fast, prevents segfaults, and guarantees
thread safety.
Dangling pointers:
Dangling pointers are pointers in programming that point to a memory location that has
already been freed or deallocated. In other words, the pointer refers to a memory address that
is no longer valid or accessible, but the pointer itself still exists and holds that now-invalid
memory reference.
How Dangling Pointers Occur:
1. Memory Deallocation:
o A dangling pointer can occur when memory is freed or deallocated (e.g., using
free() in C/C++ or dropping a resource in Rust), but a pointer to that memory still
exists and hasn’t been updated to reflect that the memory is no longer valid.
o Example in C:
int *ptr = (int *)malloc(sizeof(int)); // Allocate memory
*ptr = 42; // Use the memory
free(ptr); // Deallocate memory
*ptr = 50; // Dangling pointer! Memory is freed, but ptr still
points to it
2. Out-of-Scope Pointers:
A dangling pointer can also occur when a local variable goes out of scope, but a pointer
or reference to that variable is still retained elsewhere in the program.
Example in C:
int* danglingPointer() {
int x = 10;
return &x; // Returning pointer to local variable
}
Summary:
A dangling pointer occurs when a pointer still references memory that has been deallocated or
is otherwise invalid, leading to potential crashes, data corruption, or security issues. Proper
memory management is crucial to avoid such problems.
ch1/ch1-cereals/src/main.rs
#[derive(Debug)] // <1>
enum Cereal { // <2>
Barley, Millet, Rice,
Rye, Spelt, Wheat,
}
fn main() {
let mut grains: Vec<Cereal> = vec![]; // <3>
grains.push(Cereal::Rye); // <4>
drop(grains); // <5>
[Code Explanation]
#[derive(Debug)] // <1>
This attribute #[derive(Debug)] is used to automatically implement the Debug trait for the
Cereal enum.
The Debug trait allows the values of the enum to be printed using the {:?} format
specifier in println!.
enum Cereal { // <2>
Barley, Millet, Rice,
Rye, Spelt, Wheat,
}
Here, an enum called Cereal is defined with six variants: Barley, Millet, Rice, Rye, Spelt, and
Wheat.
Enums in Rust are used to represent a type that can hold one of several defined values (in this
case, different types of cereals).
fn main() {
let mut grains: Vec<Cereal> = vec![]; // <3>
grains.push(Cereal::Rye); // <4>
The push method adds Cereal::Rye (one of the enum variants) to the grains vector.
drop(grains); // <5>
The drop function is called to explicitly drop the grains vector. In Rust, drop is used to manually
deallocate or clean up resources before the variable goes out of scope. After this line, grains is
no longer valid, and its memory has been released.
This line attempts to print the grains vector using the {:?} format specifier, which relies on the
Debug trait that was automatically derived for the Cereal enum in line <1>. However, this line
will result in a compile-time error because grains was explicitly dropped on line <5>, making it
invalid to use here.
In Rust, when you use println! to print a value, you often need to specify how the value should
be formatted. The {:?} format specifier is used for "debug formatting," which is a way to output
a value in a developer-friendly format (useful for debugging purposes). To use {:?}, the type of
the value being printed must implement the Debug trait.
1. The Debug Trait:
The Debug trait in Rust allows types (such as enums, structs, or other custom types) to be
printed in a way that is suitable for debugging. By default, custom types like enums (e.g.,
Cereal) don’t implement the Debug trait, so you can't directly print them using {:?}.
However, by using the #[derive(Debug)] attribute, Rust automatically generates the necessary
implementation of the Debug trait for your type. This means that your type can now be printed
with the {:?} format specifier.
#[derive(Debug)] // <1>
enum Cereal { // <2>
Barley, Millet, Rice, Rye, Spelt, Wheat,
}
This line attempts to print the grains vector, which is a Vec<Cereal>. Since Vec<T> implements
the Debug trait (when T also implements Debug), Rust can print the entire vector in a readable
format.
Thanks to the #[derive(Debug)] for Cereal, each variant of the Cereal enum (e.g., Rye) can be
printed using {:?}. For instance, if grains contains Cereal::Rye, the output would look like:
[Rye]
If the Debug trait hadn't been derived for Cereal, you would have encountered a compilation
error saying that Cereal does not implement Debug
Summary
{:?} is a format specifier used to print a value in a developer-friendly, debug format.
To print a type with {:?}, that type must implement the Debug trait.
#[derive(Debug)] automatically provides the Debug trait for a type, allowing it to be
printed using {:?}.
In this case, the Cereal enum has #[derive(Debug)], allowing the grains vector (which
holds Cereal values) to be printed in a readable format.
Listing 1.4 Example of Rust preventing a race condition
ch1/ch1-race/src/main.rs file.
fn main() {
let mut data = 100;
println!("{}", data);
}
Closure
In Rust, a closure is an anonymous function that can capture variables from its surrounding
environment.
Closures are similar to functions, but with some important distinctions, such as their ability
to capture and use variables from the scope in which they are defined.
They are often used for short-lived tasks like passing as arguments to higher-order
functions, performing operations in a thread, or working with iterators.
let x = 5;
let closure = |y| x + y; // The closure captures `x` by borrowing.
println!("{}", closure(3)); // Output: 8
use std::thread;
fn main() {
let x = 10;
handle.join().unwrap();
}
handle.join().unwrap(); in Rust is used to wait for a spawned thread to finish and handle
any errors:
handle.join():
o Blocks the main thread until the spawned thread completes.
o It returns a Result, which is Ok if the thread finished successfully, or Err if the
thread panicked.
.unwrap():
o Extracts the Ok value from the Result. If the thread panicked (i.e., the Result is
Err), unwrap() will cause the program to panic and terminate.
In short, this line waits for the thread to finish and ensures that any thread failure causes a panic
in the main program.
fn main() {
let data = Arc::new(Mutex::new(100));
let data1 = Arc::clone(&data);
let handle1 = thread::spawn(move || {
let mut data = data1.lock().unwrap();
*data = 500;
});
handle1.join().unwrap();
handle2.join().unwrap();
println!("{}", *data.lock().unwrap());
}
This Rust code is designed to demonstrate a runtime error known as a "buffer overflow" (or
more specifically, an index out of bounds error). Let’s break it down:
This line attempts to access the element at index 4 of the fruit vector. However, the vector only
has 3 elements (0, 1, and 2), so index 4 does not exist. This will cause a runtime error called an
"index out of bounds" error.
In Rust, when you try to access an invalid index in a vector, the program will panic and
terminate the execution to prevent undefined behavior.
FIX
fn main() {
let fruit = vec!['🥝', '🍌', '🍇'];
------------------------------------------------------------------------------
-----------
if let Some(buffer_overflow):
o This is a shorthand pattern for matching the Option returned by
get().
o if let checks whether the result of fruit.get(4) is Some (i.e.,
the index exists and contains a value).
o If get(4) returns Some(value), the variable buffer_overflow is
assigned to value, and the block of code inside the if let
executes.
o If get(4) returns None (index out of bounds), the code inside the
block does not run, and the program continues after the if let.
*buffer_overflow:
o The buffer_overflow variable contains a reference to the element
in the vector because fruit.get(4) returns an Option<&T>, where &T
is a reference to the value.
o The * operator is used to dereference buffer_overflow, retrieving
the actual value stored at that index (in this case, a character,
such as '🍉').
o So, *buffer_overflow gives the value of the element, which is then
compared to '🍉'.
Some
Some is a variant of the Option enum and represents the presence of a value.
It is used to wrap a value that might or might not exist, alongside None, which represents
the absence of a value.
Option and its variants (Some and None) help handle situations where a value may or may
not be available without panicking.
Listing 1.6 Attempting to modify an iterator while iterating over it
ch1/ch1-letters/src/main.rs.
fn main() {
let mut letters = vec![ // <1>
"a", "b", "c"
];
Listing 1.6 fails to compile because Rust does not allow the variable to be letters modified
within the iteration block
Pushing Inside the Loop:
The code tries to push letter.clone() back into the letters vector. The .clone() method
creates a copy of letter, which the code attempts to push into the vector.
However, this will cause a runtime panic because you are modifying (mutating) the
vector letters while iterating over it, which is not allowed in Rust for safety reasons.
fn main() {
let mut letters = vec!["a", "b", "c"];
let mut i = 0;
while i < letters.len() {
println!("{}", letters[i]);
letters.push(letters[i].clone()); // Safely push a clone of the
current letter
i += 1; // Increment index
}
}
fn main() {
let mut letters = vec!["a", "b", "c"];
let mut i = 0;
while i < letters.len() {
println!("i: {}, letter: {}", i, letters[i]);
letters.push(letters[i].clone()); // Safely push a clone of the
current letter
i += 1; // Increment index
}
}
Goal of Rust: Control
Listing 1.7 Multiple ways to create integer values
use std::rc::Rc;
use std::sync::{Arc, Mutex};
fn main() {
let a = 10;
let b = Box::new(20);
let c = Rc::new(Box::new(30));
let d = Arc::new(Mutex::new(40));
println!("a: {:?}, b: {:?}, c: {:?}, d: {:?}", a, b, c, d);
}
Let's go through the Rust code step by step, explaining the concepts used such as Box, Rc, Arc,
and Mutex, and how they work together.
Code Breakdown:
use std::rc::Rc;
use std::sync::{Arc, Mutex};
Atomic Operation
An atomic operation is a low-level, indivisible operation that is performed in a single
step without the possibility of interference from other operations or threads.
It is guaranteed to be executed fully or not at all, ensuring that no other thread can
observe the operation in a partially completed state.
In multi-threaded programs, atomic operations are crucial for preventing race
conditions when multiple threads access or modify shared data.
Atomic operations are often used in synchronization primitives like mutexes and atomic
reference counting (e.g., Arc in Rust).
Variables Explanation:
1. let a = 10;:
o a is a simple integer variable. It's stored on the stack, as primitive types like i32
are typically stack-allocated in Rust.
2. let b = Box::new(20);:
o Box is a smart pointer that allocates data on the heap instead of the stack.
o Box::new(20) creates a new box that stores the integer 20 on the heap. The
value of b is the Box pointer, while the actual data (20) is stored on the heap.
o Purpose of Box: Used to heap-allocate values, especially when working with
types that must be stored on the heap (e.g., recursive data structures).
3. let c = Rc::new(Box::new(30));:
o Rc stands for reference counting, and it enables multiple ownership of the same
data. Here, c is a reference-counted smart pointer to a Box that holds the integer
30.
o Rc::new(Box::new(30)) creates a reference-counted pointer (Rc) that manages a
heap-allocated Box containing the value 30.
o Purpose of Rc: Useful when multiple parts of a program need to share ownership
of the same data, but only within a single thread (as Rc is not thread-safe).
4. let d = Arc::new(Mutex::new(40));:
o Arc is a thread-safe version of Rc. It stands for atomic reference counting,
meaning it can safely be used in multi-threaded environments. It uses atomic
operations to track the reference count, ensuring correctness across multiple
threads.
o Mutex provides a way to safely mutate data by allowing only one thread to
access the data at a time.
o Arc::new(Mutex::new(40)) creates an atomic reference-counted pointer (Arc) to
a Mutex that protects access to a heap-allocated integer (40).
o Purpose of Arc and Mutex: Used when multiple threads need to share
ownership of data (Arc), and they may need to mutate the data in a controlled
(synchronized) manner (Mutex).
Printing the Values:
println!("a: {:?}, b: {:?}, c: {:?}, d: {:?}", a, b, c, d);
The println! macro prints the values of a, b, c, and d using the {:?} format specifier, which is
used for debug printing.
a is printed as the simple value 10.
b is a Box that contains the value 20. It prints as Box(20).
c is an Rc that wraps a Box. The output shows the reference count (strong=1 because
there's only one reference) and the inner value (Box(30)).
d is an Arc that contains a Mutex. It shows the reference count (strong=1), and inside
the Mutex is the value 40.
Example Output:
a: 10, b: 20, c: Rc(Box(30)), d: Arc(Mutex { data: 40 })
Summary:
a: A simple integer value, stored on the stack.
b: A Box smart pointer, heap-allocating the integer 20.
c: An Rc pointer with a reference to a Box, allowing shared ownership of the Box
(containing 30) within a single thread.
d: An Arc pointer wrapping a Mutex, allowing multiple threads to share and mutate a
protected integer (40) in a thread-safe manner.