Rust Programming Tutorial PDF
Rust Programming Tutorial PDF
Rust Programming Tutorial PDF
Rust Programming
Language Tutorial
(Basics)
Written by:
:ULWWHQE\
Apriorit Inc.
$SULRULW,QF
Author:
$XWKRU
Alexey Lozovsky,
$OH[H\/R]RYVN\
Software Designer in System Programming Team
6RIWZDUH'HVLJQHULQ6\VWHP3URJUDPPLQJ7HDP
KWWSVZZZDSULRULWFRP
https://www.apriorit.com LQIR#DSULRULWFRP
[email protected]
Introduction
This Rust Programming Language Tutorial and feature overview is prepared by system
programming professionals from the Apriorit team. The Tutorial goes in-depth about main
features of the Rust programming language, provides examples of their implementation, and
a brief comparative analysis with C++ language in terms of complexity and possibilities.
Rust is a relatively new systems programming language, but it has already gained a lot of
loyal fans in the development community. Created as a low-level language, Rust has
managed to achieve goals that are usually associated with high-level languages.
This tutorial is divided into sections, with each section covering one of the main features of
the Rust language:
● zero-cost abstractions
● move semantics
● guaranteed memory safety
● threads without data races
● trait-based generics
● pattern matching
● type inference
● minimal runtime
● efficient C bindings
In addition, we have added a detailed chart comparing feature set of Rust to C++. As a
leading language for low-level software development, C++ serves as a great reference point
for illustrating advantages and disadvantages of Rust.
This tutorial will be useful for anyone who only starts their journey with Rust, as well as for
those who want to gain a more in-depth perspective on Rust features.
Table of Contents
Introduction
Summary of Features
Trait-Based Generics
Traits Define Type Interfaces
Traits Implement Polymorphism
Traits May be Implemented Automatically
Pattern Matching
Type Inference
Minimal Runtime
Efficient C Bindings
Calling C from Rust
The Libc Crate and Unsafe Blocks
Beyond Primitive Types
Calling Rust from C
The majority of safety checks and memory management decisions are
performed by the Rust compiler so the program’s runtime performance
isn’t slowed down by them. This makes Rust a great choice for use cases
where more secure languages like Java aren’t good:
Rust can be used for web applications as well as for backend operations
due to the many libraries that are available through the
Cargo package
registry
.
Summary of Features
Before describing the features of Rust, we’d like to mention some issues
that the language successfully manages.
Table of content
Issue Rust’s Solution
Use-after-free, double-free bugs, dangling Smart pointers and references avoid these
pointers issues by design
Legacy design of utility types heavily used Built-in, composable, structured types:
by the standard library tuples, structures, enumerations
Table of content
Embedded and bare-metal programming Minimal runtime size (which can be reduced
place high restrictions on runtime even further)
environment
Absence of built-in garbage collector, thread
scheduler, or virtual machine
Now let’s look more closely at the features provided by the Rust
programming language and see how they’re useful for developing system
software.
Zero-Cost Abstractions
Zero-cost (or zero-overhead) abstractions are one of the most important
features explored by C++. Bjarne Stroustrup, the creator of C++,
describes them as follows:
“What you don’t use, you don’t pay for.” And further:
“What you do
use, you couldn’t hand code any better.”
Abstraction is a great tool used by Rust developers to deal with complex
code. Generally, abstraction comes with runtime costs because
Table of content
abstracted code is less efficient than specific code. However, with clever
language design and compiler optimizations, some abstractions can be
made to have effectively zero runtime cost. The usual sources of these
optimizations are static polymorphism (templates) and aggressive inlining,
both of which Rust embraces fully.
Iterators are an example of commonly used (and thus heavily optimized)
abstractions that they decouple algorithms for sequences of values from
the concrete containers holding those values. Rust iterators provide many
built-in
combinators for manipulating data sequences, enabling concise
expressions of a programmer’s intent. Consider the following code:
// And here is what we can see if we print out the resulting vector:
println!("{:?}", numbers); / / ===> [6, 7, 8, 10]
Table of content
Combinators use high-level concepts such as closures and lambda
functions that have significant costs if compiled natively. However, due to
optimizations powered by LLVM, this code compiles as efficiently as the
explicit hand-coded version shown here:
use std::cmp::min;
if n > 5 {
numbers.push(n);
}
if numbers.len() == 4 {
break;
}
}
While this version is more explicit in what it does, the code using
combinators is easier to understand and maintain. Switching the type of
container where values are collected requires changes in only one line with
combinators versus three in the expanded version. Adding new conditions
and transformations is also less error-prone.
Iterators are Rust examples of “couldn’t hand code better” parts.
Smart
pointers are an example of the “don’t pay for what you don’t use”
approach in Rust.
The C++ standard library has a shared_ptr template class that’s used to
express shared ownership of an object. Internally, it uses reference
Table of content
counting to keep track of an object’s lifetime. An object is destroyed when
its last shared_ptris destroyed and the count drops to zero.
Note that objects may be shared between threads, so we need to avoid
data races in reference count updates. One thread must not destroy an
object while it’s still in use by another thread. And two threads must not
concurrently destroy the same object. Thread safety can be ensured by
using atomic operationsto update the reference counter.
However, some objects (e.g. tree nodes) may need shared ownership but
may not need to be shared between threads. Atomic operations are
unnecessary overhead in this case. It may be possible to implement some
non_atomic_shared_ptr class, but accidentally sharing it between threads
(for example, as part of some larger data structure) can lead to
hard-to-track bugs. Therefore, the designers of the Standard Template
Library chose not to provide a single-threaded option.
Move Semantics
Table of content
programs by avoiding unnecessary copying of temporary values, enabling
safe storage of non-copyable objects like mutexes in containers, and
more.
Rust recognizes the success of move semantics and embraces them by
default. That is, all values are in fact moved when they’re assigned to a
different variable:
The punchline here is that after the move, you generally can’t use the
previous location of the value (
foo in our case) because no value remains
there. But C++ doesn’t make this an error. Instead, it declares foo to have
an
unspecified value (defined by the move constructor). In some cases,
you can still safely use the variable (like with primitive types). In other
cases, you shouldn’t (like with mutexes).
Some compilers may issue a diagnostic warning if you do something
wrong. But the standard doesn’t require C++ compilers to do so, as
use-after-move may be perfectly safe. Or it may not be and might instead
lead to an
undefined behavior. It’s the programmer’s responsibility to know
when use-after-move breaks and to avoid writing programs that break.
On the other hand, Rust has a more advanced type system and it’s a
compilation error to use a value after it has been moved, no matter how
complex the control flow or data structure:
Table of content
error[E0382]: use of moved value: `foo`
--> src/main.rs:13:1
|
11 | let bar = foo;
| --- value moved here
12 |
13 | foo.some_method();
| ^^^ value used here after move
|
In fact, the Rust type system allows programmers to safely encode more
use cases than they can with C++. Consider converting between various
value representations. Let’s say you have a string in UTF-8 and you want
to convert it to a corresponding vector of bytes for further processing. You
don’t need the original string afterwards. In C++, the only safe option is to
copy the whole string using the vector copy constructor:
However, Rust allows you to move the internal buffer of the string into a
new vector, making the conversion efficient and disallowing use of the
original string afterwards:
Table of content
are an example of such copyable type, and any user-defined type can also
be marked as copyable with the #
[derive(Copy)]attribute.
● segmentation faults
● use-after-free and double-free bugs
● dangling pointers
● null dereferences
● unsafe concurrent modification
● buffer overflows
Ownership
Table of content
1 fn f() {
2 let v = Foo::new(); // ----+ v's lifetime
3 // |
4 /* some code */ // |
5 } // <---+
In this case, the object Foo is owned by the variable v and will die at line 5,
when function
f()returns.
Ownership can be
transferred by moving the object (which is performed by
default when the variable is assigned or used):
1 fn f() {
2 let v = Foo::new(); // ----+ v's lifetime
3 { // |
4 let u = v; // <---X---+ u's lifetime
5 // |
6 do_something(u); // <-------X
7 } //
8 } //
Initially, the variable v would be alive for lines 2 through 7, but its lifetime
ends at line 4 where v is assigned to u
. At that point we can’t use v
anymore (or a compiler error will occur). But the object F
oo isn’t dead yet;
it merely has a new owner u
that is alive for lines 4 through 6. However, at
line 6 the ownership of Foo is transferred to the function
do_something().
That function will destroy F
ooas soon as it returns.
Borrowing
1 fn f() {
2 let v = Foo::new(); // ---+ v's lifetime
3 // |
4 do_something(&v); // :--|----.
5 // | } v's borrowed
6 / :--|----'
do_something_else(&v); /
Table of content
7 } // <--+
C++ can handle simple cases like this just as well. But what if we want to
return a reference? What lifetime should the reference have? Obviously,
not longer than the object it refers to. However, since lifetimes aren’t part
of C++ reference types, the following code is syntactically correct for C++
and will compile just fine:
Table of content
/
fn some_call(v: &Foo) -> &Foo { / ------------------+ expected
let w = Foo::new(); /
/ ---+ w's lifetime | lifetime
/
/ | | of the
return &w; /
/ <--+ | returned
} /
/ | value
/
/ <-----------------+
The returned reference is expected to have the same lifetime as the
argument v, which is expected to live longer than the function call.
However, the variable w lives only for the duration of
some_call()
, so
references to it can’t live longer than that. The
borrow checker detects this
conflict and complains with a compilation error instead of letting the issue
go unnoticed.
The compiler is able to detect this error because it tracks lifetimes
explicitly and thus knows exactly how long values must live for the
references to still be valid and safe to use. It’s also worth noting that you
Table of content
don’t have to explicitly spell out all lifetimes for all references. In many
cases, like in the example above, the compiler is able to automatically infer
lifetimes, freeing the programmer from the burden of manual specification.
Foo c;
Foo *a = &c;
const Foo *b = &c;
Here, pointers a
and b
are aliases of the F
oo object owned by c
.
Modifications performed via a will be visible when b is dereferenced.
Usually, aliasing doesn’t cause errors, but there are some cases where it
might.
Consider the
memcpy() function. It can and is used for copying data, but
it’s known to be unsafe and can cause memory corruption when applied to
overlapping regions:
char array[5] = { 1, 2, 3, 4, 5 };
const char *a = &array[0];
char *b = &array[2];
memcpy(a, b, 3);
In the sample above, the first three elements are now undefined because
their values depend on the order in which
memcpy()performs the copying:
Table of content
The ultimate issue here is that the program contains two aliasing
references to the same object (the array), one of which is non-constant. If
such programs were syntactically incorrect then
memcpy() (and any other
function with pointer arguments as well) would always be safe to use.
The second rule relates to ownership, which was discussed in the previous
section. The first rule is the real novelty of Rust.
It’s obviously safe to have multiple aliasing pointers to the same object if
none of them can be used to modify the object (i.e. they are constant
references). If there are two mutable references, however, then
modifications can conflict with each other. Also, if there is a
const-reference A
and a mutable reference B, then presumably the
constant object as seen via A
can in fact change if modifications are made
via B
. But it’s perfectly safe if only one mutable reference to the object is
allowed to exist in the program. The Rust borrow checker enforces these
rules during compilation, effectively making each reference act as a
read-write lock for the object.
Table of content
This won’t compile in Rust, and will throw the following error:
This error signifies the restrictions imposed by the borrow checker.
Multiple immutable references are fine. One mutable reference is fine.
Different references to different objects are fine. However, you can’t
simultaneously have a mutable and an immutable reference to the same
object because this is possibly unsafe and can lead to memory corruption.
Not only does this restriction prevent possible human errors, but it in fact
enables the compiler to perform some optimizations that are normally not
possible in the presence of aliasing. The compiler is then free to use
registers more aggressively, avoiding redundant memory access and
leading to increased performance.
Another common issue related to pointers and references is null pointer
dereferencing. Tony Hoare calls the invention of the null pointer value his
billion-dollar mistake, and an increasing number of languages are including
mechanisms to prevent it (for example, Nullable types in Java and C#,
std::optionaltype since C++17).
Table of content
Rust uses the same approach as C++ for references: they always point to
an existing object. There are no null references and hence no possible
issues with them. However, smart pointers aren’t references and may not
point to objects; there are also cases when you might like to pass a
reference to an object and make no reference.
Some(
value
) – to declare some value
None is functionally equivalent to a null pointer (and in fact has the same
representation), while
Some carries a value (for example, a reference to
some object).
match option_value {
Some(value) => {
/ use the contained value
/
}
None => {
/ handle absence of value
/
}
}
Table of content
Every use of
Option acts as a clear marker, so that no object may be
present above, as it requires handling in both cases. Furthermore, the
Optiontype has many utility methods that make it more convenient to use:
No Uninitialized Variables
Another possible issue with so-called plain old types in C++ is usage of
uninitialized variables. Rust requires variables to be initialized before they
are used. Most commonly, this is done when variables are declared:
But it’s also possible to first declare a variable and then initialize it later:
Table of content
let mut x;
// x is left truly uninitialized here; the assembly
// will not contain any actual memory assignment
loop {
if something_happens() {
x = 1;
/ This is okay because x is initialized now
println!(“{}”, x); /
}
if some_condition() {
x = 2;
break;
}
if another_condition() {
x = 3;
break;
}
}
Whatever you do, keep in mind that with Rust if you forget to initialize a
variable and then accidentally use a garbage value, you’ll get a
compilation error. All structure fields must be initialized at construction
time as well:
If a field is added to the structure at some point later than all existing
constructors, it will generate compilation errors that must be fixed.
Table of content
Threads without Data Races
During early development of Rust, it was discovered that the borrow
checker (responsible for general memory safety) is also capable of
preventing
data races between threads. In order for a data race to occur,
three conditions must simultaneously hold:
The borrow checker ensures that at any point in time a memory location
has either any number of read-only references or exactly one writable
reference, thus preventing the first and second conditions from occurring
together.
However, while references are the most common way to read and modify
memory, Rust has other options for this, so the borrow checker isn’t a
silver bullet. Furthermore, risks to thread safety aren’t limited to data
races. Rust has no special magic to completely prevent
general data
races
, so deadlocks and poor usage of synchronization primitives are still
possible and certain knowledge is still required to avoid them.
The Rust compiler prevents the most common issues with concurrency
that are allowed by less safe programming languages like C or C++, but it
doesn’t require garbage collection or some background, under-the-hood
threads and synchronization to achieve this. The standard library includes
Table of content
many tools for safe usage of multiple concurrency paradigms. There are
the following tools:
● message passing
● shared state
● lock-free
● purely functional
Channels are used to transfer messages between threads. Ownership of
messages sent over a channel is transferred to the receiver thread, so you
can send pointers between threads without fear of a possible race
condition occurring later. Rust’s channels enforce thread isolation.
use std::thread;
use std::sync::mpsc::channel;
// First create a channel consisting of two parts: the sending half (tx)
// and the receiving half (rx). The sending half can be duplicated and
// shared among many threads.
let (tx, rx) = channel();
// Now in the main thread we’ll receive the expected ten numbers from
// our worker threads. The numbers may arrive in any arbitrary order,
// but all of them will be read safely.
for _ in 0..10 {
let j = rx.recv().unwrap();
assert!(0 <= j && j < 10);
}
Table of content
An important thing here is that messages are actually
moved into the
channel, so it’s not possible to use them after sending. Only the receiver
thread can re-acquire ownership over the message later.
Another more traditional way to deal with concurrency is to use a passive
shared state for communication between threads. However, this approach
requires proper synchronization, which is notoriously hard to do correctly:
it’s very easy to forget to acquire a lock or to mutate the wrong data while
holding the correct lock. Many languages even go as far as removing
support for this low-level concurrency style altogether in favor of
higher-level approaches (like channels). Rust’s approach is dictated by the
following thoughts:
So Rust provides tools for using shared state concurrency in a safe but
direct way.
Threads in Rust are naturally isolated from each other via ownership,
which can be transferred only at safe points like during thread creation or
via safe interfaces like channels. A thread can mutate data only when it
has a mutable reference to the data. In single-threaded programs, safe
usage of references is enforced statically by the borrow checker.
Table of content
Multi-threaded programs must use locks to provide the same mutual
exclusion guarantees dynamically. Rust’s ownership and borrowing
system allows locks to have an API that’s impossible to use unsafely. The
principle “lock data, not code” is enforced in Rust.
impl<T> ThreadSafeStack<T> {
fn new() -> ThreadSafeStack<T> {
ThreadSafeStack {
// The vector we created is moved into the mutex,
// so we cannot access it directly anymore
elements: Mutex::new(Vec::new()),
}
}
}
Table of content
impl<T> ThreadSafeStack<T> {
/ Note that the push() method takes a non-mutable reference to
/
// “self,” allowing multiple references to exist in different threads:
fn push(&self, value: T) {
let mut elements = self.elements.lock();
/ After acquiring the lock, the mutex will remain locked until
/
// this method returns, preventing any race conditions.
/ You can safely access the underlying data and perform any
/
// actions you need to with it:
elements.deref_mut().push(value);
}
}
impl<T> ThreadSafeStack<T> {
/ Peek into the stack, returning a reference to the top element.
/
// If the stack is empty then return None.
fn peek(&self) -> Option<&T> {
let elements = self.elements.lock();
/ Now we can access the stack data.
/
Table of content
However, the Rust compiler won’t allow this. It sees that the reference
you’re trying to return will be alive longer than the time for which the lock
is held, and will tell you exactly what is wrong with the code:
Indeed, if this were allowed, the reference you got may suddenly be
invalidated when some other thread popped the top element off the stack.
A possible race condition has been swiftly averted at compilation time,
saving you an hour or two of careful debugging of a weird behavior that
occurs only on Tuesdays.
Trait-Based Generics
Generics are a way of generalizing types and functionalities to broader
cases. They’re extremely useful for reducing code duplication in many
ways, but can call for rather involving syntax. Namely, generics require
great care in specifying over which types they are actually considered
valid. The simplest and most common use of generics is for type
parameters.
Rust’s generics are similar to C++ templates in both syntax and usage. For
example, the generic Rust data structure implementing a dynamically
sized array is called Vec<T>. It’s a vector specialized at compile time to
Table of content
contain instances of any type T. Here’s how it’s defined in the standard
library:
A generic function max() that takes two arguments of any type T can be
declared like this:
fn max<T>(a: T, b: T) -> T { /* … */ }
However, this definition is incorrect because we can’t in fact apply max()
to
any
type. The maximum function is defined only for values which have
some defined ordering between them (the one used by comparison
operators like < and >=). This is where Rust generics are different from
C++ templates.
This function will compile only when used with types that actually define
the ‘operator<(), and will cause a compilation error otherwise:
Table of content
test.cpp: In instantiation of ‘T max(T, T) [with T = Person]’:
test.cpp:19:46: required from here
test.cpp:11:14: error: no match for ‘operator<’ (operand types are
‘Person’ and ‘Person’)
return a < b ? b : a;
~~^~~
This error may be sufficient in simple cases like max() where type
requirements and the source of the error are obvious, but with more
complex functions using more than one required method, you can be
quickly overwhelmed by large amounts of seemingly unrelated errors
about obscure missing methods.
Traits are Rust’s way of defining interfaces. They describe what methods
must be implemented by a type in order to satisfy the trait. For example,
Table of content
the Ord trait requires a single method cmp() that compares this value to
another and returns the ordering between them:
Traits may be derived from other traits. The Ord trait is derived from the
traits
PartialOrd (specifying partial ordering) and
Eq (specifying
equivalence relation). Thus, Ord may be implemented only for types that
implement both the PartialOrd and Eq traits. Methods of parent traits are
inherited by child trait implementations, so for example any Ord type can
be compared with the == operator provided by the Eq trait.
Table of content
Aside from methods, traits can only contain type definitions. Like in Java
or C#, traits define only interface requirements, not the data layout of
concrete implementations of the interface.
But sometimes, you want to have generic code that acts differently based
on real runtime value types. Rust implements dynamic (runtime)
polymorphismvia so-called
trait objects
.
trait ProductB {
fn do_bar(&mut self);
}
Table of content
Note that the factory produces
Boxes with products. A Box is Rust’s
equivalent of std::unique_ptr in C++. It denotes an object allocated on the
heap and can contain
trait objects that are actually pointers to the
concrete implementation of a trait.
Table of content
Note how trait implementations are separated from declared structures.
This is why Rust has
traits
, not interfaces. A trait can be implemented for a
type not only by the module that declares the type but from anywhere else
in the program. This opens up many possibilities for extending the
behavior of library types if the provided interfaces don’t meet your needs.
fn make_and_use_some_stuff(factory: &ProductFactory) {
let mut a: Box<ProductA> = factory.make_product_a();
let mut b: Box<ProductB> = factory.make_product_b();
a.do_foo();
b.do_bar();
}
Some traits may be automatically derived and implemented by the
compiler for user-defined data structures. For example, the PartialEq trait
that defines the == comparison operator can be automatically
implemented for any data structure provided that all its fields implement
PartialEq too:
#[derive(PartialEq)]
struct Person {
name: String,
Table of content
age: u32,
}
Pattern Matching
Similar to C++, Rust has enumeration types:
enum Month {
January, February, March, April, May, June, July,
August, September, October, November, December,
}
match month {
Month::December | Month::January | Month::February
=> println!(“It’s winter!”),
Month::March | Month::April | Month::May
=> println!(“It’s spring!”),
Month::June | Month::July | Month::August
=> println!(“It’s summer!”),
Month::September | Month::October | Month::November
=> println!(“It’s autumn!”),
}
Table of content
However, m
atch has more features than a simple switch. The most crucial
difference is that matching must be
exhaustive
: the match clause must
handle all possible values of the expressions being matched. This
eliminates a typical error in which switch statements break when an
enumeration is extended later with new values. Of course, there’s also a
default catch-all option that matches any value:
match number {
0..9 => println!(“small number”),
10..100 if number % 2 == 0 => {
println!(“big even number”);
}
_ => println!(“some other number”),
}
Another important feature of Rust enumerations is that they can carry
values, implementing
discriminated unions
safely.
enum Color {
Red, Green, Blue,
RGB(u32, u32, u32),
CMYK(u32, u32, u32, u32),
}
Pattern matching can be used to match against possible options and
extract values stored in a union:
match some_color {
Color::Red => println(“Pure red”),
Color::Green => println(“Pure green”),
Color::Blue => println(“Pure blue”),
Color::RGB(red, green, blue) => {
println(“Red: {}, green: {}, blue: {}”, red, green, blue);
}
Color::CMYK(cyan, magenta, yellow, black) => {
println(“Cyan: {}, magenta: {}, yellow: {}, black: {}”,
cyan, magenta, yellow, black);
}
}
Table of content
Unlike C and C++ unions, Rust makes it impossible to choose an incorrect
branch when unpacking a union.
Type Inference
Rust uses a
static type system
, which means that types of variables,
function arguments, structure fields, and so on must be known at compile
time; the compiler will check that correct types are used everywhere.
However, Rust also uses
type inference
, which allows the compiler to
automatically deduce types based on how variables are used.
This is very convenient because you no longer need to explicitly state
types, which in some cases may be cumbersome (or impossible) to write.
The autokeyword in C++ serves the same purpose:
However, Rust also considers future uses of a variable to deduce its type –
not only the initializer – allowing programmers to write code like this:
Table of content
let v = 10; / v’s type is some integer (based on the constant),
/
// but the exact type (i32, u8, etc.) is not yet known
vec.push(v); / after this line, the compiler knows that T == v’s type
/
Minimal Runtime
Runtime is the language support library that’s embedded into every
Table of content
and garbage collection. The size and complexity of the runtime contributes
significantly to start-up and runtime overhead. For example, the JVM
requires a non-negligible amount of time to load classes, warm up the JIT
compiler, collect garbage, and so on.
● Rust Embedded (
https://github.com/rust-embedded)
● rust.ko (https://github.com/tsgates/rust.ko
)
● Windows KMD (
https://github.com/pravic/winapi-kmd-rs
)
● Redox OS (
https://www.redox-os.org/
)
Furthermore, the absence of a complex runtime simplifies embedding Rust
modules into programs written in other languages. For example, you can
Table of content
easily write JNI code for Java or extensions for dynamic languages like
Python, Ruby, or Lua.
Efficient C Bindings
There’s more than one programming language in the world, so it’s not
surprising that you might want to use libraries written in languages other
than Rust. Conventionally, libraries provide a C API because C is a
ubiquitous language, the common denominator of programming
languages. Rust is able to easily communicate with C APIs, without any
overhead, and use its ownership system to provide significantly stronger
safety guarantees for them.
/**
* Sum some numbers.
*
* @param numbers [in] pointer to the numbers to be summed
* must not be NULL and must point to at least
* `count` elements
* @param count [in] number of numbers to be summed
*
* @returns sum of the provided numbers.
*/
int sum_numbers(const int *numbers, size_t count)
{
int sum = 0;
Table of content
sum += numbers[i];
}
return sum;
}
Note that some parts of this function API are described formally by the
argument types, but some things are only specified in the documentation.
For example, we can only infer that we can’t pass NULL for a
numbers
argument and that there must be at least
count numbers available. And
only the common sense of a C programmer tells us that the function won’t
call free() for the n
umbersarray.
extern {
fn sum_numbers(numbers: *const libc::c_int, count: libc::size_t)
-> libc::c_int;
}
fn main() {
let array = [1, 2, 3, 4, 5];
let sum = unsafe { sum_numbers(array.as_ptr(), array.len()) };
println!(“Sum: {}”, sum); // ===> prints “15”
}
Table of content
There’s no hidden boxing and unboxing, re-allocating of the array,
obligatory safety checks, or other things. We see exactly the same
machine code that a C compiler would have generated for the same library
function call.
However, there are some details in the above code that require further
explanation – first of all, the libc crate. This is a wrapper library that
provides types and functions of the C standard library to Rust. Here you
can find all the usual types, constants, and functions:
Not only can you use “normal” C libraries via the Rust Foreign Function
Interface – you can also readily use the system API via libc crate.
Table of content
will allow you to pass a NULL pointer as the first argument and this will
cause
an undefined behavior (just as it would in C). The function call isn’t
safe, so it must be wrapped in an
unsafe block
which effectively says
“Compiler, you have my word that this function call is safe. I have verified
that the arguments are okay, that the function won’t compromise Rust
safety guarantees, and that it won’t cause undefined behavior.”
Just as in C, the programmer is ultimately responsible for guaranteeing
that the program doesn’t cause undefined behavior. The difference here is
that with C you must manually do this at all times, in all parts of the code,
for every library you use. On the other hand, in Rust
you must manually
verify safety only inside unsafe blocks
. All other Rust code (outside unsafe
blocks) is automatically safe, as routinely verified by the Rust compiler.
Herein lies the power of Rust: you can provide safe wrappers for unsafe
code and thus avoid tedious, manual safety verifications in the consumer
code. For example, the sum_numbers function can be wrapped like this:
Now the external function has a safe interface. It can be readily used by
idiomatic Rust code without unsafe blocks. Callers of the function don’t
need to be aware of the actual safety requirements of its native C
implementation. And it’s still as fast as the original!
Table of content
Beyond Primitive Types
Aside from primitive types like libc::c_int and pointers, Rust can use other
C types as well.
Rust structs can be made compatible with C structs via a #[repr]
annotation:
Table of content
As in C, unions in Rust are
untagged
. That is, they don’t store the runtime
type of the value inside them. The programmer is responsible for
accessing union fields correctly. The compiler can’t check this
automatically, so Rust unions require an explicit u
nsafe block when
accessing their fields both for reading and writing.
However, you can’t use advanced features of Rust enum types when
calling C code. For instance, you can’t directly pass Option<T> or
Result<T> values to C.
Rust functions can be converted into C function pointers given that the
argument types are actually compatible and the C ABI is used:
Table of content
fn launch_native_thread() {
let name = "Ferris";
// We’re going to launch a native thread via pthread_create() from libc.
// This is an external function, so calling it is unsafe in Rust (think
// about exception boundaries, for example).
unsafe {
let mut thread = 0;
libc::pthread_create(&mut thread, / / out-argument for pthread_t
ptr::null(), // in-argument of thread_attr_t
thread_body, // thread body (as a C callback)
mem::transmute(&name) / thread argument (requires a cast)
/
);
libc::pthread_join(thread, ptr::null_mut());
}
}
Native Rust functions and types can be made available to C code just as
easily as you can call C from Rust. Let’s reverse the example with the
sum_numbers() function and implement it in Rust instead:
#[no_mangle]
pub extern “C” fn sum_numbers(numbers: *const libc::c_int, count:
libc::size_t)
-> libc::c_int
{
// Convert the C pointer-to-array into a native Rust slice of an array.
// This is not safe per se because the “numbers” pointer may be NULL
// and the “count” value may not match the actual array length.
//
// As with C, we’ll require the caller of this function to ensure
// that these safety requirements are observed and will not check
// them explicitly here.
Table of content
let rust_slice = unsafe { from_raw_parts(numbers, count) };
// Rust slice types already have a handy method for summing their
// elements. Let’s use it here.
return rust_slice.sum();
}
And that’s it. The #[no_mangle] attribute prevents symbol mangling (so
that the function is exported with the exact name “sum_numbers”). The
extern directive specifies that the function should have the C ABI instead
of the native Rust ABI. With this, any C program can link to a library written
in Rust and can easily use our function:
int main()
{
int numbers[] = { 1, 2, 3, 4, 5 };
int sum = sum_numbers(numbers, 5);
printf(“Sum is %d\n”, sum);
}
Table of content
In order to explain why Rust is a safer and faster language than C++, we
decided to create a Rust vs C++ comparison chart that clearly shows the
differences between these two languages.
For better comparison, we’ve chosen features that reveal the key
similarities and differences between these two languages.
● Zero-cost abstraction
● Move semantics
Move constructors may Move constructors are suggested to A built-in static
leave objects in invalid leave the source object in a valid analyzer disallows
and unspecified states state (yet the object shouldn’t be use of objects after
and cause used in correct programs). they have been
use-after-move errors moved.
Table of content
● Smart pointers vs. null pointers
Manual code review can spot Raw pointers can only be used
use of raw pointers where smart inside unsafe blocks, which can
pointers would suffice. automatically be found by tools.
Table of content
● Internal buffer
Buffer Explicitly coded wrapper classes All slice types enforce runtime range
overflow enforce range checks. checks.
errors
Debugging builds of the STL can Range checks are avoided by most
perform range checks in standard common idioms (e.g. range-based
containers. for iterators).
● Data races
Data races Good programming discipline, The built-in borrow checker and
(unsafe knowledge, and careful review Rust reference model detect and
concurrent are required to avoid prohibit possible data races at
modification of concurrency errors. compile time.
data)
Table of content
● Object initialization
Table of content
● Adding new traits
● Standard library
Legacy design of Structured types like std::pair, Built-in composable structured
utility types std::tuple and std::variant can types: tuples, structures,
heavily used by replace ad-hoc structures. enumerations.
standard library
These types have Pattern matching allows
inconvenient interfaces convenient use of structured
(though C++17 improves this). types like tuples and
enumerations.
Forgetting to Code review and external static The compiler checks that
handle all code analyzers can spot switch match expressions explicitly
possible statements that don’t cover all handle all possible values for
branches in possible branches. an expression.
switch
statements
Table of content
● Typing of variables
● Runtime environment
Embedded and The C++ runtime is already The Rust runtime is already
bare-metal fairly minimal, as it directly fairly minimal as it directly
programming have compiles to machine code and compiles to machine code and
high restrictions on doesn’t use garbage collection. doesn’t use garbage collection.
runtime
environment
C++ programs can be built Rust programs can be built
without the standard library without the standard library
with disabled exceptions and with disabled range checks,
dynamic type information, etc. etc.
Table of content
● Using libraries written in other languages
Rust avoids possible data races, informs about undefined behavior, and
allows null raw pointers inside unsafe blocks. The Rust language also has
other distinctive features that allow programmers to achieve better safety
and performance of their software.
Table of content
This Rust Programming Language Tutorial is based on the experience of Apriorit team who
uses Rust for software developmentalong with other programming languages.
This Tutorial is intended for information purposes only. Any trademarks and brands are
property of their respective owners and used for identification purposes only.
Apriorit’s main specialties are cybersecurity and data management projects, where system
programming, driver and kernel level development, research and reversing matter. The
company has an independent web platform development department focusing on building
cloud platforms for business.
Find us on C
lutch.co
Apriorit Inc.
Headquarters:
8 The Green E-mail:i[email protected]
Suite #7106, Dover, DE, 19901, US www.apriorit.com
Phone:
202-780-9339
This book was distributed courtesy of:
For your own Unlimited Reading and FREE eBooks today, visit:
http://www.Free-eBooks.net
Share this eBook with anyone and everyone automatically by selecting any of the
options below:
COPYRIGHT INFORMATION
Free-eBooks.net respects the intellectual property of others. When a book's copyright owner submits their work to Free-eBooks.net, they are granting us permission to distribute such material. Unless
otherwise stated in this book, this permission is not passed onto others. As such, redistributing this book without the copyright owner's permission can constitute copyright infringement. If you
believe that your work has been used in a manner that constitutes copyright infringement, please follow our Notice and Procedure for Making Claims of Copyright Infringement as seen in our Terms
of Service here:
http://www.free-ebooks.net/tos.html