0% found this document useful (0 votes)
33 views42 pages

Performance Evaluation For Choosing Rust and C++

Uploaded by

sunone5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views42 pages

Performance Evaluation For Choosing Rust and C++

Uploaded by

sunone5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Bachelor of Science in Software Engineering

05 2023

Performance evaluation for choosing


between Rust and C++

Patrik Karlsson

Faculty of Computing, Blekinge Institute of Technology, 371 79 Karlskrona, Sweden


This thesis is submitted to the Faculty of Computing at Blekinge Institute of Technology in partial
fulfilment of the requirements for the degree of Bachelor of Science in Software Engineering. The
thesis is equivalent to 10 weeks of full time studies.

The authors declare that they are the sole authors of this thesis and that they have not used
any sources other than those listed in the bibliography and identified as references. They further
declare that they have not submitted this thesis at any other institution to obtain a degree.

Contact Information:
Author(s):
Patrik Karlsson
E-mail: [email protected]

University advisor:
Associate professor Emil Alégroth
Department of Software Engineering

Faculty of Computing Internet : www.bth.se


Blekinge Institute of Technology Phone : +46 455 38 50 00
SE–371 79 Karlskrona, Sweden Fax : +46 455 38 50 57
Abstract

Developers face numerous challenges in their careers, including the critical decision
of choosing the most suitable programming language to tackle these challenges. Each
programming language presents its unique set of advantages and disadvantages, mak-
ing the decision-making process complex. This study focuses on one such decision –
the selection between Rust and C++ which are both systems programming languages
with significant emphasis on performance.
Rust, an emerging and increasingly popular language, offers a compelling alter-
native to the more established C++. To aid practitioners in making an informed
decision, this study explores the performance differences between Rust and C++
through three distinct experiments: matrix multiplication, merge sort, and file I/O
operations.
The experiments reveal that C++ demonstrates significantly faster performance
in matrix multiplication. Conversely, Rust showcases superior performance in merge
sort, with both languages performing similarly overall. The findings pertaining to file
operations were mixed, with C++ exhibiting shorter execution times for file reading,
while Rust displayed an advantage in writing larger file sizes.
By shedding light on these performance disparities, this study aims to assist
developers in their decision-making process when selecting between Rust and C++.

Keywords: C++, Rust, evaluation


Acknowledgments

I would like to thank my advisor Emil for his fast and accurate guidance when writing
the study.

iii
Contents

Abstract i

Acknowledgments iii

1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Rust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.3 Similarities and differences . . . . . . . . . . . . . . . . . . . . 3
1.1.4 The importance of finding new alternatives . . . . . . . . . . . 3
1.1.5 Goal of study . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Related Work 7

3 Method 9
3.1 Research questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Research methodology . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.1 Matrix multiplication experiment . . . . . . . . . . . . . . . . 10
3.2.2 Merge sort experiment . . . . . . . . . . . . . . . . . . . . . . 10
3.2.3 Read and writing experiment . . . . . . . . . . . . . . . . . . 10
3.3 Validity and reliability of your approach . . . . . . . . . . . . . . . . 10
3.3.1 Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.2 Further discussion of validity and replication . . . . . . . . . . 11
3.3.3 Motivation and relevancy of experiments . . . . . . . . . . . . 11

4 Results and Analysis 13


4.1 Matrix multiplication result . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 Merge sort results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.3 Reading and writing . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.4 Table data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.5 Matrix Multiplication regression model . . . . . . . . . . . . . . . . . 17
4.6 Merge sort regression model . . . . . . . . . . . . . . . . . . . . . . . 18
4.7 Read regression model . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.8 Write regression model . . . . . . . . . . . . . . . . . . . . . . . . . . 21

v
5 Discussion 23
5.1 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.2 Merge sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.3 Reading and Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.4 Ethical, Societal and Sustainability concerns . . . . . . . . . . . . . . 26

6 Conclusions and Future Work 27


6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
6.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

References 29

vi
Chapter 1
Introduction

1.1 Background
In software engineering, developers are faced with a multitude of decisions, and one
of the earliest and most impactful choices is selecting the programming language
for development. There are many choices and all have different capabilities such
as performance, safety, and more. One of the newer alternatives is the low-level
programming language Rust which has only grown more and more popular since its
release.

1.1.1 Rust
Rust is a low-level programming language that focuses on safety, speed, and con-
currency according to Mozilla [18]. To achieve this, Rust is statically typed and is
considered memory safe despite lacking a garbage collector. Since Rust can ensure
memory safety without relying on a garbage collector, it avoids the performance
overhead typically associated with garbage collection. This enables Rust to gain a
performance advantage over languages that rely on garbage collection for memory
management.

1.1.1.1 Rust memory safety


In the study by Balasubramanian et al., [1] it is mentioned that to maintain mem-
ory safety without relying on garbage collection, Rust enforces a robust ownership
model. This decreases the runtime overhead of Rust memory safety to only need
array bounds checks. The Rust ownership model, as described by Balasubramanian
et al., revolves around the concept that each value in Rust has a single owner. When
this owner goes out of scope, meaning it is no longer accessible, Rust automatically
releases the value. Additionally, Rust allows for borrowing values from variables,
enabling efficient and controlled sharing of data.
In the article by Chua, [5] borrowing is further described as a concept that allows
for the controlled and safe sharing of data without transferring ownership. Borrowing
can be done in two ways: through one mutable reference or multiple immutable
references, this concept is similar to smart pointers in C++.
When a mutable reference is borrowed, it grants temporary and exclusive access
to the data, allowing modifications to be made. However, only one mutable reference
can exist for a given value within a particular scope, with this Rust can ensure at
compile time that memory is not leaked.

1
2 Chapter 1. Introduction

On the other hand, multiple immutable references can be borrowed simultane-


ously. These references allow read-only access to the data and can coexist without
conflicting with each other, this borrowing process is further described in the Rust
documentation [18].
In conclusion, the Rust ownership model offers two important advantages. Firstly,
it provides compile-time guarantees that prevent segmentation faults by ensuring that
accessing freed memory is impossible. Secondly, this ownership model effectively
manages memory by automatically freeing it from variables, thereby reducing the
risk of memory leaks. With garbage collectors, both of these have to be guaranteed
at runtime instead.

1.1.1.2 Rust history


The Rust programming language was initially developed as a personal project by
Graydon Hoare, an employee at Mozilla, starting in 2006. Recognizing the potential
of Rust, Mozilla later took over the project and provided resources for further de-
velopment and support. This support from Mozilla helped accelerate the progress of
Rust and contributed to its growth and maturity. After several years of development
and refinement, Rust reached a significant milestone with its official release in 2010.
The release marked the availability of Rust as a stable and usable programming lan-
guage for a broader audience. This and more about the history of Rust are described
in the video presentation from a Rust developer Klabnik [12].
Rust’s popularity has grown significantly over the years, and it has garnered
widespread appreciation within the programming community. In the 2022 Stack
Overflow survey [24], which included the participation of over 70,000 developers, Rust
was voted as the most popular programming language for the seventh consecutive
year

1.1.2 C++
While Rust has gained significant popularity and offers unique capabilities, it is
important to acknowledge that there are indeed older and more well-known alterna-
tives available that may possess similar capabilities in certain areas. For example,
one such alternative is C++. C++ is a mature and widely adopted programming
language that has been in existence for several decades. It provides low-level func-
tionality, supports object-oriented programming, and offers a high degree of control
over system resources and memory management. These features make it suitable for
developing high-performance applications, similar to Rust.

1.1.2.1 C++ history


C++ was developed around 1980 as an object-oriented extension of the C program-
ming language it was developed by Bjarne Stroustrup who also wrote a paper de-
scribing C++’s history [2]. As a result of its long existence, C++ has had more time
to mature and gain widespread adoption. This extended lifespan has allowed devel-
opers to become more accustomed to C++ than Rust and has led to the creation of
a vast ecosystem of libraries, frameworks, and tools supporting the language [26,27].
1.1. Background 3

1.1.3 Similarities and differences


One of the first and most important distinctions between Rust and C++ lies in
their approach to memory safety. Rust’s emphasis on memory safety sets it apart
from C++. Rust’s ownership and borrowing system, combined with its strong type
system, provides robust memory safety guarantees, helping to prevent issues such
as null pointer dereferences and data races. In contrast, C++ does not inherently
provide memory safety features, leaving it more prone to certain classes of bugs and
vulnerabilities such as segmentation faults and memory leaks.
C++ offers low-level functions and features that provide programmers with a
higher degree of control over system resources this is described by Stroustrup the
creator of C++ in his article [3]. Rust also offers low-level functionality and greater
control, similar to C++. However, to achieve this developers have to resort to using
the unsafe keyword in Rust. When developers opt to employ the unsafe keyword,
as outlined in Evans et al.’s study, [7] and Rust’s documentation [18] Rust’s mem-
ory safety cannot be guaranteed. Consequently, the risk of encountering faults like
segmentation faults increases, making it considerably easier to encounter memory
leaks.
Furthermore, it is worth mentioning that even when adhering to safe Rust prac-
tices, complete immunity to memory leaks is not assured, as mentioned in the official
Rust documentation [18].

1.1.4 The importance of finding new alternatives


Software development is undoubtedly a costly endeavour, prompting the need for
a continuous search for potentially superior alternatives in order to optimize the
process. The pursuit of new and improved approaches in software development is
crucial to enhance efficiency, productivity, and overall quality.
According to the Rust main page, the three main reasons focuses of Rust are
its performance, reliability, and productivity [19]. All three of these aspects are
very important for software development, and if this holds true it would be a useful
addition to any developer’s toolbox.
Research targeting Rust is however rare and research targeting the performance
of Rust is even more uncommon. This is probably partly due to that research that
directly targets programming languages is generally limited in comparison to studies
on other areas of computer science or software engineering. Rust is also a relatively
new language which has not had time to accumulate research interests as other
languages have done.
Examples of studies that target the other aspects of Rust than what this study’s
focus is on are the studies by Wang et al., where they research a new solution where
the memory safety of Rust is utilised [28].
Another one of these studies is the study by Borgsmüller where he evaluates the
usage of Rust in embedded systems but also about the productivity of developing in
Rust [4].
4 Chapter 1. Introduction

1.1.5 Goal of study


The primary focus of this study will be to compare the performance aspects of
Rust with C++. The rationale behind this emphasis lies in the understanding that
performance holds critical significance when considering the suitability of Rust for a
given use case. The rationale behind this is if the specific use case does not require or
prioritize performance considerations, it is often more appropriate to explore higher-
level languages as potential alternatives.
As mentioned this study will focus on the most important requirement of a high-
performance language which is performance however, it’s not the only consideration
because if an extremely performant language makes it impossible to develop appli-
cations in a reasonable time frame it will not be useful for development.
Rust faces competition in the high-performance use case, with C++ being one of
its notable contenders. C++ holds a significant advantage in terms of its extensive
industry usage and long history, which has resulted in wider adoption and a mature
ecosystem. Despite these differences, there are several similarities between Rust
and C++ that make them appealing for system-level programming. Both languages
offer object-oriented programming paradigms and are compiled languages, enabling
efficient execution.
This study, however, will specifically focus on evaluating the execution time dif-
ferences between Rust and C++. By comparing the performance of these two lan-
guages, the study aims to contribute to the understanding of how they fare in terms
of run time efficiency, aiding developers in making informed decisions when selecting
the most appropriate language for their specific performance-sensitive applications.
The cost of software development necessitates a diligent exploration of technolo-
gies that can reduce costs and enhance product quality. It is imperative to thor-
oughly investigate the advantages and disadvantages of potential technological solu-
tions when making choices in this regard. However, evaluating these aspects for a
specific use case can present challenges.
So this study aims to give guidance to practitioners when developing so they
can make the right choice for their use case for developing applications with high-
performance requirements.

1.2 Scope
1.2.1 Aim
The aim of this study is to provide information for practitioners about the perfor-
mance differences between C++ and Rust so they can make a more informed decision
when choosing between developing in either Rust or C++.

1.2.2 Objective
The objective of this study is to conduct a performance comparison between Rust
and C++ across various tasks. These tasks involve implementing and evaluating al-
gorithms that have been used in previous research to compare different programming
1.3. Outline 5

languages. Examples of prior work that utilizes these algorithms include the study
by Wong, who used matrix multiplication to assess the performance of C++ [29].
Another example is given by Zhang et al., where the researchers used smaller
programs such as sorting algorithms to measure the overhead of Rust [30]. Yet
another example is the study by Sharma, where the performance of Java and C++
were compared on sorting numbers and writing and reading files [22].
In this study, two of the chosen applications are matrix multiplication and sorting
numbers in a list. The chosen algorithms, including matrix multiplication, sorting,
and file read/write operations, are well-suited to demonstrate the efficiency of CPU
and memory utilization in both Rust and C++. These applications provide insights
into how the languages handle computational tasks and disk operations, allowing for
a comprehensive performance evaluation.
The study by Zhang et al., justifies the comparison of programming languages
with microprograms, which is small applications and algorithms [30]. This approach
ensures a fair and unbiased performance comparison. Microprograms are consid-
ered fair because they focus on the fundamental constructs and operations of the
languages, this ensures that any differences are due to the language and not imple-
mentation differences such as different algorithms used.
In this study, the focus will be placed solely on the metric of execution time, which
will be consistently measured across all experiments. The decision to use execution
time as the primary metric is because this study aims to provide a standardized
measure that allows for straightforward comparisons and assessments across different
experiments. This approach ensures that the results can be easily compared and
combined, enabling practitioners to gauge the relative performance of C++ and
Rust in various scenarios.

1.3 Outline
This study will serve as a decision support tool, providing valuable information to
practitioners to make more informed decisions regarding their choice of programming
language between Rust and C++. The findings and insights gained from this study
will be presented in the Results and Analysis chapter. This presentation will be
through description, tables and images.
In the discussion chapter, there will be a discussion of the result and analysis
section. With this discussion, practitioners will be better equipped to select the most
suitable programming language that aligns with their specific use case requirements
and goals.
The methodology which will give practitioners this information will be through
experiments aiming to target large generalized areas such as scientific computing and
machine learning.
The experiments in this study will involve collecting data from matrix multiplica-
tion, merge sort, and reading/writing implementations developed in both C++ and
Rust. These implementations will be executed using multiple data sets to ensure a
comprehensive evaluation. The results will then be analysed. The analyzed results
will provide practitioners with valuable information for making informed decisions
regarding the choice of programming language, specifically focusing on performance.
Chapter 2
Related Work

Few studies researching Rust have been created and studies targeting Rust perfor-
mance are even rarer. But there is some previous work in this area. There is also
work doing similar comparisons but with other languages.
One notable study that shares similarities with this research is the work conducted
by Franzén and Östling [9]. In a manner similar to this study, they employed matrix
multiplication as a benchmark to compare the performance differences between C++
and Rust. In Franzén and Östling’s study, the matrix sizes chosen for evaluation
are similar to this study’s approach. They utilized matrix sizes of 32x32, 128x128,
512x512, and 2048x2048 to compare the performance of C++ and Rust. A significant
distinction between this study and Franzén and Östling’s research lies in their focus
on GPU comparison, whereas the present study does not emphasize this aspect. In
Franzén and Östling’s study, the main objective was to compare the performance of
C++ and Rust specifically in the context of GPU computing. In contrast, this study
aims to compare the overall performance of Rust and C++ across various tasks,
including matrix multiplication, sorting algorithms, and file read/write operations.
Another noteworthy study in the realm of Rust performance is documented in a
paper by Wang et al [16]. Their study specifically examines the overhead of Rust
compared to the C programming language. In the study conducted by Wang et al.,
a wide range of algorithms, particularly focusing on various sorting algorithms, were
investigated to evaluate the performance of Rust. Additionally, they incorporated
applications from the 22.05 computer benchmarks, which they referred to as game
benchmarks [10]. The study conducted by Wang et al. and the present study differs
in several aspects. Wang et al.’s study primarily focus on calculating the overhead
of the Rust programming language when compared to the C programming language.
Another difference between the study by Wang et.al and this study is that this study
compares the execution speed of applications instead of calculating the overhead of
the languages. Another significant distinction between the study by Wang et al., and
this study is the absence of specific tests targeting reading and writing files and does
not include a dedicated algorithm test for matrix multiplication either. In the study
conducted by Wang et al., they reported that Rust exhibited an average overhead of
1.77x compared to C.
A third comparison study was conducted by Medin a comparison was made be-
tween Rust compiled to WebAssembly (Wasm) and the C programming language [17].
Medin’s study utilized matrix multiplication, insertion sort, and an additional test
as benchmarks for the performance evaluation. In Medin’s study, the focus was
on evaluating the performance of Rust compiled to WebAssembly (Wasm) from a

7
8 Chapter 2. Related Work

web assembly perspective. Medin’s study also focused solely on the C program-
ming language. Regarding the matrix multiplication tests, Medin’s study utilized a
smaller data set size of 100x100. This data set size may not provide as comprehen-
sive insights into the performance characteristics as the larger data set sizes used
in this study, such as 250x250, 375x375, 500x500, 750x750, 1000x1000, 1250x1250,
1500x1500, 1750x1750, and 2000x2000. It’s worth mentioning that Medin’s study
had a focus on web assembly as well so it’s possible larger sizes were not of interest.
In Medin’s study, the results indicated that Rust, when compiled to WebAssembly
(Wasm) for web assembly purposes, exhibited faster performance compared to the
C programming language in the matrix multiplication and sorting algorithms. How-
ever, for the additional test performed in the study, Rust and C demonstrated similar
performance.
A fourth study is a study by Sharma, In Sharma’s study, the focus was on com-
paring the performance of Java and C++ programming languages [22]. Similar to
this study, Sharma’s research also included tasks involving reading and writing files
as well as sorting numbers to evaluate the execution speeds. The first major dif-
ference between this study and the study by Sharma is the programming language
difference. Another notable difference between Sharma’s study and this study is the
choice of algorithms for performance comparison. While this study includes matrix
multiplication and merge sort as part of the performance evaluation, Sharma’s study
did not utilize matrix multiplication and used a different sorting algorithm instead
of merge sort.
The data sets used in Sharma’s study were smaller compared to the data sets in
this study. Sharma’s study focused on number data sets with sizes of 1000, 10,000,
and 100,000 for reading and writing files as well as sorting algorithms. In contrast,
this study employed larger data sets for both reading and writing files and sorting
algorithms.
For reading and writing files, this study used data set sizes of 10,000, 100,000,
1,000,000, 5,000,000, 10,000,000, 50,000,000, 100,000,000, and 1,000,000,000 num-
bers. These larger data sets provide a more extensive evaluation of the languages’
performance in file operations, especially when dealing with larger volumes of data.
Similarly, for sorting algorithms, this study employed data set sizes of 100,000,
500,000, 1,000,000, 5,000,000, 10,000,000, 50,000,000, 100,000,000, and 200,000,000
numbers. By using larger data sets and more varied, this study enables a more
detailed analysis of the languages’ performance in sorting tasks, particularly when
handling significantly larger input sizes.
The inclusion of larger data sets in this study allows for a more comprehensive
assessment of the performance characteristics of Rust and C++ when working with
larger data volumes. This provides practitioners with valuable insights into how
these languages perform in scenarios involving reading and writing files as well as
sorting larger data sets, assisting them in making informed decisions when selecting a
programming language for their specific application requirements. With these values,
models will be created for each language and the models can be used to predict values
outside the test.
Chapter 3
Method

3.1 Research questions


RQ1: How does the execution time of Rust compare to C++?
RQ1.1: How does C++ and Rust programming language compare with respect
to execution time with matrix multiplication of matrices of varying sizes?
RQ1.2: How does C++ and Rust programming language compare with respect
to execution time with sorting varying lists of numbers?
RQ1.3: How does C++ and Rust programming language compare with respect
to execution time with reading and writing to files with varying sizes?
The aim of this study is to help practitioners make a decision between Rust and
C++ when it comes to performance. It is perceived that the answer to RQ1 will
provide such supporting data, whilst the sub-questions will provide more details. In
particular, these questions will provide insights in regard to specific use cases for
different categories of algorithms.
RQ1.1 will answer which of the programming languages is the best choice for
applications that require large amounts of calculations, represented in the study by
matrix multiplication of large matrices. And also for use cases that make use of
matrix multiplication such as machine learning and scientific computing.
RQ1.2, when answered, will answer which language is the best choice for appli-
cations that need a lot of sorting or similar data processing functionality. A broad
group that often needs sorting is scientific computing applications.
Finally RQ1.3, in turn, will give insights into which programming language is the
best at reading/writing to files. This data will be useful for applications with a lot
of I/O between files, but also similar functionality where disk throughput matters.

3.2 Research methodology


The research methodology employed in this study is experimentation. Experimenta-
tion was chosen as the research approach because it provides a high level of control
over the data sets and the experimental environment. The experimentation in this
study involves manipulating variables such as the size and content of the data set
files, allowing for flexibility and customization. By modifying these variables, the
researchers can explore different scenarios and assess the performance of C++ and
Rust across a range of conditions. Additionally, conducting multiple experiments
enables them to derive more generalized results and draw robust conclusions.

9
10 Chapter 3. Method

3.2.1 Matrix multiplication experiment


The initial experiment focuses on comparing the performance of C++ and Rust im-
plementations of naive matrix multiplication. The goal is to evaluate whether there
is a significant difference in execution time between the two programming languages
when performing matrix multiplication. The alternate hypothesis is there is a signif-
icant execution time difference between C++ and Rust when multiplying matrices.
This experiment will answer the research question RQ1.1 the experiment will mea-
sure the execution time of matrix multiplication with varying sizes of matrices with
both C++ and Rust.

3.2.2 Merge sort experiment


The second experiment will involve comparing the execution speed of a C++ and
Rust implementation of the merge sort algorithm. The alternate hypothesis is there
is a significant execution time difference between C++ and Rust when sorting the
list of numbers with merge sort. This experiment will answer the research question
RQ1.2 as this experiment will gather the execution time differences of sorting with
merge sort with varying sizes of the list of numbers with both C++ and Rust.

3.2.3 Read and writing experiment


The third experiment will involve comparing the execution speed of C++ and Rust
implementations of both reading and writing data. The alternate hypothesis is there
is a significant execution time difference between C++ and Rust when reading and
writing files. This will answer research question RQ1.3 since the experiment will
execute and measure both reading and writing with varying sizes in both C++ and
Rust

3.3 Validity and reliability of your approach


3.3.1 Measurement
The execution times for the experiments were recorded using different libraries spe-
cific to each programming language. For C++, the Chrono library, which is a C++
library for high-resolution time measurements, was utilized to collect the execution
time data.
On the other hand, Rust’s standard library provides a time module that was
employed to capture the measurements for the Rust implementations.
To ensure accuracy and consistency, the measurements were collected solely dur-
ing the execution of the specific targets of interest within the applications. The setup
and initialization processes for these applications were not included in the recorded
execution times. This approach allows for a more focused analysis of the actual com-
putations and operations performed by the applications, enabling a more accurate
comparison of the execution times between C++ and Rust.
3.3. Validity and reliability of your approach 11

3.3.2 Further discussion of validity and replication


The algorithms chosen for testing in this study were carefully studied and selected
to ensure a fair and meaningful comparison between C++ and Rust.
To ensure consistent and replicable working conditions, a virtual machine was
created using VirtualBox. The virtual machine was configured with 8GB of RAM,
4 CPU processors, and 60GB of disk space. The operating system used in the
virtual machine was Ubuntu (64-bit). The operating system on the host machine
was Windows 11.
All applications used for the experiments and methods to generate data for the
experiments exist in this git repository https://github.com/Varg01/examRust.
All of these algorithms were validated to be fair comparisons by the researcher and
because these were all well-known simple algorithms the certainty of the applications
being fair comparisons is high.
To provide consistent and controlled input data for the experiments, the values
were generated by a randomization function into files before the experiments. For
the sorting and matrix multiplication experiments, the data was stored in files, with
numbers separated by commas. The reading and writing experiments also used files
filled with numbers as input but without commas. The data generation algorithms
ensured that the numbers in the sorting and matrix data sets ranged from 1 to 10,000.
The values were randomly generated, and each number was separated by a comma to
adhere to the required format for sorting and matrix multiplication operations. By
randomizing the values and generating numbers for the data sets with a range of 1
to 10,000, the experiments aimed to cover a wide spectrum of scenarios and provide
a diverse set of input data. All of this generated data was stored in files and used in
all experiments so the data was not randomized for every new run. Both C++ and
Rust also used the same files so that not there was no unfairness in the generated
data.
A difficulty with using execution time as the metric in the study is because of
the stochastic nature of executing applications, which makes it harder to ensure
that results are relevant. To ensure that this would not be as big of a problem and
the result are reliable and statistically significant results, each test was executed
100 times within a test session. This repetition helped minimize the impact of any
random variations or outliers in the measurements. Furthermore, separate sessions
were used to calculate the average execution time across all tests. After each session,
the results from each test session were logged, this worked as a rest period and
after the results were logged the virtual machine was restarted before a new test
session was executed. In total, there were three test sessions, resulting in a total of
300 runs for each size of every application. This extensive number of runs ensured a
robust data set for analysis and comparison despite the stochastic nature of executing
algorithms.

3.3.3 Motivation and relevancy of experiments


All of these conclusions will be useful for applications that involve merge sort, reading
and writing to files, and matrix multiplications. There are also further generalised
implications of the result of the three experiments.
12 Chapter 3. Method

3.3.3.1 Matrix multiplication


In line with the study by Li et al., [14] matrix multiplication was considered a funda-
mental operation that serves as a building block for various numerical applications.
Therefore, including matrix multiplication in the performance evaluation was cru-
cial for achieving a broader generalization of the results. One area especially is
machine learning which uses matrix multiplication, as described in the study by
Fawzi et al., [8] where they worked with AI to improve the performance of matrix
multiplication. They further described the importance of improving and optimizing
fundamental algorithms such as matrices that any improvement can have a large
impact on the performance of an extensive group of computations. The study also
brought up, the importance of matrix multiplication for scientific computing and
neural networks.

3.3.3.2 Merge sort


Sorting algorithms, particularly merge sort, play a vital role in various computing ap-
plications, as highlighted in the study conducted by Lobo and Kuwelkar [15]. Many
applications require data to be sorted in a specific order, making sorting algorithms
indispensable. Merge sort, known for its efficiency and stability, is particularly valu-
able for handling large data sets.
One prominent example of the significance of sorting algorithms is in database
systems. Databases heavily rely on sorting algorithms to efficiently retrieve and
manipulate data. Sorting allows for faster searching, indexing, and data retrieval
operations, ultimately enhancing the overall performance of database applications.
Another area sorting is vital for is scientific computing which includes appli-
cations such as transaction processing, linguistics, combinatorial optimization, ge-
nomics, molecular dynamics, weather prediction, and astrophysics. This means that
performance results for merge sort will be relevant for a wide amount of applications.

3.3.3.3 Reading and writing


Reading and writing operations involving files often introduce heavier disk access
compared to other computational tasks. The efficiency of disk operations can signif-
icantly impact the overall performance of applications that heavily rely on file I/O.
Therefore, conducting experiments to measure the performance of reading and writ-
ing files in different programming languages, such as Rust and C++, can provide
valuable insights.
In this study, specific implementations were developed for the matrix multiplica-
tion and reading and writing applications to cater to the experimental requirements.
However, for the merge sort application, existing code from a collection of algorithms
was utilized [25].
Chapter 4
Results and Analysis

In general, the C++ applications in the study exhibited lower execution times com-
pared to the Rust applications. Although both languages demonstrated similar per-
formance when handling small data sizes, C++ outperformed Rust in a majority of
the experiments.
In the study by Franzén and Östling [9] the result from their matrix multiplica-
tion experiment also showed as this study results have done that C++ had a faster
execution speed compared to Rust but the results from this study suggested that
C++ had an even faster execution speed when it comes to matrix multiplication
than what Franzéns and Östlings study showed. Although there were differences
that definitely contributed to these differences such as being run with CPU in this
study as well as environmental differences and different data sets.
In the study by Medin [17] the result where reverse and Rust had a faster execu-
tion speed although this was from a web perspective when Rust and C were compiled
into web assembly which is a very different use case to this study and this different
environment definitely affected the result and the study by Medin also only used size
100 and it’s unclear if the results would be the same at larger sizes.
The results of this study are also similar to the results in the study by Zhang et
al., [30], where the merge sort result also was close to the C result but C had better
overhead. Although the study by Zhang et al., focused on calculating overhead and
also worked with C which of course is very similar to C++ the results are still clear
that with merge sort Rust has an extremely similar performance to C/C++. The
average overhead for Rust compared to C was 1,77. The results of this study did
not show as big of an advantage in performance as the results from Zhang et al.,
study this may have been partly because of the differences in what was being tested
such as language and metrics but also because the algorithms may have been more
beneficial for the execution time of C.
Generally, previous work showed that C++ had a faster execution speed than
Rust and in this study the result is similar. Most of the tests had a faster execution
speed for C++ but Rust had some cases where it had a faster execution speed such
as merge sort when it was faster although the results were similar. Writing was also
faster at large sizes for Rust.

4.1 Matrix multiplication result


During the matrix multiplication experiments, it was observed that C++ consistently
outperformed Rust across all tested data sizes. The largest size examined in the study

13
14 Chapter 4. Results and Analysis

was 3000, where the C++ version exhibited an approximate execution time of 107
seconds, around 9% faster than the equivalent Rust version.
Although Rust avoids many of the runtime checks that garbage collectors need
the performance disparities between the two languages can be attributed, at least
in part, to the presence of run-time checks. These run-time checks, such as bounds
checks are inherent to Rust’s design and can impact its performance. The study by
Zhang et al., [30] supports this notion by discussing how run-time checks can influence
the efficiency of Rust applications. In this experiment, the bound check can affect
the performance significantly for this experiment due to for every multiplication the
arrays have to be bound-checked.
Another aspect that is important to understand is what takes time for this matrix
multiplication. Matrix multiplication of course includes a computation and this
computation have to have data from both input arrays this can be fetched much
more quickly if there is a good cache locality. Cache locality as mentioned in the
study by Lam et al., [13] explained how caching could improve the performance of
matrix multiplication. Optimizations that the languages or the compilers have done
may have affected the cache locality and therefore the results. After the data has
been fetched either from cache, memory or disk the computations can be done and
the result of the computation can be written to the results matrix.
In the Rust implementation of the matrix multiplication code, there is also a
potential issue related to the use of different integer types, namely i32 and i64. This
difference in types can lead to a type mismatch problem during indexing operations.
Specifically, when accessing the values of the first matrix and the second matrix slices,
Rust expects them to have the same integer type, either i32 or i64. Attempting to
index one slice with an integer of the wrong type will result in a compilation error
in Rust.
To avoid the type difference problem in the Rust version of the matrix multiplica-
tion code, the decision was made to use i64 consistently throughout the code instead
of mixing i32 and i64 types. By using i64 as the integer type consistently, there is no
need to cast or convert between different types when performing the multiplication
operation. The reason this might affect performance is due to do potentially more
memory was needed to store the matrices.
In the C++ version of the code, there is no explicit need for type casts when per-
forming arithmetic operations involving different integer types. Unlike Rust, C++
automatically promotes smaller integer types to larger ones when necessary. This
automatic type promotion ensures that the arithmetic operations are performed cor-
rectly without requiring explicit type casts.
As depicted in Figure 4.1, the performance differences between the C++ and Rust
implementations of matrix multiplication are increasing as the sizes of the matrices
grow larger. This indicates that the C++ version consistently outperforms the Rust
version with increasing matrix sizes.
However, it’s noteworthy to mention that the result of the last test involving a
matrix size of 2000 suggests a trend of the execution speeds becoming more similar
as the sizes grow larger. This observation could indicate that the performance gap
between the two languages narrows down as the matrix sizes increase.
4.2. Merge sort results 15

4.2 Merge sort results


In the experiment comparing the merge sort algorithm between Rust and C++,
the execution speeds were very close. Rust consistently outperformed C++ in all
sizes, but the performance difference was relatively small. The largest difference was
observed at the smallest size, where Rust was approximately 5% faster than C++,
corresponding to a difference of about half a millisecond.
At the largest size tested, which was 2∗108 , Rust was approximately 520 millisec-
onds faster than C++, representing a difference of around 2% in terms of execution
time.
The minimal performance differences observed between the Rust and C++ im-
plementations in the merge sort algorithm experiment make it challenging to draw
definitive conclusions.
As described in the study by Lobo and Kuwelkar [15] merge sort involves par-
titioning large arrays into smaller sub-arrays and then when it’s split into smaller
parts compared to size. In the C++ code, both new and delete were used to allocate
and deallocate memory for each subarray. In the study by Detlefs et al., [6] the
performance cost of memory management in C/C++ applications is described which
may have been a major effect on the results because new and delete is not used in
Rust. For both implementations referencing and accessing this reference is needed
and there are differences between the languages in doing this. Accessing arrays in
Rust is that by default the Runtime bound check is used so no access outside the
bounds of the array is possible and this contributed to performance as well. After
getting the numbers from memory they are compared and the order of the array is
changed which is another access to the reference.
The execution time of both the C++ and Rust solutions exhibited linear growth,
which can be attributed, in part, to the O3 optimization. Interestingly, both imple-
mentations demonstrated a similar growth rate, indicating comparable performance
characteristics. However, it is worth noting that Rust’s execution time growth rate
was slightly lower than that of C++, suggesting a potential advantage in terms of
performance efficiency for using Rust.

4.3 Reading and writing


In the reading part of the I/O speed experiment, C++ generally outperformed Rust.
The Rust reading application exhibited a slight advantage only at the smallest data
size, with a difference of a few microseconds. However, as the data size increased
to 109 , the performance gap became more significant. At this largest size, the C++
version was approximately 172.93 ms faster, 2 times faster execution time compared
to Rust. It is worth noting that the execution time for reading speed grew linearly
but at a lower rate compared to the other experiments.
In the writing part of the I/O speed experiment, Rust exhibited faster perfor-
mance at the two largest data sizes but slower performance at the smaller sizes.
Specifically, at the largest size of 109 numbers, the Rust version showed a time dif-
ference of approximately 1.9 seconds, making it about 32% faster than the C++
version. However, for data sizes ranging from 105 to just before 108 , C++ demon-
16 Chapter 4. Results and Analysis

strated better performance. The execution time for writing speed grew linearly, but
it was evident that the Rust version had a slower rate of growth. Notably, for the
smallest data size, Rust was faster, but for all larger sizes except the two largest,
C++ performed better.
In the case of reading and writing experiments, it is difficult to pinpoint specific
reasons for the performance differences. Various factors could contribute to the
variations observed. For example, in Rust, the use of specific methods such as directly
reading file content into a string or utilizing the write-all function for writing could
have influenced the results. However, the exact factors responsible for the differences
are not clearly identifiable.

4.4 Table data

Matrix Multiplication
Number Count C++ ms Rust ms
250 6.52 9.62
375 26.22 33.44
500 61.70 79,61
750 241.23 312.56
1000 588.57 789.46
1250 1220.42 2285.54
1500 2274.75 4365.26
1750 5147.99 9880.91
2000 14030,35 21019.84
2500 44515.48 53386.83
3000 107293.55 118253.86

Table 4.1: Matrix Multiplication execution times

Merge sort
Number Count C++ ms Rust ms
105 9,74 9,22
5 ∗ 105 50,36 49,80
106 98,54 97,78
5 ∗ 106 504,12 502,33
107 1038,8 1030,17
5 ∗ 107 5430,97 5380,23
108 10885,61 10822,97
2*108 22334,26 21815,12

Table 4.2: Merge sort execution times


4.5. Matrix Multiplication regression model 17

Read
Number Count C++ ms Rust ms
104 0,01 0,01
105 0,02 0,03
106 0,12 0,31
5 ∗ 106 1,02 1,74
107 2,84 2,92
5 ∗ 107 8,54 21,6
108 17,11 41,56
109 172,36 345,29

Table 4.3: Read execution times

Write
Number Count C++ ms Rust ms
104 0,38 0,28
105 0,40 0,88
106 1,04 8,16
5 ∗ 106 7,39 20,79
107 20,88 83,18
5 ∗ 107 71,26 305,57
108 764,94 592,73
109 7799,02 5881,87

Table 4.4: Write execution times

4.5 Matrix Multiplication regression model


Figure 4.1 represents the exponential regression model for both C++ and Rust dis-
played for Matrix multiplication. It’s important to note that these models were
constructed without incorporating the data point from size 3000. Instead, this value
was utilized to calculate the approximation error for both models.
The green dot is the predicted value for C++ when the size of the matrix reaches
3000 and the purple dot is the predicted value for when Rust reaches a size of
3000. The approximation error for C++ when running with a size of 3000 was
approximately 50 seconds, which indicates a significant deviation. This corresponds
to the predicted time being around 30% faster than the actual result for size 3000 in
the case of C++.
For Rust, the approximation error was approximately 3.2 seconds in the executed
run for the size of 3000. This corresponds to a deviation of around 21% in the
predicted result, indicating that the predicted time was faster than the actual result
for running Rust with size 3000.
For both C++ and Rust, dashed lines initially follow the predicted model until
the value of 2500. After that point, the predictions deviate significantly from the
model, indicating a noticeable difference between the predicted and actual execution
times for both languages.
This deviation suggests that both C++ and Rust have further aspects such as
18 Chapter 4. Results and Analysis

compilers and that both languages have an unpredictable execution time at larger
sizes of matrix multiplication.
According to the prediction Rust should have had a faster execution time than
C++ at size 3000 which did not happen but the % difference has definitely decreased
compared to smaller sizes, so if this continues Rust will perform faster but at which
size this could happen is not clear from the results from this study.

Figure 4.1: Matrix multiplication model

4.6 Merge sort regression model


The linear regression models for merge sort in both C++ and Rust are displayed
in figure 4.2. These models were created without using the results from the size of
2 ∗ 108 , as it was instead utilized as a reference value for the models.
The green dot represents the predicted value of 2 ∗ 108 for C++ in the diagram.
When the actual size was run for C++, the measured execution time was 22334.36
ms, resulting in an approximation error of approximately 0.5 seconds. In other words,
the predicted time was around 2% faster than the actual execution time.
The purple dot represents the predicted value of 2 ∗ 108 for Rust in the diagram.
When the actual size was tested for Rust, the measured execution time was 21815.12
4.7. Read regression model 19

ms, resulting in an approximation error of approximately 0.2 seconds. This means


that Rust’s prediction was around 0.7% faster than the resulting execution time.
Both versions, as shown in the diagrams, overlap each other, and no meaningful
difference can be observed between them.
The red dashed line in the diagram represents the connection between all the
experiment results for merge sort developed in C++. The close overlap with the
rest of the lines indicates that the approximation error for C++ was very small.
Similarly, the blue dashed line suggests a small approximation error for Rust. The
close alignment of these lines with the actual data points reinforces the accuracy of
the models and the small discrepancies between the predicted and actual values.

Figure 4.2: Merge sort model

4.7 Read regression model


Figure 4.3 shows the linear regression models for the execution time of reading speed
in both C++ and Rust. The model creation excluded the value for 109 and instead
used it to calculate an approximation error for both models.
The green dot represents the predicted value of 109 for C++ in terms of execution
time. When the code was actually tested, the execution time measured was 172.36
20 Chapter 4. Results and Analysis

ms. This resulted in an approximation error of 3.12 ms, indicating that the predicted
result was approximately 0.2% faster than the actual execution time.
The purple dot represents the predicted value of 109 for Rust in terms of execution
time. When the code was tested, the measured execution time was 345.29 ms. This
resulted in an approximation error of 75.51 ms, indicating that the predicted result
was approximately 22% slower than the actual execution time for that size. However,
it’s important to note that the error of 75 ms can be attributed to the natural variance
in the execution time when running these applications.
The red dashed line represents the direct connection of all the experiment val-
ues without using the regression model. Both the C++ and Rust lines overlapped
significantly, indicating a low approximation error.
However, the approximation line for Rust displayed a different pattern, with
the lines diverging into separate paths after 108 . This observation aligns with the
previous calculation for Rust, where the execution time for larger sizes deviated from
the predicted model.

Figure 4.3: Read model


4.8. Write regression model 21

4.8 Write regression model


Figure 4.4 illustrates the linear regression models for the execution time of C++ and
Rust in the writing speed experiment. The models were created using data from
the data set sizes, excluding the value of 109 , which was utilized to calculate the
approximation error for both models.
The green dot represents the predicted value of 109 for C++. The actual result
from the experiment was 7788.02 ms, resulting in an approximation error of 947.6 ms.
The predicted value was approximately 12% faster than the execution time recorded
in the experiment. The discrepancy between the prediction and the actual result
could be attributed to the inherent stochastic nature of algorithm execution times.
The purple dot represents the predicted value of 109 for Rust. The actual exe-
cution time for this size was measured as 5881.87 ms, resulting in an approximation
error of 51.14 ms. The prediction was approximately 0.08% slower than the measured
execution time. This indicates a very small difference between the predicted value
and the actual result.
The approximation line for C++ deviates from the actual results after the value
of 108 , indicating a larger approximation error in the predictions. This deviation
suggests that the model’s accuracy decreases for larger sizes in the case of C++.
In contrast, for Rust, there is a close overlap between the approximation line
and the actual results, indicating that the predictions closely match the experiment
results. This suggests a higher level of accuracy in the prediction model for Rust.
22 Chapter 4. Results and Analysis

Figure 4.4: Write model


Chapter 5
Discussion

5.1 Matrix Multiplication


The answer to research question RQ1.1 reveals that C++ matrix multiplication ex-
hibited faster execution times across all sizes but the sizes in use are larger than the
sizes in this study, Rust may be a faster option as the results of this study. Table 4.1
provides a detailed breakdown of the execution times in C++ for reference. Figure
4.1 gives an overview of the results and equations to predict values. The exponential
equation representing the execution time in C++ is y = 87,38 * exp(0,0025 * x) -
498,02. Similarly, for Rust, the equation to calculate execution time was y = 351,37 *
exp(0,0020 * x) - 1330,89. However, it is important to note that the execution times
of applications can vary between runs and these are the results for this experiment.
The obtained result provides valuable information for practitioners making de-
cisions regarding applications that heavily rely on matrix multiplication, such as
machine learning. In these use cases, where performance is crucial, C++ emerges as
the preferable choice compared to Rust in all sizes tested for this study. By selecting
C++ for machine learning applications, practitioners can leverage its superior per-
formance in matrix multiplication, ultimately enhancing the efficiency and speed of
their algorithms. A big part of the potential improvements this can do to machine
learning is due to neural networks as described by Khaled et al., [11] neural networks
are reliant on matrix multiplication. With faster matrix multiplication, AI can be
trained faster and also enable faster use.
Additionally, the significance of matrix multiplication extends to scientific com-
puting routines, which have broad applications across various domains. Scientific
computing, as described in the book by Shiflet and Shiflet [23], encompasses a wide
range of applications and techniques that involve the use of computers to solve scien-
tific problems. It combines various disciplines such as computer simulation, scientific
visualization, mathematical modelling, and more to facilitate research and analysis
in scientific domains. One notable aspect of scientific computing is its ability to
transform practices in diverse fields. By leveraging computational power, researchers
can perform complex simulations, analyze large datasets, and gain insights that may
not be easily achievable through traditional experimental methods alone. This opens
up new avenues for scientific discovery and understanding. The book mentions sev-
eral examples where computer models and simulations play a crucial role in scientific
research. One such example is in HIV research, where computational models help
in understanding the behaviour of the virus, its interaction with the immune sys-
tem, and the effectiveness of potential treatment strategies. Another example is in

23
24 Chapter 5. Discussion

weather prediction, where sophisticated numerical models and simulations are em-
ployed to forecast weather patterns and improve our understanding of atmospheric
processes. As further mentioned by Shiflet and Shiflet these scientific computing
often have extremely high performance requirements to process all the information
collected. If this takes too long process researchers are not able to work with this
data efficiently. If the performance is faster it will also enable researchers to run
more complex problems that would not be possible otherwise. This study showed
that C++ from a matrix multiplication perspective was faster this means that if the
scientific computing use case uses matrix multiplication C++ will save more time
compared to and enable more complex problems than Rust.
Matrix multiplication is a fundamental algorithm with wide-ranging applications
in various fields, including machine learning, scientific computing, computer graphics,
and more. As the study by Li et al., [14] matrix multiplication was considered to
be on of the most fundamental algorithm and the efficiency of matrix multiplication
determines the computational complexities of almost all numerical algorithms. As
demonstrated by the performance results in the study, C++ consistently exhibited
faster execution times compared to Rust for matrix multiplication.
As shown in figure 4.1 Rust overall had a more predictable and stable performance
for matrix multiplications although C++ had a faster performance the increase in
stability of Rust might sway the choice in Rust’s favour when the stability of execu-
tion speed is important.

5.2 Merge sort


The answer to research question RQ1.2 is that Rust was faster for all sizes how-
ever the largest difference between these languages execution time in the merge sort
experiment was a 5% faster execution speed for Rust at the smallest size so the
performance differences are not large to make clear claims about the potential per-
formance benefits. Table 4.2 and the linear regression models for merge sort offer
detailed insights into the execution times observed during the merge sort experi-
ment. These models allow for estimating the execution time based on the input size,
providing a valuable tool for analyzing the performance trends of the merge sort
algorithm. Figure 4.2 gives an overview of the results in the merge sort experiments
and equations to predict the values for larger sizes in both languages. The linear
model for the execution time with the merge sort algorithm developed in the C++
experiment is y=-20.85 + 0.0001090280x. The linear model for the execution time
in Rust with the merge sort experiment is y=-22.56 + 0.0001083460x. It is worth
clarifying, as it is with the nature of executing applications that the execution time
can vary between runs which is especially important when the differences are very
small.
The study by Lobo and Kuwelkar [15] highlights the significance of sorting data
in scientific computing, particularly when dealing with large datasets. Sorted data
enables faster processing in many cases but also for some scientific computing use
cases it’s necessary to have sorted data to even make it possible. Merge sort, known
for its efficient performance in handling large data sets, holds particular relevance
in this context. The study’s results indicate comparable performance between C++
5.3. Reading and Writing 25

and Rust in executing merge sort with a slight advantage, which will provide insights
for decision-making.
This means that for Scientific computing the language which is used to make sure
that the data is sorted does not matter a lot from a performance perspective and
other considerations can be considered for scientific computing.
While the study does not offer a definitive advantage for either C++ or Rust in
terms of merge sort execution, this means that other considerations outside of perfor-
mance are especially important here. Instead, other considerations, such as ease of
development, language features, community support, and compatibility with existing
codebases, have a more prominent role in making decisions for this experiment.
Figure 4.2 also shows that both C++ and Rust have predictable and stable ex-
ecution speeds for all sizes tested which suggests that for use in applications which
are in need of stable execution time, there is no clear preference either.

5.3 Reading and Writing


Answering research question RQ1.3 requires more careful consideration due to the
added complexity of both reading and writing. In the case of reading files, C++
demonstrated a clear advantage in terms of faster execution. However, it is im-
portant to note that both languages performed efficiently, with none of the reading
applications taking more than a second.
The observed performance advantage of C++ in reading speed may not be no-
ticeable until extremely large file sizes. Based on Figure 4.3 for the reading speed
of C++, the file size would need to reach approximately 5,9 GB for the execution
time to exceed 1 second. Similarly, for Rust, a file size of around 2,4 GB would be
required to achieve the same threshold.
As the file size increases to the magnitude of 1010 , the execution time becomes
more relevant. In this scenario, C++ is estimated to take around 1,7 seconds, while
Rust is projected to take approximately 4,2 seconds. Therefore, the execution time
becomes more significant as the file sizes grow.
However, it is important to note that both languages demonstrated efficient ex-
ecution times, with none of the reading applications exceeding one second. As a
result, the performance benefits offered by C++ may only become noticeable when
dealing with exceptionally large file sizes.
Another aspect figure 4.3 makes clear is that reading for C++ has a very pre-
dictable and stable execution speed which for use cases that requires such stability
is very useful. As for Rust, the performance does not seem to be as predictable due
to the relatively large difference, but the difference of 75 ms is not an extremely long
time.
As for writing during the experiments, it was observed that C++ initially exhib-
ited faster writing times, but a shift occurred between the data sizes of 5∗107 and 108
where Rust demonstrated lower execution times. Surprisingly, this deviation from
the predicted value of 43 310 000, based on Figure 4.4, suggests that the C++ runs
at 5 ∗ 107 had unusually fast execution times. In theory, Rust should have already
surpassed C++ in terms of performance at this point.
What figure 4.4 makes clearer is that writing for Rust has a very predictable and
26 Chapter 5. Discussion

stable execution time, which means for use cases in need of predictable performance
for writing Rust provides that. As for C++, the predictions are very unstable and
unpredictable which would suggest that this would not be the ideal choice for a
predictable execution time.
Considering applications that involve both reading and writing large data blocks,
Rust will prove to be more efficient, particularly when writing to disk due to the
writing speed taking much longer than reading does for large datasets.
These results suggest that for applications where reading and writing to disk is a
bottleneck C++ will be the faster choice for smaller sizes until we get to sizes such
as 108 where Rust performs faster for writing which is a more time-consuming disk
operation than reading.

5.4 Ethical, Societal and Sustainability concerns


When significant changes occur, such as transitioning to a different programming
language within an infrastructure, it can pose challenges for practitioners in terms
of adapting to this new language and make it harder for developers to work with
this technology. Therefore, it is crucial to ensure that developers are not left behind
and are given opportunities to keep up with these advancements. However, it is
equally important to approach such substantial decisions with caution, taking into
consideration the impact they may have on developers’ ability to work efficiently
within different projects.
In addition to the previous consideration, there are further concerns regarding the
reliability of applications when adopting Rust as a programming language. According
to a study conducted by Pinho et al., [21] Rust’s emphasis on memory safety can
enhance the safety and reliability of critical systems. Reliability holds exceptional
significance for use cases that involve safety-critical operations, and any means to
improve it should be prioritized. However, it is essential to acknowledge that when
developers are confronted with new conditions or technologies, there is always a risk
of potential quality degradation in applications. Therefore, it becomes even more
crucial to focus on maintaining high quality and investing effort in ensuring that
developers adapt effectively to new technologies such as Rust.
An increasingly significant concern that demands attention is the issue of energy
usage. This concern has been brought up in a study conducted by Pereira et al., [20]
which examines the energy consumption of different programming languages and
emphasizes the growing importance of energy efficiency.
While it is acknowledged that the relationship between energy consumption and
execution time in software solutions is not strictly one-to-one, Pereira’s et al., findings
suggest that the execution time results can provide some indication of a language’s
energy efficiency.
So this study’s result also gives some idea of which language would be most energy
effective. Although a study which has a full focus on energy usage would have to be
done to confirm this.
Using the most energy effective language will have a large impact when a program
is used on a large scale and becomes more and more important as our society’s energy
needs increase.
Chapter 6
Conclusions and Future Work

6.1 Conclusion
In conclusion, C++ had faster execution speeds for more tests compared to Rust,
but the results also suggest that Rust will have a performance advantage in certain
areas.
For the matrix multiplication, the study findings indicate that C++ exhibited
faster execution speed compared to Rust but when sizes grew Rust execution become
more similar to C++ as sizes grew larger than 2000*2000 and if this continues rust
will have a faster execution speed. Both implementations demonstrated exponential
growth in execution time as the size of the matrices increased. Although both of the
implementations also did not grow as fast as expected as the sizes grew larger which
suggests that both languages made some optimizations to avoid having as big of a
performance decrease.
This suggests that the performance difference between C++ and Rust will con-
tinue to widen as the size of the matrices expands. Therefore, for applications involv-
ing matrix multiplication, C++ is preferable in terms of execution speed, particularly
for larger matrix sizes.
In summary for merge sort, the study results indicate that Rust exhibited faster
execution speed for merge sort compared to C++ in all tested scenarios. However,
the performance differences between the two languages were not significant, with
the largest observed difference being only 5%. Consequently, it is challenging to
draw a definitive conclusion regarding the superiority of either language for merge
sort. While Rust demonstrated slightly better performance across all tested sizes,
the marginal advantage may not be substantial enough to make definitive statements
about the preferred language for merge sort.
In terms of reading files, C++ demonstrated better performance across all tested
sizes, except for the first size where both languages had similar execution times. It is
important to note that both applications exhibited fast reading speeds, with none of
the execution times exceeding one second, even for the largest sizes. When it comes
to writing, C++ outperformed Rust for the majority of the tested data set sizes.
However, for the largest two data set sizes, Rust exhibited faster execution times,
indicating an advantage for writing larger sizes in Rust.
Considering an application that involves both reading and writing large blocks
of data, the execution time of writing becomes the dominant factor. Therefore, the
choice of language would depend on whether the data size exceeds the threshold
where Rust becomes faster in writing. In such cases, Rust would be the preferred

27
28 Chapter 6. Conclusions and Future Work

language due to its better performance.


Overall, C++ demonstrated faster performance than Rust in a majority of the
tests conducted but Rust also had an advantage in some areas. However, it is crucial
to consider the specific use case and consider all relevant factors before making a
decision. While performance is an important consideration, it should be balanced
with other factors such as development speed, the expertise of the development team,
language features, community support, and compatibility with existing code bases.
All of this is important to get the optimal choice of language for your use case so
that a better well-made software product can be created.

6.2 Future work


Interesting future work to strengthen the performance guidance of choosing between
C++ and Rust would be studies comparing performance when it comes to parallelism
which is another point of discussion between these languages. Studies evaluating the
performance difference between C++ and Rust when it comes to parallelism are
especially interesting because parallelism is becoming more and more important to
improve performance. This was out of the scope of this study which focused on a
sequential mindset.
To enhance decision support between C++ and Rust, more studies examining
various aspects of the languages, such as productivity and reliability, would provide
valuable insights. While this study focused on performance comparisons, exploring
additional dimensions can provide a more comprehensive understanding of the two
languages.
Additionally, there is still a significant research gap when it comes to comparing
Rust with other programming languages, such as Java. Conducting further compar-
ative studies that encompass a broader range of languages would greatly enhance the
decision-making process for practitioners. By expanding the scope of comparisons,
practitioners can gain a more comprehensive understanding of the unique strengths
and weaknesses of Rust, Java, and other languages and make the best choice for a
use case.
References

[1] A. Balasubramanian, M. S. Baranowski, A. Burtsev, A. Panda, Z. Rakamarić,


and L. Ryzhyk, “System programming in rust: Beyond safety,” in Proceedings of
the 16th Workshop on Hot Topics in Operating Systems, ser. HotOS ’17. New
York, NY, USA: Association for Computing Machinery, 2017, p. 156–161.
[2] S. Bjarne, A History of C++: 1979–1991. New York, NY, USA: Association
for Computing Machinery, 1996, p. 699–769.
[3] ——, “An overview of the c++ programming language,” The Handbook of Object
Technology, vol. 1, no. 1, p. 23, 01 1998.
[4] N. Borgsmüller, “The rust programming language for embedded software
development,” Ingolstadt, pp. 63, lxvii, 2021. [Online]. Available: https:
//opus4.kobv.de/opus4-haw/frontdoor/index/index/docId/786
[5] Y. W. Chua, “Appreciating rust’s memory safety guarantees,” https://medium.
com/singapore-gds/appreciating-rust-memory-safety-438301fee097, 2017, (ac-
cessed: 11-may-2023).
[6] D. Detlefs, A. Dosser, and B. Zorn, “Memory allocation costs in large c and c++
programs,” Software: Practice and Experience, vol. 24, no. 6, pp. 527–542, 1994.
[7] A. N. Evans, B. Campbell, and M. L. Soffa, “Is rust used safely by software
developers?” in Proceedings of the ACM/IEEE 42nd International Conference
on Software Engineering. Cornell: ACM, jun 2020.
[8] A. Fawzi, M. Balog, A. Huang, T. Hubert, B. Romera-Paredes, M. Barekatain,
A. Novikov, F. J. R. Ruiz, J. Schrittwieser, G. Swirszcz, D. Silver, D. Hass-
abis, and P. Kohli, “Discovering faster matrix multiplication algorithms with
reinforcement learning,” Nature, vol. 610, no. 1, p. 53, 2022.
[9] V. Franzén and C. Östling, “Evaluation of rust for gpgpu high-
performance computing,” https://odr.chalmers.se/server/api/core/bitstreams/
5d380bc5-7cfc-4e21-b429-c52dcba3ccfb/content, p. 42, 2022.
[10] I. Gouy, “The computer language 22.05 benchmarks game,” https://
benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/rust.html,
2023, (accessed: 11-may-2023).
[11] A. Khaled, A. F. Atiya, and A. H. Abdel-Gawad, “Applying fast matrix multipli-
cation to neural networks,” in Proceedings of the 35th Annual ACM Symposium
on Applied Computing, ser. SAC ’20. New York, NY, USA: Association for
Computing Machinery, 2020, p. 1034–1037.

29
30 References

[12] S. Klabnik, “The history of rust,” 2016. [Online]. Available: https:


//dl.acm.org/doi/10.1145/2959689.2960081
[13] M. D. Lam, E. E. Rothberg, and M. E. Wolf, “The cache performance and
optimizations of blocked algorithms,” in Proceedings of the Fourth International
Conference on Architectural Support for Programming Languages and Operating
Systems, ser. ASPLOS IV. New York, NY, USA: Association for Computing
Machinery, 1991, p. 63–74.
[14] Y. Li, S.-L. Hu, J. Wang, and Z.-H. Huang, “An introduction to the computa-
tional complexity of matrix multiplication,” Journal of the Operations Research
Society of China, vol. 8, no. 1, p. 24, 2020.
[15] J. Lobo and S. Kuwelkar, “Performance analysis of merge sort algorithms,” in
2020 International Conference on Electronics and Sustainable Communication
Systems (ICESC). IEEE, 2020, pp. 110–115.
[16] J. N. Matthews, W. Hu, M. Hapuarachchi, T. Deshane, D. Dimatos, G. Hamil-
ton, M. McCabe, and J. Owens, “Quantifying the performance isolation proper-
ties of virtualization systems,” in Proceedings of the 2007 Workshop on Experi-
mental Computer Science, ser. ExpCS ’07. New York, NY, USA: Association
for Computing Machinery, 2007, p. 6–es.
[17] M. Medin, “Performance comparison between c and rust compiled
to webassembly,” http://www.diva-portal.org/smash/record.jsf?pid=diva2%
3A1577357&dswid=6381, p. 11, 2021.
[18] Mozilla, “Rust lang documentation,” https://doc.rust-lang.org/
rust-by-example/, 2023, (accessed: 02-Mars-2023).
[19] ——, “Rust lang homepage,” https://www.rust-lang.org/, 2023, (accessed: 02-
Mars-2023).
[20] R. Pereira, M. Couto, F. Ribeiro, R. Rua, J. Cunha, J. a. P. Fernandes, and J. a.
Saraiva, “Energy efficiency across programming languages: How do energy, time,
and memory relate?” in Proceedings of the 10th ACM SIGPLAN International
Conference on Software Language Engineering, ser. SLE 2017. New York, NY,
USA: Association for Computing Machinery, 2017, p. 256–267.
[21] A. Pinho, L. Couto, and J. Oliveira, “Towards rust for critical systems,” in 2019
IEEE International Symposium on Software Reliability Engineering Workshops
(ISSREW), 2019, pp. 19–24.
[22] S. Sharma, “Performance comparison of java and c++ when sorting integers and
writing/reading files.” https://www.diva-portal.org/smash/record.jsf?dswid=
7960&pid=diva2%3A1333653&c=1&searchType=UNDERGRADUATE&
language=sv&query=&af=%5B%5D&aq=%5B%5B%7B%22freeText%22%
3A%22Performance+comparison+of+Java+and+C%2B%2B+when+sorting+
integers+and+writing%2Freading+files%22%7D%5D%5D&aq2=%5B%5B%
5D%5D&aqe=%5B%5D&noOfRows=50&sortOrder=author_sort_asc&
sortOrder2=title_sort_asc&onlyFullText=false&sf=all, p. 25, 2019.
References 31

[23] A. B. Shiflet and G. W. Shiflet, Introduction to Computational Science: Mod-


eling and Simulation for the Sciences, 2nd ed. Princerton University Press,
2014.
[24] Stack Overflow, “Stack overflow 2022 survey,” https://survey.stackoverflow.co/
2022/#technology, 2022, (accessed: 03-Feb-2023).
[25] The algorthims, “The algorithms,” https://the-algorithms.com/, 2023, (ac-
cessed: 27-april-2023).
[26] B. Thompson, “C++ introduction guru,” https://www.guru99.com/
cpp-tutorial.html, 2023, (accessed: 02-Mars-2023).
[27] W3 Schools, “C++ introduction,” https://www.w3schools.com/cpp/cpp_intro.
asp, 2023, (accessed: 02-Mars-2023).
[28] H. Wang, P. Wang, Y. Ding, M. Sun, Y. Jing, R. Duan, L. Li, Y. Zhang,
T. Wei, and Z. Lin, “Towards memory safe enclave programming with rust-
sgx,” in Proceedings of the 2019 ACM SIGSAC Conference on Computer and
Communications Security, ser. CCS ’19. New York, NY, USA: Association for
Computing Machinery, 2019, p. 2333–2350.
[29] K. D. Wong, “Matrix multiplication performance in c++,” http:
//www.kerrywong.com/2009/03/07/matrix-multiplication-performance-in-c/,
2009, (accessed: 04-Mars-2023).
[30] Y. Zhang, Y. Zhang, G. Portokalidis, and J. Xu, “Towards understanding the
runtime performance of rust,” in Proceedings of the 37th IEEE/ACM Interna-
tional Conference on Automated Software Engineering, ser. ASE ’22. New York,
NY, USA: Association for Computing Machinery, 2023.
Faculty of Computing, Blekinge Institute of Technology, 371 79 Karlskrona, Sweden

You might also like