C# Multithreaded and Parallel Programming Sample Chapter
C# Multithreaded and Parallel Programming Sample Chapter
ee
Sa
m
pl
C# Multithreaded and
Parallel Programming
Welcome to C# Multithreaded and Parallel Programming. This book will take you
through all of the ways to perform multithreaded and concurrent programming using the
C# programming language and the .NET Framework. We will start with a description of
what concurrent and parallel programming is, why it is important, and when you should
implement it. We will then go through the different classes provided by the .NET
Framework and the different design patterns commonly used when developing
multithreaded applications.
Most modern machines have dual-core processors. This means that the present-day
computer has the ability to multitask. Using multiple cores means your applications can
process data faster and be more responsive to users. However, to fully exploit this in your
applications, you need to write multithreaded code.
This will take us on a journey from the BackgroundWorker component, the Thread
class, the Task Parallel Library, to the async and await keywords. We will also explore
common design patterns such as Pipelining, producer-consumer, and the
IAsyncResult interface.
Using the concurrent and parallel classes provided by .NET allows you to easily write
powerful multithreaded applications. In the latest version of .NET, Microsoft has added
the Task Parallel Library and the async keyword to make concurrent programming
functionality much easier than using threads.
We will cover all aspects of developing multithreaded applications using the latest
version of .NET in this book.
Chapter 6, Task-based Parallelism, enables us to examine the Task Parallel Library and
task parallelism. A task is an asynchronous set of operations that can be run concurrently
with other tasks. We will examine designing an application as a series of tasks that can
be performed in parallel. With the help of examples, we will demonstrate how to create,
manage, and coordinate tasks. We will further examine additional topics with the Task
Parallel Library and task parallelism. We will learn how to perform exception handling
when running multiple tasks, how to schedule tasks under certain conditions, and how to
cancel running tasks before they complete when needed.
Chapter 7, Data Parallelism, explores the concept of data parallelism. We will see how
to perform the same operations on elements of a collection concurrently using the Task
Parallel Library. The Parallel class has the For and ForEach loops, and we will show
examples of each to demonstrate how they handle concurrent data processing. We will
convert our image processing application from heavyweight to lightweight concurrency
using the Task Parallel Library instead of the Thread class.
Chapter 8, Debugging Multithreaded Applications with Visual Studio, teaches us how
to take full advantage of Visual Studio 2012 to debug our multithreaded applications.
We will demonstrate using the Threads view, and the Tasks, Parallel Stacks, and Parallel
Watch windows. We will finish with debugging our image processing application.
Chapter 9, Pipeline and Producer-consumer Design Patterns, helps us explore two of
the most popular parallel patterns for developmentPipelining and producer-consumer.
In Pipelining, we will see how to accomplish a parallel task where a simple parallel
loop will not work due to data dependencies. The producer-consumer pattern allows a
producer, which is generating results, to run along with the consumer so that the
consumer can consume the results concurrently. We will expand our image processing
application to implement these two patterns in combination.
Chapter 10, Parallel LINQ PLINQ, details the benefits and functionality provided
by Parallel LINQ (PLINQ). We will see how PLINQ speeds up traditional LINQs by
separating the data source into sections and executing the query on each section. We will
also discuss what kind of queries to use PLINQ for because not all queries will run faster
using PLINQ.
Chapter 11, The Asynchronous Programming Model, explains the Asynchronous
Programming Model (APM), which is a design pattern that is based on classes
implementing the IAsyncResult interface. We will see how to begin and end
asynchronous operations and use delegates to call methods asynchronously.
This chapter will also cover the new async and await keywords and how to use
them to implement an asynchronous design in your custom classes.
Data Parallelism
Concurrently performing a task or a set of operations on a collection of data is referred
to as data parallelism. For example, if we have a list of files in a folder and we want to
rename them all, we can create a For loop that goes through the collection and, during
each iteration, the loop performs a rename command. We can also iterate through a
collection datatype such as a List or DataView using a foreach statement. These are
specialized For and ForEach statements that are part of the Task Parallel Library
(TPL) in the System.Threading.Tasks.Parallel namespace.
The TPL provides the Parallel library to make it easy to perform concurrent
operations on a dataset or data collection using the different overloads of the
Parallel.For and Parallel.ForEach methods.
In this chapter, we will learn how to process items of a data source in parallel using
the Parallel.For and Parallel.ForEach methods. We will also examine the
ParallelLoopState class, which allows us to examine the results of a concurrent
loop and perform actions with the results. Finally, we will learn how to cancel a
concurrent loop before it has completed. In this chapter, we will cover:
Data Parallelism
In each example, the method or lambda expression takes a single parameter that is
the iteration value. If you need more control over the execution of the concurrent
loop, there are overload methods that take a ParallelLoopState parameter that
is internally generated by .NET. We will talk about this later in the chapter, but it
allows us to do things such as canceling a parallel loop or performing an action for
each iteration of the loop once it is completed.
Here is a list of all of the overloads of the Parallel.For method:
[ 184 ]
Chapter 7
MSDN referencehttp://msdn.microsoft.com/en-us/library/system.threading.tasks.parallel.for(v=vs.110).
aspx
How to do it
For this example, we will create a WPF application that allows the user to enter
numbers in 10 boxes and click on a button. Once the button is clicked, it will
concurrently take each number and multiply it by the numbers 1 through 10 and
sum the results. The result of each calculation will be placed back in each box.
Perform the following steps to do so:
1. Open Visual Studio and create a WPF application named ParallelMath1.
2. In the MainWindow.xaml design view, change the page title to ParallelMath:
Title="ParallelMath" Height="350" Width="525">
[ 185 ]
Data Parallelism
7. Now, put the following code inside the btnCalculate_Click event handler:
int[] numbers = new int[10];
numbers[0] = Convert.ToInt32(tb1.Text);
numbers[1] = Convert.ToInt32(tb2.Text);
numbers[2] = Convert.ToInt32(tb3.Text);
numbers[3] = Convert.ToInt32(tb4.Text);
numbers[4] = Convert.ToInt32(tb5.Text);
numbers[5] = Convert.ToInt32(tb6.Text);
numbers[6] = Convert.ToInt32(tb7.Text);
numbers[7] = Convert.ToInt32(tb8.Text);
numbers[8] = Convert.ToInt32(tb9.Text);
numbers[9] = Convert.ToInt32(tb10.Text);
Parallel.For(0, 9, CalculateNumbers);
tb1.Text
tb2.Text
tb3.Text
tb4.Text
tb5.Text
tb6.Text
tb7.Text
tb8.Text
tb9.Text
tb10.Text =
= numbers[0].ToString();
= numbers[1].ToString();
= numbers[2].ToString();
= numbers[3].ToString();
= numbers[4].ToString();
= numbers[5].ToString();
= numbers[6].ToString();
= numbers[7].ToString();
= numbers[8].ToString();
numbers[9].ToString();
Chapter 7
for (int k = 1; k <= 10; k++)
{
j *= k;
}
numbers[i] = j;
}
That should be all. Now, let's run our application and see what happens. Remember
we have not put in any error handling. The application expects a number and only
a number in each textbox when the Calculate button is clicked. If it is not there, the
application will throw an argument out of range exception.
You should see results similar to this before you click on the button:
[ 187 ]
Data Parallelism
Now, let's enter 10 numbers into our textboxes so that the application looks
something like the following:
Now, click on the Calculate button and you should see the following results very
quickly since we are doing these calculations concurrently:
[ 188 ]
Chapter 7
How it works
In the preceding exercise, we entered 10 numbers into 10 textboxes and then clicked
on Calculate. The program then took each number, multiplied it by the numbers 1
through 10, and summed them together. It then placed the result back in the textbox
it came from.
This was all done concurrently. Each textbox was processed in parallel. This may
have been on 10 separate threads or fewer, depending on the hardware we run the
program on. Unlike using threads directly by using the Parallel class and TPL,
.NET manages the threadpool and maximizes how many threads to perform the
concurrent operation on, using the processing cores available on the machine.
Let's look at how the concurrent loop is executed. It is the single command
Parallel.For(0, 9, CalculateNumbers);. This command queues 10 tasks
to the threadpool and each task will execute the CalculateNumbers method
with an integer parameter.
Now, let's look at the Parallel.ForEach command.
There are two parameters in the most basic version of this method. There is
a data collection and an Action delegate to perform a task on an item of the
dataCollection. The Action delegate takes a single parameter that is an item
in the collection.
[ 189 ]
Data Parallelism
The following are all of the different overloads of the ForEach method:
Referencehttp://msdn.microsoft.com/en-us/library/system.threading.tasks.parallel.foreach(v=vs.110).aspx
As you can see, there are many different overloads to this method. They allow us to
use a ParallelLoopState object or a thread-safe local variable.
[ 190 ]
Chapter 7
We will focus on the simple form of just performing concurrent processing on a data
collection. To further reiterate this point, let's revisit a project we worked on earlier in
the book. In Chapter 4, Advanced Thread Processing, we wrote an application that took
a JPG image, divided it into separate bitmaps, and then performed parallel functions
on each bitmap to find old stars. It then reassembled the individual bitmaps back
into a single image.
We will rewrite this application using data parallelism and the TPL instead of
threads directly. This will demonstrate how TPL can simplify multithreaded code
development
No longer do we have to manage threads (start them, wait on them to complete,
or track them). We no longer have to manage the number of processing cores our
machine has to maximize performance without starting too many individual threads.
All we have to do is separate our large image into a collection of smaller bitmaps
and use a Parallel.ForEach concurrent loop to process each bitmap. That's it.
Let's get started.
How to do it
We will take our original OldStarsFinder Windows Form application and change
it. To do this let's perform the following steps:
1. First, let's open our OldStarsFinder application in Visual Studio.
2. Let's add a new using statement so we can access the Parallel library:
using System.Threading.Tasks;
3. At the beginning of the class definition, remove all of the old variable
declarations and replace them with just these:
private int Count = 0;
// Number of bitmaps to break the original into and add to
// the list of bitmaps.
private List<Bitmap> BitmapList;
//List of bitmaps to use the ParallelForEach on.
Bitmap OriginalBitmap;
private String prsOldStarsCount = "0";
//Old stars count using a lock to protect thread safety.
[ 191 ]
Data Parallelism
Chapter 7
i = i + 1;
prsOldStarsCount = i.ToString();
}
}
}
}
}
Data Parallelism
g.Dispose();
}
[ 194 ]
Chapter 7
9. Then add a label control with the text, Number of bitmaps to divide
into for processing:.
10. Also, add a textbox control and set its Name property to tbTasks. This will be
used to allow you to designate the number of sections you want the bitmap
divided into.
11. Finally, we remove the butFindOldStarsBatch button because we do not
need it in this application.
That should be all you need to do to run this application using data parallelism with
the Task Parallel Library.
Let's compile and run our application. You should get something like this:
[ 195 ]
Data Parallelism
Now, enter the number of bitmaps to divide the image into and click on the Old Star
Finder button. The application will now look like this:
What just happened? We entered 8 for the number of bitmaps to divide into.
The application splits the JPG image into 8 equal-sized bitmaps and then into
a list collection of bitmaps. Then it concurrently processes each bitmap looking
for old stars. Finally, it reassembles the bitmaps into one image and redisplays it.
Let's take a closer look at what just happened.
How it works
If you compare the two versions of the program, you will see that the second version
is much simpler with less code. If we examine the butOldStarsFinder_Click event
handler method, we will see most of the work. First, we divide our image up into a
List collection of smaller bitmaps based on the number we entered. Here is the code
that does this:
// Breakup the bitmap into a list of bitmaps.
for (int i = 0; i < Count; i++)
{
[ 196 ]
Chapter 7
if (EachBitmapHeight > HeightToAdd)
{
// The last bitmap height perhaps is less than
// the other bitmaps height
EachBitmapHeight = HeightToAdd;
}
loBitmap = CropBitmap(OriginalBitmap, new Rectangle(0,
StartRow, OriginalBitmap.Width, EachBitmapHeight));
HeightToAdd -= EachBitmapHeight;
StartRow += EachBitmapHeight;
BitmapList.Add(loBitmap);
}
Next, we take our list collection, BitmapList, and use it in a parallel ForEach
command in this line of code:
Parallel.ForEach(BitmapList, item => ThreadOldStarsFinder(item));
Finally, when this loop has completed, we display the image with the old stars with
this method:
ShowBitmapWithOldStars();
That is it. We no longer have to find out how many cores the processor has and
create that many threads. No matter how many items there are in our collection,
.NET maximizes the threads in the threadpool to achieve optimal performance. It
will create threads if needed or reuse existing threads if possible. This saves on the
overhead of starting more threads than can be effectively used by the number of
cores in the machine.
You can now see why writing multithreaded code using TPL is called lightweight
concurrency. This version of the Old Stars Finder is definitely "lighter" on the
code and logic than the previous version written directly with threads or
heavyweight concurrency.
[ 197 ]
Data Parallelism
If we break from a parallel loop, then we complete all iterations on the threads that
are currently executing and then stop. If we Stop a parallel loop, then we stop all
currently running iterations of the loop as soon as possible, but we do not run them
to completion. In either case, we will not schedule tasks on the threadpool for the rest
of the iterations of the parallel loop that we are yet to get started with.
To perform a break or a stop of a parallel loop, we need to use the ParallelLoopState
object. This means that we have to use one of the overloads or the Parallel.For or
Parallel.ForEach method that takes a ParallelLoopState parameter.
What is the ParallelLoopState class? This class cannot be
instantiated in your user code. It is provided by the TPL and .NET and
hence is a special class. This object provides your parallel loop with a
mechanism to interact with other iterations in the loop.
The break and stop methods are the methods you will use most often, as well as
the IsStopped and IsExceptional properties. These properties allow you to check
whether any iteration of the loop has called Stop or thrown an exception.
Now, we will take our ParallelMath1 example and change it to stop the loop after
seven iterations. This is arbitrary, for example purposes. But in a real example, there
are many conditions where you will want to break or stop from a parallel loop.
How to do it
We just need to make a few adjustments to our previous program. Let's start by
opening the ParallelMath1 WPF application in Visual Studio and making the
following changes:
1. Create a new method called CalculateNumbers2 and place the following
code into it:
private void CalculateNumbers2(int i, ParallelLoopState
pls)
{
int j = numbers[i];
if (i < 7)
{
for (int k = 1; k <= 10; k++)
{
j *= k;
}
numbers[i] = j;
}
[ 198 ]
Chapter 7
else
{
pls.Stop();
return;
}
}
That's it. Now, let's run our application and put numbers in each of the boxes so that
it looks like the following screenshot:
[ 199 ]
Data Parallelism
Now, click on the Calculate button and your results should look like the
following screenshot:
What do you see? Yes, after seven iterations of the parallel loop, the loop is stopped
and the last three iterations are not finished. Let's examine why.
How it works
By adding the ParallelLoopState parameter to the method called by the parallel
For method, we actually change the overload of the method that is called. We are
now calling this overload:
[ 200 ]
Chapter 7
You will notice that we do not create this ParallelLoopState variable and pass
it into the CalculateNumbers2 method. It is done by .NET and we can just use it.
Pretty handy!
Now, in our Action delegate, CalculateNumbers2, we call the Stop method of this
object using the following command:
pls.Stop();
Once this method is called, the rest of the iterations of the loop are not performed
and the loop completes with the iterations it has already completed.
This is not a very practical examplewhy execute the loop for 10 iterations and
just stop after seven? Why not execute the parallel for seven iterations in the first
place? This is just an example for demonstration purposes. In your applications,
you will find many conditions by which you will want to exit a parallel loop before
completing all iterations, just like with a normal For loop.
[ 201 ]
Data Parallelism
How to do it
To start, let's open the ParallelMath1 project in Visual Studio and make the
following changes:
1. We will be using a Concurrent queue to collect the exceptions, so add this
using statement:
using System.Collections.Concurrent;
3. Next, we will add a new method to be called by the Action delegate of our
parallel loop command. We will call this method CalculateNumbers3. Add
the following code to this method:
private void CalculateNumbers3(int i, ParallelLoopState
pls)
{
int j = numbers[i];
try
{
for (int k = 1; k <= 10; k++)
{
j *= k;
if (j > 5000000) throw new
ArgumentException(String.Format("The value of text box {0} is {1}.
", i, j));
}
}
catch(Exception e)
{
exceptions.Enqueue(e);
}
numbers[i] = j;
}
[ 202 ]
Chapter 7
4. Then, let's alter our btnCalculate_Click event handler. Change the code
between the population of the numbers array and the population of the
textboxes to include the following lines of code:
try
{
Parallel.For(0, 10, CalculateNumbers3);
if (exceptions.Count > 0) throw new
AggregateException(exceptions);
}
catch (AggregateException ae)
{
// This is where you can choose which exceptions
to handle.
foreach (var ex in ae.InnerExceptions)
{
if (ex is ArgumentException)
{
tbMessages.Text += ex.Message;
tbMessages.Text += "\r\n";
}
else
throw ex;
}
}
[ 203 ]
Data Parallelism
Build and run your application. Now, enter numbers in each of the boxes.
You should have a screen that looks like the following screenshot:
Now, click on the Calculate button and you should see results that look like the
following screenshot:
[ 204 ]
Chapter 7
As you can see from the output, every box that has a total that goes over 5 million
has a line printed in our Messages textblock. In this example, any number that goes
over 5 million throws an exception. Once all of the iterations of the parallel loop
have completed, we process these exceptions and print their messages to the
tbMessages textblock.
How it works
The first thing we changed was adding an object that is in a concurrent queue,
to hold all of our exceptions. This is done with this command:
ConcurrentQueue<Exception> exceptions = new
ConcurrentQueue<Exception>();
Next, in our Action delegate of the parallel for command, which in this version
executes CalculateNumbers3, we check for numbers greater than 5 million through
an exception using the following command:
if (j > 5000000) throw new ArgumentException(String.Format("The value
of text box {0} is {1}. ", i, j));
We then catch this exception within the delegate and add it to our concurrent queue
of exception objects using these statements:
catch(Exception e)
{
exceptions.Enqueue(e);
}
Data Parallelism
{
// This is where you can choose which exceptions to
handle.
foreach (var ex in ae.InnerExceptions)
{
if (ex is ArgumentException)
{
tbMessages.Text += ex.Message;
tbMessages.Text += "\r\n";
}
else
throw ex;
}
}
Chapter 7
The first function will initialize the thread-local variable. The second function is the
Action delegate that the loop performs. The third function is the Action delegate
that gets executed when all iterations of the loop have completed, and it receives the
thread-local variable for each loop iteration. It can then process the results, which
usually means combining the results.
Let's examine one of the ForEach overloads:
ForEach<TSource, TLocal>(IEnumerable<TSource>, Func<TLocal>,
Func<TSource, ParallelLoopState, TLocal, TLocal>, Action<TLocal>)
Let's dissect this for a minute. We will take each piece of the method definition
and explain its role:
variable.
of the loop.
How to do it
To use a thread-local variable to sum up our textboxes once we have performed
our parallel loop on them, let's open our ParallelMath1 project and make a
few changes:
1. In the MainWindow.xaml file in the designer view, let's add a label control
and set the Content property to Sum:.
2. Now, let's add a textbox control beside it and set the Name property to tbSum
and make the Text property empty.
[ 207 ]
Data Parallelism
4. Also, add a class variable below our ConcurrentQueue declaration for a sum
variable we will call total:
long total = 0;
7. Finally, right after this statement, add the following statement so that we can
see the total on the user interface:
tbSum.Text = total.ToString();
That is all the changes we need to make so that we can use our thread-local variable
with the Parallel.For loop to calculate the sum of our textbox.
Once these changes have been made, build and run the application. Enter numbers in
the textboxes and you should have a screen that looks like the following screenshot:
[ 208 ]
Chapter 7
Now, click on the Calculate button and see what happens. The results should look
like the following screenshot:
[ 209 ]
Data Parallelism
As you can see from the example, we now have a sum of all of the boxes once
the parallel loop has processed them. We are able to do this without having to
continually lock the class variable in each iteration when it wants to update the loop.
We can do the summing once at the end of the parallel loop using the thread-local
value from each iteration of the loop.
Now, let's examine what just happened
How it works
Just like in the previous versions of this project, we take the numbers in 10 different
textboxes and multiply them by the numbers 1 through 10 and sum them. The result
is then put back in the textbox. But this time, we take the new results in the 10 text
boxes and sum them, and the final total is displayed in the tbSum textbox.
The only real difference in this version is the Parallel.For command. Let's take
a deeper look at it:
Parallel.For<long>(0, 10,
() => 0,
(i, loop, subtotal) =>
{
int j = numbers[i];
for (int k = 1; k <= 10; k++)
{
j *= k;
}
numbers[i] = j;
subtotal += j;
return subtotal;
},
(finalResult) => Interlocked.Add(ref
total, finalResult)
);
Chapter 7
{
j *= k;
}
numbers[i] = j;
subtotal += j;
return subtotal;
}
Let's create a back up; the first two parameters are the starting and ending indices of
our iteration, 0 and 10. The third parameter is our Action delegate that initializes the
thread-local variable. It is implemented with a lambda expression:
() => 0
Then, our final parameter to the Parallel.For method is the Action delegate that is
executed on each iteration's thread-local variable:
(finalResult) => Interlocked.Add(ref total, finalResult)
We chose to use lambda expression for the three Action delegates in this example
instead of named or anonymous methods because it is easier for us to see what is
going on and what is being passed to what. However, we can use named methods to
achieve the same results.
Summary
In this chapter, we covered all aspects of imperative data parallelism. In Chapter
10, Parallel LINQ PLINQ, we will cover declarative data parallelism with PLINQ
discussion. Data parallelism using TPL in .NET really comes down to performing
parallel loops using the Parallel.For and Parallel.ForEach methods. These
parallel loops allow us to iterate through a set or collection of data and perform the
same function on each member of the set concurrently.
We learned how to perform a parallel loop on a set of data using Parallel.For
and a collection of data using Parallel.ForEach. We then saw how to stop or
break from a loop when a particular condition was reached; for this we used the
ParallelLoopState object that .NET can generate.
Next, we explored error handling with parallel loops and the AggregateException
object. We learned how to process all of the exceptions that might occur during the
different iterations of the loop without affecting the other iterations.
[ 211 ]
Data Parallelism
In the last section, we saw how to use thread-local variables in our loops to have a
thread-safe local copy of a variable and then use the results from all of these local
copies at the end of the loop processing.
In the next chapter, we will take some time and explore the Visual Studio Debugger
and the features it provides for debugging a parallel application that has multiple
threads running at once.
[ 212 ]
www.PacktPub.com
Stay Connected: