0% found this document useful (0 votes)
7 views

Fortran Compiler Tutorial Coarray

This tutorial provides an overview of using Coarray Fortran for parallel programming, specifically demonstrating how to calculate the value of Pi using a Monte Carlo method. It covers building and running both sequential and coarray versions of the application, detailing the necessary modifications for parallel execution. The tutorial is designed for users of the Intel Fortran Compiler and is estimated to take 10-15 minutes to complete.

Uploaded by

jacknpoe3
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Fortran Compiler Tutorial Coarray

This tutorial provides an overview of using Coarray Fortran for parallel programming, specifically demonstrating how to calculate the value of Pi using a Monte Carlo method. It covers building and running both sequential and coarray versions of the application, detailing the necessary modifications for parallel execution. The tutorial is designed for users of the Intel Fortran Compiler and is estimated to take 10-15 minutes to complete.

Uploaded by

jacknpoe3
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Tutorial: Using Coarray Fortran

Tutorial: Using Coarray Fortran

Contents
Chapter 1: Tutorial: Coarray Fortran Overview
Introduction to Using Coarray Fortran .......................................................... 3
Calculating the Value of Pi using a Monte Carlo Method................................... 3
Sequential Program ...................................................................................4
Building and Running the Sequential Version ................................................. 5
Modifying the Program to Use Coarrays ........................................................ 5
Building and Running the Coarray Version..................................................... 7
Summary of Coarray Fortran....................................................................... 7
Notices and Disclaimers..............................................................................7

2
Tutorial: Coarray Fortran Overview 1

Tutorial: Coarray Fortran


Overview 1
NOTE The coarrays feature is only available for ifort. It is not available for ifx.

The coarrays feature of Fortran 2008 provides a Single Program Multiple Data (SPMD) approach to
parallelism, which is integrated into the Fortran language for ease of programming. An application using
coarrays runs the same program, called an image, in parallel, where coarray variables are shared across the
images in a model called Partitioned Global Address Space (PGAS).
This tutorial shows you how to build and run a serial application, and then convert it to run in parallel using
coarrays.

NOTE 32-bit coarrays are deprecated and will be removed in a future release.

About This This tutorial demonstrates writing, building, and running a Fortran application using
Tutorial coarrays.

Estimated 10-15 minutes.


Duration

Learning Building and running a Fortran coarray application.


Objectives

Introduction to Using Coarray Fortran


The Intel® Fortran Compiler supports parallel programming using coarrays as defined in the Fortran 2008
standard. As an extension to the Fortran language, coarrays offer one method to use Fortran as a robust and
efficient parallel programming language. Coarrays are supported in the Intel® Fortran Compiler.
This tutorial demonstrates writing, building, and running a Fortran application using coarrays.
This tutorial is available for Windows* only.

Calculating the Value of Pi using a Monte Carlo Method


We will be using a program that calculates the value of the mathematical constant π (Pi) using a Monte Carlo
method, so named because it uses random numbers. Imagine a square piece of paper two units across (the
actual unit doesn’t matter). On this paper draw a circle whose diameter is two units (radius one unit).

3
1 Tutorial: Using Coarray Fortran

The area of the circle is πr2, but since the radius (r) is 1, and we know the square’s area is 4 (2x2), the ratio
of the circle’s area to that of the square’s area is π/4. But how do we get the area of the circle? What we can
do is pick random points on the paper and count the points that are within the circle. To make this easier,
we’ll use the center of the square as the origin and generate random values for X and Y coordinates between
zero and 1, so we’re looking at a quarter of the circle/square. If X2+Y2 (distance between the point and the
origin) is less than or equal to 1, the point is within the circle. Count the number of random points that are
within the circle and divide that by the total number of points; the result is π/4. For a more detailed
explanation of this technique, see http://www.mathcs.emory.edu/~cheung/Courses/170/Syllabus/07/
compute-pi.html

Sequential Program
Open the sample file: mcpi_sequential.f90 (the sample file is located in the src folder for command-line
builds). The named constants for integer and real kinds are declared using SELECTED_INT_KIND and
SELECTED_REAL_KIND. Select an integer kind that can hold large integers. The num_trials is declared as
the number of trials. It is set to 6,000,000 in this example. The variable total counts the number of points
that are found within the circle.
The Fortran standard intrinsic RANDOM_NUMBER is used to generate the points for testing. The standard does
not say if the random sequence is different for each run of the program, so it is called as RANDOM_SEED with
no arguments. Intel® Fortran uses the time-of-day clock to initialize the random number generator.
The main body of the program is this loop:
! Run the trials. Get a random X and Y and see if the position
! is within a circle of radius 1. If it is, add one to the subtotal
do bigi=1_K_BIGINT,num_trials
call RANDOM_NUMBER(x)
call RANDOM_NUMBER(y)
if ((x*x)+(y*y) <= 1.0_K_SINGLE) total = total + 1_K_BIGINT
end do
At the end of the trials, divide the total by the number of trials and then multiply by four:
! total/num_trials is an approximation of pi/4
print *, "Computed value of pi is",&
REAL(4_K_BIGINT*total,K_DOUBLE)/REAL(num_trials,K_DOUBLE)

NOTE The REAL intrinsic is used to convert the integers to double precision before dividing.

The program includes code to show the elapsed time for the application.

4
Tutorial: Coarray Fortran Overview 1

Building and Running the Sequential Version


In Microsoft Visual Studio*, select Build > Build Solution. This builds the program. Run it by selecting
Debug > Start Without Debugging. The console window contains the following program output:
Computing pi using 600000000 trials sequentially
Computed value of pi is 3.1415794, Relative Error: .422E-05
Elapsed time is 19.5 seconds
When the program is run again, a different computed value is given due to the different random number
sequence:
Computed value of pi is 3.1415488, Relative Error: .139E-04

Modifying the Program to Use Coarrays


Coarrays are used to split the trials across multiple copies of the program. They are called images. Each
image has its own local variables, plus a portion of any coarrays shared variables. A coarray can be a scalar.
A coarray can be thought of as having extra dimensions, referred to as codimensions. To declare a coarray,
either add the CODIMENSION attribute, or specify the cobounds alongside the variable name. The cobounds
are always enclosed in square brackets. Some examples:
real, dimension(100), codimension[*] :: A
integer :: B[3,*]
When specifying cobounds in a declaration, the last cobound must be an asterisk. This indicates that it
depends on the number of images in the application. According to the Fortran standard, you can have up to
15 cobounds (a corank of 15), but the sum of the number of cobounds and array bounds must not exceed
31. As with array bounds, it is possible to have a lower cobound that is not 1, though this is not common.
Since the work is being split across the images, a coarray is needed to keep track of each image's subtotal of
points within the circle. At the end the subtotals are added to create a grand total, which is divided as it is in
the sequential version. The variable total is reused, but make it a coarray. Delete the existing declaration of
total and insert into the declaration section of the program:
! Declare scalar coarray that will exist on each image
integer(K_BIGINT) :: total[*] ! Per-image subtotal
The important aspect of coarrays is that there is a local part that resides on an individual image, but you can
access the part on other images. To read the value of total on image 3, use the syntax total[3]. To reference
the local copy, the coindex in brackets is omitted. For best performance, minimize touching the storage of
other images.
In a coarray application, each image has its own set of I/O units. The standard input is preconnected only on
image 1. The standard output is preconnected on all images. The standard encourages the implementations
to merge output, but the order is unpredictable. Intel® Fortran supports this merging.
It is typical to have image 1 do any setup and terminal I/O. Change the initial display to show how many
images are doing the work, and verify that the number of trials is evenly divisible by the number of images
(by default, this is the number of cores times threads-per-core). Image 1 does all the timing.
Open the file mcpi_sequential.f90 and save it as mcpi_coarray.f90.
Replace:
print '(A,I0,A)', "Computing pi using ",num_trials," trials sequentially"
! Start timing
call SYSTEM_CLOCK(clock_start)
With:
! Image 1 initialization
if (THIS_IMAGE() == 1) then
! Make sure that num_trials is divisible by the number of images

5
1 Tutorial: Using Coarray Fortran

if (MOD(num_trials,INT(NUM_IMAGES(),K_BIGINT)) /= 0_K_BIGINT) &


error stop "num_trials not evenly divisible by number of images!"
print '(A,I0,A,I0,A)', "Computing pi using ",num_trials," trials across ",NUM_IMAGES(),"
images"
call SYSTEM_CLOCK(clock_start)
end if
Use the following steps:
1. Make the test using the intrinsic function THIS_IMAGE. When it is called without arguments, it returns
the index of the invoking image. The code should execute only on image 1.
2. Ensure that the number of trials is evenly divisible by the number of images. The intrinsic function
NUM_IMAGES returns this value. error_stop is similar to stop except that it forces all images in a
coarray application to exit.
3. Print the number of trials and the number of images.
4. Start the timing.
Images other than 1 skip this code and proceed to what comes next. In more complex applications you might
want other images to wait until the initialization is done. When that is desired, insert a sync all statement.
The execution does not continue until all images have reached that statement.
The initialization of total does not need to be changed. This is done on each image's local version.
The main compute loop needs to be changed to split the work. Replace:
do bigi=1_K_BIGINT,num_trials
With:
do bigi=1_K_BIGINT,num_trials/int(NUM_IMAGES(),K_BIGINT)
After the DO loop, insert:

! Wait for everyone


sync all
Sum the image-specific totals, compute, and display the result. Again, this is done only on image 1. Replace:
! total/num_trials is an approximation of pi/4
computed_pi = 4.0_K_DOUBLE*(REAL(total,K_DOUBLE)/REAL(num_trials,K_DOUBLE))
print '(A,G0.8,A,G0.3)', "Computed value of pi is ", computed_pi, &
", Relative Error: ",ABS((computed_pi-actual_pi)/actual_pi)! Show elapsed time
call SYSTEM_CLOCK(clock_end,clock_rate)
print '(A,G0.3,A)', "Elapsed time is ", &
REAL(clock_end-clock_start)/REAL(clock_rate)," seconds"
With:
! Image 1 end processing
if (this_image() == 1) then
! Sum all of the images' subtotals
do i=2,num_images()
total = total + total[i]
end do
! total/num_trials is an approximation of pi/4
computed_pi = 4.0_K_DOUBLE* (REAL(total,K_DOUBLE)/REAL(num_trials,K_DOUBLE))
print '(A,G0.8,A,G0.3)', "Computed value of pi is ", computed_pi, &
", Relative Error: ",ABS((computed_pi-actual_pi)/actual_pi)
! Show elapsed time
call SYSTEM_CLOCK(clock_end,clock_rate)
print '(A,G0.3,A)', "Elapsed time is ", &
REAL(clock_end-clock_start)/REAL(clock_rate)," seconds"
end if
Use the following steps on the new code:

6
Tutorial: Coarray Fortran Overview 1
1. Execute this code only on image 1.
2. The total (without a coindex) already has the count from image 1, now add in the values from the
other images. Note the [i] coindex.
3. Ensure that the rest of the code is the same as the sequential version.
All of the images exit.

Building and Running the Coarray Version


The Intel® Fortran Compiler requires that the coarray features are enabled by specifying the /Qcoarray
option. In Microsoft Visual Studio*, set the project property to Fortran > Language > Enable Coarrays to
For Shared Memory and then click OK.
Use Build > Build Solution to build the application, then Debug > Start Without Debugging to run it. On
a four-core, eight-thread processor you should see:
Computing pi using 600000000 trials across 8 images
Computed value of pi is 3.1416575, Relative Error: .206E-04
Elapsed time is 4.21 seconds
The program can be run with fewer images. Set the project property to Fortran > Language > Coarray
Images to 4. (The command line option for this is: /Qcoarray-num-images:4.) Build and run the program.
You should see:
Computing pi using 600000000 trials across 4 images
Computed value of pi is 3.1415352, Relative Error: .183E-04
Elapsed time is 5.53 seconds
The time goes up because the work is now spread across four physical cores.

NOTE You can control the number of images through the environment variable:
FOR_COARRAY_NUM_IMAGES.

Summary of Coarray Fortran


You have completed the Using Coarray Fortran tutorial. The source file: mcpi_coarray_final.f90 should
contain a completed version of the coarray application. This source includes a modification to ensure that the
random number sequence is different on each image. Initializations that are based on the time-of-day may
yield the same seed on more than one image if the clock had the same value. See the source for more detail.

Notices and Disclaimers


Intel technologies may require enabled hardware, software or service activation.
No product or component can be absolutely secure.
Your costs and results may vary.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its
subsidiaries. Other names and brands may be claimed as the property of others.
Copies of documents which have an order number and are referenced in this document, or other Intel
literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm

7
1 Tutorial: Using Coarray Fortran

Product and Performance Information

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/
PerformanceIndex.
Notice revision #20201201

Intel, the Intel logo, Intel Atom, Intel Core, Intel Xeon, Intel Xeon Phi, Pentium, and VTune are trademarks of
Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
Portions Copyright © 2001, Hewlett-Packard Development Company, L.P.
Microsoft, Windows, and the Windows logo are trademarks, or registered trademarks of Microsoft Corporation
in the United States and/or other countries.
© Intel Corporation.
This software and the related documents are Intel copyrighted materials, and your use of them is governed
by the express license under which they were provided to you (License). Unless the License provides
otherwise, you may not use, modify, copy, publish, distribute, disclose or transmit this software or the
related documents without Intel's prior written permission.
This software and the related documents are provided as is, with no express or implied warranties, other
than those that are expressly stated in the License.

You might also like