Quick-Reference Guide To Optimization With Intel® Compilers Version 12
Quick-Reference Guide To Optimization With Intel® Compilers Version 12
2.
3.
4.
5.
/O1
-O1
/O3
-O3
Parallel Performance
Options that use OpenMP* or auto-parallelization are available for both Inteland non-Intel microprocessors, but these options may result in additional optimizations on Intel microprocessors that do not occur on non-Intel microprocessors. Windows* /Qopenmp /Qparallel Linux* Mac OS* X -openmp -parallel Comment Causes multi-threaded code to be generated when OpenMP directives are present. May require an increased stack size. The auto-parallelizer detects simply structured loops that may be safely executed in parallel, including loops implied by Intel Cilk Plus array notation, and automatically generates multi-threaded code for these loops. Controls the auto-parallelizers diagnostic level. n specifies the level of detail, from 0 (no report) to 3 (maximum detail). Default is 0. Sets a threshold for the auto-parallelization of loops based on the likelihood of a performance benefit. n=0 to 100, default 100. 0 Parallelize loops regardless of computation work volume. 100 Parallelize loops only if a performance benefit is highly likely Must be used in conjunction with /Qparallel (-parallel). /Qguide[:n] -guide[=n] Guided Auto-Parallelization. Causes the compiler to suggest ways to help loops to vectorize or auto-parallelize, without producing any objects or executables. Auto-parallelization advice is given only if the option parallel (Linux or Mac OS X) or /Qparallel (Windows) is also specified. n is an optional value from 1 to 4 specifying increasing levels of guidance to be provided, level 4 being the most advanced and aggressive. If n is omitted, the default is 4. This option enables [disables] a compiler-generated Matrix Multiply (matmul) library call by identifying matrix multiplication loop nests, if any, and replacing them with a matmul library call for improved performance. This option is enabled by default if options /O3 (-O3) and /Qparallel (-parallel) are specified. This option has no effect unless option /O2 (-O2) or higher is set. This option causes serialization of code containing Intel Cilk Plus language extensions. This means that the compiler will run such code as a serial C/C++ program. This option forces inclusion of a special header file (cilk_stubs.h) that includes preprocessor macros that make the Intel Cilk Plus keywords invisible to the compiler. This serialization and all Intel Cilk Plus keywords are fully described in the "Using Intel Cilk Plus section of the user and reference guide. Enables coarrays from the Fortran 2008 standard on shared memory systems (Fortran only). See the compiler reference guide for more options and detail. This option is available for both Intel and non-Intel microprocessors but it may result in more optimizations for Intel microprocessors than for non-Intel microprocessors.
/Qpar-report[:n] /Qparthreshold[:n]
-par-report[n] -par-threshold[n]
/Qopt-matmul[-]
-[no-]opt-matmul
/Qcilk-serialize
-cilk-serialize
/Qcoarray:shared
-coarray=shared
/QxHOST
-xhost
/Qaxtarget
-axtarget
/Qimf-precision:name
-fimf-precision:name
/Qimf-archconsistency:true
-fimf-arch-consistency=true
/Qprec-div[-] /Qprec-sqrt[-]
-[no-]prec-div -[no-]prec-sqrt
/Qopt-prefetch:n
-opt-prefetch=n
/Qopt-blockfactor:n /Qopt-streamingstores:mode
Enables [disables] pointer disambiguation with the restrict keyword. Off by default. (C/C++ only) Assumes no aliasing in the program. Off by default. Assumes no aliasing within functions. Off by default. Implies function arguments may be aliased [are not aliased]. On by default. (C/C++ only). fargument-noalias often helps the compiler to vectorize loops involving function array arguments. C++ class hierarchy information is used to analyze and resolve C++ virtual function calls at compile time. If a C++ application contains nonstandard C++ constructs, such as pointer down-casting, it may result in different behavior. Default is off, but it is turned on by default with the /Qipo (Windows) or ipo (Linux and Mac OS X) compiler option, enabling improved C++ optimization. (C++ only) -f-exceptions, default for C++, enables exception handling table generation -fno-exceptions, default for C or Fortran, may result in smaller code. For C++, it causes exception specifications to be parsed but ignored. Any use of exception handling constructs (such as try blocks and throw statements) will produce an error if any function in the call chain has been compiled with -fno-exceptions.
/Qopt-classanalysis[-]
-f[no-]exceptions
/Qvec-threshold:n
-vec-threshold n
Sets a threshold n for the vectorization of loops based on the probability of performance gain. 0 n 100, default n=100. 0 Vectorize loops regardless of amount of computational work. 100 Vectorize loops only if a performance benefit is almost certain
/Qvec-report:n
-vec-report n
Controls the vectorizers diagnostic levels. n specifies the level of detail, from 0 (no report) to 3 (maximum detail). Default is 0.
Debug Options
Windows* /Zi Linux* Mac OS* X -g Comment Generates debug information for use with any of the common platform debuggers. Turns off /O2 (-O2) and makes /Od (-O0) the default unless /O2 (-O2) (or another O option) is specified. keyword none No debugging information is generated (default) full (or all) produces debugging information for full symbolic debugging of unoptimized code. Same as g (/Zi), or as debug (/debug) with no keyword. extended produces additional information for improved symbolic debugging of optimized code (Linux and Mac OS X only) Debug symbols will generally increase the size of object modules and may slightly degrade performance of optimized code. Implies also debug full. parallel generates additional symbols and instrumentation for debugging threaded code. Does not imply debug full.
/debug[:keyword]
-debug [keyword]
Optimization Notice Intel Compiler includes compiler options that optimize for instruction sets that are available in both Intel and non-Intel microprocessors (for example SIMD instruction sets), but do not optimize equally for non-Intel microprocessors. In addition, certain compiler options for Intel Compiler are reserved for Intel microprocessors. For a detailed description of these compiler options, including the instruction sets they implicate, please refer to "Intel Compiler User and Reference Guides > Compiler Options." Many library routines that are part of Intel Compiler are more highly optimized for Intel microprocessors than for other microprocessors. While the compilers and libraries in Intel Compiler offer optimizations for both Intel and Intel-compatible microprocessors, depending on the options you select, your code and other factors, you likely will get extra performance on Intel microprocessors. While the paragraph above describes the basic optimization approach for Intel Compiler, with respect to Intel's compilers and associated libraries as a whole, Intel Compiler may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include Intel Streaming SIMD Extensions 2 (Intel SSE2), Intel Streaming SIMD Extensions 3 (Intel SSE3), and Supplemental Streaming SIMD Extensions 3 (Intel SSSE3) instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Intel recommends that you evaluate other compilers to determine which best meet your requirements. 8
For product and purchase information, visit the Intel Software Development Products site at: www.intel.com/software/products/compilers
Intel, the Intel logo, Pentium, Intel VTune, Intel Core and Intel Cilk are trademarks of Intel Corporation in the U.S. and other countries. * Other names and brands may be claimed as the property of others. 2010, Intel Corporation. All rights reserved.