0% found this document useful (0 votes)

47 views18 pages

Optimization Tips - Andrei Alexandrescu - CppCon 2014

Uploaded by

alan88w

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views18 pages

Optimization Tips - Andrei Alexandrescu - CppCon 2014

Uploaded by

alan88w

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Optimization Tips

Prepared for CppCon 2014

Andrei Alexandrescu, Ph.D.

Research Scientist, Facebook
[email protected]

© 2014- Andrei Alexandrescu. Do not redistribute. 1 / 36

Beware Compiler’s Most

Vexing Inlining

© 2014- Andrei Alexandrescu. Do not redistribute. 2 / 36

Inlining

• Interacts with all other optimizations

• Final code shape/size hard to estimate
• Cost function intractable
• App costs != benchmark estimates

© 2014- Andrei Alexandrescu. Do not redistribute. 3 / 36

From Regression to Win In 2 Flags

--max-inline-instns-auto=100
--early-inlining-instns=200

© 2014- Andrei Alexandrescu. Do not redistribute. 4 / 36

“Dark Matter” Code: cdtors

• Most affected by inlining

• “Motherhood and Apple Pie”
• Implicitly called
• Often implicitly generated
• Often trivial
◦ What’s a few stores between friends?
• Deadly effects at scale
◦ Beyond traditional advice!

© 2014- Andrei Alexandrescu. Do not redistribute. 5 / 36

I-Cache

• Spills seldom occur in microbenchmarks

• Issue in large applications
◦ Exactly where it hurts most
◦ . . . and harder to trace to causes
• Hard for compiler to assess impact
• (Don’t want to lose in microbenchmarks
either!)

© 2014- Andrei Alexandrescu. Do not redistribute. 6 / 36

Tip #1: Beware Inline Destructors

• Called everywhere, implicitly

• Not reflected in source code size
◦ . . . transitively
• Often generated automatically

• Watch destructor size carefully

© 2014- Andrei Alexandrescu. Do not redistribute. 7 / 36

One Destructor Inlined

Controling inlining

// GCC
#define ALWAYS_INLINE inline __attribute__((__always_inline__))
#define NEVER_INLINE __attribute__((__noinline__))

// VC++:
#define ALWAYS_INLINE __forceinline
#define NEVER_INLINE __declspec(noinline)

© 2014- Andrei Alexandrescu. Do not redistribute. 9 / 36

Defang NEVER_INLINE
Defang ALWAYS_INLINE

Case Study: Custom shared_ptr

• The go-to solution for reference counting

• Optimized for a blend of needs, each with a
cost:
◦ Compulsive atomic refcounting
◦ Custom deleters
◦ Weak Pointer Support
• No support for intrusive reference counting
◦ Remember: first cache line is where it’s at

© 2014- Andrei Alexandrescu. Do not redistribute. 12 / 36

Atomics Matter

• Atomic inc/dec: 2.5x–5x slower

• 40 years of optimizing ++/-- ripples
• 4 years of optimizing atomic inc/dec ripples
• Post inlining of course

© 2014- Andrei Alexandrescu. Do not redistribute. 13 / 36

But. . . But. . . Unwitting Sharing?

• Store thread id at first access with smart ptr

◦ Debug mode only
• Compare it with new access
• assert on mismatch

© 2014- Andrei Alexandrescu. Do not redistribute. 14 / 36

Classic Implementation
• Let’s assume non-intrusive for now

template <class T>

class SingleThreadPtr {
T* p_;
unsigned* c_;
public:
SingleThreadPtr() : p_(nullptr), c_(nullptr) {
}
SingleThreadPtr(T* p)
: p_(p)
, c_(p ? new unsigned(1) : nullptr) {
}
SingleThreadPtr(const SingleThreadPtr& rhs)
: p_(rhs.p_)
, c_(rhs.c_) {
if (c_) ++*c_;
}
...

© 2014- Andrei Alexandrescu. Do not redistribute. 15 / 36

Classic Implementation (cont’d)

SingleThreadPtr(SingleThreadPtr&& rhs)
: p_(rhs.p_)
, c_(rhs.c_) {
rhs.p_ = nullptr;
rhs.c_ = nullptr;
}
~SingleThreadPtr() {
if (c_ && --*c_ == 0) {
delete p_;
delete c_;
}
}

© 2014- Andrei Alexandrescu. Do not redistribute. 16 / 36

Herb’s Talk “Atomic Weapons”

• Focus on MT
• Use atomic<unsigned>* for c_
• Use fetch_add(1,memory_order_relaxed)
for ++
• fetch_sub(1,memory_order_acq_rel) for --

© 2014- Andrei Alexandrescu. Do not redistribute. 17 / 36

Task

Make this faster

© 2014- Andrei Alexandrescu. Do not redistribute. 18 / 36

(Source: “Down for the Count?”, Shahriyar, R et al.)

Observation

• Many refcounts are 0 or 1

• C++ legacy code in particular!
◦ People avoided auto_ptr
◦ tr1::shared_ptr closest portable
alternative
• Some designs use shared_ptr instead of
unique_ptr as future flexibility (rightly or
wrongly)

© 2014- Andrei Alexandrescu. Do not redistribute. 20 / 36

Tip #2: Lazy Refcount Allocation

template <class T>

class SingleThreadPtr {
T* p_;
mutable unsigned* c_;
public:
SingleThreadPtr() : p_(nullptr), c_(nullptr) {}
SingleThreadPtr(T* p) : p_(p), c_(nullptr) {}
SingleThreadPtr(const SingleThreadPtr& rhs)
: p_(rhs.p_)
, c_(rhs.c_) {
if (!p_) return;
if (!c_) {
c_ = rhs.c_ = new unsigned(2);
} else {
++*c_;
}
}
...
© 2014- Andrei Alexandrescu. Do not redistribute. 21 / 36

Tip #2 (cont’d)

...
SingleThreadPtr(SingleThreadPtr&& rhs)
: p_(rhs.p_)
, c_(rhs.c_) {
rhs.p_ = nullptr;
//rhs.c_ = nullptr; // UNNEEDED
}
~SingleThreadPtr() {
if (!p_) return;
if (!c_) {
soSueMe: delete p_;
} else if (--*c_ == 0) {
delete c_;
goto soSueMe;
}
}

© 2014- Andrei Alexandrescu. Do not redistribute. 22 / 36

Tip #2 (alternative)

...
SingleThreadPtr(SingleThreadPtr&& rhs)
: p_(rhs.p_)
, c_(rhs.c_) {
rhs.p_ = nullptr;
rhs.c_ = nullptr; // NEEDED
}
~SingleThreadPtr() {
if (!c_) {
soSueMe: delete p_;
} else if (--*c_ == 0) {
delete c_;
goto soSueMe;
}
}

• Fold test into delete call

© 2014- Andrei Alexandrescu. Do not redistribute. 23 / 36

Performance Dynamics

• One ref: p_ && (!c_ || *c == 1)

• Many refs: p_ && c_ && *c_ > 1
• No deallocation of c_ going down
◦ Avoid thrashing on transitions 1 ↔ 2
• We’re not above goto
◦ Dtor still a tad larger
• Ctors smaller, use zero-init
• Can control #copies better than #creations

© 2014- Andrei Alexandrescu. Do not redistribute. 24 / 36

Tip #3: Skip Last Decrement

template <class T>

class SingleThreadPtr {
...
~SingleThreadPtr() {
if (!p_) return;
if (!c_) {
soSueMe: delete p_;
} else if (*c_ == 1) {
delete c_;
goto soSueMe;
} else {
--*c_;
}
}
};

© 2014- Andrei Alexandrescu. Do not redistribute. 25 / 36

Motivation

• Most object have low refcounts

• Last refcount decrement is high
percentage-wise
• Avoid dirtying memory on moribund objects
• Replace interlocked decrement with atomic
read
◦ On x86, all reads are atomic!

Performance Dynamics

• Dtor got a tad larger

• Competition with delete
◦ If expensive, one decref won’t matter
◦ See coming Tip
• May help deleting old unused objects
◦ One less dirty page
• Generally worth the extra test
• YMMV

Tip #4: Prefer Zero of All

• Zero is “special” to the CPU

• Special assignment
• Special comparisons
• E.g. in an enum, make 0 the most frequent
value

Tip #4: Prefer Zero of All

...
SingleThreadPtr(const SingleThreadPtr& rhs)
: p_(rhs.p_)
, c_(rhs.c_) {
if (!p_) return;
if (!c_) {
c_ = rhs.c_ = new unsigned(1);
} else {
++*c_;
}
}
...

Tip #4: Prefer Zero of All

...
~SingleThreadPtr() {
if (!p_) return;
if (!c_) {
soSueMe: delete p_;
} else if (*c_ == 0) {
delete c_;
goto soSueMe;
} else {
--*c_;
}
}
...

Performance Dynamics

• Code is not faster!

◦ Test is 1 cycle or less either way
• Code is smaller
• Most often inc/decref inlined
• Effect on I-Cache may become noticeable

• Weird sub-tip: make default state all zeros

◦ http://goo.gl/WZH0BS

True story: > 0.5%

enum class A { foo, bar };

Tip #5: Use Dedicated Allocators

• No generic allocator handles small allocs well

• Keep all refcounts together
• Heap with 1 control bit per counter
◦ Only 3.125% size overhead for 32-bit
◦ Cache-friendly control bit
• Alternative: freelists
◦ No per-allocation overhead
◦ Odd cache friendliness patterns
◦ Require pointer-sized count
representation
• Best: intrusive

Tip #6: Use Smaller Counters

• Vast majority of objects: < 16 refs

• Prefer 16- or 8-bit counters
• Saturate them (with hysteresis)
• On saturation: leak!
◦ Such objects are long-lived anyway
◦ You may have cycles anyway
◦ Log a leakage report on exit

• Intrusive: just use whatever bits available

Summary

To Paraphrase John Lennon

You may say I am special

But I’m not the only one. . .

VisualC and CPP
100% (2)
VisualC and CPP
113 pages
Allocator Is To Allocation What Vector Is To Vexation - Andrei Alexandrescu - CppCon 2015
No ratings yet
Allocator Is To Allocation What Vector Is To Vexation - Andrei Alexandrescu - CppCon 2015
23 pages
Dynamic Memory Allocation
No ratings yet
Dynamic Memory Allocation
27 pages
Smart Pointers: Reduce Bugs Caused by The Misuse of Pointers While Retaining
100% (1)
Smart Pointers: Reduce Bugs Caused by The Misuse of Pointers While Retaining
31 pages
Data-Oriented Design and C++ - Mike Acton - CppCon 2014
No ratings yet
Data-Oriented Design and C++ - Mike Acton - CppCon 2014
201 pages
An Introduction To Programming Through C++: Abhiram G. Ranade CH 21: Representing Variable Length Entities
0% (1)
An Introduction To Programming Through C++: Abhiram G. Ranade CH 21: Representing Variable Length Entities
33 pages
No Littering Tamu
100% (1)
No Littering Tamu
43 pages
Hidden Overhead of A Function API
No ratings yet
Hidden Overhead of A Function API
158 pages
现代C++基础 - 生命周期与类型安全
No ratings yet
现代C++基础 - 生命周期与类型安全
92 pages
AliCehreli Helpful D Techniques - No Pause
No ratings yet
AliCehreli Helpful D Techniques - No Pause
48 pages
Boost - A Bridge From C++98 To C++11 - Michael VanLoon - CppCon 2014
No ratings yet
Boost - A Bridge From C++98 To C++11 - Michael VanLoon - CppCon 2014
65 pages
Dynamic Memory Allocation
No ratings yet
Dynamic Memory Allocation
14 pages
C++ in Huge AAA Games - Nicolas Fleury - CppCon 2014
No ratings yet
C++ in Huge AAA Games - Nicolas Fleury - CppCon 2014
51 pages
Basics! Essentials of Modern C++ Style - Herb Sutter - CppCon
No ratings yet
Basics! Essentials of Modern C++ Style - Herb Sutter - CppCon
38 pages
Memory Leaks:: What Is Memory Leak?
No ratings yet
Memory Leaks:: What Is Memory Leak?
7 pages
Modern C++ (PDFDrive)
No ratings yet
Modern C++ (PDFDrive)
106 pages
Ctraps and Pitfalls For CPP Programmers
No ratings yet
Ctraps and Pitfalls For CPP Programmers
21 pages
Object Oriented Design and Programming CS 201: Dr. Deepika Gupta
No ratings yet
Object Oriented Design and Programming CS 201: Dr. Deepika Gupta
15 pages
CPP Tutorial
No ratings yet
CPP Tutorial
25 pages
C++ Tutorial: Rob Jagnow
No ratings yet
C++ Tutorial: Rob Jagnow
21 pages
Lec16-Dyn Mem
No ratings yet
Lec16-Dyn Mem
29 pages
Lecture 05
No ratings yet
Lecture 05
49 pages
Lecture 04
No ratings yet
Lecture 04
43 pages
Lecture 8
No ratings yet
Lecture 8
67 pages
Unit 2
No ratings yet
Unit 2
24 pages
Being Smart About Pointers - Michael VanLoon - CppCon 2015
No ratings yet
Being Smart About Pointers - Michael VanLoon - CppCon 2015
47 pages
Modern C++ - What You Need To Know PDF
No ratings yet
Modern C++ - What You Need To Know PDF
55 pages
Smart Pointers
No ratings yet
Smart Pointers
38 pages
Basic CPP
No ratings yet
Basic CPP
24 pages
CPP Tutorial JWFILES
No ratings yet
CPP Tutorial JWFILES
25 pages
Atomic Smart Pointers: Half Thread-Safe
No ratings yet
Atomic Smart Pointers: Half Thread-Safe
4 pages
Afertig 2024 Meeting CPP Fast and Small CPP - When Efficiency Matters
No ratings yet
Afertig 2024 Meeting CPP Fast and Small CPP - When Efficiency Matters
16 pages
Prog 2
No ratings yet
Prog 2
4 pages
10 Mistakes
No ratings yet
10 Mistakes
7 pages
Unit 2 - Arrays and Pointers
No ratings yet
Unit 2 - Arrays and Pointers
7 pages
Memory Management
No ratings yet
Memory Management
4 pages
Cin and Cout: - C++ Provides An Easier Way For Input and Output. - The Output: - The Input
No ratings yet
Cin and Cout: - C++ Provides An Easier Way For Input and Output. - The Output: - The Input
20 pages
Memory Management
No ratings yet
Memory Management
48 pages
Smart Pointers
No ratings yet
Smart Pointers
6 pages
Note 12 24 SmartPointer
No ratings yet
Note 12 24 SmartPointer
9 pages
Chapter 7
No ratings yet
Chapter 7
29 pages
Storage Classes Arrays Strings Dynamic Memory Management Content - V1.1
No ratings yet
Storage Classes Arrays Strings Dynamic Memory Management Content - V1.1
7 pages
Untitled 1
No ratings yet
Untitled 1
8 pages
C++ For Java Programmers: 15-494 Cognitive Robotics Ethan Tira-Thompson
No ratings yet
C++ For Java Programmers: 15-494 Cognitive Robotics Ethan Tira-Thompson
25 pages
Garbage Collection: Joydeep Dey
No ratings yet
Garbage Collection: Joydeep Dey
32 pages
CPP Tutorial
No ratings yet
CPP Tutorial
25 pages
Garbage Collection: Vitaly Shmatikov
No ratings yet
Garbage Collection: Vitaly Shmatikov
34 pages
Lab01 DS
No ratings yet
Lab01 DS
10 pages
02.1 Memory Allocation
No ratings yet
02.1 Memory Allocation
14 pages
Memory Management Part1 1
No ratings yet
Memory Management Part1 1
5 pages
Read Me PDF
No ratings yet
Read Me PDF
5 pages
FY - Btech-C - Unit-6
No ratings yet
FY - Btech-C - Unit-6
20 pages
Memory Management Tips
No ratings yet
Memory Management Tips
4 pages
Memory Management (II)
No ratings yet
Memory Management (II)
21 pages
Eh Brief
No ratings yet
Eh Brief
5 pages
Everseen Test
No ratings yet
Everseen Test
5 pages
Smart Pointers & RAII - Key Concepts and Patterns
No ratings yet
Smart Pointers & RAII - Key Concepts and Patterns
5 pages
OPC Training Agenda
100% (1)
OPC Training Agenda
61 pages
Nexis GC-2030 Operation Guide 221-79201
No ratings yet
Nexis GC-2030 Operation Guide 221-79201
144 pages
Engineers Mini Notebook
No ratings yet
Engineers Mini Notebook
80 pages
Using C++ To Connect To Web Services - Steve Gates - CppCon 2014
No ratings yet
Using C++ To Connect To Web Services - Steve Gates - CppCon 2014
40 pages
Maya 2015 in Simple Steps
100% (1)
Maya 2015 in Simple Steps
2 pages
TEMS Discovery VoLTE - KPIs Metric Group Description PDF
No ratings yet
TEMS Discovery VoLTE - KPIs Metric Group Description PDF
56 pages
A Database of Romanian Love Charms
No ratings yet
A Database of Romanian Love Charms
2 pages
C++ Metaprogramming - Fedor Pikus - CppCon 2015
100% (1)
C++ Metaprogramming - Fedor Pikus - CppCon 2015
76 pages
LS10200 000NF E A1 IFC NOTIFIER Compatibility Document
No ratings yet
LS10200 000NF E A1 IFC NOTIFIER Compatibility Document
2 pages
PDMS Piping: How To Generate An Equipment Report: Example
No ratings yet
PDMS Piping: How To Generate An Equipment Report: Example
5 pages
Algorithmic Differentiation - C++ and Extremum Estimation - Matt P. Dziubinski - CppCon 2015
No ratings yet
Algorithmic Differentiation - C++ and Extremum Estimation - Matt P. Dziubinski - CppCon 2015
283 pages
Functional Programming - Functors and Monads - Michał Dominiak - CppCon 2015
100% (1)
Functional Programming - Functors and Monads - Michał Dominiak - CppCon 2015
19 pages
RCPP - Seamless R and C++ Integration - Matt P. Dziubinski - CppCon 2015
No ratings yet
RCPP - Seamless R and C++ Integration - Matt P. Dziubinski - CppCon 2015
137 pages
Viewing The World Through Array-Shaped Glasses - Łukasz Mendakiewicz - CppCon 2014
No ratings yet
Viewing The World Through Array-Shaped Glasses - Łukasz Mendakiewicz - CppCon 2014
131 pages
Compile-Time Tools For Generic Programming in C++ - Abel Sinkovics - CppCon 2015
No ratings yet
Compile-Time Tools For Generic Programming in C++ - Abel Sinkovics - CppCon 2015
241 pages
The Canonical Class - Michael Caisse - CppCon 2014
No ratings yet
The Canonical Class - Michael Caisse - CppCon 2014
138 pages
Simple Extensible Pattern Matching With C++14 - John Bandela - CppCon 2015
No ratings yet
Simple Extensible Pattern Matching With C++14 - John Bandela - CppCon 2015
118 pages
STL Algorithms in Action - Michael VanLoon - CppCon 2015
No ratings yet
STL Algorithms in Action - Michael VanLoon - CppCon 2015
99 pages
Where Did My Performance Go - Fedor Pikus - CppCon 2014
No ratings yet
Where Did My Performance Go - Fedor Pikus - CppCon 2014
66 pages
Benchmarking C++ Code - Bryce Adelstein Lelbach - CppCon 2015
No ratings yet
Benchmarking C++ Code - Bryce Adelstein Lelbach - CppCon 2015
79 pages
From Functional To Parallel - Stochastic Modelling in C++ - Kevin Carpenter - CppCon 2015
No ratings yet
From Functional To Parallel - Stochastic Modelling in C++ - Kevin Carpenter - CppCon 2015
64 pages
C++ On The Web - JF Bastien - CppCon 2015
No ratings yet
C++ On The Web - JF Bastien - CppCon 2015
24 pages
C++11, 14, 17 Atomics - The Deep Dive - Michael Wong - CppCon 2015
No ratings yet
C++11, 14, 17 Atomics - The Deep Dive - Michael Wong - CppCon 2015
69 pages
Rebuilding Boost Date-Time For C++11 - Jeff Garland - CppCon 2014
No ratings yet
Rebuilding Boost Date-Time For C++11 - Jeff Garland - CppCon 2014
56 pages
The Birth of Study Group 14 - Nicolas Guillemot, Sean Middleditch, Michael Wong - CppCon 2015
No ratings yet
The Birth of Study Group 14 - Nicolas Guillemot, Sean Middleditch, Michael Wong - CppCon 2015
44 pages
C++ Multi-Dimensional Arrays For Computational Physics and Applied Mathematics - Pramod Gupta - CppCon 2015
No ratings yet
C++ Multi-Dimensional Arrays For Computational Physics and Applied Mathematics - Pramod Gupta - CppCon 2015
43 pages
Modernizing Legacy C++ Code - Gregory and McNellis - CppCon 2014
No ratings yet
Modernizing Legacy C++ Code - Gregory and McNellis - CppCon 2014
81 pages
Reactive Stream Processing Rx4DDS - Sumant Tambe - CppCon 2015
No ratings yet
Reactive Stream Processing Rx4DDS - Sumant Tambe - CppCon 2015
51 pages
Types Don't Know # - Howard Hinnant - CppCon 2014
No ratings yet
Types Don't Know # - Howard Hinnant - CppCon 2014
95 pages
QT - Modern User Interfaces For C++ - Milian Wolff - CppCon 2015
No ratings yet
QT - Modern User Interfaces For C++ - Milian Wolff - CppCon 2015
43 pages
STL Features and Implementation Techniques - Stephan T. Lavavej - CppCon 2014
No ratings yet
STL Features and Implementation Techniques - Stephan T. Lavavej - CppCon 2014
47 pages
The Implementation of Value Types - Lawrence Crowl - CppCon 2014
No ratings yet
The Implementation of Value Types - Lawrence Crowl - CppCon 2014
71 pages
Contracts For Dependable C++ - Gabriel Dos Reis - CppCon 2015
No ratings yet
Contracts For Dependable C++ - Gabriel Dos Reis - CppCon 2015
35 pages
Easy Compilation From TouchDevelop To ARM Cortex-M0 Using C++11 - Jonathan Protzenko - CppCon 2015
No ratings yet
Easy Compilation From TouchDevelop To ARM Cortex-M0 Using C++11 - Jonathan Protzenko - CppCon 2015
20 pages
Introducing Brigand - Edouard Alligand and Joel Falcou - CppCon 2015
No ratings yet
Introducing Brigand - Edouard Alligand and Joel Falcou - CppCon 2015
9 pages
Functional Design Explained - David Sankel - CppCon 2015
No ratings yet
Functional Design Explained - David Sankel - CppCon 2015
43 pages
Unit 3 The Internet
100% (1)
Unit 3 The Internet
9 pages
DDOS 7.12 Command Reference Guide
No ratings yet
DDOS 7.12 Command Reference Guide
338 pages
C++ in The Telecom Industry - Yani Miguel - CppCon 2015
No ratings yet
C++ in The Telecom Industry - Yani Miguel - CppCon 2015
13 pages
Untitled Document - Edited
No ratings yet
Untitled Document - Edited
9 pages
Class D: User Request/Certification of Access Rights Form
No ratings yet
Class D: User Request/Certification of Access Rights Form
1 page
Printer 7167 WPD 7 XP InstallationGuide
No ratings yet
Printer 7167 WPD 7 XP InstallationGuide
20 pages
Database Management 2020
No ratings yet
Database Management 2020
5 pages
Active Directory and Related Aspects of Security
No ratings yet
Active Directory and Related Aspects of Security
8 pages
DSA All Units
No ratings yet
DSA All Units
60 pages
Excel Keyboard Shortcuts For Mac
No ratings yet
Excel Keyboard Shortcuts For Mac
16 pages
What Is ETL
No ratings yet
What Is ETL
14 pages
Embedded System To Grounding Grid Diagnosis of Energized Substations
No ratings yet
Embedded System To Grounding Grid Diagnosis of Energized Substations
5 pages
MT6582 Android Scatter
No ratings yet
MT6582 Android Scatter
6 pages
M365 Fundamentals Learning Path (July 2019) PDF
No ratings yet
M365 Fundamentals Learning Path (July 2019) PDF
1 page
Test Case Template
No ratings yet
Test Case Template
8 pages
AN - 385 FTDI D3XX Driver Installation Guide
No ratings yet
AN - 385 FTDI D3XX Driver Installation Guide
16 pages
IF (C2 B2,"Yes","No") : 1) IF Function To Return Text
No ratings yet
IF (C2 B2,"Yes","No") : 1) IF Function To Return Text
3 pages
Saidu Muhammad CV
No ratings yet
Saidu Muhammad CV
3 pages
Lucena Civil Engineers 052017 Room Assignment PDF
No ratings yet
Lucena Civil Engineers 052017 Room Assignment PDF
8 pages
Weblogic Password Change1 - Updated-070214
No ratings yet
Weblogic Password Change1 - Updated-070214
10 pages
Using "Audacity®" For Language Teaching
No ratings yet
Using "Audacity®" For Language Teaching
28 pages
Multimedia Final Exam
No ratings yet
Multimedia Final Exam
2 pages
1061-10-Lst-00012 MCC Io List Rev00 Oxy Response
No ratings yet
1061-10-Lst-00012 MCC Io List Rev00 Oxy Response
1 page
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
NX 9.0 for Designers
From Everand
NX 9.0 for Designers
Prof. Sham Tickoo
5/5 (1)
Solid Edge ST7 for Designers
From Everand
Solid Edge ST7 for Designers
Prof. Sham Tickoo
5/5 (1)

Optimization Tips - Andrei Alexandrescu - CppCon 2014

Uploaded by

Optimization Tips - Andrei Alexandrescu - CppCon 2014

Uploaded by

Optimization Tips

Prepared for CppCon 2014

Andrei Alexandrescu, Ph.D.

© 2014- Andrei Alexandrescu. Do not redistribute. 1 / 36

Beware Compiler’s Most

© 2014- Andrei Alexandrescu. Do not redistribute. 2 / 36

• Interacts with all other optimizations

© 2014- Andrei Alexandrescu. Do not redistribute. 3 / 36

From Regression to Win In 2 Flags

© 2014- Andrei Alexandrescu. Do not redistribute. 4 / 36

• Most affected by inlining

© 2014- Andrei Alexandrescu. Do not redistribute. 5 / 36

• Spills seldom occur in microbenchmarks

© 2014- Andrei Alexandrescu. Do not redistribute. 6 / 36

• Called everywhere, implicitly

• Watch destructor size carefully

© 2014- Andrei Alexandrescu. Do not redistribute. 7 / 36

One Destructor Inlined

© 2014- Andrei Alexandrescu. Do not redistribute. 9 / 36

Case Study: Custom shared_ptr

• The go-to solution for reference counting

© 2014- Andrei Alexandrescu. Do not redistribute. 12 / 36

• Atomic inc/dec: 2.5x–5x slower

© 2014- Andrei Alexandrescu. Do not redistribute. 13 / 36

But. . . But. . . Unwitting Sharing?

• Store thread id at first access with smart ptr

© 2014- Andrei Alexandrescu. Do not redistribute. 14 / 36

template <class T>

© 2014- Andrei Alexandrescu. Do not redistribute. 15 / 36

Classic Implementation (cont’d)

© 2014- Andrei Alexandrescu. Do not redistribute. 16 / 36

© 2014- Andrei Alexandrescu. Do not redistribute. 17 / 36

Make this faster

© 2014- Andrei Alexandrescu. Do not redistribute. 18 / 36

• Many refcounts are 0 or 1

© 2014- Andrei Alexandrescu. Do not redistribute. 20 / 36

template <class T>

© 2014- Andrei Alexandrescu. Do not redistribute. 22 / 36

• Fold test into delete call

© 2014- Andrei Alexandrescu. Do not redistribute. 23 / 36

• One ref: p_ && (!c_ || *c == 1)

© 2014- Andrei Alexandrescu. Do not redistribute. 24 / 36

template <class T>

© 2014- Andrei Alexandrescu. Do not redistribute. 25 / 36

• Most object have low refcounts

© 2014- Andrei Alexandrescu. Do not redistribute. 26 / 36

• Dtor got a tad larger

© 2014- Andrei Alexandrescu. Do not redistribute. 27 / 36

Tip #4: Prefer Zero of All

• Zero is “special” to the CPU

© 2014- Andrei Alexandrescu. Do not redistribute. 28 / 36

© 2014- Andrei Alexandrescu. Do not redistribute. 29 / 36

Tip #4: Prefer Zero of All

© 2014- Andrei Alexandrescu. Do not redistribute. 30 / 36

• Code is not faster!

• Weird sub-tip: make default state all zeros

© 2014- Andrei Alexandrescu. Do not redistribute. 31 / 36

True story: > 0.5%

enum class A { foo, bar };

© 2014- Andrei Alexandrescu. Do not redistribute. 32 / 36

• No generic allocator handles small allocs well

© 2014- Andrei Alexandrescu. Do not redistribute. 33 / 36

Tip #6: Use Smaller Counters

• Vast majority of objects: < 16 refs

• Intrusive: just use whatever bits available

© 2014- Andrei Alexandrescu. Do not redistribute. 34 / 36

© 2014- Andrei Alexandrescu. Do not redistribute. 35 / 36

To Paraphrase John Lennon

 You may say I am special

© 2014- Andrei Alexandrescu. Do not redistribute. 36 / 36

You might also like

You may say I am special