100% found this document useful (2 votes)
7K views5 pages

Want Speed, Pass by Value

An archive of David Abrahams' excellent (and completely lost) "Want Speed, Pass by Value" paper on C++ RVO optimizations.

Uploaded by

Computer Guru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
7K views5 pages

Want Speed, Pass by Value

An archive of David Abrahams' excellent (and completely lost) "Want Speed, Pass by Value" paper on C++ RVO optimizations.

Uploaded by

Computer Guru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

WantSpeed?PassbyValue.

C++Next
WantSpeed?PassbyValue.
Thisentryispartofaseries,RValueReferences:MovingForward

Behonest:howdoesthefollowingcodemakeyoufeel?
std::vector<std::string>get_names();

std::vector<std::string>constnames=get_names();
Frankly,eventhoughIshouldknowbetter,itmakesmenervous.Inprinciple,whenget_names()returns,we
havetocopyavectorofstrings.Then,weneedtocopyitagainwhenweinitializenames,andweneedto
destroythefirstcopy.IfthereareNstringsinthevector,eachcopycouldrequireasmanyasN+1memory
allocationsandawholeslewofcacheunfriendlydataaccessesasthestringcontentsarecopied.
Ratherthanconfrontthatsortofanxiety,Iveoftenfallenbackonpassbyreferencetoavoidneedlesscopies:
get_names(std::vector<std::string>&out_param);

std::vector<std::string>names;
get_names(names);
Unfortunately,thisapproachisfarfromideal.
Thecodegrewby150%
Wevehadtodropconstnessbecauseweremutatingnames.
Asfunctionalprogrammersliketoremindus,mutationmakescodemorecomplextoreasonaboutby
underminingreferentialtransparencyandequationalreasoning.
Wenolongerhavestrictvaluesemantics1fornames.
Butisitreallynecessarytomessupourcodeinthiswaytogainefficiency?Fortunately,theanswerturnsoutto
beno(andespeciallynotifyouareusingC++0x).Thisarticleisthefirstinaseriesthatexploresrvaluesand
theirimpliciationsforefficientvaluesemanticsinC++.

RValues
Rvaluesareexpressionsthatcreateanonymoustemporaryobjects.Thenamervaluereferstothefactthatan
rvalueexpressionofbuiltintypecanonlyappearontherighthandsideofanassignment.Unlikelvalues,which,
whennonconst,canalwaysbeusedonthelefthandsideofanassignment,rvalueexpressionsyieldobjects
withoutanypersistentidentitytoassigninto.2
Theimportantthingaboutanonymoustemporariesforourpurposes,though,isthattheycanonlybeusedonce
inanexpression.Howcouldyoupossiblyrefertosuchanobjectasecondtime?Itdoesnthaveaname(thus,
anonymous)andafterthefullexpressionisevaluated,theobjectisdestroyed(thus,temporary)!
Onceyouknowyouarecopyingfromanrvalue,then,itshouldbepossibletostealtheexpensivetocopy

resourcesfromthesourceobjectandusetheminthetargetobjectwithoutanyonenoticing.Inthiscasethat
wouldmeantransferringownershipofthesourcevectorsdynamicallyallocatedarrayofstringstothetarget
vector.Ifwecouldsomehowgetthecompilertoexecutethatmoveoperationforus,itwouldbecheapalmost
freetoinitializenamesfromavectorreturnedbyvalue.
Thatwouldtakecareofthesecondexpensivecopy,butwhataboutthefirst?Whenget_namesreturns,in
principle,ithastocopythefunctionsreturnvaluefromtheinsideofthefunctiontotheoutside.Well,itturnsout
thatreturnvalueshavethesamepropertyasanonymoustemporaries:theyareabouttobedestroyed,and
wontbeusedagain.So,wecouldeliminatethefirstexpensivecopyinthesameway,transferringthe
resourcesfromthereturnvalueontheinsideofthefunctiontotheanonymoustemporaryseenbythecaller.

CopyElisionandtheRVO
ThereasonIkeptwritingabovethatcopiesweremadeinprincipleisthatthecompilerisactuallyallowedto
performsomeoptimizationsbasedonthesameprincipleswevejustdiscussed.Thisclassofoptimizationsis
knownformallyascopyelision.Forexample,intheReturnValueOptimization(RVO),thecallingfunction
allocatesspaceforthereturnvalueonitsstack,andpassestheaddressofthatmemorytothecallee.The
calleecanthenconstructareturnvaluedirectlyintothatspace,whicheliminatestheneedtocopyfrominside
tooutside.Thecopyissimplyelided,oreditedout,bythecompiler.Soincodelikethefollowing,nocopies
arerequired:
std::vector<std::string>names=get_names();
Also,althoughthecompilerisnormallyrequiredtomakeacopywhenafunctionparameterispassedbyvalue
(somodificationstotheparameterinsidethefunctioncantaffectthecaller),itisallowedtoelidethecopy,and
simplyusethesourceobjectitself,whenthesourceisanrvalue.
1
2
3
4
5
6
7
8
9
10
11
12

std::vector<std::string>
sorted(std::vector<std::string>names)
{
std::sort(names);
returnnames;

//namesisanlvalue;acopyisrequiredsowedon'tmodifynames
std::vector<std::string>sorted_names1=sorted(names);

//get_names()isanrvalueexpression;wecanomitthecopy!
std::vector<std::string>sorted_names2=sorted(get_names());
Thisisprettyremarkable.Inprinciple,inline12above,thecompilercaneliminatealltheworrisomecopies,
makingsorted_names2thesameobjectastheonecreatedinget_names().Inpractice,though,theprinciple
wonttakeusquitethatfar,asIllexplainlater.

Implications
Althoughcopyelisionisneverrequiredbythestandard,recentversionsofeverycompilerIvetesteddo
performtheseoptimizationstoday.Butevenifyoudontfeelcomfortablereturningheavyweightobjectsby
value,copyelisionshouldstillchangethewayyouwritecode.
Considerthiscousinofouroriginalsorted()function,whichtakesnamesbyconstreferenceandmakesan
explicitcopy:
std::vector<std::string>
sorted2(std::vector<std::string>const&names)//namespassedbyreference
{
std::vector<std::string>r(names);//andexplicitlycopied
std::sort(r);
returnr;
}
Althoughsortedandsorted2seematfirsttobeidentical,therecouldbeahugeperformancedifferenceifa
compilerdoescopyelision.Eveniftheactualargumenttosorted2isanrvalue,thesourceofthecopy,names,
isanlvalue,3sothecopycantbeoptimizedaway.Inasense,copyelisionisavictimoftheseparate
compilationmodel:insidethebodyofsorted2,theresnoinformationaboutwhethertheactualargumenttothe
functionisanrvalueoutside,atthecallsite,theresnoindicationthatacopyoftheargumentwilleventuallybe
made.
Thatrealizationleadsusdirectlytothisguideline:
Guideline:Dontcopyyourfunctionarguments.Instead,passthembyvalueandletthecompilerdothe
copying.
Atworst,ifyourcompilerdoesntelidecopies,performancewillbenoworse.Atbest,youllseeanenormous
performanceboost.
Oneplaceyoucanapplythisguidelineimmediatelyisinassignmentoperators.Thecanonical,easytowrite,
alwayscorrect,strongguarantee,copyandswapassignmentoperatorisoftenseenwrittenthisway:
T&T::operator=(Tconst&x)//xisareferencetothesource
{

Ttmp(x);//copyconstructionoftmpdoesthehardwork
swap(*this,tmp);//tradeourresourcesfortmp's
return*this;//our(old)resourcesgetdestroyedwithtmp
}
butinlightofcopyelision,thatformulationisglaringlyinefficient!Itsnowobviousthatthecorrectwaytowrite
acopyandswapassignmentis:
T&operator=(Tx)//xisacopyofthesource;hardworkalreadydone
{
swap(*this,x);//tradeourresourcesforx's
return*this;//our(old)resourcesgetdestroyedwithx
}

RealityBites
Ofcourse,lunchisneverreallyfree,soIhaveacoupleofcaveats.
First,whenyoupassparametersbyreferenceandcopyinthefunctionbody,thecopyconstructoriscalledfrom
onecentrallocation.However,whenyoupassparametersbyvalue,thecompilergeneratescallstothecopy
constructoratthesiteofeachcallwherelvalueargumentsarepassed.Ifthefunctionwillbecalledfrommany
placesandcodesizeorlocalityareseriousconsiderationsforyourapplication,itcouldhavearealeffect.
Ontheotherhand,itseasytobuildawrapperfunctionthatlocalizesthecopy:
std::vector<std::string>
sorted3(std::vector<std::string>const&names)
{
//copyisgeneratedonce,atthesiteofthiscall
returnsorted(names);
}
SincetheconversedoesntholdyoucantgetbackalostopportunityforcopyelisionbywrappingI
recommendyoustartbyfollowingtheguideline,andmakechangesonlyasyoufindthemtobenecessary.
Second,Iveyettofindacompilerthatwillelidethecopywhenafunctionparameterisreturned,asinour
implementationofsorted.Whenyouthinkabouthowtheseelisionsaredone,itmakessense:withoutsome
formofinterproceduraloptimization,thecallerofsortedcantknowthattheargument(andnotsomeother
object)willeventuallybereturned,sothecompilermustallocateseparatespaceonthestackfortheargument
andthereturnvalue.
Ifyouneedtoreturnafunctionparameter,youcanstillgetnearoptimalperformancebyswappingintoa
defaultconstructedreturnvalue(provideddefaultconstructionandswaparecheap,astheyshouldbe):
std::vector<std::string>
sorted(std::vector<std::string>names)
{
std::sort(names);
std::vector<std::string>ret;

swap(ret,names);
returnret;
}

MoreToCome
Hopefullyyounowhavetheammunitionyouneedtostaveoffanxietyaboutpassingandreturningnontrivial
objectsbyvalue.Butwerenotdoneyet:nowthatwevecoveredrvalues,copyelision,andtheRVO,wehave
allthebackgroundweneedtoattackmovesemantics,rvaluereferences,perfectforwarding,andmoreaswe
continuethisarticleseries.Seeyousoon!
Followthislinktothenextinstallment.

Acknowledgements
HowardHinnantisresponsibleforkeyinsightsthatmakethisarticleseriespossible.AndreiAlexandrescuwas
postingoncomp.lang.c++.moderatedabouthowtoleveragecopyelisionyearsbeforeItookitseriously.Mostof
all,though,thanksingeneraltoallreadersandreviewers!
1.Googlingforagooddefinitionofvaluesemanticsturnedupnothingforme.Unlesssomeoneelsecanpoint
toone(andmaybeeveniftheycan),wellberunninganarticleonthattopicinwhichIpromiseyoua
definitionsoon.
2.Foradetailedtreatmentofrvaluesandlvalues,pleaseseethisexcellentarticlebyDanSaks
3.Exceptforenumsandnontypetemplateparameters,everyvaluewithanameisanlvalue.
PostedSaturday,August15th,2009underValueSemantics.

You might also like