See N3042.
The addition of nullptr and
nullptr_t is bad.
Introduction
The macro
NULL, that goes back quite early, was meant to provide a tool to specify a null pointer constant such that it is easily visible and such that it makes the intention of the programmer to specifier a pointer value clear. Unfortunately, the definition as it is given in the standard misses that goal, because the constant that is hidden behind the macro can be of very different nature.A null pointer constant can be any integer constant of value 0 or such a constant converted to
void*. Thereby several types are possible forNULL. Commonly used are0withint,0Lwithlongand(void*)0withvoid*.
- This may lead to surprises when invoking a type-generic macro with an
NULLargument.- Conditional expressions such as
(true ? 0 : NULL)and(true ? 1 : NULL)have different status depending howNULLis defined. Whereas the first is always defined, the second is a constraint violation ifNULLhas typevoid*, and defined otherwise. In particular, the second happens to work in C++ but most of the times not in C.- A
NULLargument that is passed as a sentinel to a...function that expects a pointer can have severe consequences. On many architectures nowadaysintandvoid*have different sizes, and so ifNULLis just0, a wrongly sized argument is passed to the function.- In particular, C++ can’t have
NULLas(void*)0becausevoid*does not implicitly convert to other pointer types. Thus it is usually an integer constant of value zero. On the C side (e.g byprintf) such a passed integer constant is then interpreted asvoid*orchar*; such a re-interpretation has undefined behavior.
NULL. It is not, however, an important
problem.
NULL as sentinel for pointer types could
be done by giving it the proper type. We already make
void* and char* “compatible” for the
purpose of va_arg (only).
NULL defined as
void* has no effect whatsoever on C.
Besides, the definitions of NULL for C and C++
are likely to disagree already. My system’s
NULL definition is:
/*
* Written by Todd C. Miller, September 9, 2016
* Public domain.
*/
#ifndef NULL
#if !defined(__cplusplus)
#define NULL ((void *)0)
#elif __cplusplus >= 201103L
#define NULL nullptr
#elif defined(__GNUG__)
#define NULL __null
#else
#define NULL 0L
#endif
#endif
It certainly is true that a NULL defined as
integer constant expression can’t be used as variadic
argument where the callee expects a pointer. There is no fix
for all pointer types other than casting the null pointer, but
for void* and char*, the fix is
forbidding a definition of NULL as integer
constant expression.
Rationale
Why do we need a specific
nullptrconstant?Null pointer constants in C are a feature that is somewhat defined orthogonal to the type system. They are based on the concept of “integer constant expressions” and may in fact have any integer type (even
bool, enumerations, character constants or expressions such asx-xare possible) as long as the value can be determined at translation time and happens to be zero. On top of that ambiguity concerning integer types, it is even permitted to use an explicit cast tovoid*and to still obtain null pointer constant.The standard macro
NULLinherits from these confusing definitions and has no standardized type and no standardized behavior in contexts that are different from simple conversion to a pointer type. For example a use ofNULLas an argument to a...function is not guaranteed to work.
If
NULLhas integer type but different alignment or size thanvoid*any access withva_argthat interprets such an argument could crash the program.If
NULLhas integer type and null pointers are not represented as all-bit zero, such a transferred integer cannot be reinterpreted as a pointer value that would be a null pointer.If
NULLhas integer type (and notvoid*) and if even the integer type, saylong, has the correct size and alignment, an interpretation of that past-in integer in the formchar* a = va_arg(ap, char*);has undefined behavior. As an exception
va_argallows the reinterpretation betweenvoid*andchar*, for example, but not from integer type to pointer type.
Note how the last point is not even fixed by
nullptr.
Also, it is not easy to detect if an argument to a function or even macro is a null pointer constant or only an arbitrary null pointer value. In C, compile time code distinction is usually done in the preprocessor or by
_Generic. The preprocessor doesn’t work withNULLbecause it might not even be a preprocessor constant._Genericis difficult to use because it is based on types and not values, although there are ways to abuse properties of conditional expressions, integer constant expressions, null pointer constants and_Genericto do so.
This is utter nonsense. You really don’t need to
differentiate between null pointer constants and other null
pointer values; you don’t differentiate between integer
constant expressions and other integer values or perhaps string
literals and other char[] expressions either. If,
for some reason, this was desired, the solution would be a
facility to do just that—let the function check whether the
argument is a constant expression or a literal or just any
other expression.
Another reason to strengthen the definition of null pointer constants in C is the common confusion between a null pointer and a pointer that points to the zero address in the OS, as is suggested by using integer literals such as
0to express null pointer constants. Also, the fact that on some architectures a null pointer is not necessarily represented with a all-zero bit-pattern always needs special attention when teaching C and is quite surprising for beginners. If it were that these sophistic distinctions would be necessary for the expressiveness of the language, that could perhaps be acceptable, but here it clearly is a random burden that is imposed on generations of teachers and students that is only rooted in history and has no reason d’être as of today; all other programming languages that have concepts similar to pointers in C do quite well without this ambiguity between numbers and pointers.The idea of
nullptris to end this ambiguity and to provide a keyword with a value and a portable type that can be used anywhere where a null pointer constant is needed.
This “ambiguity” is not changed a bit by the introduction
of nullptr. Even if we have an additional way of
expressing null pointers, the old ones will have to stay. The
overall burden on the student only increases. This does
not simplify the language.
We already have expressions with a value and a portable type
that can be used anywhere a null pointer constant is needed: a
null pointer constant; say, (void*)0.
The
nullptrfeature presented in this paper has the following properties.
- It has a complete object type.
Same for (void*)0.
- It does not have scalar type, so it is forbidden in arithmetic.
This is not true. In this revision of the proposal, it does have scalar type.
- It converts to any pointer type.
Same for (void*)0.
- It converts to
boolby always evaluating tofalse.
Same for (void*)0.
- In memory,
nullptris represented with the same bit-pattern as a null pointer constant of typevoid*.
Same for (void*)0.
nullptris permitted in all “Boolean” contexts such as&&operators orifstatements.
Same for (void*)0.
nullptris permitted as argument to..., as long as the function interprets it as pointer tovoidor character type.
Same for (void*)0.
The aim is that this feature has exactly the same behavior as the corresponding feature in C++.
If the aim was enriching the C ABI in a way compatible with C++, if not a useful goal, I would understand; this, I don’t understand. There is not reason to aim for the same behavior as a corresponding C++ feature.
Why do we need a specific
nullptr_ttype different fromvoid*?The secondary feature proposed in this paper is the the type
nullptr_twith the intent to allow better diagnostics for functions that possibly receive a null pointer argument and to potentially optimize the case where a null pointer constant is received.Consider a function
functhat receives a pointer parameter that can either be valid or a null pointer to indicate a default choice.// header "func.h" void func_general(toto*); // define a default action // no parameter name, parameter is never read inline void func_default(nullptr_t) { ... } #define func(P) \ _Generic((P), \ nullptr_t: func_default, \ default: func_general)(P)// one translation unit #include "func.h" // emit an external definition extern void func_default(nullptr_t); // define the general action void func_general(toto* p) { // p may still have value null if (!p) func_default(nullptr); // may only be called with nullptr else { ... } }Here, a function
func_defaultis defined that receives anullptr. The function needs no access to the parameter, since that parameter can only hold one specific value. A type-generic macrofuncthen chooses this function or the general functionfunc_general. The translation unit that definesfunc_generalmay then emit an external definition offunc_defaultand also use it within the definition for the case thatfunc_generalreceives a parameter value that is null without being recognized as such at translation time of the call.#include "func.h" ... func(0); // ok, but uses the general function and may issue a diagnostic func((void*)0); // ok, but uses the general function, no diagnostic func(NULL); // ok, but uses the general function, diagnostic or not func((toto*)0); // ok, but uses the general function, no diagnostic func(nullptr); // uses default action directlyThe use of the macro with a null pointer constant of integer type then uses the general function and sets the parameter to null; implementations that chose to diagnose the use of null pointer constants of integer type may do so for this call.
In contrast to that, a call that uses
nullptras an argument directly resolves tofunc_default, may or may not inline the corresponding action, and will not trigger such a diagnosis.The emission of a diagnosis can be forced by restricting the admissible type as shown in the definition of
func_strict.#define func_strict(P) \ _Generic((P), \ nullptr_t: func_default, \ toto*: func_general)(P) ... func_strict(0); // invalid, int argument is not a valid choice, constraint violation func_strict((void*)0); // invalid, void* argument is not a valid choice, constraint violation func_strict(NULL); // invalid, void* or integer argument is not a valid choice, constraint violation func_strict((toto*)0); // ok, but uses the general function, no diagnostic func_strict(nullptr); // uses default action directly
This one example is a giant hack. It’s abusing the generic selection to check not what would ordinarily be the type, but whether the caller provided a constant expression with a certain value.
For this specific example, since you need to document the
special handling of nullptr anyway, you could’ve
simply provided the two functions as they are.
func and func_default. There is no
point in trying to squish that extra bit of information—“I
know that I definitely want the default; it doesn’t depend on
runtime information”—in the one parameter.
In general, you will note, is this percieved problem not
addressed at all: It’s not specific to pointer parameters.
You still can’t use a generic selection to find out whether
an integer is an integer constant expression 0 or any other
integer, non-zero or non-constant. (No, that’s wrong. You
can. You can build a function with an int
parameter and a type-generic macro that select a function
without parameter if the argument is of type
nullptr_t. Go figure.)
Not only does a new type for pointer values that have
value 0 special-case the type (to pointer types), but
also the value (to 0): If your default is something other than
the null pointer, you’d be ill-advised to interpret
nullptr as that default. Suddenly,
func((toto*)0) and func(nullptr) not
only go slightly different paths (the one first to
func_general and the other directly to
func_default), but behave completely differently!
TODO: Design choices and Impact
Prior art
The concept to present a null pointer constant as a keyword that is tightly integrated into the language as is proposed here is present in most other programming languages that have the concept of pointers, for example Pascal, Lisp, Smalltalk, Ruby, Objective-C, Lua, Scala, or Go, often with other spellings such as
nil,NIL,None,nullorNull. The fact that C still does express this concept with other language features is a rare exception in this picture and only a historic artifact and not a necessity.
It is neither a necessity nor a bug. And note how those languages do not have own types for null pointers.
The
nullptrfeature together withnullptr_tis present in C++ since C++11 and has extensive implementation and application experience in that framework. This feature is also given under a different name in the Plan 9 C compiler, namednil. It approximates some of the features provided below, but not all of them.
This is slander. The Plan 9 C compiler has nothing similar to
nullptr. The Plan 9 C library has
this line:
#define nil ((void*)0)
A plain old macro. Instead of “NULL” it’s called
“nil” and instead of being defined as any null pointer
constant it’s defined as ((void*)0). The name
is as fine as “NULL” (if not better—easier to type and
easier on the eyes) and the value is, for once, correct.
This is how it should be.
C users often shift between using literal
0versus(void*)0for a library-deployed, macro-based definition. There are various trade-offs for doing this (discussed as part of the design decisions above) that can make this have undesirable behaviors and qualities. Recently, users have tried to move away from their own personal definitions for portability and correctness reasons.
Actually, the design decisions do not discuss those trade-offs.
The introduction of nullptr and
nullptr_t does not address all the problems stated
and does not address any real problem better than changing
NULL’s definition to ((void*)0)
would have.