The C Object System
The C Object System
Laurent Deniau
CERN – European Organization for Nuclear Research
[email protected]
Abstract 1. Motivation
The C Object System (C OS) is a small C library which im- The C Object System (C OS) is a small framework which
plements high-level concepts available in C LOS, O BJECTIVE - adds an object-oriented layer to the C programming lan-
C and other object-oriented programming languages: uni- guage [1, 2, 3] using its programmable capabilities1 while
form object model (class, metaclass and property-metaclass), following the simplicity of O BJECTIVE -C [5, 6] and the
generic functions, multi-methods, delegation, properties, ex- extensibility of C LOS [8, 9, 10]. C OS aims to fulfill sev-
ceptions, contracts and closures. C OS relies on the program- eral general principles rarely met in a single programming
mable capabilities of the C programming language to extend language: simplicity, extensibility, reusability, efficiency and
its syntax and to implement the aforementioned concepts portability.
as first-class objects. C OS aims at satisfying several general
principles like simplicity, extensibility, reusability, efficiency 1.1 Context
and portability which are rarely met in a single program- C OS has been developed in the hope to solve fundamen-
ming language. Its design is tuned to provide efficient and tal programming problems encountered in scientific comput-
portable implementation of message multi-dispatch and mes- ing and more specifically in applied metrology [11, 12]. Al-
sage multi-forwarding which are the heart of code extensi- though this domain looks simple at first glance, it involves
bility and reusability. With COS features in hand, software nonetheless numerous fields of computer science; from low-
should become as flexible and extensible as with scripting level tasks like the development of drivers, protocols or state
languages and as efficient and portable as expected with C machines, the control of hardware, the acquisition of data,
programming. Likewise, C OS concepts should significantly the synchronization of concurrent processes, or the numer-
simplify adaptive and aspect-oriented programming as well ical analysis and modeling of huge data sets; to high-level
as distributed and service-oriented computing. tasks like the interaction with databases or web servers, the
Categories and Subject Descriptors D.3.3 [C Program- management of remote or distributed resources, the visual-
ming Language]: Language Constructs and Features; D.1.5 ization of complex data sets or the interpretation of scripts
[Programming Techniques]: Object-oriented Programming. to make the system configurable and controllable by non-
programmers [13, 14, 15]. Not to mention that scientific
General Terms Object-oriented programming. projects commonly have to rely on sparse human resources
Keywords Adaptive object model, Aspects, Class cluster, to develop and maintain for the long term such continually-
Closure, Contract, Delegation, Design pattern, Exception, evolving-systems (i.e. R&D). Therefore the challenge is am-
Generic function, Introspection, High-order message, Mes- bitious but I firmly believe that C OS provides the required
sage forwarding, Meta class, Meta-object protocol, Multi- features to simplify the development and the support of such
method, Open class model, Predicate dispatch, Program- systems as well as a wide variety of software projects.
ming language design, Properties, Uniform object model. 1.2 Principles
∗ COS project: http://sourceforge.net/projects/cos Given the context, it is essential to reduce the multiplicity of
the technologies involved, to simplify the development pro-
cess, to enhance the productivity, to guarantee the extensibil-
ity and the portability of the code and to adapt the required
skills to the available resources. Hence, the qualities of the
programming language are essential for the success of such
projects and should focus on the following principles:
[Copyright notice will appear here once ’preprint’ option is removed.] 1 In the sense of “Lisp is a programmable programming language”, [4].
The Counter class derives from the root class Object — the defclass(Object,_)
U32 id; // object’s class identity
default behavior when the superclass isn’t specified — and
U32 rc; // reference counting
defines the attribute cnt. endclass
Class visibility What must be visible and when? In order But its methods must be defined with care since they provide
to manage coupling, C OS provides three levels of visibility: all the essential functionalities inherited by other classes.
none, declaration and definition. If you only use the generic
type OBJ, nothing is required (no coupling): Class rank C OS computes at compile-time the inheritance
depth of each class. The rank of a root class is zero (by
OBJ gnew(OBJ cls) {
definition) and each successive subclass increases the rank.
return ginit(galloc(cls));
} Dynamic inheritance C OS provides the message gchange-
Class(obj,cls) to change the class of obj to cls iff it is a
If you want to create instances of a class, only the declaration
superclass of obj’s class; and the message gunsafeChange-
is required (weak coupling):
Class(obj,cls,spr) to change the class of obj to cls iff both
OBJ gnewBook(void) { classes share a common superclass spr and the instance size
useclass(Book); // local declaration of cls is lesser or equal to the size of obj. These messages are
return gnew(Book); useful for implementing class clusters, state machines and
}
adaptive behaviors.
If you want to define subclasses, methods or instances with
automatic storage duration, the class definition must be visi- 3.4 Meta classes
ble (strong coupling). Like in O BJECTIVE -C, a C OS class definition creates a
parallel hierarchy of metaclass which facilitates the use of
3.3 Class inheritance classes as first-class objects. Figure 4 shows the complete
Class inheritance is as easy in C OS as in other object- hierarchy of the PropMetaClass class, including its meta-
oriented programming languages. Figure 3 shows the hi- classes.
erarchies of the core classes of C OS deriving from the root Class metaclass The metaclasses are classes of classes im-
classes Object and Nil. As an example, the MilliCounter plicitly defined in C OS to ensure the coherency of the type
class defined hereafter derives from the class Counter to system: to each class must correspond a metaclass [26]. Both
extend its resolution to thousandths of count: inheritance trees are built in parallel: if a class A derives from
defclass(MilliCounter, Counter) a class B, then its metaclass mA8 derives from the metaclass mB
int mcnt; — except the root classes which derive from NIL and have
endclass their metaclasses deriving from Class to close the inheri-
which gives in O BJECTIVE -C: tance path. Metaclasses are instances of the class MetaClass.
@interface MilliCounter : Counter { 7⊥ means “end of hierarchy” or NIL, but not the class Nil.
int mcnt; 8 The metaclass name is always the class name prefixed by a ’m’.
Object pmObject
mBehavior Property Predicate
Behavior pmBehavior
mClass name size TrueFalse Ordered
Class pmClass
mMetaClass True False Lesser Equal Greater
MetaClass pmMetaClass
mPropMetaClass Figure 5. Subset of C OS core class-predicates hierarchy.
PropMetaClass pmPropMetaClass
3.5 Class instances
Object life cycle The life cycle of objects in C OS is very
Figure 4. C OS core classes hierarchy with metaclasses. similar to other object-oriented programming languages,
namely it starts by creation (galloc) followed by initial-
ization (ginit and variants) and ends with deinitialization
Property metaclass In some design patterns like Singleton
(gdeinit) followed by destruction (gdealloc). In between,
or Class Cluster, or during class initialization (section 3.6),
the user manages the ownership of objects (i.e. dynamic
the automatic derivation of the class metaclass from its su-
scope) with gretain, grelease and gautoRelease like in
perclass metaclass can be problematic as detailed in [27].
O BJECTIVE -C. The copy initializer is the specialization of
To solve the problem C OS associates to each class a prop-
the generic ginitWith(_,_) for the same class twice. The
erty metaclass which cannot be derived; that is all methods
designated initializer is the initializer with the most cover-
specialized on the property metaclass can only be reached
age which invokes the designated initializer of the superclass
by the class itself. In order to preserve the consistency of
using next_method. Other initializers are secondary initial-
the hierarchy, a property metaclass must always derive from
izers which must invoke the designated initializer [7].
its class metaclass, namely pmA9 (resp. pmB) derives from mA
(resp. mB) as shown in the figure 4. Property metaclasses are Object type In C OS (resp. O BJECTIVE -C), objects are al-
instances of the class PropMetaClass. ways of dynamic type because the type of galloc (resp.
alloc) is OBJ (resp. id). Since it is the first step of the life cy-
Class objects With multi-methods and metaclasses in cle of objects in both languages, the type of objects can never
hands, it is possible to use classes as common objects. Fig- be known statically, except inside their own multi-methods.
ure 5 shows the hierarchy of the core class-objects used That is why C OS (resp. O BJECTIVE -C) provides the mes-
in C OS to specialized multi-methods with specific states. sage gisKindOf(obj,cls) (resp. [obj isKindOf: cls]) to in-
For instance messages like gand, gor and gnot are able to spect the type of objects. But even so, it would be danger-
respond to messages containing the class-predicates True, ous to use a static cast to convert an object into its expected
False and TrueFalse. The root class Nil is a special class- type because dynamic design patterns like Class Cluster and
object which means no-object but still safe for message dis- Proxy might override gisKindOf for their use. C OS also pro-
patch: sending a message to Nil is safe, but not to NIL. vides the message gclass(obj) which returns obj’s class.
Type system The C OS type system follows the rules of Object identity In C OS, an object is bounded to its class
O BJECTIVE -C, that is polymorphic objects have opaque through a unique 32-bit identifier produced by a linear con-
types (ADT) outside their methods and are statically and gruential generator which is also a generator of the cyclic
strongly typed inside; not to mention that multi-methods re- groups N/2k N for k = 2..32. This powerful algebraic prop-
duce significantly the need for runtime identification of poly- erty allows to retrieve efficiently the class of an object from
morphic parameters. Furthermore, the set of class – meta- the components table using its identifier as an index (Fig-
class – property-metaclass forms a coherent hierarchy of ure 6). Comparing to pointer-based implementations, the
classes and types which offers better consistency and more unique identifier has four advantages:
flexibility than in O BJECTIVE -C and S MALLTALK where
metaclasses are not explicit and derive directly from Object. It ensures better behavior of cache lookups under heavy
load (uniform hash), it makes the hash functions very fast
9 The property metaclass name is always the class name prefixed by a ’pm’. (sum of shifted ids), it is smaller than pointers on 64-bit
Figure 6. Lookup to retrieve object’s class from object’s id. 4. Generics (verbs)
We have already seen in previous code samples that generics
machines and it can store extra information (high bits) like can be used as functions. But generics take in fact multiple
class ranks to speedup linear lookup in class hierarchies. forms and define each:
• a function declaration (defgeneric) which ensures the
Automatic objects Since C OS adds an object-oriented correctness of the signature of its methods (defmethod),
layer on top of the C programming language, it is possi- aliases (defalias) and next-methods (defnext).
ble to create objects with automatic storage duration (e.g.
on the stack) using compound literals (C99). In order to • a function definition used to dispatch the message and to
achieve this, the class definition must be visible and the find the most specialized method belonging to the generic
developer of the class must provide a special constructor. and matching the classes of the receivers.
For example the constructor aStr(’’a string’’)10 is equiv- • an object holding the generic’s metadata: the selector.
alent to the O BJECTIVE -C directive @’’a string’’. C OS al-
ready provides automatic constructors for many common A generic function has one definition of its semantics and is,
objects like Char, Short, Int, Long, Float, Complex, Range, in effect, a verb raised at the same level of abstraction as a
Functor and Array. Automatic constructors allow to create noun [4]. Figure 7 summarizes the syntax of generics, half
efficiently temporary objects with local scope and enhance way between the syntax of generic’s definition in C LOS and
the flexibility of multi-methods. For example, the initializer the syntax of method’s declaration in O BJECTIVE -C.
ginitWith(_,_) and its variants can be used in conjunc- Generic rank The rank of a generic is the number of re-
tion with almost all the automatic constructors aforemen- ceivers in its param-list. C OS supports generics from rank
tioned. Thanks to the rich semantic of C OS reference count- 1 to 5 what should be enough in practice since rank 1 to 4
ing, if an automatic object receives the message gretain or already cover all the multi-methods defined in the libraries
gautoDelete, it is automatically cloned using the message of C ECIL and DYLAN [29, 30, 36].
gclone and the new copy with dynamic scope is returned.
4.1 Message dispatch
Static objects Static objects can be built in the same way
as automatic objects except that they require some care in C OS dispatch uses global caches (one per generics rank)
multi-threaded environments. It is worth to note that all C OS implemented with hash tables to speedup method lookups.
components have static storage duration and consequently The caches solve slot collisions by growing until they reach
are insensitive to ownership and cannot be destroyed. a configurable upper bound of slots. After that, they use
packed linked list incrementally built to hold a maximum of
3.6 Implementing classes 3 cells. Above this length, the caches start to forget cached
Class instantiations create the class objects using the key- methods — a required behavior when dynamic class creation
word makclass and the same class-specifier as the corre- is supported. The lookup uses fast asymmetric hash func-
sponding defclass. C OS checks at compile-time if both tions (sum of shifted ids) to compute the cache slots and en-
definitions match. The counters implementation follows: sures uniform distribution even when all selectors have the
same type or specializations on permutations exist.
makclass(Counter);
makclass(MilliCounter,Counter); Fast messages C OS lookup is simple enough to allow
which is equivalent in O BJECTIVE -C to: some code inlining on the caller side to speedup message
dispatch. Fast lookup is enabled up to the generic rank spec-
@implementation Counter ified by COS_FAST_MESSAGE — from disabled (0) to all (5,
// definition of Counter methods not shown
default) — before the generic definitions (defgeneric).
@end
@implementation MilliCounter : Counter 4.2 Declaring generics
// definition of MilliCounter methods not shown
@end Generic declarations are less common than class declara-
tions but they can be useful when one wants to use gener-
10 By convention, automatic constructors always starts by an ’a’. ics as first-class objects. Since generic definitions are more
often visible than class definitions, it is common to rename Methods specializers The receivers can be equivalently
them locally as in the following short example: accessed through selfn11 whose types correspond to their
void safe_print(OBJ obj) { class specialization (e.g. struct Counter*) and through un-
usegeneric( (gprint) prn ); named parameters _n whose types are OBJ for 1 ≤ n ≤ g,
if ( gunderstandMessage1(obj, prn) == True ) where g is the rank of the generic. It is important to un-
gprint(obj); derstand that selfn and _n are bound to the same object,
} but selfn provides a statically typed access which allows to
treat C OS objects like normal C structures.
which gives in O BJECTIVE -C:
void safe_print(id obj) { Multi-methods Multi-methods are methods with more
SEL prn = @selector(print); than one receiver and do not require special attention in
if ( [obj respondsToSelector: prn] == YES ) C OS. The following example defines the assign-sum opera-
[obj print]; tor (i.e. +=) specializations which adds 2 or 3 Counters:
}
11 self and self1 are equivalent.
3 forward_message(self->obj); // delegate
Methods specialization Assuming for instance the class 4 endmethod
inheritance A :> B :> C, the class precedence list for the which can be translated line-by-line into O BJECTIVE -C by:
set of all pairs of specialization of A, B and C by decreasing
order will be: 1 - (retval_t) forward:(SEL)sel :(arglist_t)args {
2 if ([self->obj respondsTo: sel] == YES)
(C,C)(C,B)(B,C)(C,A)(B,B)(A,C)(B,A)(A,B)(A,A) 3 return [self->obj performv:sel :args];
}
and the list of all next_method paths are: 4
property-definition: The value property is associated with the cnt attribute with
defproperty( property-def ); read-write semantic and uses user-defined boxing (int2OBJ)
and unboxing (gint). The class property is associated with
property-def: the entire object (omitted attribute) with read-only semantic
property-name and uses the inherited message gclass to retrieve it.
( super-property-name ) property-name Sometimes the abstraction or the complexity of the prop-
erties require handwritten methods. For instance:
class-property-definition: defmethod(OBJ, ggetAt, Person, mP_name)
defproperty( class-property-def ); retmethod(gcat(self->fstname, self->lstname));
endmethod
class-property-def:
is equivalent to, assuming gname(Person) is doing the gcat:
class-name , property-attr
class-name , property-attr , get-funcopt defproperty(Person, ()name, gname);
class-name , property-attr , get-funcopt , put-funcopt
Using properties The example below displays the name
property-attr: property of an object (or raise the exception ExBadMessage):
property-name void print_name(OBJ obj) {
( object-attributeopt ) property-name useproperty(name);
gprint(ggetAt(obj, name));
{property, super-property}-name, object-attribute: }
identifier (c99)
6. Exceptions
Figure 10. Syntax summary of properties.
Exceptions are non-local errors which ease the writing of
interfaces since they allow to solve the problems where the
5.4 Properties
solutions exist. To state it differently, if an exceptional con-
Property declaration is a useful programming concept which dition is detected, the callee needs to return an error and let
allows, amongst others, to manage the access of object at- the caller take over. Applying recursively this behavior re-
tributes, to use objects as associative arrays or to make ob- quires a lot of boilerplate code on the callers side to check
jects persistent. Figure 10 summarizes the syntax of proper- returned status. Exceptions let the callers choose to either
ties in C OS which are just syntactic sugar on top of the defi- ignore thrown errors or to catch them and take over.
nition of class-objects and the specialization of the accessors Implementing an exception mechanism in C on top of the
ggetAt and gputAt already mentioned in section 4.3. standard setjmp and longjmp is not new. But it is uncom-
Property definition Properties in C OS are defined conven- mon to see a framework written in C which provides the
tionally with lowercase names: full try-catch-finally statements (figure 11) with the same se-
mantic as in other object-oriented programming languages
defproperty( name ); (e.g. JAVA, C #). The CATCH declaration relies on the mes-
defproperty( size ); sage gisKindOf to identify the thrown exception, what im-
defproperty( class );
plies that the order of CATCH definitions matters, as usual.
defproperty( value );
The sample program hereafter gives an overview of ex-
For example, the last property definition is equivalent to: ceptions in C OS:
defclass(P_value, Property) 1 int main(void) {
endclass 2 useclass(String, ExBadAssert, mExBadAlloc);
The code above shows some typical usages: 13 These testsuites can be browsed on sf.net in the module CosBase/tests.
Table 1. Performances summary in 106 calls/second. Figure 12. Performances summary in 106 calls/second.
Multi-threading The same performance tests have been encapsulation, polymorphism, low coupling (messages)
run with P OSIX multi-threads enabled. When the Thread- and contracts are also assets for this principle.
Local-Storage mechanism is available (Linux), no signif- • The reusability comes from the key concepts of C OS
icant impact on performance has been observed (<1%). which enhance generic design: polymorphism, collabo-
When the architecture supports only P OSIX Thread-Specific- ration (multi-methods) and composition (delegation).
Key (Mac OS X), the performance is lowered by a factor • The efficiency measurement shows that key concepts per-
×1.6 and becomes clearly the bottleneck of the dispatcher.
form well compared to other mainstream languages.
Object creation Like other languages with semantic by • The portability comes from its nature: a C89 library.
reference, C OS loads heavily the C memory allocator (e.g.
malloc) which is not very fast. If the allocator is identified It is widely acknowledged that dynamic programming
as the bottleneck, it can be replaced with optimized pools languages simplify significantly the implementation of clas-
by overriding galloc or by faster external allocators (e.g. sical design patterns [28] when they don’t supersede them
Google tcmalloc). C OS also takes care of automatic objects by more powerful dynamic patterns [7, 33, 34]. This section
which can be used to speed up the creation of local objects. focuses on how to use C OS features to simplify design pat-
terns or to turn them into reusable components, where the
Other aspects Other features of C OS do not involve such definition of componentization is borrowed from [16, 17]:
heavy machinery as in message dispatch or object creation.
Thereby, they all run at full speed of C. Contracts run at the Component Orientation = Encapsulation + Polymorphism
speed of the user tests since the execution path is known at + Late Binding + Multi-Dispatch + Delegation.
compile time and flattened by the compiler optimizer. Empty
try-blocks run at the speed of setjmp which is a well known
bottleneck. Finally next_method runs at 70% of the speed 8.1 Simple Patterns
of an indirect function call (i.e. late binding) because it also Creational Patterns It is a well known fact that these pat-
has to pack the closed arguments into the generic’s _arg terns vanish in languages supporting generic types and intro-
structure. spection. We have already seen gnew (p. 5), here is more:
OBJ gnewWithStr(OBJ cls, STR str) {
8. Component-Oriented Design Patterns return ginitWithStr(galloc(cls), str);
This overview of C OS shows that the principles stated in the }
introduction are already well fulfilled. So far:
OBJ gclone(OBJ obj) {
• The simplicity can be assumed from the fact that the en- return ginitWith(galloc(gclass(obj)), obj);
}
tire language can be described within few pages, includ-
ing the grammar, some implementation details and few The Builder pattern is a nice application of property meta-
examples and comparisons with other languages. classes to turn it into the so called Class Cluster pattern:
• The extensibility comes from the nature of the object defmethod(OBJ, galloc, pmString)
model which allows to extend (methods bound to gener- retmethod(_1); // lazy, delegate the task to initializers
ics), wrap (around methods) or rename (method aliases) endmethod
behaviors with a simple user-friendly syntax. Besides,