Dynamic Language Runtime
Dynamic Language Runtime
Chandrasekhar Prabala
Abstract Dynamic Language Runtime is Microsofts new addition on top of the CLR which makes the creation and usage of Dynamic Languages on the .NET framework a lot easier. It also enables existing languages on the .NET framework like C#, add dynamic typing capabilities. This paper provides a high level overview of the DLR and how it works. This paper assumes that the readerhas some background knowledge on dynamically typed languages.
Introduction
The front end of a compiler, famously known as the compile time portion of a compiler is divided into 4 phases: Lexical analysis Semantic checking Syntax checking Intermediate code generation Lexical analysis is assigned with the task of breaking down the program string into useful pieces of information known as lexemes. Syntax checking verifies whether the code that has been written obeys the conventions that the language has adopted in terms of naming schemes, usage of language idioms, etc. It creates a syntax tree which is then taken by the intermediate code generation phase and converted into a language that the compiler can then use to make intelligent choices upon. Etched between these two phases is semantic checking. Semantic checking goes beyond keywords and their naming schemes and checks whether they have been used in the proper sense. Any language that is of practical importance today has a type system. A type gives meaning to data. It also specifies the operations that can be performed on that kind of data. When a type has been defined and a datum is associated with that type, any further manipulations on the datum has to obey the rules that are laid out in the definition of that type. Based on how a programming language performs this rule verification, also famously known as type checking, programming languages are classified into the following. Statically typed languages Dynamically typed languages
If we have to invoke the method Add at run time using reflection the way we write code is given below.
Assembly assem = Assembly.GetExecutingAssembly(); Object fact= Activator.CreateInstance(assem.GetType(ReflectionExample.Factori alClass)); Type factType = fact.GetType(); MethodInfo factMethod = factType.GetMethod(Factorial); object res = factMethod.Invoke(fact, new object[] { 10 }); int factorial = Convert.ToInt32(res); Console.WriteLine(factorial);
Its immediately clear how tedious and verbose it is to work with types dynamically at runtime.
A description of the architecture from a developers perspective is presented here. For a deeper understanding please refer to the documentation on codeplex [1]. The picture below shows how DLR fits into the big .NET picture.
Fig 2. (Source - www.codeplex.com/dlr)
The above picture gives a birds eye view of the architecture. At the top of the picture there are various languages that one can program in and right at the bottom there are various technologies and dynamic languages. The DLR with the help of appropriate binders enables data access across these various technologies and languages. Binders are discussed a bit down the line.
Following are 3 features of the DLR that are of primary importance to the developer. Dynamic Dispatch Call Site Caching Expression Trees
Dynamic Dispatch
Just because the DLR is implemented on top of the CLR, all objects do not get dynamic behavioural capabilities automatically. For an object to behave dynamically, the corresponding class should have dynamic dispatch capabilities. For a class to have dynamic dispatch capabilities, it needs to implement the IDynamicObject interface or simply inherit from the DynamicObject class and override a few of its methods. The following is the example of the Medication class that inherits from the DynamicObject class and overrides a couple of the classs methods.
public class Medication : DynamicObject { private Dictionary<string, string> _Medicines = new Dictionary<string, string>(); public override bool TryGetMember(GetMemberBinder binder, out object result) { result = _Medicines[binder.Name]; return true; } public override bool TrySetMember(SetMemberBinder binder, object value) { _Medicines[binder.Name] = value.ToString(); return true; } }
The class Medication overrides two methods TryGetMember and TrySetMember, implementation of these methods helps the Medication class create properties and set their values dynamically. The _Medicines dictionary helps with holding the property and the associated value. The Medication class can now be used in the following fashion.
dynamic medication = new Medication(); medication.Name = Paracetamol; medication.Strength = 500mg; medication.ExpiryDate =DateTime.Parse(12/12/2005); console.WriteLine( Prescribed Medication {0} is of strength {1} and its expiry date is {2} , medication.Name, Medication.Strength, medication.ExpiryDate);
Look how we gave our C# Medication class dynamic capabilities. We are able to add properties to it at runtime. The DynamicObject class has various other methods that one can override to add additional features. For example, if one overrides the TryInvokeMember() method one gets the capabilities to add methods to the object at runtime. The above code snippet also introduces the C# 4.0 dynamic key word. The dynamic keyword declares that the variable type should be resolved at runtime instead of at compile time. The above methodology provides us with obvious advantages. It simplifies code drastically. It enables dynamism to creep into languages on the DLR.
Having said the above there are a few disadvantages as well. There wont be intellisense capabilities obviously because the type members are not known at compile time. There is no compile time type checking. Dynamic dispatch comes with a slight execution overhead.
Expression Trees
Intermediate language representations are mostly of two types. The first is a code representation, an example is 3-address code and the other is tree representation, an example is abstract syntax tree. .NET uses MSIL, an intermediate language which is a code based intermediate language representation. With the introduction of LINQ a tree based representation became necessary, thus expression trees were introduced. However, note that expression trees are finally converted into MSIL. C# provides facilities to express code as an expression tree. Expression Trees in DLR are the version next of the expression trees that were introduced along with LINQ in .NET 3.5 or they derive from that idea. Expression trees help in expressing code as data. Following is an example.
Expression<Func<int, int, int>> expression = (a, b) => a + b; Console.WriteLine( The left part of the expression: + {0}{4} The NodeType: {1}{4} The right part: {2}{4} The Type: {3}{4}, left.Name, body.NodeType, right.Name, body.Type, Environment.NewLine);
In the above example, the lambda expression (a,b) => a + b in code is expressed in a tree data structure. The above code yields an output as shown below.
The The The The left part of the expression: a NodeType: Add right part of the expression: b Type: System.Int32
Expression trees help in analyzing, transforming and composing the code. For example LINQ to SQL queries are taken and converted into T-SQL queries. PLINQ takes a LINQ query and runs it on multiple cores. The advantage is that the compiler now gets a chance to optimize the code and generate a very efficient machine code. This is the version of expression trees found in C#. In this version, the data at every node is static. The expression trees were extended to hold data, which would be discovered later, i.e. the nodes should be un-typed and late bound. Expression trees add more specialized nodes that support concepts unique to dynamic languages. As an example consider the following piece of code that calculates a Fibonacci number.
int Fibonacci(int n) { if (n < 1) { return 1; } else { return Fibonacci(n 1) + Fibonacci(n 2); } }
The expression tree generated for this code is as shown in the figure.
Parameter
Binary OP <
Method Call
Return
Any language which wishes to make use of the DLR has to produce these Expression trees and not the IL. The DLR will take care of producing the necessary IL. Pictorially the idea is shown below.
Foo.xy
DLR Tree
MSIL
The way this is compiled into a call site in the DLR is that a site object is created in a static field
static CallSite<Func<CallSite, Object, int, bool>> _site = ...
This is compiled into code which looks like this. This calls a delegate on the site object that we have created earlier. Here, the _site object acts as the inline cache.
if(_site.Target(_site, x, 0)) {...}
The delegate Target points to some code which knows how to do the actual operation. An example is shown below:
static bool _0(CallSite site, object x, int y) { return site.Update(site, x, y); }
If we carefully look at the code we see that x is an object. Because its a dynamic language, x need not be integer it can be anything say, Bigint or long. So x becomes an object. In any reasonable language 0 is an integer so y becomes an int. Since, if statement needs to return a Boolean value the return type of the operation is bool. Now if we look at the method body, the body calls a method update which updates the call site. This is where the speed comes from. Say, first time an integer is passed into x, the update method is generated in the following way.
static bool_1(CallSite site, object x, int y) { if(x is int) { return (int)x == y; } else { return site.Update(site, x, y); } }
The update method generated generates code for integer comparison. So, the next time Integer is passed into x again the execution is really fast. Say, now Bigint is passed, because this is a dynamic language as said earlier its not necessary that one should pass integer input always. The target is now updated in the following way.
static bool_1(CallSite site, object x, int y) { if(x is int) { return (int)x == y; } else if (x is BigInteger) { return BigInteger.op_equality((BigInteger)x, y)); } else { return site.Update(site, x, y); } }
This target now handles both integers and Bigint and now the third time, say something else is passed, target is updated again. This is the polymorphism part of the polymorphic inline cache. The first time the code is executed with integers there is no target code, thus a cache miss is said to have occurred. This is where a binder comes into the picture. Binder represents the language specific semantics for performing a specific operation at the callsite, including any metadata. Binders get called when there is a cache miss; they inspect the operands and compute how to perform the requested operation. Binders communicate with callsite using expression trees. From now onwards if the code is called with integers, the code that was generated to handle integers is reused and thus there is a cache hit. The next time the code is called with big integers again there is a cache miss and the binder comes into the picture. This is how call site caching works and this enables dynamic dispatch faster and thus makes, otherwise horribly slow dynamic languages work faster on the .NET framework. It should also be noted that this mechanism is built on top of the expression trees.
C# 4.0
C# 4.0 introduced a lot of new features. One of the most important amongst those is the dynamic keyword. With the introduction of the dynamic keyword the calculator example presented earlier can be re written in the following way.
Conclusion
Dynamic Language Runtime is a welcome addition on top of the .NET framework. This addition brings more variety to the framework and will attract those who believe in the Pythons and the Rubys to .NET. The DLR can be used to build custom Domain Specific Languages (DSL) which are inherently dynamic in nature. All that one needs to do is write the parser in such a way that it outputs an expression tree. The DLR can also be used to enable meta- programming where code is generated as and when required and lives shortly in the memory, does the necessary task and deletes itself. Meta programming is the next big step after Model Driven Development (MDD). More importantly DLR will make the life of an average C# and VB developer easier by letting them work with COM, JSON, XML objects. It makes the web developer make use of a language like Python to create robust web applications with ease. It also opens up a host of possibilities. In time we can see java objects being consumed in .NET if an appropriate binder is written.
References
1. www.codeplex.com/dlr 2. http://msdn.microsoft.com/hi-in/vsx/default(en-us).aspx 3. Compilers principles, Techniques and Tools, Aho, Sethi and Ulman(Dragon book) 4. Static Typing Where Possible, Dynamic Typing When Needed: The End of the Cold War Between Programming Languages.
Acknowledgements
I profusely thank S. Naveen Kumar (Principal Architect, MTC) and Sudhanshu M. Hate (Senior Technical Architect, MTC) for their support and valuable guidance in bringing out the paper in the current form. I would also like to thank Atul Gupta (Principal Architect, MTC) and Sripriya Thothadri (Technical Architect, MTC) for helping out at various stages during the review process.
ou Did y ? ow kn
About Infosys
Infosys provides its clients with business and technology consulting services that deliver measurable business value to help you build tomorrows enterprise. Through a well-integrated end-to-end range of consulting, technology, engineering and outsourcing, Infosys clients derive the business value theyve always been requiring from enterprise transformations. Our extensive research teams, including the award-winning SET Labs, ensure Infosys solutions are always cutting-edge and relevant. Our high investment in training over 10,000 graduates a year pass through our Mysore campus, the worlds largest corporate university ensures our people stay best-in-class. Infosys (NASDAQ:INFY) also believes in giving back to the communities with scholarships, libraries, schools, and many other elds through the Infosys Foundation.
Global presence
The Americas Atlanta Bellevue Belo Horizonte Bentonville Boston Calgary Charlotte Chicago Detroit Fremont Hartford Houston Los Angeles Minneapolis Monterrey New York Philadelphia Phoenix Plano Reston Toronto Europe Amsterdam Brno Brussels Copenhagen Dublin Frankfurt Geneva Helsinki dz London Madrid Milano Milton Keynes Oslo Paris Prague Stockholm Stuttgart Swindon Toulouse Walldorf Zurich India Bangalore Bhubaneswar Chandigarh Chennai Gurgaon Hyderabad Jaipur Mangalore Mumbai Mysore New Delhi Pune Thiruvananthapuram Asia Paci c Brisbane Dubai Hangzhou Hong Kong Kuala Lumpur Manila Melbourne Moscow Perth Shanghai Sharjah Singapore Sydney Tokyo Wellington Africa Reduit
www.infosys.com
2011 Infosys Technologies Limited, Bangalore, India. Infosys believes the information in this publication is accurate as of its publication date; such information is subject to change without notice. Infosys acknowledges the proprietary rights of the trademarks and product names of other companies mentioned in this document.