0% found this document useful (0 votes)

75 views36 pages

Andevcon2011 Stephen Williams: Javaglue.00 Agenda

JavaGlue is a fork of XBiG. Avoid breaking current usage of xbig, Avoid worrying about breaking things, While getting out new features, and most importantly, Using a more memorable (and easier to google) name.

Uploaded by

Stephen Williams

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

75 views36 pages

Andevcon2011 Stephen Williams: Javaglue.00 Agenda

Uploaded by

Stephen Williams

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

AnDevCon2011 Stephen Williams

AnDevCon2011 JavaGlue

JavaGlue.00 Agenda
(updated just now by StephenWilliams)

1. About JavaGlue.01 About

This presentation can be found at: http://sdw.st/javaglue.html#tag:JavaGlue

a. Introduction
1. Improvements JavaGlue.02 Improvements
2. A JavaGlue.03 JNI Primer: Java with C and C++
a. Calling C methods
b. Passing Data
i. Scalars
ii. Strings
iii. Byte arrays
c. JNI References
i. Local References
ii. Global References
d. C to Java
i. Allocating scalar arrays
ii. Allocating strings
iii. Calling Java methods
3. JavaGlue.04 Alternatives
a. Hand-coded JNI
i. JavaGlue.04.1 JNI Diagrams
b. SWIG
c. JNA
4. JavaGlue.05 Use
a. Capabilities
b. Limitations
c. A Simple JavaGlue Example: JavaGlue.05.1 Example 1
d. JavaGlue.05.2 JavaGlue Diagrams
e. JavaGlue.07 Memory Management
f. JavaGlue.08 Utility Methods
g. JavaGlue.06 JavaGlue Build System
5. How does JavaGlue work? JavaGlue.10 Internals
6. JavaGlue.11 Adanced JNI
7. JavaGlue.12 CMake
a. Main Characteristics
b. Simple Examples

1 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

c. Building with NDK

d. Building JavaGlue Projects
e. Integrating with Java / Android / Eclipse
8. Planned Improvements
9. About Us

JavaGlue AnDevCon2011

JavaGlue.01 About
(updated 12 hours ago by StephenWilliams)

About
Authors: Stephen Williams ([email protected]), with help from Kevin Campbell ([email protected])
and excerpts from the Ogre4j project page (link 1 below).

Introduction
JavaGlue is a fork of XBiG. Why:

1. Avoid breaking current usage of XBiG,

2. Avoid worrying about breaking things,
3. While getting out new features,
4. An easier to use build system, and most importantly,
5. Using a more memorable (and easier to Google) name.

XBiG & NoodleGlue

The XSLT Bindings Generator - XBiG - is a project that uses XSL transformations to generate foreign
function interfaces (bindings) for libraries (http://code.google.com/p/xbig/). XBiG was derived from an
older project called NoodleGlue (http://web.archive.org/web/20070205204525rn_1/www.noodleglue.org
/noodleglue/noodleglue.html, http://www.stuartaxon.com/2008/10/01/noodleglue-found/).

Specifically, XBiG is designed to generate Java code and JNI bindings that allow almost any native (i.e.
C or C++) library to be used from Java. XBig was initially used to create Java OGRE
(http://www.ogre3d.org/) bindings as the Ogre4j project (http://ogre4j.sourceforge.net/).

JavaGlue changes are minor compared to the work that obviously went into creating NoodleGlue and
XBiG. It is assumed that JavaGlue and XBiG will merge eventually.

Licensing
The code generation tool is GPL. The linkable libraries from XBiG are LGPL. JavaGlue additions are
Apache 2.0 where this doesn't conflict with XBiG licensing. Generated code, as is generally the case

2 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

with code generators and compilers, is owned by the owner of the input.

Download
Google Code shortly. Snag it here for now: http://sdw.st/conf/AnDevCon2011/javaglue-1.0.zip

The Name
Our hope is that the name JavaGlue will be as discoverable, descriptive and generic as the tool itself
should be. It is likely that XBiG and JavaGlue will merge. May the most useful name win.

Kudos for Past & Current Work By

NoodleHeaven / Noodleglue.org (Tool released as GPL, with generated code and runtime code
having no restrictions (i.e. public domain / user owned).)
NetAllied (Believed to be current owners of XBiG copyright, tool GPL, runtime code LGPL. MIT
or Apache requested.)
Christoph Nenning
[email protected]

Projects with similar goals

http://www.itk.org/ITK/resources/CableSwig.html

AnDevCon2011 JavaGlue

JavaGlue.02 Improvements
(updated 15 hours ago by StephenWilliams)

Improvements of XBiG
We have enhanced XBiG significantly to meet our needs, adding the following functionality:

Support for passing null pointers as arguments to and from C/C++ functions
Efficient and easy byte array movement between Java & C++
Better handling of input include file hierarchies
Improved build system
A number of bugs fixed
Finding and working around details for using with Android

3 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

Design Goals
Minimize or eliminate the need for the native code to be "JavaGlue-aware". Ideally, an existing
native library could be wrapped naturally in Java without requiring any changes to the native code.
In practice this may not completely be the case, but the required changes are fairly small and
non-invasive.
Support "callback interfaces", where native code calls back to methods on objects implemented in
Java
Ensure that applications which use code generated by JavaGlue are not bound by licensing
restrictions

AnDevCon2011 JavaGlue

JavaGlue.03 JNI Primer

(updated 15 hours ago by StephenWilliams)

JNI Diagrams

1. Calling C methods
2. Passing Data
a. Scalars
b. Strings
c. Byte arrays
3. JNI References
a. Local References
b. Global References
4. C to Java
a. Allocating scalar arrays
b. Allocating strings
c. Calling Java methods

Tagged as 'JavaGlue.03 JNI Primer':

JavaGlue.04.1 JNI Diagrams

AnDevCon2011 JavaGlue

JavaGlue.04 Alternatives
(updated 15 hours ago by StephenWilliams)

JavaGlue Alternatives
JavaGlue alternatives either involve writing and maintaining a lot of metadata, fragile and verbose hand
coding, or libraries that have a lot of run time inefficiencies. No other freely available library is available
that uses only C++ header files as input and generates all Java and C/C++ glue code needed to

4 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

immediately write Java code that can pretty much universally use C++ objects.

Hand-coded JNI
Hand-coded JNI is time consuming, verbose, not typesafe at all, error prone, and hard to maintain. And
it only directly supports Java->C and C->Java calls. Handling C++ code requires you to create functions
with unmangled C linkage that would need to take pointers as integer parameters, to cast properly, and
then to make the C++ method calls desired. All scalar types need to be mapped in each direction with
JNI methods to be called to convert strings, etc. Additionally, the linkage both ways is interpreted at
runtime so typos and out of date interfaces are not detected until a method call is attempted. How many
people have full code coverage built into their projects?

For a few C methods, or very limited linkage to C++, this is doable. There are several steps, but it isn't
too difficult. However, none of the code involved does anything useful and debugging can be
timeconsuming.

http://download.oracle.com/javase/1.5.0/docs/guide/jni/
http://download.oracle.com/javase/1.5.0/docs/guide/jni/spec/jniTOC.html
http://java.sun.com/docs/books/jni/
http://java.sun.com/developer/onlineTraining/Programming/JDCBook/jni.html

Simplified Wrapper and Interface Generator (SWIG)

SWIG is a nice system that allows wrapping C/C++ code for Java, many scripting languages, and other
targets. It is flexible and tunable in many ways. However, it does usually take some manual
configuration file creation. Also, some choices are probably not as desirable as JavaGlue solutions, like
the way that pass by value and "char *" situations are handled. The typed pointers as strings is
interesting, but it will be much slower than the direct passing of pointers in JavaGlue.

http://www.swig.org/

Java Native Access (JNA)

JNA is a Java library that includes a native code module that can be used to interpretively map access to
native methods, structures, and variables. It is said to be about 1/10th the speed of similar JNI calls. It
also has some limitations, would take work to port to Android since it uses an ugly native access
method, and is not very popular.

http://jna.java.net/

Rationale from the XBiG Authors:

From: http://sourceforge.net/apps/mediawiki/ogre4j/index.php?title=White_Paper

One big point for every project that implements wrapper(s) for a library in different
programming languages is the effort to maintain the wrapper code. The target library has

5 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

their own release cycles and most major releases introduce API breaking changes. Most
of the projects deal with this issue by using code generators which create the necessary
bindings automatically. We evaluated the application of several code generators such as
SWIG (Simplified Wrapper and Interface Generator) and NoddleGlue but none of the
tested tools met our requirements. SWIG needs very much effort beforehand because
every interface that should be wrapped needs a interface description file. Both tools miss
full support of C/C++ templates which are used quite often in OGRE. For this and other
reasons we decided in autumn 2005 to implement our own generator based on the same
technologies as NoodleGlue. Since autumn 2006 the JNI code generator project is forked
from ogre4j under the name XBiG (XSLT Bindings Generator) and got its own project
space on Sourceforge.net.

NoodleGlue is the wrapper generation tool of "noodle heaven" and uses doxygen to
extract the API information from the library's source code. This approach had the
advantage that parsing and analyzing is done by a tool that is widely-used and tested with
different input languages. So the first step to our generator was already available for free.
Besides the usual outputs like a HTML documentation Doxygen provides a XML output
of the analysed source code. This output is specialized for the Doxygen task to generate
documentation, contains a lot of information that isn't necessary to generate wrapper code
and is structured in a flat (E.g. name spaces are not nested as child XML elements.)
hierarchy. For these reasons and to have the possibility to replace Doxygen with another
tool, we decided to implement a meta layer that is represented in XML too.

To convert the Doxygen output to our meta layer we're using XSLT (Extensible
Stylesheet Language Transformations) which is designed to describe conversions or
transformations of XML code with XML code. One big advantage of XSLT is that it is an
interpreted language and therefore OS (Operation System) independent. The generation
of the meta layer and the layer itself should be independent from any OS or platform to
make it possible to generate bindings for "every" language on every platform. To have a
consistent tool chain the generation of the wrapper code is done with XSLT too. This
reduces the usage of different tools and technologies to one major aspect: XML/XSLT.
As mentioned before, Doxygen could be replaced with another tool that is capable of
parsing source code and generating a XML representation of the parsed input.

AnDevCon2011 JavaGlue JavaGlue.03 JNI Primer

JavaGlue.04.1 JNI Diagrams

(updated 5 hours ago by StephenWilliams)

6 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

Standard JNI / NDK Development Process

(Images courtesy of Marko Gargenta of Marakana.)

7 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

AnDevCon2011 JavaGlue

JavaGlue.05 Use
(updated 14 hours ago by StephenWilliams)

Capabilities
JavaGlue provides access to just about everything in C/C++ that you would reasonably want access to
from Java. Setters and getters are created for access to public data. Public constructors, destructors,
methods, and types are all available. Globals, class static, and object members are available in a fairly
clean way. Both factory methods and Java-side 'new' of objects is supported, along with pass and return
by value. Enums, template types (possibly requiring typedef), std:string, Vector<byte>, and unsigned
char*[] are all supported. Direct support for handling pointers, including null pointers, and passing by
reference, are all handled in a straightforward and very C++-like way. Name spaces and class hierarchies
are handled by direct mapping to Java package name space. Even C++ multiple inheritance is mapped to
Java in a usable way. Type mapping can be tuned as needed. C++ items in headers can be ignored with a
couple levels of granularity through a config file.

The net result is that through no creation of metadata or programming, you can point the build system at
a hierarchical directory of C and C++ headers, build, and write very C++-like Java code that directly
uses C++ code. And because the C++ code has been mirrored into generated Java code, Eclipse will

8 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

provide tooltip assistance while writing C++-ish Java code.

1. Global methods & members - These show up as static methods in a class called GlobalUtility
which is created as necessary at every package level.
2. Variables - Public variables get getters and setters automatically created.
3. Classes - Public classes are fully wrapped and proxied into Java classes, usually with a Java class
with the same name and an interface with 'I' prepended.
4. Methods - All public methods are available. Those returning objects by value have a slightly
modified form in Java: They return void and have an extra first parameter which must be an already
constructed object.
5. Constructors & Destructors - These are proxied normally into Java.
6. Template instances - These are supported, however parameterized templates often must be
typedef'd to be usable. Methods with an untypedefed complex template type as a paramter or return
value will simply be ignored and won't exist in Java.
7. Typedefs - C++ treats typedefs as equivalent to the original type, while JavaGlue wraps them into
Java classes and interfaces of the same name.
8. Enums - Completely usable, including created mapping methods. Java use of C++ enum values
looks different than C++ use of enum, but the semantics are mapped well.

Limitations
1. Templates have to have a concrete instance
2. Parameterized templates often need to be typedef'd. Since types in any form are equivalent in C++,
this is easy.
3. Template or other code instantiation must obviously be triggered in C++. When writing code in
Java, it is too late. In many cases JavaGlue will generate code that will make it happen. Making use
of something in Java is often just a matter of adding a typedef.
4. JavaGlue will sometimes create code to access members or base classes that are not public, causing
compilation errors in the generated C++ code. This can be avoided by adding ignore statements to
ignore_list.xml or hiding code from the JavaGlue analysis.

Temporary Limitations
1. No '_' in enum type names. (Name mangling requires that '_' -> '_1'. This happens elsewhere and
needs to be fixed for enum types.)
2. The generated C++ code may have trouble finding include files in some cases. Paths weren't
preserved in the Doxygen output. JavaGlue is improved here as it tries to regenerate the original
paths, but the method isn't foolproof.
3. Can't change the name of shared libraries or generated paths. They are xbig and org.xbig currently.
Will change to org.javaglue, and be modifiable soon.
4. The original XBiG library code must end up in a separate shared library from generated application
code so that the LGPL relink requirement can be met easily. This will be fixed shortly. Please
consider the current code development only until then.
5. C++ wstring handling code is currently missing due to now-obsolete Android STL issues. Wstring

9 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

support will return soon.

AnDevCon2011 JavaGlue Example

JavaGlue.05.1 Example 1
(updated 14 hours ago by StephenWilliams)

Better examples pending.

test.h:

10 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

view plain copy to clipboard print ?

01. #include <stdio.h>

02. #include <string.h>
03. #include <string>
04. #include <vector>
05. #include <jni.h>
06. #include "basedelete.h"
07. #include "jni_base.h"
08.
09. // typedef std::vector<unsigned char> ByteVector; Get this from modified Xbig now.
10. class test {
11. public:
12. int x;
13. float y;
14. char *p;
15. test() { p = NULL; }
16. test(char *pp) { p = pp; }
17. bool isNullFlag;
18. char * doWhatever(char *pp) { if (pp == NULL) isNullFlag = true; else isNullFlag = false; return
19. std::string * mkStringP(std::string s) { return new std::string(s); }
20. void setString(std::string s) { p = strdup(s.c_str()); }
21. void setStringP(std::string *s) { p = strdup(s-‐>c_str()); delete s; }
22. std::string getString() { return std::string(p); }
23. char *getCString() { return p; }
24. char *dupcString(std::string s) { return strdup(s.c_str()); }
25. bool isNull() { return isNullFlag; }
26. bool isTestNull(test *tt) {
27. if (tt == NULL) isNullFlag = true; else isNullFlag = false; return isNullFlag;
28. }
29. bool isTest(test tt) {
30. // if (tt == NULL) isNullFlag = true; else isNullFlag = false; return isNullFlag;
31. }
32. test* getTest() { return this; }
33. test* getTestNull() { return (test*)NULL; }
34. static base::ByteVector& mkByteVector() {
35. base::ByteVector* bp = new base::ByteVector(20);
36. (*bp)[0] = 'h'; (*bp)[1] = 'i';
37. return *bp;
38. }
39.
40. // std::wstring ws;
41. // std::wstring getWString() { return ws; }
42.
43. static void bvDouble(base::ByteVector, base::ByteVector) {
44.
45. }
46.
47. };
48.
49. class test2 {
50. public:
51. test t;
52. int i;
53. test2() {
54. }
55. test2(int ip) {
56. i=ip;
57. }
58. };
59.
60. class test3 {
61. int j;
62. void* p3;
63. public:
64. test3() { p3 = 0; }
11 of 36 65. void* getP3() { return p3; } 3/9/11 4:20 PM
66. };
67.
68. class test4: public test, public test3 {
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

BasicTests.java:

12 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

view plain copy to clipboard print ?

01. /**
02. *
03. */
04. package test;
05. import org.xbig.base.*;
06. import org.xbig.std.*;
07.
08. import org.xbig.*;
09.
10. // import org.junit.Assert;
11. // import org.junit.Test;
12. // import junit.framework.TestCase;
13.
14. /**
15. * @author swilliams
16. *
17. */
18. public class BasicTests {
19. public BasicTests() {
20. }
21. public static void main(String [] args) {
22. BasicTests tst = new BasicTests();
23. tst.test();
24. }
25. //@Test
26. public void test() {
27. setUp();
28. Itest t = new test();
29. org.xbig.base.InstancePointer ipn = new org.xbig.base.InstancePointer(0);
30. org.xbig.base.BytePointer bpn = new org.xbig.base.BytePointer(ipn);
31. message(" BytePointer bp = t.doWhatever(bpn);");
32. BytePointer bp = t.doWhatever(bpn);
33. message(" if (bp.longValue() == 0L) message(\"Got a Null!\");");
34. if (bp.object.pointer == 0L) message("Got a Null!");
35. message(" t.setString(\"Hi\");");
36. t.setString("Hi");
37. message(" message(t.getString());");
38. message(t.getString());
39. message(" t.setString(bpn);");
40. t.setStringP(t.mkStringP("wow!"));
41. message(" bp = t.getCString();");
42. bp = t.getCString();
43. message(" if (bp.object.pointer() == 0L) message(\"Got a Null!\");");
44. if (bp.object.pointer == 0L) message("Got a Null!");
45. message("doWhatever(null) isNull:");
46. t.doWhatever(bpn);
47. message("isNull:"+t.isNull());
48. t.doWhatever(t.dupcString("Hi again!"));
49. message("isNull:"+t.isNull());
50.
51. message("isTestNull:"+t.isTestNull(t));
52. message("isTestNull:"+t.isTestNull(null));
53.
54. Itest tn = t.getTest();
55. message("getTest:"+(tn==null)+"(test)tn.object.pointer:"+((test)tn).object.pointer);
56. message("tn == null: "+(tn == null));
57. tn = t.getTestNull();
58. message("tn == null: "+(tn == null));
59. if (tn != null)
60. message("isTestNull:"+(tn==null)+"(test)tn.object.pointer:"+
((test)tn).object.pointer);
61.
62. message("new ByteVector");
63. ByteVector bv = new ByteVector();
13 of 36 64. message("mkByteVector()"); 3/9/11 4:20 PM
65. IByteVector ibv = test.mkByteVector();
66. ibv.reserve(20);
67. ibv.push_back((byte)'h');
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

AnDevCon2011 JavaGlue Diagram

JavaGlue.05.2 JavaGlue Diagrams

(updated 5 hours ago by StephenWilliams)

JavaGlue Build Diagram

AnDevCon2011 JavaGlue CMake

JavaGlue.06 JavaGlue Build System

(updated 13 hours ago by StephenWilliams)

Host vs. Embedded Development

For a variety of reasons, it is a best practice to write as much code as possible to run cross-platform.
With respect to Android, this means that the C++ code should run on MacOS X / Linux (and Windows
too if you like) and the first layer of Java code should work in "native Java" along with Dalvik. Since
there are very few Android-specific features that native Android C/C++ can get to, this is usually not too
difficult. In the case of our recent project, we had significant C++ code in several layers that needed to
be used on Android, iOS, Qt Windows, and Qt MacOSX.

To accomplish this, and to make the use of JavaGlue efficiently, we chose to use CMake as a cross-
platform build system. We also wrote generic Java code that used much of the C++ layer so that this

14 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

could all be debugged on the host before running on Dalvik/Android. After solving many problems, and
recently porting to NDK5, we have used this extensively. We do not use the NDK build system, only the
cross-compilation tools, libraries, and headers. Getting this right required a thorough analysis of
arguments for compiling and linking shared libraries for Android on multiple architectures, MacOSX,
and Linux. Windows building is partially solved (we build the C++ code using CMake-generated Visual
Studio projects), but we don't currently build the Java/JavaGlue portion of the system there as there is
no interest.

CMake, Two-Pass Builds, Android

We use CMake with custom platform definitions for Android, conditionalized explicit steps to run
JavaGlue code generation, build all of the C++ code to a shared library and the Java binding code to a
JAR file, and execute some regression tests. The Android Eclipse project then references the resulting
Jar file. CMake is very nice, and seems at first to have great documentation, however doing anything
beyond trivial projects suffers from a lack of good examples. Furthermore, code generation systems like
JavaGlue give most build systems fits. The main problem is that new C++ and Java source files can
appear (or disappear) from the source tree during build because of changes to the C++ headers. Solving
this in a way that mostly preserved full dependency based building was a key success.

The way this was accomplished was by running the CMake build system generation step a second time
when necessary. A driver Makefile is used to run a first pass. If JavaGlue code generation is required, a
flag file is removed, causing the rest of the first built to short-circuit. The Makefile reacts to the missing
flag file by running a second CMake generate pass which picks up the file changes through standard
CMake globbing, and then a make with the same parameters is run again. Generally, we maintain the
ability to run a local host and Android build without a clean. A 'make Clean' wipe is needed after certain
types of changes.

JavaGlue.12 CMake describes CMake in a little more depth. The example build system uses
cpp-project-template as a base build environment, with CMake used in Makefile mode and our driver.
Scripts that we use for installing needed apt packages in Ubuntu or Macports packages on MacOSX are
included. Note that we install all development Macports packages with +universal so that we can
produce both 32-bit and 64-bit libraries.

The main JavaGlue / XBiG system is in tools/xbig. Any Java binding related code, Java or C++, goes in
bindings/java, as does the main JavaGlue CMakeLists.txt script. CMake has very good support for
out-of-source builds, so we always build in build/. Be careful not to run 'cmake' outside of build/ as the
cache is sticky and stubborn. There is a script to cleanup mistakes. Currently, we use subdirectories of
build for host vs. Android, etc., but the current example project does not. This will likely change in the
next release.

AnDevCon2011 JavaGlue

JavaGlue.07 Memory Management

(updated 15 hours ago by StephenWilliams)

15 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

Memory Management
A JavaGlue generated Java proxy class contains:

a pointer to a C++ object,

a flag that indicates how the object was allocated,
methods to call all C++ static and member methods through generated JNI code, and
some utility methods that are not normally used directly.

A "JavaGlue object" consists of an instance of the C++ class and the corresponding Java proxy class
instance that holds a pointer. The JavaGlue object can be created in three ways:

By allocation in Java (SomeCPPClass scc = new SomeCPPClass();)

By C++ code that returns a pointer (i.e. a "factory" method)
By C++ code that returns an object by value (factory method returning by value, which involves a
copy constructor)

When a JavaGlue object is created by allocation in Java, a flag is set in the Java object so that the object
can be deleted (by calling "delete()"). If not deleted explicitly (which is recommended) then the finalizer
on the Java class will delete the object. Note that finalizers may not run predictably or be guaranteed to
run in a given JVM.

If an object is returned from a method by pointer, JavaGlue records the fact that C++ code "owns" the
allocation of that object and will refuse to call the destructor on that object by throwing an exception.
This is similar to C++ allocation / deallocation rules in a number of environments. There is a Delete
utility class that contains "factory destructors" for some cases that don't automatically end up with
accessible destructors, such as byte arrays or vectors.

In C++, objects are passed to methods in one of three ways, and as return values in one of two ways:

by value: the object is on the stack (both)

by pointer: a pointer to the object is on the stack (both)
by reference: a pointer to the object is on the stack, but it is always interpreted dereferenced and
cannot be null (only as a method parameter)

An object returned by value is a copy of the object that the method returned, which typically no longer
exists. In this case, there is a potential quandary. Because Java passes everything but scalar constants as
references, there would be no obvious difference between an object returned by value vs. a pointer to an
object. To avoid all confusion and better match the actual semantics involved, the XBiG authors
implemented the return by value case a return of void with an extra first parameter that is an "out"
variable. This requires the caller to first construct an object matching the return type needed, then pass
this as the first parameter. The reference to this is passed to the generated C method which invokes the
copy constructor to the Java-allocated object. This creates the requirement that the object have a usable-
from-Java constructor (not the case for a naked parameterized template type), and that the Java
application code delete the object later.

16 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

JavaGlue creates a normal Java class that acts as an "interface" class ("I" followed by the class name)
and, usually, a Java class which is a subclass of the interface class. The interface class holds references
to static class methods. There is also a global interface class for public functions that are not class
members. In cases where there is no constructor available to Java, only the "interface" ("I" class) is
generated. While Java cannot create an instance of these classes, a reference (holding a pointer) can be
returned from a C++ method and later passed as a parameter.

AnDevCon2011 JavaGlue

JavaGlue.08 Utility Methods

(updated 14 hours ago by StephenWilliams)

XBiG already included good string conversion methods. JavaGlue adds byte array / byte vector
copy/allocate methods for a number of useful cases, plus memset.

17 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

view plain copy to clipboard print ?

01. public class ByteArray {

02. // ByteArray (byte[]) / ByteVector (vector<unsigned char>) methods
03. public static long memset(IByteVector bv, long b, long len) {
04. return memsetByteVectorNative(bv, b, len);
05. }
06. public static native long memsetByteVectorNative(IByteVector bv, long b, long len);
07.
08. public static byte[] byteArray(IByteVector byteVector) {
09. return byteArray(byteVector, false);
10. }
11. public static byte[] byteArray(IByteVector byteVector, boolean fullAllocation) {
12. long bvn = byteVector.getInstancePointer().pointer;
13. if (bvn == 0L) return null; // Just pass it on
14. return byteArrayNative(bvn, fullAllocation);
15. }
16. public static native byte[] byteArrayNative(long ptr, boolean fullAllocation);
17.
18. public static ByteVector byteVector(byte[] bytes) {
19. return byteVector(bytes, 0);
20. }
21. public static ByteVector byteVector(byte[] bytes, long reserve) {
22. long _returnObjPtr = byteVectorNative(bytes, reserve);
23. return new ByteVector(new InstancePointer(_returnObjPtr));
24. }
25. public static native long byteVectorNative(byte[] bytes, long reserve);
26.
27. public static byte[] copy(IByteVector bv, byte[] ba) {
28. long bvn = bv.getInstancePointer().pointer;
29. if (bvn == 0L) return null; // Just pass it on
30. return copyNativebv2ba(bvn, ba);
31. }
32. public static native byte[] copyNativebv2ba(long bv, byte[] ba);
33.
34. public static ByteVector copy(byte[] ba, IByteVector bv) {
35. long bvn = bv.getInstancePointer().pointer;
36. if (bvn == 0L) return null; // Just pass it on
37. long _returnObjPtr = copyNativeba2bv(ba, bvn);
38. return new ByteVector(new InstancePointer(_returnObjPtr));
39. }
40. public static native long copyNativeba2bv(byte[] ba, long bv);
41.
42. // ByteArray / BytePointer methods
43. public static long memset(BytePointer bp, long b, long len) {
44. long bpn = bp.getInstancePointer().pointer;
45. return memsetBytePointerNative(bpn, b, len);
46. }
47. public static native long memsetBytePointerNative(long bp, long b, long len);
48.
49. public static byte[] byteArray(BytePointer bp, long size) {
50. long bpn = bp.getInstancePointer().pointer;
51. return byteArrayNativebp(bpn, size);
52. }
53. public static native byte[] byteArrayNativebp(long bp, long size);
54.
55. public static BytePointer bytePointer(byte[] bytes) {
56. return bytePointer(bytes, bytes.length);
57. }
58. public static BytePointer bytePointer(byte[] bytes, long size) {
59. return bytePointerNative(bytes, size);
60. }
61. public static native BytePointer bytePointerNative(byte[] bytes, long size);
62.
63. // Copy from byte array to byte pointer up to size or ba.length.
64. // Return amount of data written.
18 of 36 65. public static long copy(byte[] ba, BytePointer bp, long size) { 3/9/11 4:20 PM
66. long bpn = bp.getInstancePointer().pointer;
67. if (bpn == 0) error(0, "Bad bp instancePointer == null");
68. return copyNativeba2bp(ba, bpn, size);
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

This provides some frequently needed template instantiations, plus an accessible way to delete allocated
objects.
The xbig_* methods are used in C/C++->Java code. It turns out to be difficult to successfully look up
Java methods from C without using these methods. The default JVM reference doesn't have a complete
class loader so all lookups fail.
jni_base.h:

view plain copy to clipboard print ?

01. void Delete::byteVector(base::ByteVector* bv) { delete bv; }

02. void Delete::stringVector(base::StringVector* sv) { delete sv; }
03. void Delete::vectorByteVector(base::VectorByteVector* vbv) { delete vbv; }
04. void Delete::mapStringByteVector(base::MapStringByteVector* msbv) { delete msbv; }
05. void Delete::mapLongByteVector(base::MapLongByteVector* mlbv) { delete mlbv; }
06. void Delete::mapStringString(base::MapStringString* mss) { delete mss; }
07.
08. JNIEXPORT JNIEnv* Xbig_GetEnv();
09. JNIEXPORT jmethodID Xbig_cpath2MID(const char* cpath, const char* meth, const char* sig);
10. JNIEXPORT jmethodID Xbig_cpath2MIDenv(JNIEnv* env, const char* cpath, const char* meth, const
11. JNIEXPORT jfieldID Xbig_cpath2FIDenv(JNIEnv* env, const char* cpath, const char* field, const
12. JNIEXPORT jmethodID Xbig_obj2MID(jobject obj, const char* meth, const char* sig);
13. JNIEXPORT jmethodID Xbig_obj2MIDenv(JNIEnv* env, jobject obj, const char* meth, const char* sig);

basedelete.h:

view plain copy to clipboard print ?

01. #include <jni.h>

02. #include <string>
03. #include <vector>
04. #include <map>
05.
06. namespace base {
07. typedef std::vector<unsigned char> ByteVector;
08. typedef std::vector<std::string> StringVector;
09. typedef std::vector<ByteVector> VectorByteVector;
10. typedef std::map<std::string, ByteVector> MapStringByteVector;
11. typedef std::map<long, ByteVector> MapLongByteVector;
12. typedef std::map<std::string, std::string> MapStringString;
13. }
14. class Delete {
15. public:
16. static void byteVector(base::ByteVector* bv);
17. static void stringVector(base::StringVector* sv);
18. static void vectorByteVector(base::VectorByteVector* vbv);
19. static void mapStringByteVector(base::MapStringByteVector* msbv);
20. static void mapLongByteVector(base::MapLongByteVector* mlbv);
21. static void mapStringString(base::MapStringString* mss);
22. };

AnDevCon2011 JavaGlue

JavaGlue.10 Internals
(updated 12 hours ago by StephenWilliams)

19 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

JavaGlue (and XBiG) uses Doxygen to parse C++ header files, producing an XML description of all
types, methods, members, and variables. This is processed in a series of stages by an Ant-driven XSL
engine. After producing an intermediate mapping, a Java generation and a C++ generation pass are
made. At this point, the original C/C++ code and the generated C++ code can be compiled into a shared
library. The generated java code can be added to an Eclipse project, or just compiled into a JAR file
which can be referenced by an Eclipse project. Once the Java code compiles, the run settings must run
the application from a directory and environment where the shared library will be found. For an Android
project, this means having the shared library under the correct libs/ARCH/ directory and the JAR file (if
that route is taken) is in the lib/ directory.

Example code produced from the examples above:

Itest4.java:

view plain copy to clipboard print ?

01. package org.xbig;

02.
03. import org.xbig.base.*;
04. public interface Itest4 extends INativeObject, org.xbig.Itest, org.xbig.Itest3 {
05.
06. }

test4.java:

20 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

view plain copy to clipboard print ?

01. package org.xbig;

02.
03. import org.xbig.base.*;
04. public class test4 extends org.xbig.base.NativeObject implements org.xbig.Itest4 {
05. static { System.loadLibrary("xbig"); }
06.
07. /**
08. *
09. * This constructor is public for internal useage only!
10. * Do not use it!
11. *
12. */
13. public test4(org.xbig.base.InstancePointer p) {
14. super(p);
15. }
16.
17. /**
18. *
19. * Creates a Java wrapper object for an existing C++ object.
20. * If remote is set to 'true' this object cannot be deleted in Java.
21. *
22. */
23. protected test4(org.xbig.base.InstancePointer p, boolean remote) {
24. super(p, remote);
25. }
26.
27. /**
28. * Allows creation of Java objects without C++ objects.
29. *
30. * @see org.xbig.base.WithoutNativeObject
31. * @see org.xbig.base.INativeObject#disconnectFromNativeObject()
32. */
33. public test4(org.xbig.base.WithoutNativeObject val) {
34. super(val);
35. }
36.
37. public void delete() {
38. if (this.remote) {
39. throw new RuntimeException("can't dispose object created by native library");
40. }
41.
42. if(!this.deleted) {
43. __delete(object.pointer);
44. this.deleted = true;
45. this.object.pointer = 0;
46. }
47. }
48.
49. public void finalize() {
50. if(!this.remote && !this.deleted) {
51. delete();
52. }
53. }
54.
55.
56. private final native void __delete(long _pointer_);
57.
58.
59.
60. /** **/
61. public test4() {
62. super( new org.xbig.base.InstancePointer(__createtest4()), false);
63. }
64.
21 of 36 65. private native static long __createtest4(); 3/9/11 4:20 PM
66.
67. /** **/
68. public BytePointer doWhatever(BytePointer pp) {
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

class_org_xbig_test4.cpp:

22 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

view plain copy to clipboard print ?

01. #ifdef WIN32

02. // disable warnings
03. #pragma warning (disable : 4267) // conversion from 'size_t' to 'jint'
04. #else
05.
06. #endif
07.
08.
09. // use base library for cpp2j
10. #include "jni_base_all.h"
11.
12. // import declaration of all functions
13. #include "class_org_xbig_test4.h"
14.
15. // import header files of original library
16. #include <test.h>
17.
18.
19.
20. /*
21. * Class: org.xbig.test4
22. * Method: test4()
23. * Type: constructor
24. * Definition: test4::test4
25. * Signature: ()V
26. */
27.
28. JNIEXPORT jlong JNICALL Java_org_xbig_test4__1_1createtest4 (
29. JNIEnv* _jni_env_, /* interface pointer */
30. jclass _jni_class_ /* class pointer */
31. )
32. {
33. // constructor of class test4
34.
35. // parameter conversions
36.
37. // create new instance of class test4
38. test4* _cpp_this = new test4();
39.
40. // return casted pointer
41. jlong _jni_pointer_ = reinterpret_cast<jlong>(_cpp_this);
42. return _jni_pointer_;
43. } /* test4::test4 */
44.
45.
46. /*
47. * Class: org.xbig.test4
48. * Method: doWhatever()
49. * Type: non-‐virtual method
50. * Definition: char* test::doWhatever
51. * Signature: (C)C
52. */
53.
54. JNIEXPORT jlong JNICALL Java_org_xbig_test4__1doWhatever_1_1cp (
55. JNIEnv* _jni_env_, /* interface pointer */
56. jobject _jni_this_, /* Java object */
57. jlong _jni_pointer_, /* C++ pointer */
58. jlong pp
59. )
60. {
61. // parameter conversions
62. char* _cpp_pp = reinterpret_cast<char*>(pp);
63.
64. // cast pointer to C++ object
23 of 36 65. test4* _cpp_this = reinterpret_cast<test4*>(_jni_pointer_); 3/9/11 4:20 PM
66.
67. // call library method
68. char* _cpp_result = _cpp_this-‐>doWhatever(_cpp_pp) ;
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

AnDevCon2011 JavaGlue

JavaGlue.11 Advanced JNI

(updated 1 hour ago by StephenWilliams)

AnDevCon2011 CMake JavaGlue

JavaGlue.12 CMake
(updated 12 hours ago by StephenWilliams)

What is CMake?
From the manual:

CMake is a cross-platform build system generator. Projects specify their build process
with platform-independent CMake listfiles included in each directory of a source tree
with the name CMakeLists.txt. Users build a project by using CMake to generate a build
system for a native tool on their platform.

CMake can be very confusing at first, especially if you only read the official manual and only have very
simple projects as reference. Here are a few words that may greatly ease the learning curve:

CMake is a meta-make system. This means that CMake doesn't build anything but build scripts for
actual build systems. CMake can generate Makefiles, XCode projects, Visual Studio projects, and
Eclipse projects, at least. The files generated may seem a bit different than you may expect. Mostly this
is good because some nice automation and other capabilities are provided. CMakeList.txt scripts
reference other CMakeList.txt scripts in subdirectories plus they can include files that may be in a
project (typically named *.cmake). CMake relies on an installed directory of modules and other scripts
that know how to find various libraries, subsystems, executables, etc. These are invoked by requesting
access to (i.e. variables to be set) standard modules, like Java.

Typical operations in CMake scripts involve finding system capabilities, setting variables, globbing for
source code or data files (into variables), defining source directories & files, defining libraries, and
defining executables. Dependencies can be explicitly created while many are inferred automatically.
Custom commands can be defined. Most are used at build time, but there is some limited ability to do
operations at meta-make time. Definition of most operations is at a very high, logical level. Only when
defining a new platform or doing something beyond compile, create library, link executables do you
need to work with native tool definitions, arguments, or anything platform specific.

Many CMake variables have values at meta-make time based on where they are referenced in a
CMakeList.txt file. The key examples of this are variables for the current source and current "binary"
(i.e. build target, shadow build) directories. Generally, a particular CMakeList.txt can only set variables

24 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

for those scripts below it, so frequently things needed by one subdirectory from another are set at a
higher level. In essence, CMake at meta-make time is a functional language with a lot of built-in
functionality, controled by conditionalized scripting to solve all platform-specific issues.

The resulting makefiles have absolute paths everywhere, targets that flow from the top directory down to
where they need to be built, and have a default output that is very clean in the absence of errors. A 'make
VERBOSE=1' gives full detail of steps being taken. For Make, you can still do parallel makes with -j4
(J=4 to the driver build/Makefile in the JavaGlue example project). DEBUG is the default, use
RELEASE=1 for release builds.

http://www.cmake.org/cmake/help/cmake-2-8-docs.html
https://code.google.com/p/cpp-project-template/

Others using CMake with Android:

https://code.google.com/p/android-cmake/

AnDevCon2011 Ssx

Ssx
(updated 1 minute ago by StephenWilliams)

Ssx, a new open source Java XML parsing library.

World premier.
This presentation can be found at: http://sdw.st/ssx.html#tag:Ssx

Why another XML library / API?

Super Simple Xml (Ssx) was written because:

There was (is?) no usable DOM XML parser on Android.

The standard DOM API is broken anyway (too verbose, inefficient).
Project needed to avoid spending a lot of time on XML parsing. Typical use of SAX event
processing, combined with complex application logic and networking, would have created too
much complexity.
There was a desire to write the most concise application code possible.
Parsing XML is not a big deal: stop the madness.
XML parsing using the built-in SAX parser on Android was too slow.
Some XML features were needed that are not typically in XML APIs (.getXml()).
Typical XML libraries are far larger than they need to be.

License
Written by Stephen Williams, principle at OptimaLogic. Development was split with client. Apache 2.0

25 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

license has been approved by all parties.

Download
On Google Code soon. For now, snag it from: http://sdw.st/conf/AnDevCon2011/ssx-1.0.zip

Concise Coding

How simple can you get?

view plain copy to clipboard print ?

01. Ssx ssx = new Ssx();

02. Ssx.Xml fx;
03. fx = ssx.parse("<?xml version=\"1.0\" encoding=\"UTF-‐8\" standalone=\"yes\"?><xml><test>
<!DOCTYPE greeting [ <!ELEMENT greeting (#PCDATA)> ]>this <![CDATA[ is <nice!> ]]> ok?</test>
</xml>");
04. ssx.message("test:"+fx.get("test"));
05. ssx.message("test xml:"+fx.getNode("test").toXml());
06.
07. // Parsing an Atom-‐style media feed: A list of entries that may contain multiple links of different types
08. fx = ssx.parse(feed);
09. for (Ssx.Xml entry = fx.getNode("entry"); entry != null; entry = entry.nextSameName()) {
10. for (Ssx.Xml link = entry.getNode("link"); link != null; link = link.nextSameName()) {
11. ssx.message("Link type "+link.get("@type")+" href="+link.get("@href"));
12. }
13. }

Intro to Ssx
Ssx provides a fast, concise to use and concisely written DOM and SAX parsing library. It is a
non-validating "reasonably conforming" XML parser. In a single Java file in about 1000 lines of code,
written and optimized in about a week. Ssx is meant for parsing of typical application and business data.
It is not intended as a solution to every XML need. There are a number of permanent (DTDs) and a
couple temporary restrictions for the range of XML handled. The embedded SAX parser, which
implements the org.xml.sax.XMLReader interface, is 240 lines of code (with the core parse loop written
in dense "paragraph mode"). This parser also directly supports efficient implementation of the toXml()
method by remembering the text parsed. In a number of cases, this can make re-serializing XML very
fast.

Ssx is standard Java that also works well with Dalvik. The only Android specific code is what is needed
to find SAX when the select the built-in SAX parser is selected:

view plain copy to clipboard print ?

01. try {
02. parser = XMLReaderFactory.createXMLReader();
03. } catch (Exception e) {
04. // Try known "default" for Android:
05. System.setProperty("org.xml.sax.driver","org.xmlpull.v1.sax2.Driver");
06. parser = XMLReaderFactory.createXMLReader();
07. }

26 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

Ssx is small and simple enough to be extended as needed for a particular use. This could include
additional XPath capabilities, specialized indexing, custom validation, or integrating signing and
encryption. Internal DTD entity definitions or external entities could easily be supported.

Why
XML is, at the base, a reasonably simple data format. A number of details, like namespaces, make
parsing XML somewhat interesting. Still, existing APIs are generally far more complex than they need
to be for most applications. An application generally wants to hand data to a parser, be told if there is an
error, and be able to find and retrieve data elements. In many cases, and especially for an Android
application, there is a lot to be said for having just the code needed to solve the job, leaving the kitchen
sink in the kitchen.

Application Data Models

Applications manage data in a variety of ways. Sometimes these are straightforward, sometimes exotic
towering frameworks must be fed and cared for. Plumbing and overhead should not dwarf business
logic.

Object Mapping

A number of methods center around creating classes for every business object, then writing code to map
external representations to those objects. This includes object relational and XML mapping.
Traditionally, developers wrote copious glue code at each layer and step. Some modern systems try to
alleviate this by using language-enabled annotations or metadata files so that this mapping can be done
interpretively. This can be helpful, but often the detailed steps and care needed to get this to work rivals
manual glue code.

One must ask: Is this the only way to accomplish the business logic needed? Is it the most efficient?
Easiest to understand and modify? Is this the best use of the developer's time? Consider how many lines
of code need to be written at each layer for each data element introduced. Traditionally, it is several at
least, multiplied by many layers and both directions. The ideal, and often possible case is far less than
one line of code per element is needed at each layer.

The ideal case can be described this way: An application architecture is established where a message
travels from point A to point B, perhaps passing through proxies, intermediate steps that may observe or
also modify or add data, perhaps storing it in message queues or in a database. Each function, library,
and application along the way may be developed separately and updated and different intervals. If the
application at point A adds a new data element, what has to change for it to get to B and perhaps back to
A? In the ideal case, only A needs to change. When B is changed, it can react to that element.
Intermediate applications should not care that something has changed because they read what they are
interested in, insert or replace data that they care about, and pass the message along.

Typical applications are not this resilient and some XML frameworks do not easily enable the best case.

Versioning & Extensions

27 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

Something that is *always* an issue is how versioning and extensions are handled. How much code has
to change? What needs to be recompiled? When? Ideally, something like adding an additional field to a
message should be able to propagate through a system without requiring lockstep upgrades and without
conflicting with existing data or databases. XML, properly used, is one way to accomplish this.
(RDF-like graph-based semantic data is a better way, but that is another story.)

Collection Objects

A collection object is an instance of a class that manages sets of data in a structured way. A classic
example is a Map<> that provides a dictionary structure. A DOM-style XML representation is a type of
collection object, although the traditional DOM API is very cumbersome.

One way to avoid a lot of manual glue code or metadata is to use collection objects of some kind to
represent messages and business objects. Interestingly, these can be made arbitrarily hierarchical, just
like an object hierarchy. They can also be wrapped with very lightweight classes so that while the
collection class may provide clean find/set/get, application specific methods can be added. The result
can be used in a typical object oriented fashion while writing very little code that is not business logic.
While Ssx doesn't have this level of API yet, the author has designed and implemented this type of
solution very successfully in the past.

XML Idioms & Loose Coupling Rules

Some key XML idioms are:

Accept anything, complaining only if it is malformed or you can't find required items
When passing on data received in some sense, pass on extra information even if it is not
understood.
Carefully produce data exactly to specification.
Prefer logical structure to physical: XML can be used to represent graphs and trees. The former are
more flexible.
Use namespaces, and semantic tagging if possible, to uniquely identify the types of elements,
attributes, and relationships.

Writing XML
XML is usually easy to write: Simply concatenate strings, perhaps using a template that can be updated
easily. This is also usually the fastest method. It is a big help if parsed XML or generated data structures
can be easily converted into an XML document or a fragment that can be included.

In some cases, it can be helpful to have an API that allows building the XML output, perhaps in a
non-linear way. Ssx does not yet support this, but will soon.

How it works
Key insights used in Ssx are:

28 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

A single set of Maps could efficiently represent the structure of an arbitrary XML tree.
Structure is provided as map entries from the current node to the next in more than one sense: Next
sibling, next sibling with same name, first child, parent. The "next" operation, which is very
commonly used, is very fast.
No iterators types or objects are needed: Each object is its own iterator!
Each element can be represented with a very lightweight object, with relationships held completely
in the maps.
The text values can be referenced as ranges of the original parsed data. (There are some nuances
here since XML is unicode and the actual source may have been bytes.)
A toXml() method can be supported at every element in a very efficient way. XML can be provided
as a fragment or as fully formed XML with all name spaces defined properly, allowing the XML
subtree to be recreated exactly in a later parse with no application fixup.
A minimal form of XPath allows a DOM-like API to support most operations efficiently in a single
line of code.

Android Lessons Learned

The first complete version of Ssx parsed 64K of XML in 30 ms in "native Java" (JDK 1.6 MacOS X).
This version took 65,000 ms to parse the same data on Android. After optimization, the code still took
the same 30 ms in native Java, but was down to 300-400 ms on Android, about the 1/10 speed ratio
expected. Speed was similar between Android 1.6 and 2.2.

Some Ssx lessons on optimizing for Dalvik:

Avoid creating objects of any kind. Memory allocation and garbage collection, plus the related
copying, should always be minimized. Character is expensive too.
Unicode character conversion (byte[]->char, char->byte[]) is too expensive. Inline code may be
used in the future.
Avoid function calls when possible.
Using Enums is very expensive! Don't do it in tight loops. A local "int" is very fast.
For small sets, especially with a String key, HashMap is far more expensive than TreeMap. Use
TreeMap.
Even TreeMap is too expensive. Much of the CPU in Ssx is spent in TreeMap.
Direct array access is very cheap.
Reusing objects is a key technique.
When expandable objects are needed, simple with amortized bounds checking / reallocation are
preferred. Once a high-water mark is hit, remember it. Use non-linear expansion in size (doubling
for instance) when data varies widely.

Ssx API
Tagged as 'Ssx':

29 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

AnDevCon2011 (20)
Ssx (4)
Ssx API
Ssx Part 2

AnDevCon2011 Ssx

Ssx API
(updated 16 hours ago by StephenWilliams)

All retrieval methods return null when the request cannot be found, except for the versions which are
given a default value to return.

30 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

view plain copy to clipboard print ?

01. public class Ssx { // Reusable object holding parse tree. Just call parse() to reuse.
02. // Determine which SAX parser is used: internal, local, or both, and whether to time parsing.
03. public static void setParseType(boolean sparsep, boolean defaultParserp, boolean timeAllp);
04. // Parse XML, given as a byte array.
05. public Xml parse(byte[] xmlBytes, int off, int len, String nsSet) throws IOException, ParseException;
06. // Parse an XML string. 'what' is information for logging. 'timed' determines whether the parse tim
07. public Xml parse(String what, String xml, boolean timed) throws IOException, ParseException;
08. // The element node object. Returned from a parse and most operations.
09. public class Xml implements Comparable {
10. // Allows objects to be compared.
11. public boolean equals(Object o);
12. // Returns XML equivalent of the current node. Contained in an '<xml>' node with all active name
13. public String toXml() throws UnsupportedEncodingException;
14. // Returns the current node as an XML fragment.
15. public String toXmlFragment() throws UnsupportedEncodingException;
16. // Returns the next sibling of this element.
17. public Xml next();
18. // Returns the next sibling of this element that has the same name, skipping any other elements.
19. public Xml nextSameName();
20. // Returns the node matching the path qname.
21. public Xml getNode(String qname);
22. // Returns the node matching the namespace+localname.
23. public Xml getNode(String ns, String localName);
24. // Returns the namespace of the current node.
25. String namespace();
26. // Return the name of the current node.
27. String name();
28. // toString(), getText(), and get() all return the text for the current element.
29. public String toString();
30. public String getText();
31. public String get();
32. // Returns the text value of the given path qname.
33. public String get(String qname);
34. // Returns the text value of the given path qname, or the passed default value if the path is not
35. public String get(String qname, String def);
36. // Returns the text value of the given namespace+path, or default.
37. public String get(String ns, String path, String def);
38. // Returns the value of the given node as an int.
39. public int getInt();
40. public int getInt(int defaultInt);
41. public int getInt(String path, int defaultInt);
42. // Returns the value of the given node as a double.
43. public double getDouble();
44. public double getDouble(double defaultDouble);
45. public double getDouble(String path, double defaultDouble);
46. public double getDouble(String path);
47. }
48. // Turn on debugging or verbose tracing.
49. public static void setDebug(boolean deb, boolean verb) { debug = deb; verbose = verb; }
50.
51. ////// Utility methods that are often missing or not quite usable.
52. // Pull a stream into a string efficiently.
53. public static String slurp(InputStream in) throws IOException;
54. // Pull a stream into a byte array efficiently.
55. public static byte[] slurpBytes(InputStream is) throws IOException;
56. // These will change soon to take a pass list as proper url encoding varies depending on situation.
57. // Urlencode a string
58. public static String urlEncode(String s);
59. // Does this character need encoding?
60. public static boolean needsEncode(char c);
61. // Urldecode a string
62. public static String urlDecode(String s);
63. // Coming soon: b64 codec
64. }
31 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

Back to Ssx

AnDevCon2011 Ssx

Ssx Part 2
(updated 16 hours ago by StephenWilliams)

Coming Soon
1. Namespaces are handled well for many cases. What is currently not supported is namespace
definitions that change during a single parse. This includes a default namespace that is redefined or
only defined for a subtree. This can be improved to handle any non-pathological use of
namespaces. Namespace parsing in attribute values may also be handled.
2. It is possible that the lexical events, DTD entity declarations, and other features may be important
enough to be implemented. Some of these could activate a flag to enable extra features when
present to keep parsing as fast as possible in other cases.
3. More incremental parsing will be supported, especially to support the streaming event DOMlet
model.
4. Additional convenience methods, such as date parse.

Streaming Event DOMlet

The next feature to be released will be the streaming event DOMlet method. A method to register a
callback for a particular element path will allow a callback during parsing with an Ssx.Xml node for the
matching element that was just completed. The application can then process that element, returning true
if the element should be removed from the parse tree to save memory. The callback can use all normal
DOM-like methods on that element or the partially completed tree as a whole.

OpenEXI
OpenEXI is an open source project that combines several open source implementations of the W3C
Efficient XML Interchange binary XML standard. The author participated in the EXI working group and
the XBC working group before it. We plan to merge and refactor the existing code base, then provide an
Ssx API for OpenEXI so that either XML or EXI can be produced or parsed by applications. We have
also begun the process of getting OpenEXI into the Apache Incubator.
http://openexi.sourceforge.net/

What is EXI?

EXI provides a very compact encoding of the XML infoset (i.e., the informational equivalent of an XML
file) with some options. These options allow encoding of a standalone XML file or an XML file with
expected structure and data types specified with an XML Schema. The resulting intermediate encoding
can then optionally include data compression (ZLIB), applied in a particular way. With a schema,
encoding can be much more compact because the schema represents redundancy in the data and certain
data can be encoded as compact binary values.

32 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

The point of all of this is to highly optimize both the processing overhead of parsing and serialization
and the size of the resulting data. Both of these greatly reduce the overall data transfer, processing,
memory usage, and latency of data.

It is a common FAQ why XML + compression isn't just as good. The two main points are that this
makes processing speed even worse for XML and the result is still not as compact in many cases as EXI.
EXI greatly reduces the overhead of XML, particularly when many tags and attribute names are used.
For some XML, there is little of that so only the possibility of restricted character sets or other binary
encoding would make a difference.

Another key point, and one of the key differences between EXI and most prior optimized binary
formats, is that EXI can encode any XML that it is given, whether or not it matches the optional schema.

When to use

EXI is great for large amounts of complex data or for transfer of data that could be more efficient in
binary, such as float or many date/times. It is also efficient for small messages that could reduce down to
a few bytes in some cases.

GenXDM
GenXDM enables applications to write code that uses and manipulates XML trees
without being tied to a particular XML tree representation like DOM, DOM4J, AXIOM,
or any other. It also prods developers towards an immutable view of XML trees, which
will make it easier and faster to work with XML across multiple cores and multiple
processors.

http://www.genxdm.org/
GenXDM is a great concept. The GenXDM developers are interested in Ssx and OpenEXI.

Other Minimal XML Parsers for Java

First, the obligatory "you can't do that":

Some of these problems are problems that most

homemade or minimal solutions haven't considered, or haven't had the
full subject knowledge to implement correctly. These problems must be
handled for a parser to be a correctly working xml parser, and once
those problems are solved, you pretty much end up with something that is
similar to the projects that already exists.

http://lists.xml.org/archives/xml-dev/200401/msg00492.html

33 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

Some other small-ish libraries. None seem nearly as concise and easy to use.

XMLtp
sparta-xml
NanoXML
jdom
tinyxml
piccolo
kXML

About Us

Corporate sdw

OptimaLogic
(updated just now by StephenWilliams)

OptimaLogic, Inc. is a lean R&D consulting organization located in Silicon Valley that provides highly
technical mobile, desktop, and server related consulting in a variety of areas. These include:

mobile & service architecture/design,

Android & NDK / iOS / WebOS,
web services,
crypto / security,
scalable server applications,
database / storage,
machine vision, and
cutting edge UI.

Recent clients include technical startups, multi-national corporations, government agencies, and
academia. With access to a wide range of resources, OptimaLogic can quickly find the optimal solution
for your most challenging projects. We have a particular interest in early stage startups and high-profile,
important projects.

Look for the Concise Coding book soon.

Stephen Williams leads and runs OptimaLogic. Resume LinkedIn

AnDevCon2011 ConciseCoding

Concise Coding
(updated 16 hours ago by StephenWilliams)

Most code is far from optimal. It is too verbose, sometimes having hundreds of classes, and use many
lines of code to do things that should be done in a single line. Interfaces are complicated, combining
libraries and techniques often create a combinatorial explosion of total complexity. The chronic

34 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

application of "keep it simple", "don't put too much code in one place", "we don't have time to rewrite
that code" results in code that is far, far more complex than it should be.

In addition to problems in project management and architecture / high-level design, detailed design and
coding frequently suffers from over-application of methods and design rules which leads to inefficiency,
pain, suffering, and project failure. Management challenges are answered somewhat by Agile, Scrum,
etc.
There are many architectural methods and principles that are very helpful. These are also managed with
iterative & Agile methods and best emerging practice. However, in very successful projects, applying
just the right techniques in a sparing way to get lowest total complexity requires competent, experienced
architect coders with the right goals. These are guidelines for choosing those goals.

1. Apply architectural principles and design methodologies, recommended design rules, and favor use
of well-known APIs and architectures
2. Always counterbalance with consideration overall complexity for application developers,
maintenance, and reuse
3. Favor creating tools and libraries to concentrate complexity to keep application development as
simple and concise as possible
4. Develop and use design rules for when *not* to create new classes, files, packages, etc.
a. Avoid "class diarrhea": Most developers seem to have many reasons to create new classes and
practically no reasons not to.
i. Tools are making this worse. Peephole, tool-tip driven development can lead to a system
that is impossible to grok as a whole. Development can grind to a halt as it becomes more
and more difficult to make changes.
b. Don't pollute the namespace, class "space", file "space", etc.
c. Strive to reduce total "surface area" (the total cognitive load) at each level.
d. Architect for flexibility, but recognize when the flexibility is not needed. (Do you really need
to create an indirection for a constant like "http://"? Is it going to change? Does it need
translation? What are you doing???)
5. Be object-oriented at the macro level too
a. Keep everything together when possible and lowest complexity.
b. Expect code reuse: Can a class be copied easily to another project / package, or do all classes
form a complex web. Having to change more than a class or two for incremental additions is a
good sign that something is wrong.
c. Don't ever hide application flow, configuration, and dependencies.
d. Avoid creating interfaces and classes just to pass, return, and store tuples.
i. In some cases, Map<> or String[]/Object[] can be appropriate. (Similar to C++ pair<> or
Qt/C++ or Objective-C properties or even Lisp lists.)
ii. Use generic callback interfaces for generic solutions.
iii. If a custom composite return type or callback interface is desired, declare it in an inner
class right next to where it is used, unless it is a very standard and common element in a
system.
6. Don't avoid refactoring or even a total rewrite
a. Recognize that developers are experts at the problem *after* they have created an initial
design and implemented it. Frequently, what seemed appropriate before solving all of the

35 of 36 3/9/11 4:20 PM
AnDevCon2011 Stephen Williams - JavaGlue file:///Users/sdw/Documents/OptimaLogic/tw/sdwnmptw.html#tag:J...

details is, later, clearly not the best. Expect this. Redesign and rewrite at stopping points or
when development is slowing.
b. Working code can usually be rewritten much faster than the original: You are not wasting all
prior effort.
c. Rewrite in parallel to existing code when possible which can allow toggling or running both
paths and comparing the results.
7. Use or Create new conventions and methods that improve complexity
a. Little things like coding conventions
i. Favor reducing vertical white space: Seeing more code at once is useful. (I prefer K&R
for that reason.)
ii. Use special rules when necessary: "paragraph mode" for intense, dense coding: See
Ssx.SParse.
b. Big things like architectural patterns
i. Signals / Slots, message based, queues, logging/debugging, ...