Calling R From Java: Duncan Temple Lang December 13, 2005
Calling R From Java: Duncan Temple Lang December 13, 2005
(Note that one does not need to create a separate instance for each evaluation and that the same object can be used any number of times to evaluate an expression.) Now we want to have this object e parse and evaluate the expression "objects()" For this, we call one of the eval() methods provided by the REvaluator class. The simplest version requires only the expression in the form of a string. [] Object val = e.eval("objects()") This parses the expression and then evaluates it as a top-level R task. The result is then processed by the standard conversion mechanism used in the R-Java interface (including any user-registered converters) and the resulting Java object returned. In this particular example, the R value returned from the expression is a character vector. This is converted to an array of String objects. So we can cast the val to this type and work with it as a regular Java object. [] if(val != null) { String[] objects = (String[])val; for(i = 0 ; i < objects.length; i++_ 1
System.err.println("("+i+") " + objects[i]); } Objects that have no translation need to be returned as an RForeignReference. They can then be manipulated in subsequent Java code in an application-specic fashion. The other versions of the eval() method in REvaluator allow one to control how the result of the R expression is converted back to a Java object. These allow one to specify whether an attempt should be made to nd a converter or whether the value should be returned immediately as a foreign reference so that it can be used later in other R expressions. Alternatively, one can specify the required return type as a Java interface. The conversion code will then create a new class which extends RForeignReference and implement the methods of the Java interface by calling the corresponding R functions on that R object.
1 Calling R functions
One will quickly nd that specifying calls to R functions via strings is quite limited. It is dicult to pass values computed in earlier computation as arguments to these calls. Instead, we want to be able to call the R functions in a more Java-like mechanism. We want to identify the function by name and then pass objects to it in a call of the form REvaluator.call(functionName, argArray, namedArgTable) In , we can use a simple syntax such as functionName(arg1, arg2, .., name1=value, name2=value) like we do in S. The standard conversion mechanism used in the inter-system interfaces is used to convert the Java arguments to R objects. This means that one can register a C routine, Java method, or S function to perform the conversion. Here are some simple examples of calling R functions that involve only primitive arguments. These are taken from the org.omegahat.R.Java.Examples.JavaRCall and one should consult that for more details. We start by creating the R interpreter instance in Java (ROmegahatInterpreter) which initializes the R engine, etc. Next, we create then the R evaluator REvaluator) instance which we will use to make the calls to the R engine. [Examples:RfromJava] ROmegahatInterpreter interp = new ROmegahatInterpreter(ROmegahatInterpreter.fixArgs(args), false); REvaluator e = new REvaluator();
Now we are ready to create the dierent calls to the R functions. The rst is a simple call to objects() with no arguments. This returns the names of the objects in the default, global environment. We know this returns an array of Strings (which may be null) and so we can perform the cast. [] String[] objects = (String[]) e.call("objects");
The second call species the element of Rs search path whose contents are to be returned. We can specify this by name or by index. In this case, we provide the name as a Java String. To do this, we create an array containing each of the arguments (1 in this case). [] String[] objects = (String[]) e.call("objects", new Object[]{"package:base"});
Now, we turn to calling the function seq() and giving it two and more arguments. The rst call is equivalent to the S expression seq(as.integer(1), as.integer(10)) Again, we create an array to store the arguments. In this case, we specify the arguments as Integer objects. And nally, we invoke the REvaluators call() method with these arguments. [] int[] seq; funArgs = new Object[2]; funArgs[0] = new Integer(1); funArgs[1] = new Integer(10); seq = (int[])e.call("seq", funArgs); Now we add a third and named argument, specically for the by parameter. We do this by creating We reuse the rst two arguments in the array funArgs. [Example:RfromJava] java.util.Hashtable namedArgs = new java.util.Hashtable(1); namedArgs.put("by", new Integer(2)); seq = e.call("seq", funArgs, namedArgs);
In some cases, it would be simpler to create an array with three arguments and to specify the names of the arguments separately. The following is equivalent to the previous example as it species two parallel arrays, one giving the argument values and the other giving the argument names [] funArgs = new Object[3]; funArgs[0] = new Integer(1); funArgs[1] = new Integer(10); funArgs[2] = new Integer(2); String[] names = new String[3]; names[2] = "by"; value = e.call("seq", funArgs, names);
These examples involve simple primitive data types. Suppose we want to pass a matrix to R and have it produce a pair-wise plot. (Note that what we do with the matrix within the R call is the relevant point of this example.) There are two basic approaches (as is true in general for all the inter-system conversion approaches we use.) by-value We can convert the Java matrix into an S matrix by copying its contents to S. by-reference We can create in R a proxy for the Java object and have the code that operates on it call methods on it using the $ operator or the .Java() function explictly.
The by-value appears simpler but requires code specic to the Java class. It also means that one cannot share the object between Java and S so that modifciations in one system are visible to the other. In our example, the code that generates the pairwise scatterplots (pairs()) is not written in a way to handle a reference to a Java object. Thus, we should use the by-value approach. We should note however, that we must start writing S code in a more general and exible manner so that we can pass objects from dierent systems as arguments to functions and have the same behaviour. Too many S functions are written using explicit knowledge of the representation of the data type and are not abstracted to use methods on the object. This is not only reduces the re-use of the code with respect to inter-system interfaces, but makes for code that is not robuts to small changes in design. The S4-style classes and more importantly, object oriented style classes are being added to SPlus and R and are important tools for serious software development in the S language. Back to our example and converting the Java matrix to an S matrix. As with all inter-system interfaces and distributed computing environments, we have to decide where to do the conversion. In other words, in which language do we write the computations that creates the S matrix object from the Java instance. We can do this in R/S-Plus, Java or C. Lets take what should be the simplest approach and create an R function that converts the Java matrix. We will assume that we have a DenseDoubleMatrix2D from the http://www.colt.orgColt package. The function is quite simple. It must obtain the values to put into the matrix and the dimensions of the matrix. The former can be retrieved by calling the toArray() method of the DenseDoubleMatrix2D object. Similarly, the number of rows and columns in the matrix can be obtained by calling the corresponding methods in the target object being converted. Note that the toArray() method returns an array of arrays containing double values. This is converted to an R object as ... [] coltConverter <function(x, klassName) { vals <- unlist(x$toArray()) matrix(vals, x$rows(), x$columns()) }
Along with the converter, we have to register a function that determines whether the converter can handle the object being converted. In this case, we only handle DenseDoubleMatrix2D objects and so the function need only compare names. [] coltMatch <function(x, klassName) { klassName == "cern.colt.matrix.impl.DenseDoubleMatrix2D" }
The nal step is to register these functions with the basic conversion mechanism used by the R-Java interface. [] setJavaFunctionConverter(converter, match, description="Colt DenseDoubleMatrix2D to R matrix", fromJava=T)
Exactly where and when this R expression is evaluated depends on the Java application and the R session. One can add it to a .First(), evaluate it using the eval() method in the REvaluator class, and so on.
This returns an object of class lm. We want to return this to the Java code. We have a variety of dierent options.
{ double[] coefficients; double[] residuals; int rank; public SLinearModelFit(double[] coeffs, double[] resids, int rank) { setCoefficients(coeffs); setResiduals(resids); setRank(rank); } public int getRank() { return(rank); } public void setRank(int v) { rank = v; } public double[] getResiduals() { return(residuals); } public void setResiduals(double[] vals) { residuals = vals; } public double[] getCoefficients() { return(coefficients); } public void setCoefficients(double[] vals) { coefficients = vals; } } We should note that we can partially automate the creation of this class denition. We can create an instance of the lm class and then examine its elements and their types. Next, we can write a converter function in R that creates an instance of this new class. [lmConvert.R] lmCvt <function(obj,...) { .JNew("org.omegahat.R.Java.Examples.SLinearModelFit", obj$coefficients, obj$residuals, obj$rank) }
Before we register the converter, we must load the Java library to make the S functions it provides available to the session. Note that at this point, we can make calls to the dierent functions that access the Omegahat interpreter (e.g. .Java(), .JNew(), etc.).
[lmConvert.R] library(Java) Again, we must register the converter function along with its matching function which determines whether the converter can handle a given object. [lmConvert.R] setJavaFunctionConverter(lmCvt, function(x,...){inherits(x,"lm")}, description="lm object to Java", fromJava=F)
We are now in a position to write some Java code that actually calls this R code and makes use of the converters. The steps are relatively simple. We create the Omegahat interpreter and the Java version of the REvaluator. We source the R code that denes the simLM() and the converter into the R session. We do this by evaluating an R expression. Note that we use a call to voidEval() since we are not interested in the return value. Now we are ready to invoke the simLM() and we do so by invoking the call() method of the REvaluator. We give it an array of the arguments, here containing just a single value which is an integer (Integer). The remainder of the code manipulates the result and prints out the coecients, residuals and rank using the show() method of the interpreter. [Main] static public void main(String[] args) { ROmegahatInterpreter interp = new ROmegahatInterpreter(ROmegahatInterpreter.fixArgs(args), false); REvaluator e = new REvaluator(); String rfile = "system.file(data, lmConvert.R, pkg=Java)"; System.err.println("executing: source(" + rfile +")"); e.voidEval("source(" + rfile +")"); Object val = e.call("simLM", new Object[]{new Integer(10)}); System.err.println("Result of simLM: " + val + " (" + val.getClass() + ")"); org.omegahat.R.Java.Examples.SLinearModelFit f = (org.omegahat.R.Java.Examples.SLinearModelFit) val; interp.show("Coefficients:"); interp.show(f.getCoefficients()); interp.show("Residuals:"); interp.show(f.getResiduals()); interp.show("Rank:"); interp.show(new Integer(f.getRank())); val = e.call("simLM", new Object[]{new Integer(10)}); }
Having compiled the Java code and created the lmConvert.R le, we can invoke this Java application using the RJava script that is installed with the Java package. In my setup, I invoke it as follows. [] /tmp/R/tmp/Java/scripts/RJava --example --class org.omegahat.R.Java.Examples.lmTest --gui=none --silent We specify the lmTest as the class whose main() method is to be run. The remaining arguments are passed to the R startup and turn o the loading of the graphics device code and the startup message or banner. The following is the code that actually denes the lmTest and allows it to be created directly from this document. [lmTest.java] package org.omegahat.R.Java.Examples; import org.omegahat.R.Java.REvaluator; import org.omegahat.R.Java.ROmegahatInterpreter; public class lmTest @use Main } The makele in the R/Java/Examples/ directory is responsible for creating the dierent les from this [] make make # Do it twice for the moment! {
Now, we can automatically generate a new class that implements this interface and inherits from RForeignRefrence. (See the function jdynamicCompile() in the Java package.) We create an instance of this class by rst registering the lm object with the foreign reference manager. [] ref <- foreignReference(lmValue) and then passing the result R reference object as the argument to the constructor of this new class
[] .JNew("LinearModelFitForeignReference", ref)
3 Setup
There are two ways in which we might want to call R from Java. One is that we have a regular Java application and we want to embed Java within it as a worker. Alternatively, we are in an R session and use R as the main controller and from there call some Java code that needs to evaluate an R expression. In the rst case, you will need to have built R as a shared library. You can do this by passing the argument --enable-shared to the configure script when building R from source. Then, make the regular R installation and the R shared library. That is, from within the top-level directory of the R distribution issue the following commands: makecd src/main make libR When Java is embedded in R, one need not do anything special other than installing and loading the Java package and following the steps to instantiate the REvaluator object and evaluating the expression(s).
4 Example
4.1 Command Line and Text Examples
When the Java package is installed, it provides a script scripts/RJava to allow one to run a Java application that can dynamically load R (assuming the shared library is available). You can use this to run the example provided in REvaluator provided by that classes main() method. Invoke it as Java/scripts/RJava --example --class org.omegahat.R.Java.Examples.JavaRCall --gui=none This prints the search path and the result of the R expression objects(package:base) It uses the Omegahat evaluator to print the results (and this is why they are truncated) Another example allows you to type R commands at a Java-controlled prompt. Invoke this as Java/scripts/RJava --example --gui=none Then, one sees the prompt [omegahat->R] and can type R expressions such as [omegahat->R] 6 [omegahat->R] a b c d e f g h i j k sum(1:3) letters
10
l m n o p q r s t u v w x y z