Monday, September 20, 2010

New featues of embedding API for JRuby 1.6

This month, a lot of work for JRuby's embedding API (RedBridge) has been done. Mainly, bug fixing. While I was fixing bugs, I eventually landed to add a new feature and change the area of sharing variables. These will be in JRuby 1.6. Currently, snapshot is available at http://ci.jruby.org/snapshots/ if you want to test it. I believe the changes are good to use Ruby more naturally, but those might affect the code already written a little bit. That why I'm writing this. If you are the user of JRuby's embedding API or JSR223, be aware of upcoming changes.


1. Sharing variables becomes a receiver sensitive

Before the changes, embedding API didn't mind the difference of receivers. A receiver means Ruby's receiver, which is returned as a result of evaluation. Variables and constants to be shared are injected to a top level, in other words, runtime's top self. Also, those should be retrieved from the top level variables and constants. However, this logic didn't work perfectly.

Firstly, a trouble happened in sharing instance variables. The reason is that embedding API didn't use consistent receivers to inject and retrieve instance variables. When multiple receivers were involved, multiple values were assigned to the same key. This ended up unwanted results occasionally.

In light of this, I added three methods to ScriptingContainer.

get(receiver, key)
put(receiver, key, value)
remove(receiver, key, value)

These methods explicitly interact with a given receiver. Existing get/put/remove methods will have top self receivers for the argument.

Let me show you example. I'm going to use the Ruby code, tree_sample.rb below:

class Tree
attr_accessor :name, :shape, :type

def initialize name, shape, type
@name = name
@shape = shape
@type = type
end

def name= name
@name = name
end

def to_s
"#{name.capitalize} is a(n) #{shape} shaped, #{type} tree."
end
end

When the code is evaluated by:

ScriptingContainer container = new ScriptingContainer(LocalContextScope.SINGLETHREAD);
container.runScriptlet(PathType.CLASSPATH, "ruby/tree_sample.rb");

The runtime caches the Tree class and returns nil, which is converted to null for Java code. Then, suppose two objects are instantiated:

Object tree1 = container.runScriptlet("Tree.new('any', 'pyramidal', 'evergreen')");
Object tree2 = container.runScriptlet("Tree.new('any', 'oval', 'deciduous')");

Tree1 and tree2 above are receivers. Before the changes, container retrieved instance variables from the receivers and saved in an internal variable table tied to the instance name at the end of runScriptlet method. As you know, there're two objects. Two values of each instance variable were assigned to the single key without any receiver info. The retrieval should be receiver sensitive as well as injecting.

After the change, the instance variable values are tied to both the key and receiver. In JRuby 1.6, you'll get expected results even though multiple receivers of the same class are there. For example, suppose callMethod methods are run for each receiver object:

container.callMethod(tree1, "name=", "pine");
container.callMethod(tree2, "name=", "poplar");
System.out.println(container.callMethod(tree1, "to_s", String.class));
System.out.println(container.callMethod(tree2, "to_s", String.class));

The result is the one expected:

Pine is a(n) pyramidal shaped, evergreen tree.
Poplar is a(n) oval shaped, deciduous tree.

For clarity, let me add more lines of sharing variable related method usages:

container.put(tree1, "@name", "camellia");
container.put(tree2, "@name", "cherry");
container.put(tree1, "@shape", "oval");
container.put(tree2, "@shape", "round");

System.out.println(container.callMethod(tree1, "to_s", String.class));
System.out.println(container.callMethod(tree2, "to_s", String.class));

System.out.println("@type of tree1: " + container.get(tree1, "@type"));
System.out.println("@type of tree2: " + container.get(tree2, "@type"));

Above prints:

Camellia is a(n) oval shaped, evergreen tree.
Cherry is a(n) round shaped, deciduous tree.
@type of tree1: evergreen
@type of tree2: deciduous


So far, I talked about instance variables. Constants are also receiver sensitive in JRuby 1.6. Look at the code below:

ScriptingContainer container = new ScriptingContainer(LocalContextScope.SINGLETHREAD);
String script =
"COLOR = 'pink'\n" +
"class ColorSample\n" +
" COLOR = 'orange'\n" +
"end\n" +
"ColorSample.new";
Object receiver = container.runScriptlet(script);
System.out.println("top level: " + container.get("COLOR"));
System.out.println("class: " + container.get(receiver, "COLOR"));

There're two constants which have the same name, COLOR. While container.get("COLOR")) gets a top level contant, container.get(receiver, "COLOR")) gets a constant from a given receiver. So, the output is:

top level: pink
class: orange


How about global and local variables? Global variables are receiver insensitive because those should be referenced globally. Local variables are always injected to/retrieved from the top level. It might have been possible to tie to the receiver, but that is not a good idea because Ruby code might use gems or third party libraries. The injected local variables from Java might be extraterrestrials for those libraries and cause unexpected results.

For JSR223 user, I added an "org.jruby.embed.receiver" attribute, but this needs more work to make it happen.


2. Variable/constant value retrieval becomes lazy by default


Before the change ScriptingContainer tried to retrieve variable/constant values as much as possible at the end of runScriptlet/callMethod and EvalUnit.run methods. This was convenient to get variables/constants defined in Ruby ready for Java. However, too many values were saved in the internal variable table. Those variables were injected to Ruby code in succeeding evaluations implicitly. Obviously, this behavior made performance and memory usage worse. When multiple gems were used, it would be serious.

In JRuby 1.6, all variables/constants are retrieved lazily except persistent local variables. This means the internal variable table will have minimum key-value pairs. When a get method of ScriptingContainer is called, the requested key-value pair is saved in the variable table. When key-value pairs have been put to the internal variable table before the evaluation, those will be updated right after the evaluation. Others are not. However, persistent local variables are exception. The values are retrieved eagerly as it was done before. This is because of the internal of JRuby and is helped by the policy that local variables to be shared are only top level ones. Probably, I can consider top level local variables are few.

The change will bring a good result; however, JSR223 users might be affected. The existence of JSR223's SimpleBindings and SimpleScriptContext are always headache to me. It's very hard to add a trick to those two. Workaround is to set false in lazy option and to use the method, ScriptEngine#getContext().getAttrubute() method. Otherwise, set a key and dummy value pair to the bindings. Then, JRubyEngine updates the value right after the evaluation. By the release of 1.6, I'll improve JSR223 support.


By this change, users program will be more natural in terms of Ruby coding even on JRuby's embedding API. My work on these has not yet finished. I'll add more tests and some measures to make these on JSR223.

No comments: