The dRuby Book

4.1 Passing Objects Among Processes

In dRuby, you pass objects to another process via method arguments and receive objects via a return value. You can do this either by reference or by value. In this section, let’s explore the difference.

Passing Objects in Ruby

You usually pass by reference in Ruby. Let’s try it with irb.

  ​% irb --prompt simple​
  ​>> def foo(str); str.upcase!; end​
  ​>> my_str = "Hello, World."​
  ​>> foo(my_str)​
  ​=> "HELLO, WORLD."​
  ​>> my_str​
  ​=> "HELLO, WORLD."​

In the preceding example, we defined a method called foo, which calls upcase! internally. When we assign “Hello, World.” into the my_str variable and call the foo method, then my_str now becomes "HELLO, WORLD." This is because the upcase! method inside the foo method manipulated the string.

How about pass by value, which passes a copy of the object? Let’s try again with irb.

  ​>> my_str = "Hello, World."​
  ​=> "Hello, World."​
  ​>> foo(my_str.dup)​
  ​=> "HELLO, WORLD."​
  ​>> my_str​
  ​=> "Hello, World."​

If you give a copy of the object to the method arguments, then it becomes passed by value. You pass only the copy of my_str to the foo method, so my_str stays as "Hello, World." You can see that the original value had no impact on the foo method.

When you call a method in Ruby, it always chooses to pass by reference. To pass by value, you have to call the dup method on your own (see Figure 14, Passing by reference in a normal situation and passing by value using dup).

images/d2pass.png

Figure 14. Passing by reference in a normal situation and passing by value using dup

Passing Objects in dRuby

Now let’s look at how this differs in dRuby.

Passing Objects with Marshal

dRuby uses the Marshal class to pass an object to other processes. Marshal consists of a dump method that serializes an object into byte strings and a load method that deserializes it (see Figure 15, Using Marshal to create a copy of an object). You can also use Marshal to make a deep copy (which makes a different object with the same values).

images/d2marshal.png

Figure 15. Using Marshal to create a copy of an object

Let’s play with Marshal.

  ​% irb --prompt simple​
  ​>> class Foo​
  ​>> attr_accessor :name​
  ​>> end​
  ​>> it = Foo.new​
  ​>> it.name = 'Foo 1'​
  ​>> it​
  ​=> #<Foo:0x40200e34 @name="Foo 1">​
  ​>> str = Marshal.dump(it)​
  ​# Copy this string of bytes to the second terminal​
  ​=> "\x04\bo:\bFoo\x06:\n@nameI\"\nFoo 1\x06:\x06ET"​
  ​>> foo = Marshal.load(str)​
  ​# "foo" and "it" has different id !!​
  ​=> #<Foo:0x401f4008 @name="Foo 1">​

Let’s try again with a different terminal, but this time let’s copy the byte string we created using Marshal.dump.

  ​% irb --prompt simple​
  ​# Define Foo class at terminal 2​
  ​>> class Foo​
  ​>> attr_accessor :name​
  ​>> end​
  ​>> # Copy and paste the dump result at terminal 1​
  ​>> str = "\x04\bo:\bFoo\x06:\n@nameI\"\nFoo 1\x06:\x06ET" # Paste here.​
  ​>> Marshal.load(str)​
  ​# You got the copy of Foo object!​
  ​=> #<Foo:0x4021c0a8 @name="Foo 1">​

Make sure that you assigned the copied string into str. It should load the Foo object that you marshaled earlier.

Pass by Reference Value

In the previous example, we exchanged the object byte string using copy and paste manually. However, you can exchange objects across processes (and even machines) via sockets. dRuby uses Marshal to call methods, pass arguments, and return values of other objects (see Figure 16, Marshal object and transfer via socket).

images/d2marshal2.png

Figure 16. Marshal object and transfer via socket

When you use Marshal.dump and Marshal.load, Ruby always creates new objects. Does this mean dRuby always passes by value?

The answer is yes and no. When you pass by value, you simply serialize an object. When you pass by reference, instead of serializing the object, you serialize an object containing a reference to the original object. In other words, passing by reference passes the value of an object holding a reference. You can say that passing by reference passes the value of the reference object (see Figure 17, Passing by reference implemented by passing the value of the reference).

images/d2marshal3.png

Figure 17. Passing by reference implemented by passing the value of the reference

Remember that we used DRbObject to connect to the DRb server in Using the Service from irb? Well, DRbObject is the object that has the reference of the original object. DRbObject has two constructors for different purposes. The first one is the DRbObject.new_with_uri method, which allows you to create a reference object remotely using a URI. Another way is to use DRbObject.new(obj), which creates a reference object in its own process.

  ​% irb --prompt simple -r drb/drb -r pp​
  ​>> DRb.start_service​
  ​>> ary = [1, 2, 3]​
  ​[1, 2, 3]​
  ​>> ref = DRbObject.new(ary)​
  ​[1, 2, 3]​

DRbObject.new(obj) creates a reference object that refers to a specific object. We need to start up a server with DRb.start_service because we created a reference object for other processes to access. DRb.start_service allocates the URI for the server.

Let’s observe the inside of the ref object.

  ​>> pp ref​
  ​​
  ​=> #<DRb::DRbObject:...​
  ​ @ref=537879846,​
  ​ @uri="druby://localhost:41708">​
  ​ => [1, 2, 3]​

DRbObject contains two instance variables. The first one is @uri, which holds its own URI, and the other is __id__, which holds its identification number.

  ​>> ary.__id__​
  ​=> 537879846​
  ​​
  ​>> exit​

You should be able to see that __id__ contains the reference information.

Next, we’ll pass objects between two terminals, so make sure you have two terminals up and running. Let’s call the first terminal (the server) terminal 1 and call the second one (the client) terminal 2. At terminal 1, we’ll run DRb.start_service by assigning Hash as a front object. The URI will be associated with the Hash object, which is inside the front variable. You can treat this process as you would a server that contains a certain object.

  ​# [Terminal 1]​
  ​% irb --prompt simple -r drb/drb​
  ​>> front = {}​
  ​>> DRb.start_service('druby://localhost:1426', front)​
  ​=> #<DRb::DRbServer:0x ..... >​
  ​>> DRb.uri​
  ​=> "druby://localhost:1426"​

"druby://localhost:1426" is the URI for the service at terminal 1. You can access this via terminal 2. Let’s also start up a server at terminal 2.

  ​# [Terminal 2]​
  ​% irb --prompt simple -r drb/drb​
  ​>> DRb.start_service​
  ​>> there = DRbObject.new_with_uri("druby://localhost:1426")​
  ​=> #<DRb::DRbObject:0x2ac5fe94 @uri="druby://localhost:1426", @ref=nil>​

You can create a reference by assigning a URI. @uri holds druby://localhost:1426, which is used to specify DRbServer, and its reference variable is set at @ref=nil. When @ref is set to nil, it refers to a front object that is tied into a special URI.

OK, now let’s assign "Hello, World." a value of key 1 in the there hash object.

  ​# [Terminal 2]​
  ​>> str = "Hello, World."​
  ​>> there[1] = str​
  ​=> "Hello, World."​

Let’s examine the front object at terminal 1.

  ​# [Terminal 1]​
  ​>> front​
  ​=> {1=>"Hello, World."}​

Yes! The front object displays "Hello, World."

The String object "Hello, World." is transferred by dRuby in byte string format and restored at terminal 1. "Hello, World." should have been exchanged by value.

You might wonder how to prove that it’s really passed by value. Let’s do another experiment. Let’s change the original string using a destructive operation. You should also check that the ID of the str is the same before and after the operation.

  ​# [Terminal 2]​
  ​>> str.__id__​
  ​=> 358800978​
  ​>> str.sub!(/World/, 'dRuby')​
  ​=> "Hello, dRuby." # Destructive substitution.​
  ​>> str.__id__​
  ​=&gt; 358800978 # <= Make sure that they are still the same object.​

Let’s see how this impacted the object at terminal 1.

  ​# [Terminal 1]​
  ​>> front​
  ​=> {1=>"Hello, World."}​

The value of the front object remains as "Hello, World." The destructive operation at terminal 2 didn’t affect terminal 1.

Let’s also try changing "Hello, World" at terminal 1.

  ​# [Terminal 1]​
  ​>> front[1].sub!(/World/, 'Ruby')​
  ​=> "Hello, Ruby."​
  ​>> puts front[1]​
  ​Hello, Ruby.​
  ​=> nil​
  ​# [Terminal 2]​
  ​>> str​
  ​=> "Hello, dRuby."​

Phew—it has no impact either.

Next, we’ll pass by reference on purpose. Use DRbObject.new() to create a reference object (DRbObject). Specify the reference of the str object and set it to key 2 of the there hash.

  ​# [Terminal 2]​
  ​>> there[2] = DRbObject.new(str)​
  ​=> "Hello, dRuby."​

Let’s see what’s inside front.

  ​# [Terminal 1]​
  ​>> require 'pp'​
  ​>> pp front​
  ​{1=>"Hello, Ruby.",​
  ​ 2=>​
  ​ #<DRb::DRbObject:0x00000100a22550​
  ​ @ref=2157043220,​
  ​ @uri="druby://yourhost:1426">}​

You can see that front[2] is DRbObject. (irb in 1.8 used to print out the internals of an object, but the behavior has changed in Ruby 1.9. We’ll use pp for inspection instead.) So, what happens if you try to print it?

  ​>> puts front[1]​
  ​Hello, Ruby.​
  ​=> nil​
  ​>> puts front[2]​
  ​Hello, dRuby.​
  ​=> nil​

front[2] contains DRbObject, which is a reference. It prints out “Hello, dRuby.”

Are you still with me? To summarize what happened: puts is called, and it calls the to_s method of the object in the argument and returns the result. So, in this case, it called front[2].to_s and then printed out the result. This method invocation against DRbObject is transferred to its original object, which is "Hello, dRuby." in terminal 2. The flow is shown in Figure 18, Calling a method from terminal 1 to terminal 2.

images/d252pass.png

Figure 18. Calling a method from terminal 1 to terminal 2
  ​# [Terminal 1]​
  ​>> front[2].to_s​
  ​=> "Hello, dRuby."​

OK, let’s check what happens if we modify "Hello, dRuby." at terminal 2 and see how it impacts terminal 1.

  ​# [Terminal 2]​
  ​>> str.sub!(/dRuby/, 'Ruby and dRuby')​
  ​=> "Hello, Ruby and dRuby."​
  ​# [Terminal 1]​
  ​>> front[2].to_s​
  ​=> "Hello, Ruby and dRuby."​

Yay, str.sub! at terminal 2 changes terminal 1. This is just like the Ruby we know.

Hmmm, so what happens if we do it the other way around? Let’s try to change strings from terminal 1.

  ​# [Terminal 1]​
  ​>> front[2].sub!(/Ruby and dRuby/, 'World')​
  ​=> "Hello, World."​
  ​# [Terminal 2]​
  ​>> str​
  ​=> "Hello, World."​

front[2].sub!() at terminal 1 changed strings at terminal 2 just as we expected.

In this section, we experimented with intentionally passing by reference. We’ll continue the experiments, so keep irb open on both terminals!

Which Is a Server and Which Is a Client?

Remember that front[2].sub!() was calling an object at terminal 2? You may wonder which is the server and which is the client. When you observe method invocation from the dRuby point of view, the one that receives the method is the server and the one that sends the method is the client.

  ​there[1] = "Hello, World."​

When the setter or getter of there is called at terminal 2, terminal 1 is the server because the real object of there is at terminal 1, and terminal 2 is the client.

  ​front[2].sub!​

How about this case? The receiver will be terminal 2 because that’s where the string of front[2] exists, and terminal 1 is the client.

When processes join dRuby, either process can act as a server or a client.

Now let’s quit irb at terminal 2 and call front[2].sub! from terminal 1. It should raise a DRb::DRbConnError error because terminal 1 lost the connection to terminal 2.

Here’s a summary of what we learned in this section:

Joe asks:
Joe asks:
Why Do You Need DRb.start_service on the Client Side?

So far, we’ve been calling the DRb.start_service method even when a client connects to a server. You might be wondering why.

  ​DRb.start_service​
  ​DRbObject.new_with_uri('druby://:1234')​

The client code can access a remote object from the server side without calling the method, like this:

  ​irb(main):015:0> h = DRbObject.new_with_uri('druby://:1234')​
  ​=> #<DRb::DRbObject:0x10031cd00 @ref=nil, @uri="druby://:1234">
  ​irb(main):013:0> h​
  ​=> #<DRb::DRbObject:0x100348c98 @ref=nil, @uri="druby://:1234">
  ​irb(main):014:0> h['a']​
  ​=> 1​

So, why do we pass DRb.start_service? This will make sense once you understand how to pass by reference. Now let’s try to iterate the hash.

  ​irb(main):018:0> h.map{|a| a}​
  ​DRb::DRbConnError: DRb::DRbServerNotFound​
  ​ from /usr/lib/ruby/1.9/drb/drb.rb:1658:in `current_server'​
  ​ from /usr/lib/ruby/1.9/drb/drb.rb:1726:in​

Hmmm, we got a DRb::DRbServerNotFound error message. This is because we can’t perform a Marshal.dump on the Enumerator object that the map method returns, and therefore it passes by reference. This means that the server now requests that the client iterate the object. However, the server fails to connect to the client, because the client itself does not start up the service. Let’s try the same operation after we run DRb.start_service.

  ​irb(main):019:0> DRb.start_service​
  ​irb(main):020:0> h.map{|a| a}​
  ​=> [["a", 1]]​

It should work as expected. I do recommend that you run DRb.start_service even on the client side to avoid this kind of situation.