The dRuby Book

6.4 Toward Applications

So far, we’ve looked into basic distributed data structures using tuplespace. The next step is to convert these data structures into classes that applications can easily reuse. As a starting point, let’s extend TSStruct, which we created earlier.

TSStruct lets you read and write a field specified by Symbol. You can generate a new object by setting initialization values in the Struct or Hash instance as an object identifier. You can replace existing fields in TSStruct, but you can’t add or delete new fields once initialized.

  class TSStruct​
  def initialize(ts, name, struct=nil)​
  ​ @ts = ts​
  ​ @name = name || self​
  return unless struct​
  ​ struct.each_pair do |key, value|​
  ​ @ts.write([@name, key, value])​
  end
  end
  ​ attr_reader :name​
  def [](key)​
  ​ tuple = @ts.read([name, key, nil])​
  ​ tuple[2]​
  end
  def []=(key, value)​
  ​ replace(key) { |old_value| value }​
  end
  def replace(key)​
  ​ tuple = @ts.take([name, key, nil])​
  ​ tuple[2] = yield(tuple[2])​
  ensure @ts.write(tuple) if tuple​
  end
  end

Do you remember that each data structure requires a name that acts as an identifier?

Sem, Barrier, and RDStream require some sort of name. How would you generate such a unique name? There is no mechanism to check whether there are duplicate entities because the tuplespace is a bag and it allows duplicate entities. With this in mind, let’s think about how we should generate a unique name. The first scenario is when tuplespace is used by only one process. The easiest way to obtain a unique identifier is to set an object itself as a value. Set it as follows:

  ​key = Object.new​
  ​key2 = Object.new​
  ​p (key == keys2) # -> false

If there is only one process, you can easily use an object as an identifier. As long as an object exists in memory and until garbage collection (GC) wipes out the object, an object with the same ID won’t be generated. As an alternative to using Object.new, you can use the object that manages the data structure. "@name = name || self" sets self as the default value.

  class TSStruct​
  def initialize(ts, name, struct)​
  ​ @ts = ts​
  ​ @name = name || self​
  ​ struct.each_pair do |key, value|​
  ​ @ts.write([@name, key, value])​
  end
  end
  ​ ...​
  end

This writes TSStruct itself as an element of a tuple into TupleSpace. What if multiple processes access the same object via dRuby? One strategy is to convert the object into a DRbObject object before name is sent to a remote location. You can include DRbUndumped to solve this problem.

  class TSStruct​
  ​ include DRbUndumped​
  ​ ...​
  end

DRbObject consists of a URI of the process where an object exists (this is actually DRbServer, which starts up the process) and a unique object identifier within the process. Therefore, DRbObject can be unique within TupleSpace where TSStruct data exists.

When a process that generates TSStruct and a process that holds TupleSpace are different, then there are a few problems. One is a performance issue, and another is an object life span.

When a process accesses TSStruct, methods are invoked in both the process that generated the object and the process that holds TupleSpace where the TSStruct is stored.

There will be fewer RMI calls if the client process also holds TSStruct methods or invokes them at TupleSpace remotely. It shows off the great flexibility of dRuby if the client can hold TSStruct methods as well, but it will enhance performance if TupleSpace holds them.

The problem of object life span is trickier. If the process that generated the object dies, then the generated TSStruct becomes unavailable, and it leaves unused tuples in TupleSpace. This also causes problems with name identification. dRuby’s URI is unique only while the process is up and running. Once the process that generated the object dies and a new dRuby service starts, then the same port number might be allocated.

The easier workaround to avoid the GC is that a process that holds TupleSpace should also hold TSStruct. (It’s a bit sad to leave TupleSpace running on the back end, but you sometimes need a simpler solution for practical applications.)