The dRuby Book

8.3 Persisting a Tuple with PTupleSpace

TupleSpace in Rinda makes complex interprocess communication very easy. However, the fact that the entire tuplespace is in volatile memory is worrisome—you could lose all your data when the system crashes. That’s a bit frightening. To work around the problem, I created a persistency feature; let’s discuss its architecture and limitations.

Persistency in TupleSpace

Let’s get acquainted with the basic use of PTupleSpace while learning about its crash and recovery mechanism.

PTupleSpace is a subclass of TupleSpace. It keeps logging the change of the tuple state. When PTupleSpace is restarted, it recovers to the last state of the tuple.

Using PTupleSpace is easy; here’s an example:

ptuple.rb
  ​require 'rinda/ptuplespace'
  ​store = Rinda::TupleStoreLog.new('ts_log')​
  ​Rinda::setup_tuple_store(store)​
  ​​
  ​DRb.install_id_conv(Rinda::TupleStoreIdConv.new)​
  ​ts = Rinda::PTupleSpace.new​
  ​DRb.start_service('druby://localhost:23456', ts)​
  ​ts.restore​
  ​​
  ​ts.write(['Hello', 'World'])​
  ​p ts.read_all(['Hello', nil])​
  ​p ts.take(['Hello', nil])​
  ​​
  ​x = ts.write(['Hello', 'cancel'], 2)​
  ​p ts.read_all(['Hello', nil])​
  ​ref = DRbObject.new(x)​
  ​ref.cancel​
  ​p ts.read_all(['Hello', nil])​
  ​x = ts.write(['Hello', 'World'])​
  ​​
  ​p DRbObject.new(x)​

Running the script should show output like this:

  ​$ ruby ptuple.rb​
  ​[["Hello", "World"]]​
  ​["Hello", "World"]​
  ​[["Hello", "cancel"]]​
  ​[]​
  #<Rinda::PTupleEntry:0x00000101392388>

Let’s see what just happened. To use PTupleSpace, you need to prepare TupleStore first. This becomes the persistency layer of this PTupleSpace process. There are different kinds of TupleStore, so you can switch between them. In this example, I used a simple persistency layer using Marshal, called TupleStoreLog. To use it, you need to specify the directory name to save the TupleStoreLog logs.

  ​store = Rinda::TupleStoreLog.new('ts_log')​
  ​Rinda::setup_tuple_store(store)​

Once the persistency layer is set up, you can create a PTupleSpace and then recover to the previous state.

  ​ts = Rinda::PTupleSpace.new​
  ​ts.restore​

This will rebuild itself based on the information logged before you call the restore method.

Hmmm, this statement is unfamiliar to me. What does this do?

  ​DRb.install_id_conv(Rinda::TupleStoreIdConv.new)​

This is a trick to use when TupleSpace crashes so that the same DRbObject can refer to the same object after the recovery. With the default behavior of dRuby, DRbObject uses Object.object_id as reference information, and this will be lost when the process restarts. To work around this problem, this trick uses a static (nonvolatile) ID as reference information so that a process can refer to the same object after the process restarts.

You can set a timeout or cancel the timeout of the tuple in the same way as you use TupleSpace.

Comparing PTupleSpace and Key-Value Store

Now let’s compare PTupleSpace with a key-value store and discuss their similarities and differences. We’ll also discuss the pros and cons of its crash and recovery strategy, which is vital to any persistency layer.

Can We Use a Key-Value Store?

Let’s think about what TupleSpace means from an API point of view. TupleSpace is a group of tuples. This is categorized as a Bag (see Section 6.3, Basic Distributed Data Structures for more details), because you can have duplicate keys.

The acronym KVS is popular nowadays as part of the NoSQL movement. The definition of KVS varies, but it is often common to have a similar API as Ruby’s Hash. Hash is a “dictionary” that consists of a pair of unique keys and its value.

It isn’t easy to represent a dictionary using TupleSpace. Let’s consider what would happen if we represented a dictionary with a [key, value] tuple. Reading the data looks easy.

  ​@ts.read([key, nil])​

How about adding an element?

  ​@ts.write([key, value])​

This code doesn’t prevent you from writing duplicate keys. To prevent that, you need to lock the entire TupleSpace, remove the tuple, and then write a new one.

  def []=(key, value)​
  ​ lock = @ts.take([:global_lock])​
  ​ @ts.take([key, nil], 0) rescue nil​
  ​ @ts.write([key, value])​
  ensure
  ​ @ts.write(lock) if lock​
  end

You need this global lock even when you read data. This is because the other thread may be updating the data while you are reading it.

  def [](key)​
  ​ lock = @ts.take([:global_lock])​
  ​ _, value = @ts.read([key, nil], 0) rescue nil​
  return value​
  ensure
  ​ @ts.write(lock) if lock​
  end

You don’t need a global lock if an element doesn’t increase or decrease (as shown in Structs and Hashes). You can’t access the element if someone is updating an element. However, you can simply wait until you can read the element. This requires only a local lock.

How about each? There isn’t a good way to go through the entire TupleSpace. You need to run read_all, generate an array of all the elements, and then delegate each to the array.

  def each(blk)​
  ​ lock = @ts.take([:global_lock])​
  ​ @ts.read_all([nil, nil]).each(&blk)​
  ensure
  ​ @ts.write(lock) if lock​
  end

This method works when the number of elements are relatively small, but the method call will become slower as the data size becomes bigger.

It’s difficult to implement each and keys in a distributed hash table with a relatively low execution cost. Some of the currently popular storage options have a sequence of elements sorted by keys. If the storage option has a sorting functionality, it will be easier to browse a relatively large data space. You could even customize your key to include versioning information like this:

  def []=(key, value)​
  ​ @rbtree[[key, Time.now]] = value​
  end

However, it’s difficult to replicate this behavior using Rinda’s TupleSpace, because Rinda doesn’t provide sorted order functionality.

By the way, did you really want to use Hash as a data structure? In some cases, Bag may be sufficient.

Managing Crash and Recovery

As explained in Section 6.1, Introducing Linda and Rinda, the concept of Tuple is similar to a memo in the real world. The “memorandum” tuple walks from process to process via TupleSpace.

PTupleSpace saves only the “memorandum” stored in the TupleSpace. When a process holds a “memorandum,” it’s out of the hands of PTupleSpace, and it won’t be persisted. You can’t save the state that you are waiting on for a tuple either. You can’t recover the exact state of process coordination.

If you expect TupleSpace to be just a storage for a “memorandum,” then this should be sufficient. You can restore any information once it is written to PTupleSpace. This is what most applications probably expect for persistency. How useful is TupleSpace compared to just exposing Array or Hash via dRuby then? The strong pattern matching of Rinda TupleSpace could be an advantage. In exchange for this strong pattern matching capability, however, I wasn’t able to use a more efficient data structure when I implemented TupleSpace. It may become slow when the number of tuples increases because it does a linear search internally.

How about process coordination, which is the primary role of TupleSpace? Let’s think about what happens when PTupleSpace crashes and you need to recover it. When a PTupleSpace process halts, any mutable RMI call to change the state of a tuple (such as write and take) will raise a dRuby exception. When PTupleSpace is restarted, it recovers to the last known state, and you have to write logic to retry any failed operations once the restart completes. These are difficult and time-consuming steps to write.

The use of RMI causes another problem. Let’s think about some mutable operations to change the state of tuples, such as write or take. In a normal Ruby method call, the control flow goes back to the caller as soon as the work of the callee completes. On the other hand, there is one extra step between these two steps in RMI. RMI requires socket communication between the caller and the callee. From the caller’s point of view, it won’t be able to know whether the exception happened before the method call to the callee ended or before the result reached the caller. It could crash even when the tuple is logged into the secondary storage of PTupleSpace and right before the result reaches the client. If I change the implementation to write a log after all the operations complete, then it may crash after the client received the tuple but before saving to storage.

Not only can PTupleSpace crash, but its client application can also crash. It may be beyond the responsibility of PTupleSpace, but let’s think about such a situation. If a process crashes after it took out a “memorandum,” then there is no way to recover. Let’s look at this short script:

  def succ​
  ​ _, count = @ts.take([:count, nil])​
  ​ count += 1​
  yield(count)​
  ensure
  ​ @ts.write([:count, count) if count​
  end

This example takes [:count, Integer], increments the number, and then writes it back. While the “memorandum” is in our process, other processes can’t read or take the same “memorandum,” so we can safely increment the counter. But what if our process dies while we are manipulating the memorandum? Since PTupleSpace can recover only a memorandum inside PTupleSpace, we’ll lose the memorandum forever. If other processes were also waiting to manipulate the counter, then all the other processes will halt. In such a situation, not only do you need to restart TupleSpace and all the other processes that depend on the TupleSpace (I guess you have to do it anyway when you are dealing with process coordination), but you also have to reset the counter. Saving the tuple in such a scenario misses the point.

PTupleSpace works to persist TupleSpace itself, but that isn’t enough to recover all the process coordination state. You may feel a bit cheated if you expected TupleSpace to handle the entire crash and recovery scenario.