8.3 Persisting a Tuple with PTupleSpace
TupleSpace
in Rinda makes complex interprocess communication very easy. However, the fact that the entire tuplespace is in volatile memory is worrisome—you could lose all your data when the system crashes. That’s a bit frightening. To work around the problem, I created a persistency feature; let’s discuss its architecture and limitations.
Persistency in TupleSpace
Let’s get acquainted with the basic use of PTupleSpace
while learning about its crash and recovery mechanism.
PTupleSpace
is a subclass of TupleSpace
. It keeps logging the change of the tuple state. When PTupleSpace
is restarted, it recovers to the last state of the tuple.
Using PTupleSpace
is easy; here’s an example:
ptuple.rb | |
require 'rinda/ptuplespace' |
|
store = Rinda::TupleStoreLog.new('ts_log') |
|
Rinda::setup_tuple_store(store) |
|
|
|
DRb.install_id_conv(Rinda::TupleStoreIdConv.new) |
|
ts = Rinda::PTupleSpace.new |
|
DRb.start_service('druby://localhost:23456', ts) |
|
ts.restore |
|
|
|
ts.write(['Hello', 'World']) |
|
p ts.read_all(['Hello', nil]) |
|
p ts.take(['Hello', nil]) |
|
|
|
x = ts.write(['Hello', 'cancel'], 2) |
|
p ts.read_all(['Hello', nil]) |
|
ref = DRbObject.new(x) |
|
ref.cancel |
|
p ts.read_all(['Hello', nil]) |
|
x = ts.write(['Hello', 'World']) |
|
|
|
p DRbObject.new(x) |
Running the script should show output like this:
$ ruby ptuple.rb |
|
[["Hello", "World"]] |
|
["Hello", "World"] |
|
[["Hello", "cancel"]] |
|
[] |
|
#<Rinda::PTupleEntry:0x00000101392388> |
Let’s see what just happened. To use PTupleSpace
, you need to prepare TupleStore
first. This becomes the persistency layer of this PTupleSpace
process. There are different kinds of TupleStore
, so you can switch between them. In this example, I used a simple persistency layer using Marshal
, called TupleStoreLog
. To use it, you need to specify the directory name to save the TupleStoreLog
logs.
store = Rinda::TupleStoreLog.new('ts_log') |
|
Rinda::setup_tuple_store(store) |
Once the persistency layer is set up, you can create a PTupleSpace
and then recover to the previous state.
ts = Rinda::PTupleSpace.new |
|
ts.restore |
This will rebuild itself based on the information logged before you call the restore
method.
Hmmm, this statement is unfamiliar to me. What does this do?
DRb.install_id_conv(Rinda::TupleStoreIdConv.new) |
This is a trick to use when TupleSpace
crashes so that the same DRbObject
can refer to the same object after the recovery. With the default behavior of dRuby, DRbObject
uses Object.object_id
as reference information, and this will be lost when the process restarts. To work around this problem, this trick uses a static (nonvolatile) ID as reference information so that a process can refer to the same object after the process restarts.
You can set a timeout or cancel the timeout of the tuple in the same way as you use TupleSpace
.
Comparing PTupleSpace and Key-Value Store
Now let’s compare PTupleSpace
with a key-value store and discuss their similarities and differences. We’ll also discuss the pros and cons of its crash and recovery strategy, which is vital to any persistency layer.
Can We Use a Key-Value Store?
Let’s think about what TupleSpace
means from an API point of view. TupleSpace
is a group of tuples. This is categorized as a Bag
(see Section 6.3, Basic Distributed Data Structures for more details), because you can have duplicate keys.
The acronym KVS is popular nowadays as part of the NoSQL movement. The definition of KVS varies, but it is often common to have a similar API as Ruby’s Hash
.
Hash
is a “dictionary” that consists of a pair of unique keys and its value.
It isn’t easy to represent a dictionary using TupleSpace
. Let’s consider what would happen if we represented a dictionary with a [key, value]
tuple. Reading the data looks easy.
@ts.read([key, nil]) |
How about adding an element?
@ts.write([key, value]) |
This code doesn’t prevent you from writing duplicate keys. To prevent that, you need to lock the entire TupleSpace
, remove the tuple, and then write a new one.
def []=(key, value) |
|
lock = @ts.take([:global_lock]) |
|
@ts.take([key, nil], 0) rescue nil |
|
@ts.write([key, value]) |
|
ensure |
|
@ts.write(lock) if lock |
|
end |
You need this global lock even when you read data. This is because the other thread may be updating the data while you are reading it.
def [](key) |
|
lock = @ts.take([:global_lock]) |
|
_, value = @ts.read([key, nil], 0) rescue nil |
|
return value |
|
ensure |
|
@ts.write(lock) if lock |
|
end |
You don’t need a global lock if an element doesn’t increase or decrease (as shown in Structs and Hashes). You can’t access the element if someone is updating an element. However, you can simply wait until you can read the element. This requires only a local lock.
How about each
? There isn’t a good way to go through the entire TupleSpace
. You need to run read_all
, generate an array of all the elements, and then delegate each
to the array.
def each(blk) |
|
lock = @ts.take([:global_lock]) |
|
@ts.read_all([nil, nil]).each(&blk) |
|
ensure |
|
@ts.write(lock) if lock |
|
end |
This method works when the number of elements are relatively small, but the method call will become slower as the data size becomes bigger.
It’s difficult to implement each
and keys
in a distributed hash table with a relatively low execution cost.
Some of the currently popular storage options have a sequence of elements sorted by keys.
If the storage option has a sorting functionality, it will be easier to browse a relatively large data space.
You could even customize your key to include versioning information like this:
def []=(key, value) |
|
@rbtree[[key, Time.now]] = value |
|
end |
However, it’s difficult to replicate this behavior using Rinda’s TupleSpace
, because Rinda doesn’t provide sorted order functionality.
By the way, did you really want to use Hash
as a data structure? In some cases, Bag
may be sufficient.
Managing Crash and Recovery
As explained in Section 6.1, Introducing Linda and Rinda, the concept of Tuple
is similar to a memo in the real world. The “memorandum” tuple walks from process to process via TupleSpace
.
PTupleSpace
saves only the “memorandum” stored in the TupleSpace
. When a process holds a “memorandum,” it’s out of the hands of PTupleSpace
, and it won’t be persisted. You can’t save the state that you are waiting on for a tuple either. You can’t recover the exact state of process coordination.
If you expect TupleSpace
to be just a storage for a “memorandum,” then this should be sufficient. You can restore any information once it is written to PTupleSpace
. This is what most applications probably expect for persistency. How useful is TupleSpace
compared to just exposing Array
or Hash
via dRuby then? The strong pattern matching of Rinda TupleSpace
could be an advantage. In exchange for this strong pattern matching capability, however, I wasn’t able to use a more efficient data structure when I implemented TupleSpace
. It may become slow when the number of tuples increases because it does a linear search internally.
How about process coordination, which is the primary role of TupleSpace
? Let’s think about what happens when PTupleSpace
crashes and you need to recover it. When a PTupleSpace
process halts, any mutable RMI call to change the state of a tuple (such as write
and take
) will raise a dRuby exception. When PTupleSpace
is restarted, it recovers to the last known state, and you have to write logic to retry any failed operations once the restart completes. These are difficult and time-consuming steps to write.
The use of RMI causes another problem. Let’s think about some mutable operations to change the state of tuples, such as write
or take
. In a normal Ruby method call, the control flow goes back to the caller as soon as the work of the callee completes. On the other hand, there is one extra step between these two steps in RMI. RMI requires socket communication between the caller and the callee. From the caller’s point of view, it won’t be able to know whether the exception happened before the method call to the callee ended or before the result reached the caller. It could crash even when the tuple is logged into the secondary storage of PTupleSpace
and right before the result reaches the client. If I change the implementation to write a log after all the operations complete, then it may crash after the client received the tuple but before saving to storage.
Not only can PTupleSpace
crash, but its client application can also crash. It may be beyond the responsibility of PTupleSpace
, but let’s think about such a situation. If a process crashes after it took out a “memorandum,” then there is no way to recover. Let’s look at this short script:
def succ |
|
_, count = @ts.take([:count, nil]) |
|
count += 1 |
|
yield(count) |
|
ensure |
|
@ts.write([:count, count) if count |
|
end |
This example takes [:count, Integer]
, increments the number, and then writes it back. While the “memorandum” is in our process, other processes can’t read or take the same “memorandum,” so we can safely increment the counter. But what if our process dies while we are manipulating the memorandum? Since PTupleSpace
can recover only a memorandum inside PTupleSpace
, we’ll lose the memorandum forever. If other processes were also waiting to manipulate the counter, then all the other processes will halt. In such a situation, not only do you need to restart TupleSpace
and all the other processes that depend on the TupleSpace
(I guess you have to do it anyway when you are dealing with process coordination), but you also have to reset the counter. Saving the tuple in such a scenario misses the point.
PTupleSpace
works to persist TupleSpace
itself, but that isn’t enough to recover all the process coordination state. You may feel a bit cheated if you expected TupleSpace
to handle the entire crash and recovery scenario.