9.3 Drip Compared to Hash

The dRuby Book > Chapter 9: Drip: A Stream-Based Storage System > 9.3 Drip Compared to Hash

9.3 Drip Compared to Hash

In this section, you’ll learn advanced usage of Drip by comparing it to KVS or Hash.

Using Tags

Drip#write will allow you to store an object with tags. The tags must be instances of String. You can specify multiple tags for one object. You can read with tag names, which lets you retrieve objects easily. By leveraging these tags, you can simulate the behavior of Hash with Drip.

Let’s treat tags as Hash keys. “write with tags” in Drip is equivalent to “set a value to a key” in Hash. “read the latest value with the given tag” is equivalent to reading a value from Hash with the given tag. Since “the latest value” in Drip is equivalent to a value in Hash, “the older than the latest value” in Drip is similar to a Hash with version history.

Accessing Tags with head and read_tag Methods

In this section, we’ll be using the head and read_tag methods.

Drip#head(n=1, tag=nil)

head returns an array of the first n elements. When you specify tags, then it returns n elements that have the specified tags. head doesn’t block, even if Drip has fewer than n elements. It only views the first n elements.

Drip#read_tag(key, tag, n=1, at_least=1, timeout=nil)

read_tag has a similar operation to read, but it allows you to specify tags. It only reads elements with the specified tags. If elements newer than the specified keys don’t have at_least elements, then it will block until enough elements arrive. This lets you wait until elements with certain tags are stored.

Experimenting with Tags

Let’s emulate the behavior of Hash using head and read_tag. We’ll keep using the MyDrip we invoked earlier.

First, let’s set a value. This is how you usually set a value in a Hash.

hash['seki.age'] = 29

And here is the equivalent operation using Drip. You write a value 29 with the tag seki.age.

	`>> MyDrip.write(29, 'seki.age')`
	`=> 1313358208178481`

Let’s use head to retrieve the value. Here is the command to take the first element with a seki.age tag.

	`>> MyDrip.head(1, 'seki.age')`
	`=> [[1313358208178481, 29, "seki.age"]]`

The element consists of [key, value, tags] as an array. If you’re interested only in reading values, you can assign key and value into different variables as follows:

	`>> k, v = MyDrip.head(1, 'seki.age')[0]`
	`=> [[1313358208178481, 29, "seki.age"]]`
	`>> v`
	`=> 29`

Let’s reset the value. Here is the equivalent operation in Hash:

hash['seki.age'] = 49

To change the value of seki.age to 49 in Drip, you do exactly the same as before. You write 49 with the tag seki.age. Let’s try to check the value with head.

	`>> MyDrip.write(49, 'seki.age')`
	`=> 1313358584380683`
	`>> MyDrip.head(1, 'seki.age')`
	`=> [[1313358584380683, 49, "seki.age"]]`

You can check the version history by retrieving the history data. Let’s use head to take the last ten versions.

	`>> MyDrip.head(10, 'seki.age')`
	`=> [[1313358208178481, 29, "seki.age"], [1313358584380683, 49, "seki.age"]]`

We asked for ten elements, but it returned an array with only two elements, because that’s all Drip has for seki.age tags. Multiple results are ordered from older to newer.

What happens if you try to read a nonexistent tag (key in Hash)?

	`>> MyDrip.head(1, 'sora_h.age')`
	`=> []`

It returns an empty array. It doesn’t block either. head is a nonblocking operation and returns an empty array if there are no matches.

If you want to wait for a new element of a specific tag, then you should use read_tag.

>> MyDrip.read_tag(0, 'sora_h.age')

It now blocks. Let’s set up the value from a different terminal.

	`>> MyDrip.write(12, 'sora_h.age')`
	`=> 1313359385886937`

This will unblock the read_tag and return the value that you just set.

	`>> MyDrip.read_tag(0, 'sora_h.age')`
	`=> [[1313359385886937, 12, "sora_h.age"]]`

Let’s recap again. In this section, we saw that with tags we can simulate the basic operation of setting and reading values from Hash.

The difference is as follows:

You can’t remove an element.
It has a history of values.
There are no keys/values.

You can’t remove an element like you do in Hash, but you can work around by adding nil or another special object that represents the deleted status. As a side effect of not being able to remove elements, you can see the entire history of changes.

I didn’t create keys and each methods on purpose. It’s easy to create them, so I created them once but deleted them later. There are no APIs in Drip at this moment. To implement keys, you need to collect all elements first, but this won’t scale when the number of elements becomes very big. I assume this is why many distributed hash tables don’t have keys.

There are also some similarities with TupleSpace. You can wait for new elements or their changes with read_tag. This is a limited version of read pattern matching in Rinda TupleSpace. You can wait until elements with certain tags arrive. This pattern matching is a lot weaker than Rinda’s pattern matching, but I expect that this is enough for the majority of applications.

When I created Drip, I tried to make the specification narrower than that of Rinda so that it’s simple enough to optimize. Rinda represents an in-memory, Ruby-like luxurious world, whereas Drip represents a simple process coordination mechanism with persistency in mind.

To verify my design expectations, we need a lot more concrete applications.

In the previous two sections, we explored Drip in comparison with Queue and Hash. You can represent some interesting data structures using this simple append-only stream. You can stream the world using Drip because you can traverse most of the data structures one at a time.

▲ ▼ 9.4 Browsing Data with Key