The dRuby Book

2.3 dRuby in the Real World

With dRuby, you can create distributed systems as if you were doing normal Ruby programming, which helps you turn your ideas for complex distributed systems into working applications quickly. dRuby offers a generic way to achieve RMI. Some people use dRuby to sketch their initial system and then swap with more specialized middleware as their systems grow. For inspiration, the following are some real-world examples of dRuby systems.

Hatena Screen Shot

Hatena[3] is Japan’s leading Web 2.0 company and provides various services, such as a blogging system and a social bookmark system. Hatena used to provide a service called Hatena Screen Shot, which generates a screenshot of a given URL and provides it as a thumbnail. The architecture of this service was unique because it consisted of different operating systems. The web frontend was built on Linux, and the screen-capturing component was built with a Windows Internet Explorer component. This is a good example of integrating a cross-platform system using dRuby. The system was also architected to be able to run screen-capturing services in parallel so that the component could scale horizontally by simply adding more Windows machines.

Twitter

Twitter used dRuby and Rinda before it built its own queuing system called Starling in Ruby (until the system was replaced by another in-house system written in Scala). At SDForum Silicon Valley Ruby Conference 2007, Blaine Cooke presented a talk called “Scaling Twitter.”[4] In the presentation, Cooke mentioned dRuby as “stupid easy and reasonably fast” (though he also described it as “kinda flaky”).

Buzztter

Buzztter[5] is a web service to analyze tweets from Twitter, which extracts trending keywords. The service started in 2007, before Twitter started its own “Trending Topics.” The service still provides a useful way to extract topics from Japanese tweets because Japanese sentences don’t have word separation and are therefore harder to analyze. Buzztter consists of multiple subsystems, and it used Rinda (Chapter 6, Coordinating Processes Using Rinda) as middleware to consume tweets from Twitter’s REST API. The system used Rinda for two years, eventually replacing it with RabbitMQ[6] in 2009. In November 2007, Rinda handled 125,000 tweets (72MB) a day.

RWiki

RWiki is a wiki system written with dRuby and still actively used in my workplace. It’s been running for more than ten years, and the system stores more than 40,000 pages in memory (more than 1GB). RWiki doesn’t use an RDBMS but logs pages in plain text. RWiki persists the data by recovering the log when the system restarts. RWiki uses Ruby Document (RD) as a document format. Once you write your wiki page, the page is stored as a Ruby object, and you can retrieve the content of the page (not just the entire source but also various components, such as chapters, sections, links, incoming links, and other customized attributes) via method invocations. RWiki acts as a wiki via HTTP but acts as an object database via dRuby. We use the object database in several ways. For example, we’ve been using agile methodologies for years, and we store user stories[7] and request tickets in the RWiki. Then we connect to RWiki pages via dRuby from a separate process to automatically generate TestSuite or aggregate statistical information of all the request tickets.

Libraries

Many libraries use dRuby to take advantage of its interprocess communication capability. Here are some examples:

god

https://github.com/mojombo/god: Process monitoring system.

RSpec

http://rspec.info/rails/runners.html: Testing framework. With the --drb option, you can speed up tests by preloading the entire Rails app.

BackgrounDRb

http://backgroundrb.rubyforge.org: Job server and scheduler. This tool off-loads longer-running tasks from the Ruby on Rails application.

They are all open source and good examples to study for how to use dRuby in various ways.