The dRuby Book

2.1 Understanding Distributed Object Systems

Client-server systems are among the most well-known ways to build distributed systems or applications. A distributed object system is an enhancement of this client-server model. It’s a library in which you can build distributed applications using object-oriented programming.

When you write a distributed application, you have to pay special attention to network programming. If you don’t, you may end up spending more time dealing with network programming issues than building application logic.

Many developers have tried coming up with libraries that let you easily program distributed applications by hiding these complex interprocess networking protocols. Let’s take a look at the available libraries.

Remote Functions with RPC and RMI

Remote Procedure Call (RPC) is a way to call remote functions as if you were calling local functions (see Figure 5, How RPC works). It generates a client stub from interface descriptions. The client stub hides network programming logic so that you don’t have to worry about the location of the server or how to connect.

images/d2rpc.png

Figure 5. How RPC works: Client -> Stub -> Network -> Stub -> Server

In Figure 5, How RPC works, the client stub converts function calls into network communication. The “server stub” receives the communication from the client, invokes the main function, and then returns the result. The server stub is often called the skeleton or framework. It not only executes the incoming function request but also acts as a listener to wait for any incoming calls.

Remote Method Invocation (RMI) is a way to extend method invocation remotely, and the concept is very similar to RPC (see Figure 6, How RMI works). The main difference is how you think about the concept. RPC “calls” remote functions, whereas RMI “sends” a message to remote objects. RMI also provides a client stub and a server stub to hide the interprocess communication layer. The server stub is in charge of network server programming and identifies which object should receive the call.

images/d2rmi.png

Figure 6. How RMI works: Client -> Stub -> Network -> Stub -> Server

Clients can call methods without worrying about the location of the receiver object. You can also use the remote object reference as if it existed locally. For example, you can set a reference of a remote object into a variable or pass it as a method argument. The type of library where you can treat remote objects and local objects equally is called a distributed object or a distributed object system. A distributed object is also referred to as a remotely located object (in contrast to a local object).

Distributed Objects from a Programming Perspective

So far, we’ve learned the semantics of distributed systems. Let’s now think about how these distributed systems affect our programming style.

When we write normal programs, all objects, variables, and methods are allocated inside one process space. Each process area is protected by the operating system (OS), and they can’t access each other (see Figure 7, Location of objects and processes for a normal system).

images/d2inprc.png

Figure 7. Location of objects and processes for a normal system. Objects can’t access each other across processes.

In a distributed object system, we treat other processes or objects in other machines as if they were in the same process space (see Figure 8, Location of objects and processes within a distributed object systems).

images/d2inprc2.png

Figure 8. Location of objects and processes within a distributed object systems. Objects can access each other across processes.

How local processes and remote processes behave differently depends on their implementation. For example, some systems may be able to pass remote objects into arguments or return remote objects, but others may not. Some systems may require you to take extra steps when local objects communicate with remote objects.

Furthermore, there are often differences between “objects” in distributed systems and “objects” in programming languages. The smaller the difference is, the more seamless it will be to switch between programming languages.

The Popular Distributed Object Systems

Distributed Component Object Model (DCOM), Common Object Request Broker Architecture (CORBA), and Java RMI are widely known, and of course dRuby is also a distributed object system.

While Java RMI and dRuby are tightly coupled with their hosting languages, DCOM and CORBA are language-independent systems. C++, Java, and non-OOP languages such as C can use them.

DCOM, CORBA, and Java RMI require us to define the interface for stub generation because they are statically typed languages. For example, DCOM and CORBA require us to write an interface using a language called Interactive Data Language (IDL); see Figure 9, Writing interfaces using IDL. Once we generate stubs from IDL, we need to link them to all the clients that may use these remote objects in advance.

images/d2idl.png

Figure 9. Writing interfaces using IDL

Dynamically typed languages, such as Cocoa/Objective-C and dRuby, don’t need IDL because methods are linked at execution time. We also don’t need to link to the stub of every single class. Instead, we link to only one class for the client and one for the server. This sounds so easy compared to statically typed languages, but you need to be aware of one thing. When you want to copy an object of unknown remote class (rather than just calling remote methods), then the class definition of the remote object must exist locally. Your programming style becomes very different between a system that needs to know the interface in advance and a system that doesn’t need to know it.

So far, we’ve seen the different flavors of distributed systems in other languages. Some are language dependent, and others aren’t. Also, distributed systems in statically typed languages tend to require the interface of remote objects to be defined as IDL, while dynamic languages don’t. Next, let’s see how dRuby fits into this distributed system paradigm.