The dRuby Book

10.4 Resetting Data

While you are repeating this experiment, you may want to start some of the exercises from scratch. The easiest way is to re-create the Drip database, but you may not want to delete all the data if you’ve already written some data into MyDrip.

In such a case, let’s introduce a tag that indicates the start of the system. When we put an object with an rbcrawl-begin tag, we’ll ignore any data prior to that so that we can treat it as if the system were empty.

In the example code, I used fence for both the crawler and the indexer. When the script calls the older or head method, it checks the key timestamp and ignores if the value is older than the one set in fence.

  ​=> MyDrip.write('fence', 'rbcrawl-begin')​
  ​>> 1313573767321913​

There is another use case for the fence pattern. Imagine you finish the search system after writing index information to disk. The next time you start the search system, the newly started search process needs to know up to where it has been indexed. In such a case, you can leave fence to indicate the progress of the indexing work so that the indexer can start from where it left off when it gets restarted. The earlier example uses the footprint to ignore any old data, but this situation uses the footprint to show the remaining tasks.