10.4 Resetting Data
While you are repeating this experiment, you may want to start some of the exercises from scratch. The easiest way is to re-create the Drip database, but you may not want to delete all the data if you’ve already written some data into MyDrip.
In such a case, let’s introduce a tag that indicates the start of the system. When we put an object with an rbcrawl-begin
tag, we’ll ignore any data prior to that so that we can treat it as if the system were empty.
In the example code, I used fence
for both the crawler and the indexer. When the script calls the older
or head
method, it checks the key timestamp and ignores if the value is older than the one set in fence
.
=> MyDrip.write('fence', 'rbcrawl-begin') |
|
>> 1313573767321913 |
There is another use case for the fence
pattern.
Imagine you finish the search system after writing index information to disk. The next time you start the search system, the newly started search process needs to know up to where it has been indexed. In such a case, you can leave fence
to indicate the progress of the indexing work so that the indexer can start from where it left off when it gets restarted. The earlier example uses the footprint to ignore any old data, but this situation uses the footprint to show the remaining tasks.