Bluemini.comBluemini.com

Maintaining Cache Validity over a Multi Server Clu

posted: 06 May 2011

Overview

Increasingly, data is being held in memory caches on individual servers over a cluster of multiple machines, to reduce the load and latency associated with repeated lookups in a master data store. These caches are essentially key/value stores, held in application memory. Validity of the data in each cache is of greater importance than synchronicity so whilst notification of updates must be reliably cascaded out to all other nodes, the underlying data is not necessary. We have proposed and implemented a lightweight, difference based, approach to keeping cached data valid at the expense of keeping it synchronized using JMS message services.

Validity vs Synchronicity

A cache is only as valuable as the data that it holds. If that data is out of date (stale or invalid) then this can affect the users of systems that use the cache and ultimately leads to further problems with data integrity later on.

Synchronicity, on the other hand, was less of an issue. If one cache contained data that another cache did not, then the corresponding application could fetch the data from the master data store and save it to the cache. This relies on all servers in the cluster having appropriate access to the underlying data store.

One important aspect of this deferred synchronicity is that whenever any cache is updated, all other cluster members must be notified of the action and be able to validate their own copies of the data, if such a copy exists.

Hash Exchange

As discussed, it is not important that all caches contain the same data, rather it is important that the data that any cache contains is up to date. Hence we needed a mechanism to validate data whenever there was potential for it to change. To achieve this, the cache stores alongside each data node, a hash of the stored value. It uses this hash when both informing other caches of activity connected to a node and also when receiving messages related to a node, to discern whether cache related actions are required.

The JMS Message

For current versions no data is passed over the JMS, only notifications that data has been somehow affected. When a node is inserted or modified, the key name that stores the data and the hash value are communicated in a message to all participants of the cache cluster.

  1. When data is pushed into the cache an MD5 hash value for the data is calculated. If data already exists in the node, the hash of the incoming data is checked with that of the existing data. If the hashes match no further action is taken. However, if the hashes do not match, or the data doesn't exist, then the data is stored in the cache and a message is transmitted to the other members of the cluster.
  2. On receipt of a message, each cache will check the referenced key and hash value. If the hashes do not match, then the data is immediately flushed. If no matching key is found or the hashes match, no action is taken. Note, data is not refreshed, rather it is dropped. The application is tasked with checking the cache and inserting new data if the cache doesn't contain valid data.
  3. An application may also invalidate a cache key on demand. This can be done to control the size of the cache or for other housekeeping reasons, however, it is not necessary to message the other cluster members in this case.

Process Flow: A data modification example..

Consider a two server setup where both servers have been recently started.

  • SERVER ONE - cache = empty
  • SERVER TWO - cache = empty
  1. A user logs in to server one and requests a record, the cache is checked and found to be empty, the data is therefore retrieved from the database and inserted into the cache and a new hash created {name:nick, hash:1234567890}. According to the rules, a new message is generated with the hash value {1234567890} and transmitted to the cluster.
  2. Both servers receive the message and both check their caches. Server one's cache contains the key and the hash values match and no further action is taken. Server two's cache is empty so again, no further action is taken.
    • SERVER ONE - cache = {data={name=nick}, hash=1234567890}
    • SERVER TWO - cache = empty
  3. An editor visits server two, because they are performing an edit on the same data viewed in step 1, the application checks its cache but finds no data so retrieves the data from the database.
  4. On saving the edit, the application invalidates the cache and then displays the record. Checking the cache, it finds it empty and therefore retrieves the data from the database and stores it in cache. This generates a message, this time creating a hash based on the newly edited data, eg {abcdefghij}
  5. Both servers again receive the message and both check their caches. Server two's cache contains the key and the hash values match, so no further action is taken. Server one's cache contains the key, but the hashes don't match so it invalidates the key.
  6. The next time the data is viewed on server 1, the cache will be checked and found empty, causing a final round of insertion and messaging, after which both caches will be primed and up to date.
  7. <