Accumulo tablet servers have block caches that buffer data in memory to limit reads from disk. This caching has the following benefits:

  • reduces latency when reading data
  • helps alleviate hotspots in tables

Each tablet server has an index and data block cache that is shared by all hosted tablets (see the tablet server diagram to learn more). A typical Accumulo read operation will perform a binary search over several index blocks followed by a linear scan of one or more data blocks. If these blocks are not in a cache, they will need to be retrieved from RFiles in HDFS. While the index block cache is enabled for all tables, the data block cache has to be enabled for a table by the user. It is typically only enabled for tables where read performance is critical.

Configuration

The tserver.cache.manager.class property controls which block cache implementation is used within the tablet server. Users can supply their own implementation and set custom configuration properties to control its behavior (see org.apache.accumulo.core.spi.cache.BlockCacheManager$Configuration.java).

The index and data block caches are configured for tables by the following properties:

While the index block cache is enabled by default for all Accumulo tables, users must enable the data block cache by setting table.cache.block.enable to true in the shell:

  1. config -t mytable -s table.cache.block.enable=true

Or programmatically using TableOperations.setProperty():

  1. client.tableOperations().setProperty("mytable", "table.cache.block.enable", "true");

The size of the index and data block caches (which are shared by all tablets of tablet server) can be changed from their defaults by setting the following properties: