The accumulo command can be used to run various tools and classes from the command line.

RFileInfo

The rfile-info tool will examine an Accumulo storage file and print out basic metadata.

  1. $ accumulo rfile-info /accumulo/tables/1/default_tablet/A000000n.rf
  2. 2013-07-16 08:17:14,778 [util.NativeCodeLoader] INFO : Loaded the native-hadoop library
  3. Locality group : <DEFAULT>
  4. Start block : 0
  5. Num blocks : 1
  6. Index level 0 : 62 bytes 1 blocks
  7. First key : 288be9ab4052fe9e span:34078a86a723e5d3:3da450f02108ced5 [] 1373373521623 false
  8. Last key : start:13fc375709e id:615f5ee2dd822d7a [] 1373373821660 false
  9. Num entries : 466
  10. Column families : [waitForCommits, start, md major compactor 1, md major compactor 2, md major compactor 3,
  11. bringOnline, prep, md major compactor 4, md major compactor 5, md root major compactor 3,
  12. minorCompaction, wal, compactFiles, md root major compactor 4, md root major compactor 1,
  13. md root major compactor 2, compact, id, client:update, span, update, commit, write,
  14. majorCompaction]
  15. Meta block : BCFile.index
  16. Raw size : 4 bytes
  17. Compressed size : 12 bytes
  18. Compression type : gz
  19. Meta block : RFile.index
  20. Raw size : 780 bytes
  21. Compressed size : 344 bytes
  22. Compression type : gz

When trying to diagnose problems related to key size, the rfile-info tool can provide a histogram of the individual key sizes:

  1. $ accumulo rfile-info --histogram /accumulo/tables/1/default_tablet/A000000n.rf
  2. ...
  3. Up to size count %-age
  4. 10 : 222 28.23%
  5. 100 : 244 71.77%
  6. 1000 : 0 0.00%
  7. 10000 : 0 0.00%
  8. 100000 : 0 0.00%
  9. 1000000 : 0 0.00%
  10. 10000000 : 0 0.00%
  11. 100000000 : 0 0.00%
  12. 1000000000 : 0 0.00%
  13. 10000000000 : 0 0.00%

Likewise, rfile-info will dump the key-value pairs and show you the contents of the RFile:

  1. $ accumulo rfile-info --dump /accumulo/tables/1/default_tablet/A000000n.rf
  2. row columnFamily:columnQualifier [visibility] timestamp deleteFlag -> Value
  3. ...

Encrypted Files

To examine an encrypted rfile the necessary encryption properties must be provided to the utility. To do this the accumulo.properties file can be copied, the necessary encryption parameters added, and then the properties file can be passed to the utility with the -p argument.

For example, if using PerTableCryptoServiceFactory and the AESCryptoService, you would need the following properties in your accumulo.properties file:

  1. general.custom.crypto.key.uri=<path-to-key>/data-encryption.key
  2. instance.crypto.opts.factory=org.apache.accumulo.core.spi.crypto.PerTableCryptoServiceFactory
  3. table.crypto.opts.service=org.apache.accumulo.core.spi.crypto.AESCryptoService

Example output:

  1. $ accumulo rfile-info hdfs://localhost:8020/accumulo/tables/1/default_tablet/F0000001.rf -p <path-to-properties>/accumulo.properties
  2. Reading file: hdfs://localhost:8020/accumulo/tables/1/default_tablet/F0000001.rf
  3. Encrypted with Params: ...
  4. ...
  5. RFile Version : 8
  6. Locality group : <DEFAULT>
  7. Num blocks : 1
  8. Index level 0 : 37 bytes 1 blocks
  9. ...
  10. Meta block : BCFile.index
  11. Raw size : 4 bytes
  12. ...
  13. Meta block : RFile.index
  14. Raw size : 121 bytes
  15. ...
  16. ...

GetManagerStats

The GetManagerStats tool can be used to retrieve Accumulo state and statistics:

  1. $ accumulo org.apache.accumulo.test.GetManagerStats | grep Load
  2. OS Load Average: 0.27

FindOfflineTablets

If the Accumulo monitor shows an offline tablet, use FindOfflineTablets to find out which tablet it is.

  1. $ accumulo org.apache.accumulo.server.util.FindOfflineTablets
  2. 2<<@(null,null,localhost:9997) is UNASSIGNED #walogs:2

Here’s what the output means:

  • 2<< - This is the tablet from (-inf, pass:[+]inf) for the table with id 2. The command tables -l in the shell will show table ids for tables.

  • @(null, null, localhost:9997) - Location information. The format is @(assigned, hosted, last). In this case, the tablet has not been assigned, is not hosted anywhere, and was once hosted on localhost.

  • #walogs:2 - The number of write-ahead logs that this tablet requires for recovery.

An unassigned tablet with write-ahead logs is probably waiting for logs to be sorted for efficient recovery.

The CheckForMetadataProblems tool can be used to make sure metadata tables are up and consistent. It will verify the start/end of every tablet matches, and the start and stop for the table is empty:

  1. $ accumulo org.apache.accumulo.server.util.CheckForMetadataProblems -u root --password
  2. Enter the connection password:
  3. Checking tables whose metadata is found in: accumulo.root (+r)
  4. ...All is well for table accumulo.metadata (!0)
  5. No problems found in accumulo.root (+r)
  6. Checking tables whose metadata is found in: accumulo.metadata (!0)
  7. ...All is well for table accumulo.replication (+rep)
  8. ...All is well for table trace (1)
  9. No problems found in accumulo.metadata (!0)

RemoveEntriesForMissingFiles

If your Hadoop cluster has a lost a file due to a NameNode failure, you can remove the file reference using RemoveEntriesForMissingFiles. It will check every file reference and ensure that the file exists in HDFS. Optionally, it will remove the reference:

  1. $ accumulo org.apache.accumulo.server.util.RemoveEntriesForMissingFiles -u root --password
  2. Enter the connection password:
  3. 2013-07-16 13:10:57,293 [util.RemoveEntriesForMissingFiles] INFO : File /accumulo/tables/2/default_tablet/F0000005.rf
  4. is missing
  5. 2013-07-16 13:10:57,296 [util.RemoveEntriesForMissingFiles] INFO : 1 files of 3 missing

ChangeSecret (new in 2.1)

Changes the unique secret given to the instance that all servers must know. The utility can be run using the accumulo admin command. Note that Accumulo must be shut down to run this utility.

  1. $ accumulo admin changeSecret
  2. Old secret:
  3. New secret:
  4. New instance id is 6e7f416b-c578-45df-8016-c9bc6b400e13
  5. Be sure to put your new secret in accumulo.properties

DeleteZooInstance (new in 2.1)

Deletes specific a specific instance name or id from zookeeper or cleans up all old instances. The utility can be run using the accumulo admin command.

To delete a specific instance use -i or --instance flags.

  1. $ accumulo admin deleteZooInstance -i instance1
  2. Deleted instance: instance1

If you try to delete the current instance a warning prompt will be displayed.

  1. $ accumulo admin deleteZooInstance -i uno
  2. Warning: This is the current instance, are you sure? (yes|no): no
  3. Instance deletion of 'uno' cancelled.
  4. $ accumulo admin deleteZooInstance -i uno
  5. Warning: This is the current instance, are you sure? (yes|no): yes
  6. Deleted instance: instance1

If you have entries in zookeeper for old instances that you no longer need, use the -c or --clean flags. This command will not delete the instance pointed to by the local accumulo.properties file.

  1. $ accumulo admin deleteZooInstance -c
  2. Deleted instance: instance1
  3. Deleted instance: instance2

accumulo-util dump-zoo

To view the contents of ZooKeeper, run the following command:

It can also be run using the accumulo command.

If you would like to backup ZooKeeper, run the following command to write its contents as XML to file.

  1. $ accumulo-util dump-zoo --xml --root /accumulo >dump.xml

RestoreZookeeper

An XML dump file can be later used to restore ZooKeeper. The utility can be run using the accumulo admin command.

  1. $ accumulo admin restoreZoo --overwrite < dump.xml

This command overwrites ZooKeeper so take care when using it. This is also why it cannot be called using accumulo-util.

TabletServerLocks (new in 2.1)

List or delete Tablet Server locks. The utility can be run using the accumulo admin command.

  1. $ accumulo admin locks
  2. localhost:9997 TSERV_CLIENT=localhost:9997
  3. $ accumulo admin locks -delete localhost:9997
  4. $ accumulo admin locks
  5. localhost:9997 <none>

VerifyTabletAssignments (new in 2.1)

Verify all tablets are assigned to tablet servers. The utility can be run using the accumulo admin command.

  1. $ accumulo admin verifyTabletAssigns
  2. Checking table accumulo.metadata
  3. Checking table accumulo.replication
  4. Tablet +rep<< has no location
  5. Checking table accumulo.root
  6. Checking table t1
  7. Checking table t2
  8. Checking table t3
  9. $ accumulo admin verifyTabletAssigns -v
  10. Checking table accumulo.metadata
  11. Tablet !0;~< is located at localhost:9997
  12. Tablet !0<;~ is located at localhost:9997
  13. Checking table accumulo.replication
  14. Tablet +rep<< has no location
  15. Checking table accumulo.root
  16. Tablet +r<< is located at localhost:9997
  17. Checking table t1
  18. Tablet 1<< is located at localhost:9997
  19. Checking table t2
  20. Tablet 2<< is located at localhost:9997
  21. Checking table t3
  22. Tablet 3<< is located at localhost:9997

zoo-info-viewer (new in 2.1)

View Accumulo information stored in ZooKeeper in a human-readable format. The utility can be run without an Accumulo instance. If an instance id or name is not provided on the command line, the instance will be read from HDFS, otherwise only a running ZooKeeper instance is required to run the command.

To run the command:

  1. $ accumulo zoo-info-viewer [mode-options] [--outfile filename]
  2. mode options:
  3. --print-instances
  4. --print-id-map
  5. --print-props [--system] [-ns | --namespaces list] [-t | --tables list]
  6. --print-acls

mode: print instances

The instance name(s) and instance id(s) are stored in ZooKeeper. To see the available name to id mapping run:

  1. $ accumulo zoo-info-viewer --print-instances
  2. -----------------------------------------------
  3. Report Time: 2022-05-31T21:07:19.673258Z
  4. -----------------------------------------------
  5. Instances (Instance Name, Instance ID)
  6. test_a=1111465d-b7bb-42c2-919b-111111111111
  7. test_b=2222465d-b7bb-42c2-919b-222222222222
  8. uno=9cc9465d-b7bb-42c2-919b-ddf74b610c82
  9. -----------------------------------------------

mode: print id-map

If a shell is not available or convenient, the zoo-info-viewer can provide the same information as the namespaces -l and tables -l commands. Note, the zoo-info-viewer output is sorted by the id.

  1. $ accumulo zoo-info-viewer --print-id-map
  2. -----------------------------------------------
  3. Report Time: 2022-05-25T19:33:42.079969Z
  4. -----------------------------------------------
  5. ID Mapping (id => name) for instance: 8f006afd-8673-4a5a-b940-60405755197f
  6. Namespace ids:
  7. +accumulo => accumulo
  8. +default => ""
  9. 1 => ns_sample1
  10. Table ids:
  11. !0 => accumulo.metadata
  12. +r => accumulo.root
  13. +rep => accumulo.replication
  14. 2 => ns_sample1.tbl1
  15. 3 => tbl2
  16. -----------------------------------------------

mode: print property mappings

With Accumulo version 2.1, the storage of properties in ZooKeeper has changed and the properties are not directly readable with the ZooKeeper zkCli utility. The properties can be listed in an Accumulo shell with the config command. However, if a shell is not available, this utility zoo-info-viewer can be used instead.

The zoo-info-viewer option --print-props with no other options will print all the configuration properties for system, namespaces and tables. The print-props can be filtered the with additional options, --system will print the system configuration, -ns or --namespaces expects a list of the namespace names, -t or --tables expects a list of table names included in the output.

  1. $ accumulo zoo-info-viewer --print-props
  2. -----------------------------------------------
  3. Report Time: 2022-05-31T21:18:11.562867Z
  4. -----------------------------------------------
  5. ZooKeeper properties for instance ID: 9cc9465d-b7bb-42c2-919b-ddf74b610c82
  6. Name: System, Data Version:0, Data Timestamp: 2022-05-31T15:51:52.772265Z:
  7. -- none --
  8. Namespace:
  9. Name: , Data Version:0, Data Timestamp: 2022-05-31T15:51:53.015613Z:
  10. -- none --
  11. Name: accumulo, Data Version:0, Data Timestamp: 2022-05-31T15:51:53.034172Z:
  12. -- none --
  13. Name: ns1, Data Version:0, Data Timestamp: 2022-05-31T21:17:22.927165Z:
  14. -- none --
  15. Tables:
  16. Name: accumulo.metadata, Data Version:2, Data Timestamp: 2022-05-31T15:51:53.511811Z:
  17. table.cache.block.enable=true
  18. table.cache.index.enable=true
  19. ...
  20. Name: accumulo.replication, Data Version:1, Data Timestamp: 2022-05-31T15:51:53.516346Z:
  21. table.formatter=org.apache.accumulo.server.replication.StatusFormatter
  22. table.group.repl=repl
  23. ...
  24. Name: accumulo.root, Data Version:2, Data Timestamp: 2022-05-31T15:51:53.501174Z:
  25. table.cache.block.enable=true
  26. table.cache.index.enable=true
  27. ...
  28. Name: ns1.tbl1, Data Version:1, Data Timestamp: 2022-05-31T21:17:41.111836Z:
  29. table.constraint.1=org.apache.accumulo.core.data.constraints.DefaultKeySizeConstraint
  30. table.iterator.majc.vers=20,org.apache.accumulo.core.iterators.user.VersioningIterator
  31. ...
  32. Name: tbl3, Data Version:1, Data Timestamp: 2022-05-31T21:17:54.083044Z:
  33. table.constraint.1=org.apache.accumulo.core.data.constraints.DefaultKeySizeConstraint
  34. table.iterator.majc.vers=20,org.apache.accumulo.core.iterators.user.VersioningIterator
  35. ...
  36. -----------------------------------------------

mode: print ACLs (new in 2.1.1)

With 2.1.1, the zoo-info-viewer option --print-acls will print the ZooKeeper ACLs for all nodes under the /accumulo/INSTANCE_ID] path.

See troubleshooting ZooKeeper for more information on the tool output and expected ACLs.

  1. $ accumulo zoo-info-viewer --print-acls
  2. -----------------------------------------------
  3. Report Time: 2023-01-27T23:00:26.079546Z
  4. -----------------------------------------------
  5. Output format:
  6. ACCUMULO_PERM:OTHER_PERM path user_acls...
  7. ZooKeeper acls for instance ID: f491223b-1413-494e-b75a-c2ca018db00f
  8. ACCUMULO_OKAY:NOT_PRIVATE /accumulo/f491223b-1413-494e-b75a-c2ca018db00f cdrwa:accumulo, r:anyone
  9. ACCUMULO_OKAY:NOT_PRIVATE /accumulo/f491223b-1413-494e-b75a-c2ca018db00f/bulk_failed_copyq cdrwa:accumulo, r:anyone
  10. ACCUMULO_OKAY:NOT_PRIVATE /accumulo/f491223b-1413-494e-b75a-c2ca018db00f/bulk_failed_copyq/locks cdrwa:accumulo, r:anyone
  11. ACCUMULO_OKAY:NOT_PRIVATE /accumulo/f491223b-1413-494e-b75a-c2ca018db00f/compactors cdrwa:accumulo, r:anyone
  12. ACCUMULO_OKAY:PRIVATE /accumulo/f491223b-1413-494e-b75a-c2ca018db00f/config cdrwa:accumulo
  13. ACCUMULO_OKAY:NOT_PRIVATE /accumulo/f491223b-1413-494e-b75a-c2ca018db00f/coordinators cdrwa:accumulo, r:anyone
  14. ...
  15. ERROR_ACCUMULO_MISSING_SOME:NOT_PRIVATE /accumulo/f491223b-1413-494e-b75a-c2ca018db00f/users/root/Namespaces r:accumulo, r:anyone
  16. ...
  17. ACCUMULO_OKAY:NOT_PRIVATE /accumulo/f491223b-1413-494e-b75a-c2ca018db00f/wals/localhost:9997[100003d35cc0004]/643b14db-b929-4570-b226-620bc5ac85ff cdrwa:accumulo, r:anyone
  18. ACCUMULO_OKAY:NOT_PRIVATE /accumulo/f491223b-1413-494e-b75a-c2ca018db00f/wals/localhost:9997[100003d35cc0004]/ad26be2a-dc52-4e0e-8e78-8fc8c3323d51 cdrwa:accumulo, r:anyone
  19. ACCUMULO_OKAY:NOT_PRIVATE /accumulo/instances cdrwa:anyone
  20. ACCUMULO_OKAY:NOT_PRIVATE /accumulo/instances/uno cdrwa:accumulo, r:anyone

zoo-prop-editor (new in 2.1.1)

The zoo-prop-editor tool provides an emergency capability to update properties stored in ZooKeeper without a running Accumulo instance. Only ZooKeeper and Hadoop are required to be available to use the tool. With Accumulo 2.1, properties are stored in single ZooKeeper config nodes for the system, each namespace and each table. The properties are stored compressed and cannot be directly edited using the ZooKeeper command lines tools like zkCli.sh

This tool is provided for a situation if invalid properties were set by the user that prevent the instance from running or if running the instance would lead to an unacceptable outcome. Users should prefer using the Accumulo shell to edit properties if at all possible. Alternatively, properties can be also be viewed using the zoo-info-viewer (it also does not need a running Accumulo instance).

The zoo-prop-editor follows a similar command format of the shell config command. If a namespace or table is not specified, the tool assumes the system properties. If set or delete option is not provided, the tool prints the current properties.

The tool displays only the properties stored in a single ZooKeeper config node. It does not provide the property hierarchy (default -> system -> namespace -> table) that is available with the shell config command.

The output includes property metadata that is prefixed with : to support filtering with grep -v : to suppress those lines if desired when piping the output to follow on commands.

For example, to view the current system config node properties (no properties are set in this example)

  1. $ accumulo zoo-prop-editor
  2. : Instance name: uno
  3. : Instance Id: e715caf8-f576-4b5d-871a-d47add90b7ba
  4. : Property scope: SYSTEM
  5. : ZooKeeper path: /accumulo/e715caf8-f576-4b5d-871a-d47add90b7ba/config
  6. : Name: system
  7. : Id: e715caf8-f576-4b5d-871a-d47add90b7ba
  8. : Data version: 0
  9. : Timestamp: 2023-06-12T21:52:15.727028Z

For example, to view the properties for table ns1.tbl1

  1. $ accumulo zoo-prop-editor -t ns1.tbl1
  2. : Instance name: uno
  3. : Instance Id: e715caf8-f576-4b5d-871a-d47add90b7ba
  4. : Property scope: TABLE
  5. : ZooKeeper path: /accumulo/e715caf8-f576-4b5d-871a-d47add90b7ba/tables/2/config
  6. : Name: ns1.tbl1
  7. : Id: 2
  8. : Data version: 1
  9. : Timestamp: 2023-06-12T21:54:31.817473Z
  10. table.constraint.1=org.apache.accumulo.core.data.constraints.DefaultKeySizeConstraint
  11. table.iterator.majc.vers=20,org.apache.accumulo.core.iterators.user.VersioningIterator
  12. table.iterator.majc.vers.opt.maxVersions=1
  13. table.iterator.minc.vers=20,org.apache.accumulo.core.iterators.user.VersioningIterator
  14. table.iterator.minc.vers.opt.maxVersions=1
  15. table.iterator.scan.vers=20,org.apache.accumulo.core.iterators.user.VersioningIterator
  16. table.iterator.scan.vers.opt.maxVersions=1

To set a property, use the -s or --set option. For example:

  1. $ zoo-prop-editor -t ns1.tbl1 -s table.bloom.enabled=false

To delete a property, use the -d or --delete option. For example:

  1. $ zoo-prop-editor -t ns1.tbl1 -d table.bloom.enabled