How to determine Accumulo table visibilities?

518 views Asked by At

We have an Accumulo instance and some of the tables have data which was written with visibility tokens which none of our current users have. For various reasons, we do not know what all the visibility strings/tokens are within the tables. Because of this, we have orphaned data. Is their a way for the Accumulo root user or other user to determine what the visibility strings are for the data within a given table without them having those tokens already assigned to them?

2

There are 2 answers

3
Christopher On

There's a few ways, and most of them involve writing code.

  1. You could modify Accumulo to disable visibility filtering (requires modification of the VisibilityFilter, a built-in and mandatory iterator).
  2. You could write a custom major compaction iterator that transforms all entries's visibilities to something like "SUPERUSER|OLDLABEL", and then grant the "SUPERUSER" authorization to the user you'd like to inspect the data. You could also write a major compaction iterator that simply reports what visibilities it sees to a separate log or something, which can be inspected later. (requires alter-table permission, and the ability to add your iterator to the classpath).
  3. You could read the contents of the files directly (requires direct access to the underlying distributed filesystem; see RFile Reader and related classes or this other answer).
0
MikeD On

You're going to have to read the underlying RFiles directly in order to do this. One way to do this is to use the included PrintInfo admin utility. So as a user that can read the files for your out of HDFS, run:

accumulo org.apache.accumulo.core.file.rfile.PrintInfo --dump [hdfs:///path/to/files/xxx.rf]

You'll have to find the files that correspond to your table, likely by scanning the metadata table for the "file" column family. The specifics will vary depending on which version of Accumulo you are using, however.