We are trying to run simple write/sacn from Accumulo (client jar 1.5.0) in standalone Java main program (Maven shade executable) as below in AWS EC2 master (described below) using Putty
public class AccumuloQueryApp {
private static final Logger logger = LoggerFactory.getLogger(AccumuloQueryApp.class);
public static final String INSTANCE = "accumulo"; // miniInstance
public static final String ZOOKEEPERS = "ip-x-x-x-100:2181"; //localhost:28076
private static Connector conn;
static {
// Accumulo
Instance instance = new ZooKeeperInstance(INSTANCE, ZOOKEEPERS);
try {
conn = instance.getConnector("root", new PasswordToken("xxx"));
} catch (Exception e) {
logger.error("Connection", e);
}
}
public static void main(String[] args) throws TableNotFoundException, AccumuloException, AccumuloSecurityException, TableExistsException {
System.out.println("connection with : " + conn.whoami());
BatchWriter writer = conn.createBatchWriter("test", ofBatchWriter());
for (int i = 0; i < 10; i++) {
Mutation m1 = new Mutation(String.valueOf(i));
m1.put("personal_info", "first_name", String.valueOf(i));
m1.put("personal_info", "last_name", String.valueOf(i));
m1.put("personal_info", "phone", "983065281" + i % 2);
m1.put("personal_info", "email", String.valueOf(i));
m1.put("personal_info", "date_of_birth", String.valueOf(i));
m1.put("department_info", "id", String.valueOf(i));
m1.put("department_info", "short_name", String.valueOf(i));
m1.put("department_info", "full_name", String.valueOf(i));
m1.put("organization_info", "id", String.valueOf(i));
m1.put("organization_info", "short_name", String.valueOf(i));
m1.put("organization_info", "full_name", String.valueOf(i));
writer.addMutation(m1);
}
writer.close();
System.out.println("Writing complete ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~`");
Scanner scanner = conn.createScanner("test", new Authorizations());
System.out.println("Step 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~`");
scanner.setRange(new Range("3", "7"));
System.out.println("Step 2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~`");
scanner.forEach(e -> System.out.println("Key: " + e.getKey() + ", Value: " + e.getValue()));
System.out.println("Step 3 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~`");
scanner.close();
}
public static BatchWriterConfig ofBatchWriter() {
//Batch Writer Properties
final int MAX_LATENCY = 1;
final int MAX_MEMORY = 10000000;
final int MAX_WRITE_THREADS = 10;
final int TIMEOUT = 10;
BatchWriterConfig config = new BatchWriterConfig();
config.setMaxLatency(MAX_LATENCY, TimeUnit.MINUTES);
config.setMaxMemory(MAX_MEMORY);
config.setMaxWriteThreads(MAX_WRITE_THREADS);
config.setTimeout(TIMEOUT, TimeUnit.MINUTES);
return config;
}
}
Connection is established correctly but creating BatchWriter it getting error and it's trying in loop with same error
[impl.ThriftScanner] DEBUG: Error getting transport to ip-x-x-x-100:10011 : NotServingTabletException(extent:TKeyExtent(table:21 30, endRow:21 30 3C, prevEndRow:null))
When we run the same code (writing to Accumulo and reading from Accumulo) inside Spark job and submit to the YANK cluster it's running perfectly. We are struggling to figure out that but getting no clue. Please see the environment as described below
Cloudera CDH 5.8.2 on AWS environemnts (4 EC2 instances as one master and 3 child).
Consider the private IPs are like
- Mater: x.x.x.100
- Child1: x.x.x.101
- Child2: x.x.x.102
- Child3: x.x.x.103
We havethe follwing installation in CDH
Cluster (CDH 5.8.2)
- Accumulo 1.6 (Tracer not installed, Garbage Collector in Child2, Master in Master, Monitor in child3, Tablet Server in Master)
- HBase
- HDFS (master as name node, all 3 child as datanode)
- Kafka
- Spark
- YARN (MR2 Included)
- ZooKeeper
Hrm, that's very curious that it runs with the Spark-on-YARN, but as a regular Java application. Usually, it's the other way around :)
I would verify that the JARs on the classpath of the standalone java app match the JARs used by the Spark-on-YARN job as well as the Accumulo server classpath.
If that doesn't help, try to increase the log4j level to DEBUG or TRACE and see if anything jumps out at you. If you have a hard time understanding what the logging is saying, feel free to send an email to [email protected] and you'll definitely have more eyes on the problem.