Reading rows using AllRowsReader but starting from a specific row

Question

Reading rows using AllRowsReader but starting from a specific row

157 views Asked by Pär Svanström At 11 June 2015 at 13:44

I have a batch job that reads through approximately 33 million rows in Cassandra, using the AllRowsReader as described in the Astyanax wiki:

new AllRowsReader.Builder<>(getKeyspace(), columnFamily)
            .withPageSize(100)
            .withIncludeEmptyRows(false)
            .withConcurrencyLevel(1)
            .forEachRow(
                row -> {
                    try {
                        return processRow(row);
                    } catch (Exception e) {
                        LOG.error("Error while processing row!", e);
                        return false;
                    }
                }
            )
            .build()
            .call();

If some sort of error stops the batch job, I would like to be able to pick up and continue reading from the row where it stopped, so that I don't have to start reading from the first row again. Is there any fast and simple way to do this?

Or isn't the AllRowsReader the right fit for this kind of task?

Original Q&A

There are 1 answers

**Abhishek Garg** · Accepted Answer · 2016-02-11T20:38:16+00:00

Since nobody has answered let me try this one. Cassandra uses partitioners to determine in which node it should place the row. There are mainly two type of partitioners: 1) Ordered 2) Unordered

https://docs.datastax.com/en/cassandra/2.2/cassandra/architecture/archPartitionerAbout.html

In case of Ordered Partitioner, rows are placed according to the lexicographic order.But in case of Unordered Partitioner you dont have any way to know about the order.

Ordered Partitioner are regarded as anti-pattern in cassandra because it makes cluster distribution pretty difficult. https://docs.datastax.com/en/cassandra/2.2/cassandra/planning/planPlanningAntiPatterns.html

I am assuming you should be using unordered partitioner in your code. So currently there is no way to tell cassandra which is using unordered partitioner that start from this particular row.

I hope this answers your question

TechQA.

Reading rows using AllRowsReader but starting from a specific row

There are 1 answers

Related Questions in JAVA

Related Questions in CASSANDRA

Related Questions in ASTYANAX

Popular Questions

Popular Tags

Trending Questions