We have implemented cassandra using the aws keyspace service, and using the cassandra-driver for node. The client works fine and we are able to perform create and update operations on the data. However, running a simple cql query using the client returns no data (empty rows).
When I run the exact same query on cql editor on aws dashboard, it works fine and does return the required data.
Query:
SELECT * FROM <TABLE_NAME> WHERE product_id = '<PRODUCT_ID>' ALLOW FILTERING
Running the same query on the cql editor on aws dashboard works fine.
Yes you are using part of the partition key in your query statement. Keyspaces will filter the rows on the storage layer. After scanning a given amount it will return back to the client. If the filter criteria did not find results it will result in an empty page. A page with no rows. It's a safe guard to avoid unbounded request time. You can continue to iterate over the pages until you reach the end of the iterator.
The approach above is useful when grabbing first X rows, but for full table scans I would recommend using AWS Glue.
In the follow example we perform the same product lookup but with Spark and Glue. Under the hood Spark will parse, and paginate through the results. Glue will provision the memory and compute resources. In this job we export the results to s3. General full table scan can be TBs in size. This architecture will work for small or large tables since it uses serverless resources.
You can find the full export to s3 example and glue create script here. here