Iceberg on Redshift Spectrum - Spectrum Scan Error

120 views Asked by At

AWS has given support to Iceberg through Redshift Spectrum. I made it work, but once my Iceberg table receive some updates (30k let's say), it breaks the Spectrum query. I get the error:

[2024-01-30 18:31:01] [XX000] ERROR: Spectrum Scan Error: Retries exceeded
[2024-01-30 18:31:01] Detail:
[2024-01-30 18:31:01] -----------------------------------------------
[2024-01-30 18:31:01] error:  Spectrum Scan Error: Retries exceeded
[2024-01-30 18:31:01] code:      15001
[2024-01-30 18:31:01] context:   The Spectrum scan timed out, please retry.
[2024-01-30 18:31:01] query:     3259155
[2024-01-30 18:31:01] location:  dory_util.cpp:1579
[2024-01-30 18:31:01] process:   worker_thread [pid=8668]
[2024-01-30 18:31:01] -----------------------------------------------

In Athena tho it can be queried in less than a minute. Also, if I run the OPTIMIZE command (https://docs.aws.amazon.com/athena/latest/ug/querying-iceberg-data-optimization.html) then it works again in redshift spectrum. But having to be doing OPTIMIZE all the time to solve this is quite unhandy. I'd like to understand this problem deeper, wondering if anyone has more experience about it and could help me at least understand the reason why this happens. Here is the table DDL:

CREATE TABLE data_lake.orders ( 
created_at string, 
id int, 
euro_amount double, 
local_amount double, 
local_currency string, 
pay_method string, 
) PARTITIONED BY (`__source_ts_ms`) 
LOCATION 's3://xxxxx/iceberg-deltas-latest-updates/data_lake.db/beltegoed_order' TBLPROPERTIES ( 'table_type'='iceberg', 'write_compression'='zstd' );

The table is created in Athena and is available in a glue database, then I create an external schema in redshift and access it through there. If it's insert onlys it works well, but the updates are the problem... I'm upserting these updates in the iceberg table. The table has 1m records only and has less than 30 files in total after the updates (for data, and for metadata).

0

There are 0 answers