I'm trying to read Delta Table
using delta-rs library (Python).
The table has millions of records and we wanted to read it frequently using Rest API
call(only specific records, based on request).
So, i was checking the delta-rs
library. Since it has millions of records the read performance is not good..
Its reading the entire table and convert it as Pandas DF( before i can filter based on my request).
Is there a way to read only the records what i need instead of reading entire table then filter ( like column pruning
, predicate pushdown
etc)
Update: i followed this issue (https://github.com/delta-io/delta-rs/issues/631) and able to get good performance by converting DeltaTable to PyArrow Dataset and then using Duckdb to filter.