Filtering JDBC Ingestion with AWS Glue and PySpark

1k views Asked by At

I am using AWS Glue to ingest from a mysql database. I know that I can use custom queries when using pyspark-JDBC to ingest data. Does the same apply for when ingesting based on a crawler? Right now I am using this:

datasource =glueContext.create_dynamic_frame.from_catalog(database="db_name",table_name="table_name")

Is there any way that I can ingest, instead of the whole table, only part of it? Like using a select * from table where column_x > value.

0

There are 0 answers