How to use the regex split filter in the datafusion?

805 views Asked by At

I'm using Google Cloud Platform DataFusion products.

Does it supposed to put a regular expression in the Regex Path Filter part in the Advanced section of the GCS Properties? e.g) /[0-9]

But, If i enter a value in the Regex Path Filter and run the data pipeline, "Output records have not been generated for stage GCS. Please verify your logic, or try sending more data."

I would appreciate it if you could give me an example of how to write in the Regex Path Filter secion.

Thank you for reading.

1

There are 1 answers

7
Alexandre Moraes On BEST ANSWER

Currently, there is an Open issue in CDAP for updating its documentation about Regex Path Filter field, here.

The Regex Path Filter is used only to filter files, using Regex according to this documentation.

For example, you can write gs://data_directory/*/file_prefix* to filter the documents by file prefix or gs://data_directory/.*\.csv to filter the files by extension. Whereas Path points to GCS directory, such as gs://data_directory.