I have a MEAN stack application that connects to customer databases and third-party data. From JS front end I need to be able to read parquet and big-data CSV files. In this regard please clarify my understanding :
- I cannot read parquet file using arrow libraries directly (due to this issue JIRA#2786). I have to use something like parquetjs-lite for this.
- To read big-data CSV into apache-arrow, I have to first use Python (pyarrow) to convert CSV to arrow format (as in here) and then read the arrow file in my JS application. a). If (2) above is correct then can I convert any third-party CSV to arrow or should I have a predefined schema ahead of time ? b). Are nulls and NaNs allowed in the CSV .
Thanks