What is the best way to feed training data from parquet file to a Tensorflow/Keras model?

629 views Asked by exAres At 30 November 2021 at 06:29

I have a training dataset stored on S3 in parquet format. I wish to load this data into a notebook (on databricks cluster) and train a Keras model on it. There are few ways that I can think of to train Keras model on this dataset:

read parquet file from S3 in batches (maybe using Pandas) and feed these batches to the model
using Tensorflow IO APIs (this might require to copy parquet from S3 to local env on notebook)
using Petastorm package (from Uber) - this also might require to copy parquet from S3 to local notebook's environment

What is the best way to train a model in such case, such that it would be easier to scale the training to larger training datasets?

Original Q&A

TechQA.

What is the best way to feed training data from parquet file to a Tensorflow/Keras model?

There are 0 answers

Related Questions in TENSORFLOW

Related Questions in AMAZON-S3

Related Questions in PARQUET

Related Questions in TENSORFLOW-DATASETS

Related Questions in PETASTORM

Popular Questions

Popular Tags

Trending Questions