Preprocessing large data in databricks community edition

Question

Preprocessing large data in databricks community edition

408 views Asked by Shihab Masri At 19 April 2022 at 18:22

I have 16 GB dataset and want to use it in databricks. However, in community edition DBFS limit is 10 GB. May you please assist me to preprocess the data to be able to move it from driver to DBFS.

Original Q&A

There are 1 answers

**Alex Ott** · Answer 1 · 2022-04-20T11:56:22+00:00

The simplest way for that is not to use DBFS (it's designed only for temporary data), but host data & results in your own environment, like, AWS S3 bucket or ADLS (could be a higher transfer costs).

If you can't use it, then solution depends on other factors - what is the input file format, like, is it compressed/uncompressed, etc.

TechQA.

Preprocessing large data in databricks community edition

There are 1 answers

Related Questions in DATASET

Related Questions in DATABRICKS

Related Questions in LARGE-DATA

Related Questions in DATABRICKS-COMMUNITY-EDITION

Popular Questions

Popular Tags

Trending Questions