DataBricks (10.2) Undocumented Case Sensitivity Related to Feature Store Database/Table Access

Question

DataBricks (10.2) Undocumented Case Sensitivity Related to Feature Store Database/Table Access

400 views Asked by StephenDonaldHuffPhD At 21 March 2022 at 13:31

I created an input table intended to feed DataBricks Feature Store, mounting it (in Linux) and calling it as proscribed in DataBricks documentation (from their "RawDatasets" code example):

SourceDataFrameName_df = spark \
  .read \
  .format('delta') \
  .load("dbfs:/mnt/path/dev/version/database_name.tablename_extension")

However, this call fails with a "not-found"/"doesn't exist" error report related to locating the "database_name.tablename_extension" resource. This is how the name displays everywhere within the DataBricks GUI - that is as all lower-case.

I spent much time reviewing DataBricks documentation and SO while reviewing my DataBricks system setup but cannot find the solution to this error. Please assist.

Original Q&A

There are 1 answers

**StephenDonaldHuffPhD** · Accepted Answer · 2022-03-21T13:31:28+00:00

This is an as-yet undocumented issue related to the nature of DataBricks Feature Store operations. Since DataBricks is largely pass-through (using registered views rather than storing the source data), the mount is a key issue here.

This issue may not be documented/highlighted adequately in their documentation because it is actually a Linux-thing, since that operating system is case-sensitive (whereas DataBricks appears to be largely case-agnostic). In this example, the original database/Linux engineer created the table/mount this way:

database_name.TableName_Extension

Since the mount references a Linux path, the path is case-sensitive, too. So, the proper way to load this source dataset from such a mount would be:

SourceDataFrameName_df = spark \
  .read \
  .format('delta') \
  .load("dbfs:/mnt/path/dev/version/database_name.TableName_Extension")

The problem is that this case-sensitive nomenclature could potentially be unknown (and unknowable) if the DataBricks developer/engineer and the database/Linux developer/engineer are not the same person! For example, it might have been labeled "database_name.Tablename_extension" or "database_name.TableName_EXTENSION" or any other combination thereof.

Obviously, this information isn't difficult to find, if the needy user knows to look for it. Beware.

TechQA.

DataBricks (10.2) Undocumented Case Sensitivity Related to Feature Store Database/Table Access

There are 1 answers

Related Questions in PYTHON

Related Questions in PYSPARK

Related Questions in DATABRICKS

Related Questions in FEATURE-STORE

Popular Questions

Popular Tags

Trending Questions