I created an input table intended to feed DataBricks Feature Store, mounting it (in Linux) and calling it as proscribed in DataBricks documentation (from their "RawDatasets" code example):
SourceDataFrameName_df = spark \
.read \
.format('delta') \
.load("dbfs:/mnt/path/dev/version/database_name.tablename_extension")
However, this call fails with a "not-found"/"doesn't exist" error report related to locating the "database_name.tablename_extension" resource. This is how the name displays everywhere within the DataBricks GUI - that is as all lower-case.
I spent much time reviewing DataBricks documentation and SO while reviewing my DataBricks system setup but cannot find the solution to this error. Please assist.
This is an as-yet undocumented issue related to the nature of DataBricks Feature Store operations. Since DataBricks is largely pass-through (using registered views rather than storing the source data), the mount is a key issue here.
This issue may not be documented/highlighted adequately in their documentation because it is actually a Linux-thing, since that operating system is case-sensitive (whereas DataBricks appears to be largely case-agnostic). In this example, the original database/Linux engineer created the table/mount this way:
Since the mount references a Linux path, the path is case-sensitive, too. So, the proper way to load this source dataset from such a mount would be:
The problem is that this case-sensitive nomenclature could potentially be unknown (and unknowable) if the DataBricks developer/engineer and the database/Linux developer/engineer are not the same person! For example, it might have been labeled "database_name.Tablename_extension" or "database_name.TableName_EXTENSION" or any other combination thereof.
Obviously, this information isn't difficult to find, if the needy user knows to look for it. Beware.