I am trying to play around with Mosaic on the Databricks.
Previously I was able to run Mosaic library on DBR 11.3. however, Volumes was not support in this DBR 11.3 version.
So I changed the DBR to 13.3 Standard, but the Mosaic was not able to import.
I installed the lib as the documentation instructed. Here is the install and import snippet. Exactly as the documentation.
# Installation
%pip install databricks-mosaic --quiet # <- Mosaic 0.3 series
# %pip install "databricks-mosaic<0.5,>=0.4" --quiet # <- Mosaic 0.4 series (as available)
dbutils.library.restartPython()
# Setup libs
# -- configure AQE for more compute heavy operations
# - choose option-1 or option-2 below, essential for REPARTITION!
# spark.conf.set("spark.databricks.optimizer.adaptive.enabled", False) # <- option-1: turn off completely for full control
spark.conf.set("spark.sql.adaptive.coalescePartitions.enabled", False) # <- option-2: just tweak partition management
spark.conf.set("spark.sql.shuffle.partitions", 1_024) # <-- default is 200
# -- import databricks + spark functions
from pyspark.sql import functions as F
from pyspark.sql.functions import col, udf, explode, to_json, lit
from pyspark.sql.types import *
# -- setup mosaic
import mosaic as mos
mos.enable_mosaic(spark, dbutils)
# mos.enable_gdal(spark) # <- not needed for this example
# --other imports
import os
import pathlib
import requests
import warnings
warnings.simplefilter("ignore")
The error was only here,
java.lang.Exception: DEPRECATION ERROR:
Nevermind, I just found out the error message was folded. I should be using DBR with ML. DBR version 13.3 ML is working fine.