Hyperopt spark 3.0 issues

1.2k views Asked by At

I am running runtime 8.1 (includes Apache Spark 3.1.1, Scala 2.12) trying to get hyperopt working as defined by

https://docs.databricks.com/applications/machine-learning/automl-hyperparam-tuning/hyperopt- spark-mlflow-integration.html

py4j.Py4JException: Method maxNumConcurrentTasks([]) does not exist

when I try to

spark_trials = SparkTrials()

Is there anything special I need to do to get this working?

Here is the cluster I am using

{
    "autoscale": {
        "min_workers": 1,
        "max_workers": 2
    },
    "cluster_name": "mlops_tiny_ml",
    "spark_version": "8.2.x-cpu-ml-scala2.12",
    "spark_conf": {},
    "aws_attributes": {
        "first_on_demand": 1,
        "availability": "SPOT_WITH_FALLBACK",
        "zone_id": "us-west-2b",
        "instance_profile_arn": "arn:aws:iam::112437402463:instance-profile/databricks_instance_role_s3",
        "spot_bid_price_percent": 100,
        "ebs_volume_type": "GENERAL_PURPOSE_SSD",
        "ebs_volume_count": 3,
        "ebs_volume_size": 100
    },
    "node_type_id": "m4.large",
    "driver_node_type_id": "m4.large",
    "ssh_public_keys": [],
    "custom_tags": {},
    "spark_env_vars": {},
    "autotermination_minutes": 120,
    "enable_elastic_disk": false,
    "cluster_source": "UI",
    "init_scripts": [],
    "cluster_id": "0xxxxxt404"
}

this is the code I am using https://docs.databricks.com/applications/machine-learning/automl-hyperparam-tuning/hyperopt-model-selection.html

1

There are 1 answers

4
Alex Ott On BEST ANSWER

Hyperopt is only included into the DBR ML runtimes, not into the stock runtimes. You can check it by looking into release notes for each of runtimes: DBR 8.1 vs. DBR 8.1 ML.

And from the docs:

Databricks Runtime for Machine Learning incorporates MLflow and Hyperopt, two open source tools that automate the process of model selection and hyperparameter tuning.