Adding machine-type parameters in Google Cloud Python SDK create_cluster() function

64 views Asked by At

Google cloud's python docs have a script (python-docs-samples/dataproc/submit_job_to_cluster.py) that has the following function:

def create_cluster(dataproc, project, zone, region, cluster_name):
      print('Creating cluster...')
      zone_uri = 'https://www.googleapis.com/compute/v1/projects/{}/zones/{}'.format(
        project, zone)
      cluster_data = {
         'projectId': project,
         'clusterName': cluster_name,
         'config': {
             'gceClusterConfig': {
                'zoneUri': zone_uri
             }
         }
      }
      result = dataproc.projects().regions().clusters().create(
         projectId=project,
         region=region,
         body=cluster_data).execute()
      return result

I was wondering if it's possible to specify the machine types for the master and worker nodes of the cluster in this function?

1

There are 1 answers

1
tix On

The following should work:

def create_cluster(dataproc, project, zone, region, cluster_name):
      print('Creating cluster...')
      zone_uri = 'https://www.googleapis.com/compute/v1/projects/{}/zones/{}'.format(
        project, zone)
      cluster_data = {
         'projectId': project,
         'clusterName': cluster_name,
         'config': {
             'gceClusterConfig': {
                'zoneUri': zone_uri
              },
              'masterConfig': {
                'machineTypeUri' : 'n1-standard-1',
              },
              'workerConfig': {
                'machineTypeUri' : 'n1-standard-4',
              },
             }
         }
      }
      result = dataproc.projects().regions().clusters().create(
         projectId=project,
         region=region,
         body=cluster_data).execute()
      return result

https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.clusters#ClusterConfig