Google cloud's python docs have a script (python-docs-samples/dataproc/submit_job_to_cluster.py) that has the following function:
def create_cluster(dataproc, project, zone, region, cluster_name):
print('Creating cluster...')
zone_uri = 'https://www.googleapis.com/compute/v1/projects/{}/zones/{}'.format(
project, zone)
cluster_data = {
'projectId': project,
'clusterName': cluster_name,
'config': {
'gceClusterConfig': {
'zoneUri': zone_uri
}
}
}
result = dataproc.projects().regions().clusters().create(
projectId=project,
region=region,
body=cluster_data).execute()
return result
I was wondering if it's possible to specify the machine types for the master and worker nodes of the cluster in this function?
The following should work:
https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.clusters#ClusterConfig