Trying to use a specific (not-default) service account for running kfp pipelines in VertexAI. JSON keys are not an option.
Ideally gets both project ID and credentials using google.auth.default()
, as suggested in google.auth
user guide.
So far, I've tried:
- Using deprecated
kfp.v2.google.client.AIPlatformClient
, client instantiated with project ID specified and running the pipeline withcreate_run_from_spec
withservice_account
keyword argument - Using
google.cloud.aiplatform.pipeline_jobs.PipelineJob
, object instantiated with project ID, pipeline run withsubmit
andservice_account
kwarg - Creating new run from cloud UI (and the JSON file created by compiling the pipeline) with service account specified.
I've tried all three with both the actual pipeline (running on custom built containers) and a minimal working example (using lightweight python components). In all cases, when I run creds, project = google.auth.default()
and then printing the project and creds.service_account_email
, I get a project ID I don't recognize (always the same one in all cases) and default
for the service account email.
I think I must be doing something wrong, but I can't figure out what. It seems like the configuration I'm passing to the pipeline run isn't being used at all.
For reference, the MWE:
from kfp.v2 import dsl
@dsl.component(packages_to_install=['google-auth'])
def check_auth(name:str) -> str:
import google.auth
creds,project = google.auth.default()
print(f'Project is: {project}')
print(f'Got creds for: {creds.service_account_email}')
return project
@dsl.pipeline(
name='adc-mwe-pipeline'
)
def pipeline() -> str:
auth_check = check_auth(name='name')
return auth_check.output
from google.cloud.aiplatform import pipeline_jobs
from kfp.v2 import compiler
compiler.Compiler().compile(pipeline_func=pipeline, package_path='mwe.json')
start_pipeline = pipeline_jobs.PipelineJob(
display_name='mwe',
template_path='mwe.json',
location='some-location',
project='my-project',
enable_caching=False
)
start_pipeline.submit(service_account="my-service-account")
Figured out the correct way to use application default credentials is to not invoke credentials explicitly at all.
So, for example, with BigQuery:
Running this in the Compute Engine instance or a component in a pipeline will use the credentials of the service account attached to the Compute Engine instance or the service account used to submit the pipeline (as in the question).
Hope this helps someone else. Quite frustrating that it isn't documented clearly.