How do you scale Google Cloud Document AI processing?

Question

How do you scale Google Cloud Document AI processing?

586 views Asked by Kevin Eid At 30 July 2020 at 11:01

From https://cloud.google.com/document-ai/docs/process-forms, I can see some example of processing single files. But in most cases, companies have buckets of documents. In that case, how do you scale the document ai processing? Do you use the document ai in conjunction with Spark? Or is there another way?

Original Q&A

There are 2 answers

**Kevin Eid** · Answer 1 · 2020-07-30T11:24:45+00:00

I could only find the following: batch_process_documents process many documents and return an async response that'll get saved in cloud storage.

From there, I think that we can parametrise our job by adding an input path of the bucket prefix and distribute the job over several machines.

All of that could be orchestrated via Airflow for example.

**Holt Skinner** · Answer 2 · 2022-08-02T20:57:37+00:00

You will need to use Batch Processing to handle multiple documents at once with Document AI.

This page in the Cloud Documentation shows how to make Batch Processing requests with REST and the Client Libraries.

https://cloud.google.com/document-ai/docs/send-request#batch-process

This codelab also illustrates how to do this in Python with the OCR Processor. https://codelabs.developers.google.com/codelabs/docai-ocr-python

TechQA.

How do you scale Google Cloud Document AI processing?

There are 2 answers

Related Questions in GOOGLE-CLOUD-PLATFORM

Related Questions in GOOGLE-CLOUD-DATAPROC

Related Questions in CLOUD-DOCUMENT-AI

Related Questions in GOOGLE-CLOUD-AI

Popular Questions

Popular Tags

Trending Questions