AWS transcription job does not complete after lambda returns

471 views Asked by At

I am trying to launch an async transcription job inside a lambda. I have a cloudwatch event configured that should trigger on completion of the transcription job; So that I can perform some action on job completion in a different lambda. But the problem is that the async transcription job is lauched successfully with following jobResult in the log but the job never completes and the job completed event is not triggered.

jobResult = java.util.concurrent.CompletableFuture@481a996b[Not completed, 1 dependents]

My code is on following lines -

public class APIGatewayTranscriptHandler implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> {

    public APIGatewayProxyResponseEvent handleRequest(APIGatewayProxyRequestEvent event, Context context) {
        S3Client s3Client = S3Client.create();
        String fileUrl = s3Client.utilities().getUrl(GetUrlRequest.builder().bucket("srcBucket").key("fileName").build()).toString();
        Media media = Media.builder().mediaFileUri(fileUrl).build();

        StartTranscriptionJobRequest request = StartTranscriptionJobRequest.builder().
                languageCode(LanguageCode.ES_ES)
                .media(media).outputBucketName("destBucket")
                .transcriptionJobName("jobName")
                .mediaFormat("mp3")
                .settings(Settings.builder().showSpeakerLabels(true).maxSpeakerLabels(2).build())
                .build();

        TranscribeAsyncClient transcribeAsyncClient = TranscribeAsyncClient.create();
        CompletableFuture<StartTranscriptionJobResponse> jobResult = transcribeAsyncClient.startTranscriptionJob(request);
        logger.log("jobResult =  " + jobResult.toString());
        
        jobResult.whenComplete((jobResponse, err) -> {
            try {
                if (jobResponse != null) {
                    logger.log("CompletableFuture : response = " + jobResponse.toString());
                } else {
                    logger.log("CompletableFuture : NULL response: error = " + err.getMessage());
                }
            } catch (Exception e) {
                e.printStackTrace();
            }
        });

        //Job is completed only if Thread is made to sleep
        /*try {
                Thread.sleep(50000);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }*/

        APIGatewayProxyResponseEvent response = new APIGatewayProxyResponseEvent();
        response.setStatusCode(200);
        Map<String, String> responseBody = new HashMap<String, String>();
        responseBody.put("Status", jobResult.toString());
        String responseBodyString = new JSONObject(responseBody).toJSONString();
        response.setBody(responseBodyString);
        return response;
    }
}

I have verified, the audio file exists in the source bucket.

The above job completes and the job completed event is triggered ONLY if I add some sleep time in the lambda after launching the job.
For example,

Thread.sleep(50000);

Every thing works as expected if sleep time is added. But without Thread.sleep() the job never completes. The Timeout for lambda is configured as 60 seconds. Some help or pointers will be really appreciated.

1

There are 1 answers

6
Augusto On BEST ANSWER

You are starting a CompletableFuture, but not waiting for it to complete.

Call get() to wait for it to wait util it completes executing.

        [...]
        logger.log("jobResult =  " + jobResult.toString());
        jobResult.get();

        APIGatewayProxyResponseEvent response = new APIGatewayProxyResponseEvent();
        [...]

This also explains why it works when you do call sleep(), as it gives enough time to the Future to complete.

Even if the call only does an HTTPS request, the lambda will finish sooner (HTTPS connections are expensive to create).