Creating Python zip for AWS Lambda using Bazel

1.7k views Asked by At

I've a monorepo that contains a set of Python AWS lambdas and I'm using Bazel for building and packaging the lambdas. I'm now trying to use Bazel to create a zip file that follows the expected AWS Lambdas packaging and that I can upload to Lambda. Wondering what's the best way to do this with Bazel?

Below are a few different things I've tried thus far:

Attempt 1: py_binary

BUILD.bazel

py_binary(
name = "main_binary",
srcs = glob(["*.py"]),
main = "main.py",
visibility = ["//appcode/api/transaction_details:__subpackages__"],
deps = [
        requirement("Faker"),
    ],
)

Problem:

This generates the following:

  • main_binary (python executable)
  • main_binary.runfiles
  • main_binary.runfiles_manifest

Lambda expects the handler to be in the format of lambda_function.lambda_handler. Since main_binary is an executable vs. a python file, it doesn't expose the actual handler method and the lambda blows up because it can't find it. I tried updating the handler configuration to simply point to the main_binary but it blows up because it expects two arguments(i.e. lambda_function.lambda_handler).

Attempt 2: py_library + pkg_zip

BUILD.bazel

py_library(
name = "main",
srcs = glob(["*.py"]),
visibility = ["//appcode/api/transaction_details:__subpackages__"],
deps = [
        requirement("Faker"),
    ],
)

pkg_zip(
name = "main_zip",
srcs =["//appcode/api/transaction_details/src:main" ],
)

Problem:

This generates a zip file with:

  • main.py
  • __init__.py

The zip file now includes the main.py but none of its runtime dependencies. Thus the lambda blows up because it can't find Faker.

Other Attempts:

I've also tried using the --build_python_zip flag as well as the @bazel_tools//tools/zip:zipper with a generic rule but they both lead to similar outcomes as the two previous attempts.

2

There are 2 answers

0
Ricardo Fonseca On BEST ANSWER

Below are the changes I made to the previous answer to generate the lambda zip. Thanks @jvolkman for the original suggestion.

project/BUILD.bazel: Added rule to generate requirements_lock.txt from project/requirements.txt

load("@rules_python//python:pip.bzl", "compile_pip_requirements")

compile_pip_requirements(
    name = "requirements",
    extra_args = ["--allow-unsafe"],
    requirements_in = "requirements.txt",
    requirements_txt = "requirements_lock.txt",
)

project/WORKSPACE.bazel: swap pip_install with pip_parse

workspace(name = "mdc-eligibility")

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
http_archive(
    name = "rules_python",
    sha256 = "9fcf91dbcc31fde6d1edb15f117246d912c33c36f44cf681976bd886538deba6",
    strip_prefix = "rules_python-0.8.0",
    url = "https://github.com/bazelbuild/rules_python/archive/refs/tags/0.8.0.tar.gz",
)

load("@rules_python//python:repositories.bzl", "python_register_toolchains")
python_register_toolchains(
    name = "python3_9",
    python_version = "3.9",
)

load("@rules_python//python:pip.bzl", "pip_parse")
load("@python3_9//:defs.bzl", "interpreter")
pip_parse(
   name = "mndc-eligibility-deps",
   requirements_lock = "//:requirements_lock.txt",
   python_interpreter_target = interpreter,
   quiet = False
)
load("@mndc-eligibility-deps//:requirements.bzl", "install_deps")
install_deps()

project/build_rules/lambda_packaging/lambda.bzl: Modified custom rule provided by @jvolkman to include source code in the resulting zip code.

def contains(pattern):
    return "contains:" + pattern

def startswith(pattern):
    return "startswith:" + pattern

def endswith(pattern):
    return "endswith:" + pattern

def _is_ignored(path, patterns):
    for p in patterns:
        if p.startswith("contains:"):
            if p[len("contains:"):] in path:
                return True
        elif p.startswith("startswith:"):
            if path.startswith(p[len("startswith:"):]):
                return True
        elif p.startswith("endswith:"):
            if path.endswith(p[len("endswith:"):]):
                return True
        else:
            fail("Invalid pattern: " + p)

    return False

def _short_path(file_):
    # Remove prefixes for external and generated files.
    # E.g.,
    #   ../py_deps_pypi__pydantic/pydantic/__init__.py -> pydantic/__init__.py
    short_path = file_.short_path
    if short_path.startswith("../"):
        second_slash = short_path.index("/", 3)
        short_path = short_path[second_slash + 1:]
    return short_path

# steven chambers

def _py_lambda_zip_impl(ctx):
    deps = ctx.attr.target[DefaultInfo].default_runfiles.files

    f = ctx.outputs.output

    args = []
    for dep in deps.to_list():
        short_path = _short_path(dep)

        # Skip ignored patterns
        if _is_ignored(short_path, ctx.attr.ignore):
            continue

        args.append(short_path + "=" + dep.path)

    # MODIFICATION: Added source files to the map of files to zip
    source_files = ctx.attr.target[DefaultInfo].files
    for source_file in source_files.to_list():
        args.append(source_file.basename+"="+source_file.path)

    ctx.actions.run(
        outputs = [f],
        inputs = deps,
        executable = ctx.executable._zipper,
        arguments = ["cC", f.path] + args,
        progress_message = "Creating archive...",
        mnemonic = "archiver",
    )

    out = depset(direct = [f])
    return [
        DefaultInfo(
            files = out,
        ),
        OutputGroupInfo(
            all_files = out,
        ),
    ]

_py_lambda_zip = rule(
    implementation = _py_lambda_zip_impl,
    attrs = {
        "target": attr.label(),
        "ignore": attr.string_list(),
        "_zipper": attr.label(
            default = Label("@bazel_tools//tools/zip:zipper"),
            cfg = "host",
            executable = True,
        ),
        "output": attr.output(),
    },
    executable = False,
    test = False,
)

def py_lambda_zip(name, target, ignore, **kwargs):
    _py_lambda_zip(
        name = name,
        target = target,
        ignore = ignore,
        output = name + ".zip",
        **kwargs
    )

project/appcode/api/transaction_details/src/BUILD.bazel: Used custom py_lambda_zip rule to zip up py_library

load("@mndc-eligibility-deps//:requirements.bzl", "requirement")
load("@python3_9//:defs.bzl", "interpreter")
load("//build_rules/lambda_packaging:lambda.bzl", "contains", "endswith", "py_lambda_zip", "startswith")

py_library(
    name = "main",
    srcs = glob(["*.py"]),
    visibility = ["//appcode/api/transaction_details:__subpackages__"],
    deps = [
            requirement("Faker"),
        ],
)

py_lambda_zip(
    name = "lambda_archive",
    ignore = [
        contains("/__pycache__/"),
        endswith(".pyc"),
        endswith(".pyo"),

        # Ignore boto since it's provided by Lambda.
        startswith("boto3/"),
        startswith("botocore/"),

        # With the move to hermetic toolchains, the zip gets a lib/ directory containing the
        # python runtime. We don't need that.
        startswith("lib/"),
    ],
    target = ":main",
)
2
jvolkman On

We use @bazel_tools//tools/zip:zipper with a custom rule. We also pull serverless in using rules_nodejs and run it through bazel, which causes the package building to happen prior to running sls deploy.

We use pip_parse from rules_python. I'm not sure whether the _short_path function below will work with pip_install or other mechanisms.

File filtering is supported, although it's awkward. Ideally the zip generation would be handled by a separate binary (i.e., a Python script) which would allow filtering using regular expressions/globs/etc. Bazel doesn't support regular expressions in Starlark, so we use our own thing.

I've included an excerpt:

lambda.bzl

"""
Support for serverless deployments.
"""

def contains(pattern):
    return "contains:" + pattern

def startswith(pattern):
    return "startswith:" + pattern

def endswith(pattern):
    return "endswith:" + pattern

def _is_ignored(path, patterns):
    for p in patterns:
        if p.startswith("contains:"):
            if p[len("contains:"):] in path:
                return True
        elif p.startswith("startswith:"):
            if path.startswith(p[len("startswith:"):]):
                return True
        elif p.startswith("endswith:"):
            if path.endswith(p[len("endswith:"):]):
                return True
        else:
            fail("Invalid pattern: " + p)

    return False

def _short_path(file_):
    # Remove prefixes for external and generated files.
    # E.g.,
    #   ../py_deps_pypi__pydantic/pydantic/__init__.py -> pydantic/__init__.py
    short_path = file_.short_path
    if short_path.startswith("../"):
        second_slash = short_path.index("/", 3)
        short_path = short_path[second_slash + 1:]
    return short_path

def _py_lambda_zip_impl(ctx):
    deps = ctx.attr.target[DefaultInfo].default_runfiles.files

    f = ctx.outputs.output

    args = []
    for dep in deps.to_list():
        short_path = _short_path(dep)

        # Skip ignored patterns
        if _is_ignored(short_path, ctx.attr.ignore):
            continue

        args.append(short_path + "=" + dep.path)

    ctx.actions.run(
        outputs = [f],
        inputs = deps,
        executable = ctx.executable._zipper,
        arguments = ["cC", f.path] + args,
        progress_message = "Creating archive...",
        mnemonic = "archiver",
    )

    out = depset(direct = [f])
    return [
        DefaultInfo(
            files = out,
        ),
        OutputGroupInfo(
            all_files = out,
        ),
    ]

_py_lambda_zip = rule(
    implementation = _py_lambda_zip_impl,
    attrs = {
        "target": attr.label(),
        "ignore": attr.string_list(),
        "_zipper": attr.label(
            default = Label("@bazel_tools//tools/zip:zipper"),
            cfg = "host",
            executable = True,
        ),
        "output": attr.output(),
    },
    executable = False,
    test = False,
)

def py_lambda_zip(name, target, ignore, **kwargs):
    _py_lambda_zip(
        name = name,
        target = target,
        ignore = ignore,
        output = name + ".zip",
        **kwargs
    )

BUILD.bazel

load("@npm_serverless//serverless:index.bzl", "serverless")
load(":lambda.bzl", "contains", "endswith", "py_lambda_zip", "startswith")

py_binary(
    name = "my_lambda_app",
    ...
)

py_lambda_zip(
    name = "lambda_archive",
    ignore = [
        contains("/__pycache__/"),
        endswith(".pyc"),
        endswith(".pyo"),
        
        # Ignore boto since it's provided by Lambda.
        startswith("boto3/"),
        startswith("botocore/"),

        # With the move to hermetic toolchains, the zip gets a lib/ directory containing the
        # python runtime. We don't need that.
        startswith("lib/"),
    ],
    target = ":my_lambda_app",

    # Only allow building on linux, since we don't want to upload a lambda zip file
    # with e.g. macos compiled binaries.
    target_compatible_with = [
        "@platforms//os:linux",
    ],
)

# The sls command requires that serverless.yml be in its working directory, and that the yaml file
# NOT be a symlink. So this target builds a directory containing a copy of serverless.yml, and also 
# symlinks the generated lambda_archive.zip in the same directory.
#
# It also generates a chdir.js script that we instruct node to execute to change to the proper working directory.
genrule(
    name = "sls_files",
    srcs = [
        "lambda_archive.zip",
        "serverless.yml",
    ],
    outs = [
        "sls_files/lambda_archive.zip",
        "sls_files/serverless.yml",
        "sls_files/chdir.js",
    ],
    cmd = """
        mkdir -p $(@D)/sls_files
        cp $(location serverless.yml) $(@D)/sls_files/serverless.yml
        cp -P $(location lambda_archive.zip) $(@D)/sls_files/lambda_archive.zip

        echo "const fs = require('fs');" \
             "const path = require('path');" \
             "process.chdir(path.dirname(fs.realpathSync(__filename)));" > $(@D)/sls_files/chdir.js
    """,
)

# Usage:
#   bazel run //:sls -- deploy <more args>
serverless(
    name = "sls",
    args = ["""--node_options=--require=./$(location sls_files/chdir.js)"""],
    data = [
        "sls_files/chdir.js",
        "sls_files/serverless.yml",
        "sls_files/lambda_archive.zip",
    ],
)

serverless.yml

service: my-app

package:
  artifact: lambda_archive.zip

# ... other config ...