Lifetime of literal values inside Zig comptime function

89 views Asked by At

Literal's lifetime inside a function

In the following code, is the reference returned by the function a bad pointer ?

fn hello() *const u8 {
    return &'A';
}

No, because the 'A' literal has a static lifetime (or is it?). But what about the following function ?

fn world(c: u8) []const u8 {
    return &.{ c };
}

Still no, because, again, the literal array has a static lifetime, but this leads to strange behavior if called multiple times:

const a = world('a');
const b = world('b');
print("{s} {s}\n", .{a, b});
b b

The previous value ('a') has been overwritten has the referenced literal is the same over all call to the function world().

Comptime function

What happens now if the world() function is comptime ?

fn world(comptime c: u8) []const u8 {
    return &.{ c };
}

Apparently, it creates a distinct static literal for each call:

const a = world('a');
const b = world('b');
print("{s} {s}\n", .{a, b});
a b

Macro to build literal tree

Now, the same problem but applied to a concrete situation.

I have a Regex type representing a regular expression :

const Regex = union(enum) {
    literal: u8,
    sequence: []const Regex,
    repeat: *const Regex,
    choice: []const Regex,
};

But directly writing a literal regex formula can be cumbersome:

const regex: Regex = .{ .sequence = &.{ .{ .literal = 'h' }, .{ .choice = &.{ ... } } } };

To ease the writing of literal regular expression, some helper functions can come to the rescue :

// make the regular expression optional (aka the `?` operator)
fn opt(comptime self: Regex) Regex {
    // choice between itself and an empty sequence
    return .{ .choice = &.{ self, .{ .sequence = &.{} } } };
}

We could also image a function which convert a string literal to a regex (sequence of literals).

This seems to work as long as the function is comptime. But I don't want approximate confidence, I want full confidence in its correctness. Please correct me if my understanding of lifetime of literal is wrong.

Zig's comptime print

It turns out Zig have, in its standard library, such a function. The std.fmt.comptimePrint. It creates a local variable and returns a pointer to it.

pub inline fn comptimePrint(comptime fmt: []const u8, args: anytype) *const [count(fmt, args):0]u8 {
    comptime {
        var buf: [count(fmt, args):0]u8 = undefined;
        _ = bufPrint(&buf, fmt, args) catch unreachable;
        buf[buf.len] = 0;
        return &buf;
    }
}

But what does the inline and the comptime { ... } do here ? Why are they necessary ? What would happen if they weren't there ? Is the returned buffer's lifetime static ?

1

There are 1 answers

0
sigod On

inline forces the function to be semantically inlined at the callsite. comptime {} guarantees that the code is evaluated at compile-time. This means that the content of the buffer is known at compile-time and likely lives in the global constant data section at runtime. In other words, yes, the lifetime of the buffer is static.