What is a bucket brigade?

4k views Asked by At

I would really love to implement a php_user_filter::filter(). But therefore I have to know what a bucket brigade is. This seems to be a resource which I can operate with the stream_bucket_* functions. But the documentation is not really helpful. The best I could find are those examples in stream_filter_register().

I'm especially curios what these stream_bucket_new() and stream_bucket_make_writeable() can do.


Update: It seems that PHP is exposing an internal data structure of Apache.

2

There are 2 answers

10
bwoebi On BEST ANSWER

Ah, welcome to the least documented parts of the PHP manual! [I opened a bug report about it; maybe this answer will be helpful for documenting it: https://bugs.php.net/bug.php?id=69966]

The bucket brigade

To start with your initial question, the bucket brigade is just a name to the resource named userfilter.bucket brigade.

You are passed two different brigades in as first and second parameters to php_user_filter::filter(). The first brigade is the input buckets you read from, the second brigade is initially empty; you write to it.

Regarding your update about the data structure… It's really just a doubly linked list with strings basically. But it may well be that the name was stolen from there ;-)

stream_bucket_prepend() / stream_bucket_append()

stream_bucket_prepend(resource $brigade, stdClass $bucket): null
stream_bucket_append(resource $brigade, stdClass $bucket): null

The expected $brigade is the output brigade aka the second parameter on php_user_filter::filter().

The $bucket is a stdClass object like it is returned by stream_bucket_make_writable() or stream_bucket_new().

These two functions just prepend or append the passed bucket to the brigade.

stream_bucket_new()

To demystify this function, analyze first what it's function signature is:

stream_bucket_new(resource $stream, string $buffer): stdClass

First argument is the $stream you're writing this bucket to. Second is the $buffer this new bucket will contain.

[I'd like to note here that the $stream parameter actually is not very significant; it's just used to check whether we need to allocate memory persistently so that it survives through requests. I just suppose that you can make PHP nicely segfault by passing a persistent stream in here, when operating on a non-persistent filter...]

There is now an userfilter.bucket resource created which is assigned to a property of a (stdClass) object named bucket. That object has also two other properties: data and datalen, which contain the buffer and the buffer size of this bucket.

It will return you a stdClass which you can pass in to stream_bucket_prepend() and stream_bucket_append().

stream_bucket_make_writable()

stream_bucket_make_writeable(resource $brigade): stdClass|null

It shifts the first bucket from the $brigade and returns it. If the $brigade was emptied, it returns null.

Further notes

When php_user_filter::filter() is called, the $stream property on the object filter() is called on will be set to the stream we're currently working on. That's also the stream you need to pass to stream_bucket_new() when calling it. (The $stream property will be unset again after the call. You can't reuse it in e.g. php_user_filter::onClose()).

Also note that even when you're returned a $datalen property, you do not need to set that property in case you change $data property before passing it to stream_bucket_prepend() or stream_bucket_append().

The implementation requires you (well, it expects that or will throw a warning) that you read all the data from the $in bucket before returning.

There is another case of the documentation lying to us: in php_user_filter::onCreate(), the $stream property is not set. It will only be set during filter() method call.

Generally, don't use filters with non-blocking streams. I tried that once and it went horribly wrong … And it's not likely that's ever going to be fixed...

Sum up (examples)

Let's start with the simplest case: writing back what we got in.

class simple_filter extends php_user_filter {
    function filter($in, $out, &$consumed, $closing) {
        while ($bucket = stream_bucket_make_writeable($in)) {
            $consumed += $bucket->datalen;
            stream_bucket_append($out, $bucket);
        }
        return PSFS_PASS_ON;
    }
}

stream_filter_register("simple", "simple_filter")

All what happens here is getting buckets from $in bucket brigade and putting it back into $out bucket brigade.

Okay, now try to manipulate our input.

class reverse_filter extends php_user_filter {
    function filter($in, $out, &$consumed, $closing) {
        while ($bucket = stream_bucket_make_writeable($in)) {
            $consumed += $bucket->datalen;
            $bucket->data = strrev($bucket->data);
            stream_bucket_prepend($out, $bucket);
        }
        return PSFS_PASS_ON;
    }
}

stream_filter_register("reverse", "reverse_filter")

Now we registered the reverse:// protocol, which reverses your string (each write is being reversed on it's own here; write order is still preserved). So, we obviously now need to manipulate the bucket data and prepend it here.

Now, what's the use case for stream_bucket_new()? Usually you can just append to $bucket->data; yes, you even can concatenate all the data into the first bucket, but when flush()'ing it might be possible that nothing is in bucket brigade and you want to send a last bucket, then you need it.

class append_filter extends php_user_filter {
    public $stream;

    function filter($in, $out, &$consumed, $closing) {
        while ($bucket = stream_bucket_make_writeable($in)) {
            $consumed += $bucket->datalen;
            stream_bucket_append($out, $bucket);
        }
        // always append a terminating \n
        if ($closing) {
            $bucket = stream_bucket_new($this->stream, "\n");
            stream_bucket_append($out, $bucket);
        }
        return PSFS_PASS_ON;
    }
}

stream_filter_register("append", "append_filter")

With that (and the existing documentation about php_user_filter class), one should be able to do all sorts of magic userland stream filtering by combining all these powerful possibilities into even stronger code.

0
rexfordkelly On

I thought I would contribute some background info.

First, the terms buckets and brigades. Turns out there is a thing called a bucket brigade... sort of a tag team effort for fighting fires... where you have a chain of people who stand still but pass buckets of water to the person next to them, producing a constant flow of buckets full of water.

Also, as pointed out above, PHPs adoption of buckets and brigades comes from Apaches [Buckets and Brigades](http://www.apachetutor.org/dev/brigades], perhaps... great explanation is given on the methodology and reasoning.

But essentally the idea is, if you need to do modification to a some content, before it is sent, doing it mid stream has many benefits, especially when you model your streams using buckets and brigades.