Unwrap / amalgamate PHP code from several .php files

72 views Asked by At

For debugging purposes, when working on PHP projects with many file / many include (example: Wordpress code), I would sometimes be interested in seeing the "unwrapped" code, and to amalgamate / flatten ("flatten" is the terminology used in Photoshop-like tools when you merge many layers into one layer) all files into one big PHP file.

How to do an amalgamation of multiple PHP files?

Example:

$ php index.php --amalgamation

would take these files as input:

  • vars.php

    <?php
    $color = 'green';
    $fruit = 'apple';
    ?>
    

    index.php

    <?php
    include 'vars.php';
    echo "A $color $fruit";
    ?>
    

and produce this amalgamated output:

<?php
$color = 'green';
$fruit = 'apple';
echo "A $color $fruit";
?>

(it should work also with many files, e.g. if index.php includes vars.php which itself includes abc.php).

1

There are 1 answers

2
Markus AO On BEST ANSWER

We can write an amalgamation/bundling script that fetches a given file's contents and matches any instances of include|require, and then fetches any referred files' contents, and substitutes the include/require calls with the actual code.

The following is a rudimentary implementation that will work (based on a very limited test on files with nested references) with any number of files that include/require other files.

<?php

// Main file that references further files:
$start = 'test/test.php';

function bundle_files(string $filepath)
{
    // Fetch current code
    $code = file_get_contents($filepath);
    
    // Set directory for referred files
    $dirname = pathinfo($filepath, PATHINFO_DIRNAME);
    
    // Match and substitute include/require(_once) with code:
    $rx = '~((include|require)(_once)?)\s+[\'"](?<path>[^\'"]+)[\'"];~';

    $code = preg_replace_callback($rx, function($m) use ($dirname) {
        // Ensure a valid filepath or abort:
        if($path = realpath($dirname . '/' . $m['path'])) {
            return bundle_files($path);         
        } else {
            die("Filepath Read Fail: {$dirname}/{$m['path']}");
        }
    }, $code);
    
    // Remove opening PHP tags, note source filepath
    $code = preg_replace('~^\s*<\?php\s*~i', "\n// ==== Source: {$filepath} ====\n\n", $code);
    
    // Remove closing PHP tags, if any
    $code = preg_replace('~\?>\s*$~', '', $code);   
    
    return $code;
}

$bundle = '<?php ' . "\n" . bundle_files($start);

file_put_contents('bundle.php', $bundle);
echo $bundle;

Here we use preg_replace_callback() to match and substitute in order of appearance, with the callback calling the bundling function on each matched filepath and substituting include/require references with the actual code. The function also includes a comment line indicating the source of the included file, which may come in handy if/when you're debugging the compiled bundle file.

Notes/Homework:

  • You may need to refine the base directory reference routine. (Expect trouble with "incomplete" filepaths that rely on PHP include_path.)
  • There is no control of _once, code will be re-included. (Easy to remedy by recording included filepaths and skipping recurrences.)
  • Matching is only made on "path/file.php", ie. unbroken strings inside single/double quotes. Concatenated strings are not matched.
  • Paths including variables or constants are not understood. Files would have to be evaluated, without side-effects!, for that to be possible.
  • If you use declare(strict_types=1);, place it atop and eliminate following instances.
  • There may be other side-effects from the bundling of files that are not addressed here.
  • The regex does no lookbehind/around to see if your include/require is commented out!
  • If your code jumps in and out of PHP mode and blurts out HTML, all bets are off
  • Managing the inclusion of autoloaded classes is beyond this snippet.

Please report any glitches and edge cases. Feel free to develop and (freely) share.