PHP text to Array Regex

82 views Asked by At

I need the text delimited by "~ (Number from 0 to 13)" and ending at "~ end" each position of the array should have the text that is between the braces. Does anybody have an idea ?

TEXT: (The original has a lake text and maybe html)

 ~0 
    aaaaaa1
    aaaaaaaaaa
    ~1 
    bbbbbbbbbb
    sdf23
    324 <br>
    sdfs
    ~2 
    cccccccccc
    ~3 
    ddddddddddd 

    ~13 
    eeeeeeeeeee 

    ~14 
    fffffffffff 
            ~end

END Array:

 Array
                (
                    [0] =>  aaaaaa1
                            aaaaaaaaaa

                    [1] => bbbbbbbbbbb
                            sdf23
                            324 <br>
                            sdfs 

                    [2] => cccccccccc 

                    [3] => dddddddddd
                    .
                    .
                    .
                    .
                    [13] => eeeeeee 

                    [14] => fffffff 


                )

My PHP with regex: (fail)

$texto = "
 ~0 
    123hola321
    yyyyyyyyyyy
    ~1 
    rrrrrrrrrrrr
    sdf23
    324 <br>
    sdfs
    ~2 
    cccccccccc
    ~3 
    ddddddddddd 

    ~13 
    ddddddddddd 

    ~14 
    ddddddddddd 
            ~end  ";


$regex = '/^~(\d{1,2}.\n)(.*?)/m';
echo $regex;
preg_replace($regex,$texto,$matches);


echo "<pre>";
print_r($matches);
echo "</pre>";

//      ^~(\d{1,2}.\n)    

// ~\d{1,2} (.*?)2$
// 
//  ^~\d{1,2}(.*?)end$

thx

2

There are 2 answers

1
victorelec14 On BEST ANSWER
$texto = "
~0 
123hola321
yyyyyyyyyyy
xxxxxxxx
ffffffffff
~1 
rrrrrrrrrrrr
~2 
cccccccccc
~3 
ddddddddddd 

~3 
ddddddddddd 
        ~end  ";


$arr = preg_split('#~\d{1,2}.(\r\n|\n|\r)#', $texto);

echo "<pre>";
print_r($arr);
echo "</pre>";
1
jeroen On

I would use preg_split() instead:

$arr = preg_split('/~\d{1,2}/', $texto);

No need to capture everything in between.

Of course this will only work if the keys are sequential and start at 0 or they don't matter.

Edit: If you want to trim the resulting strings in the process, you should not just add any character to the regex, the dot .; this can remove valid characters from your results.

Instead, only remove the white-space with this:

$arr = preg_split('/\s*~\d{1,2}\s*/', $texto);

\s* means 0 to any number of white-space characters (spaces, tabs, newlines, etc.).