Split text into an array of N elements with a balanced number of words in each element

Question

Split text into an array of N elements with a balanced number of words in each element

2k views Asked by T.Todua At 02 December 2013 at 17:15

I wanted to split a large text into 10 pieces (somehow equal parts). Ii use this function:

function chunk($msg) {
    $msg = preg_replace('/[\r\n]+/', ' ', $msg);
    //define character length of each text piece
    $chunks = wordwrap($msg, 10000, '\n');
    return explode('\n', $chunks);
}

$arrayys = chunk($t);
foreach ($arrayys as $partt) {
    echo $partt . "<br/><br/><br/>";
}

But is it possible to define word length of each text piece (not character length )? how to divide text into words in such situation?

Original Q&A

There are 4 answers

Shivanshu On 02 December 2013 at 17:27

From docs,

<?php
$text = "ABCDEFGHIJK.";
$newtext = wordwrap($text,3,"\n",true);
echo "$newtext\n";
?>

OUTPUT: ABC DEF GHI JK.

Shankar Narayana Damodaran On 02 December 2013 at 17:31

You can do something like this. Breaks your text into equal parts.. The text in $str is of 20 chars, So the text is broken into 10 parts with 2 chars as a set.

Say, if your large text is of 1000 characters, then you will be getting 100 equal parts of text.

<?php
$div=10;//Equally split into 10 ...
$str="abcdefghijklmnopqrst";
print_r(array_chunk(str_split($str), (strlen($str)/($div))));

OUTPUT:

Array
(
    [0] => Array
        (
            [0] => a
            [1] => b
        )

    [1] => Array
        (
            [0] => c
            [1] => d
        )

    [2] => Array
        (
            [0] => e
            [1] => f
        )

    [3] => Array
        (
            [0] => g
            [1] => h
        )

    [4] => Array
        (
            [0] => i
            [1] => j
        )

    [5] => Array
        (
            [0] => k
            [1] => l
        )

    [6] => Array
        (
            [0] => m
            [1] => n
        )

    [7] => Array
        (
            [0] => o
            [1] => p
        )

    [8] => Array
        (
            [0] => q
            [1] => r
        )

    [9] => Array
        (
            [0] => s
            [1] => t
        )

)

mickmackusa On 15 September 2023 at 22:08

find the offset of each "word" in the text,
count the words then divide by 10 to determine the desired number of words per group,
isolate the first offset of each group,
extract segments of the original text between one group's first word offset and the next group's first word offset.

Code: (Demo)

$offsets = array_keys(str_word_count($text, 2));
$totalPerGroup = intdiv(count($offsets), 10);
$chunks = array_chunk($offsets, $totalPerGroup);
$starts = array_column($chunks, 0);
var_export(
    array_map(
        fn($start, $end) => substr($text, $start, $end ? $end - $start : $end),
        $starts,
        array_slice($starts, 1) + [null]
    )
);

Sample input:

$text = <<<TEXT
The answer was within her reach. It was hidden in a box and now that box sat directly in front of her. She'd spent years searching for it and could hardly believe she'd finally managed to find it. She turned the key to unlock the box and then gently lifted the top. She held her breath in anticipation of finally knowing the answer she had spent so much of her time in search of. As the lid came off she could see that the box was empty.
TEXT;

Output:

array (
  0 => 'The answer was within her reach. It was ',
  1 => 'hidden in a box and now that box ',
  2 => 'sat directly in front of her. She\'d spent ',
  3 => 'years searching for it and could hardly believe ',
  4 => 'she\'d finally managed to find it. She turned ',
  5 => 'the key to unlock the box and then ',
  6 => 'gently lifted the top. She held her breath ',
  7 => 'in anticipation of finally knowing the answer she ',
  8 => 'had spent so much of her time in ',
  9 => 'search of. As the lid came off she ',
  10 => 'could see that the box was empty.',
)

Of course, to remove trailing whitespaces, wrap substr() in a rtrim() call.

**user1915746** · Accepted Answer · 2013-12-02T17:25:14+00:00

user1915746 On 02 December 2013 at 17:25 BEST ANSWER

I would suggest to use "explode" http://php.net/manual/en/function.explode.php for splitting the string by spaces. Then you'll get an array of words on which you can iterate and build your text-parts.

TechQA.

Split text into an array of N elements with a balanced number of words in each element

There are 4 answers

Related Questions in PHP

Related Questions in TEXT

Related Questions in SPLIT

Related Questions in CPU-WORD

Related Questions in IMBALANCED-DATA

Popular Questions

Popular Tags

Trending Questions