Split file after n number of non consecutivempty lines

Question

Split file after n number of non consecutivempty lines

133 views Asked by gmtek At 08 June 2022 at 07:04

I am trying to split a big text files after n number of empty lines. The text file contains exactly one empty line as data separator. Like below:

Lorem ipsum
Lorem ipsum
Lorem ipsum

Lorem ipsum
Lorem ipsum

Lorem ipsum

Lorem ipsum
Lorem ipsum

Lorem
Lorem

...

I have tried to use csplit

csplit data.txt /^$/ {3}

My expectation is that after 3 empty lines (not consecutive, but after cursor processes 3 empty lines) it split file and continue to do so. But it actualy splits file in each empty line.

My expected files: xx00

Lorem ipsum
Lorem ipsum
Lorem ipsum

Lorem ipsum
Lorem ipsum

Lorem ipsum

xx01

Lorem ipsum
Lorem ipsum

Lorem
Lorem

Any suggestion?

Original Q&A

There are 4 answers

anubhava On 08 June 2022 at 08:21

This awk should also work with an empty RS:

awk -v n=3 -v RS= '{ORS=RT; print > sprintf("xx%02d", int((NR-1)/n))}' file

dan On 08 June 2022 at 08:28

awk is good for this.

Split every n empty lines, naming files with:

No leading zeroes:

awk -v n=3 '
$0 == "" {++c}
c <= n {print > "xx"f}
c==n {c=0; ++f}'

width minimum width/zeroes:

awk -v n=3 -v width=2 '
$0 == "" {++c}
c <= n {print > "xx"f}
c==n {c=0; ++f; f = sprintf("%0*d",width,f)}'

To remove the trailing empty line in each file, just change c <= n to c < n.

RARE Kpop Manifesto On 08 June 2022 at 13:25

removed './xx00'
removed './xx01'
removed './awkprof.out'

    {m,g}awk '{
        print >> sprintf("xx%0*.f%.*s", __-(_~_),
                 int(_/__),_<_,_+=!NF) }' FS='^$' __=3

-rw-r--r--  1 501  75 Jun  8 09:19:10 2022 xx00
-rw-r--r--  1 501  37 Jun  8 09:19:10 2022 xx01


../../Desktop/testdiremptylines/

     1  Lorem ipsum
     2  Lorem ipsum
     3  Lorem ipsum
     4  
     5  Lorem ipsum
     6  Lorem ipsum
     7  
     8  Lorem ipsum
     9  

 xx00

     1  Lorem ipsum
     2  Lorem ipsum
     3  
     4  Lorem
     5  Lorem

 xx01

**Renaud Pacalet** · Accepted Answer · 2022-06-08T07:15:20+00:00

Renaud Pacalet On 08 June 2022 at 07:15 BEST ANSWER

With awk (tested with GNU and BSD awk):

awk -v max=3 '{print > sprintf("xx%02d", int(n/max))} /^$/ {n += 1}' file

TechQA.

Split file after n number of non consecutivempty lines

There are 4 answers

Related Questions in BASH

Related Questions in SHELL

Related Questions in CSPLIT

Popular Questions

Trending Questions