Break text into sentences using bash

379 views Asked by At

A sentence is a one that ends with period (.), exclamation(!) or question (?). I tried

tr '\n' ' ' <  input | sed -e 's/[.] \s*/. \\n/g'

I see \n added in my file but the line does not really break there.

I am using bash 3.2 version on Mac OS X Mavericks.

1

There are 1 answers

2
AKS On

See if this works. ( '\012' is new line character that tr command understands, you are replacing it with a blank space and then finally using sed to "capture" either a full stop/dot ., an exclamation !, or a question mark ? character using ( and ) and that whatever character will become available to \1 and after that you want \n new line for sed. sed boundary character that I used in the following example is #.

tr '\012' ' ' < someInputFile.txt | sed "s#\([\.\?\!]\)#\1\n#g"