How to remove word from text file containing X number of characters?

538 views Asked by At

I found quiet few posts here suggesting solutions using awk and sed, but none of them seems to do the job. Either the whole line is being removed, or nothing at all is removed. I'm also not command line wizzard and my knowledge is kind of limited, so I decided to ask for help here. It doesnt matter the solution, whether is awk, grep, sed... I honestly can't make any difference in this case, so it will be whatever you feel it's beset in this case.

What I have is several files with few million lines, and the files/lines look something like this:

50somethingcharactergibberish shortrword
50somethingcharactergibberish shortrword
50somethingcharactergibberish shortrword
50somethingcharactergibberish shortrword
50somethingcharactergibberish shortrword
50somethingcharactergibberish shortrword

And this goes for several million lines. What I need to do, is to remove the 50somethingcharactergibberish and leave only the shortword. The problem also is that there is no pattern, the long word in question sometimes starts with letter, and sometimes with number. So I assume I'll have to count the characters eventually.

2

There are 2 answers

3
Inian On BEST ANSWER

The most minimal awk that could work for you is something like:-

awk '!($1="")' million-line-file
0
tomc On

awk is overkill for this try cut

cut -f2 -d ' ' 2col.list > 2ndcol.list

says cut the second field -f2 considering a space to be field delimiter -d ' ' for every row in the input file and redirect that second field into the output file