How to delete every X line of a very large data file?

113 views Asked by At

I've got a very large .csv file, which contains 10 million lines of data. The file size is around 250 MB. Each line contains three values and looks like this:

-9.8199980e-03,183,-4.32

I want to delete every 2nd line or e.g. copy every 10th line straight into a new file. Which program should I use and can you also post the code?

I tried it with Scilab and Excel; they couldn't open the file or just a small part of it. I can open the file in Notepad++, but when I tried to record and run a macro, which deletes every 2nd line, it crashed.

1

There are 1 answers

0
Mark Setchell On

I would recommend you install gawk/awk from here and harness the power of this brilliant tool.

If you want every other line:

gawk "NR%2" original.csv > new.csv

If you want every 10th line:

gawk 'NR%10==0" original.csv > new.csv