Concatenating CSV files in bash preserving the header only once

Question

Concatenating CSV files in bash preserving the header only once

557 views Asked by David At 14 November 2018 at 19:13

Imagine I have a directory containing many subdirectories each containing some number of CSV files with the same structure (same number of columns and all containing the same header).

I am aware that I can run from the parent folder something like

find ./ -name '*.csv' -exec cat {} \; > ~/Desktop/result.csv

And this will work fine, expect for the fact that the header is repeated each time (once for each file).

I'm also aware that I can do something like sed 1d <filename> or tail -n +<N+1> <filename> to skip the first line of a file.

But in my case, it seems a bit more specialised. I want to preserve the header once for the first file and then skip the header for every file after that.

Is anyone aware of a way to achieve this using standard Unix tools (like find, head, tail, sed, awk etc.) and bash?

For example input files

   /folder1
            /file1.csv
            /file2.csv
   /folder2
            /file1.csv

Where each file has header:

A,B,C and each file has one data row 1,2,3

The desired output would be:

A,B,C
1,2,3
1,2,3
1,2,3

Marked As Duplicate

I feel this is different to other questions like this and this specifically because those solutions reference file1 and file2 in the solution. My question asks about a directory structure with an arbitrary number of files where I would not want to type out each file one by one.

Original Q&A

There are 2 answers

moo On 14 November 2018 at 19:21

$ {
> cat real-daily-wages-in-pounds-engla.tsv;
> tail -n+2 real-daily-wages-in-pounds-engla.tsv;
> } | cat

You can pipe the output of multiple commands through cat. tail -n+2 selects all lines from a file, except the first.

**anubhava** · Accepted Answer · 2018-11-14T19:22:14+00:00

anubhava On 14 November 2018 at 19:22 BEST ANSWER

You may use this find + xargs + awk:

find . -name '*.csv' -print0 | xargs -0 awk 'NR==1 || FNR>1'

NR==1 || FNR>1 condition will be true for very first line in combined output or for every non-first line.

TechQA.

Concatenating CSV files in bash preserving the header only once

Marked As Duplicate

There are 2 answers

Related Questions in BASH

Related Questions in AWK

Related Questions in SED

Related Questions in CAT

Related Questions in UNIX-HEAD

Popular Questions

Popular Tags

Trending Questions