Join 3 or more CSV's that all contain the same first columns

175 views Asked by At

I am looking to join 3 (or More) CSV's into 1 csv. Each CSV contains the same data in the first column. It will always be the same in the first column. for example column 1, line 1 will be the same in all 3+ csv's, Column 1, line 2 will be the same in all 3 csv's....etc.

Sample data

csv_1:

date, a
1,b
2,c
3,d

csv_2:

date,t
1,h
2,j
3,k

csv_3:

date, q
1,w
2,e
3,r

Output CSV:

date,a,t,q
1,b,h,w
2,c,j,e
3,d,k,r

I was hoping to achieve this with miller. It works great for combining 2 csv's with this command:

mlr --csv join -u -j Date -f csv_1 csv_2 > output.csv

sadly if i add another csv like this: mlr --csv join -u -j Date -f csv_1 csv_2 csv_3 > output.csv It will add 2 more columns to the end of the file.

Can this be achieved with miller?

I have googled this problem and cant find a solution. Tried the join above but it only works for 2 files.

2

There are 2 answers

6
aborruso On

Add a then

mlr --csv join -j date -f 01.csv then join -j date -f 02.csv 03.csv

and you get

date t a q
1 h b w
2 j c e
3 k d r
5
aborruso On

If all CSV files have the same order and you want to combine column 1,2 on each CSV and them remove the column 1 duplicates, in the Unix shell you can run:

paste -d ',' *.csv | mlr --csv label ref_date then cut -x -r -f "^date"
ref_date a t q
1 b h w
2 c j e
3 d k r