column join variable number of columns and rearrangement

534 views Asked by At

I want to join the following 3 tables row-wise, but also while moving some columns.

 q)
newtab1:([]basecol1:10 20 30;basecol2:40 50 60;L1Col1:"a","b","c";L1Col2:"d","e","f";L1Col3:"g","h","i")
newtab2:([]basecol1:100 200 300;basecol2:400 500 600;L1Col1:"a","b","c";L1Col2:"d","e","f";L2Col1:"j","k","l";L2Col2:"m","n","o";L1Col3:"g","h","i";L2Col3:"p","q","r")
newtab3:([]basecol1:1 2 3;basecol2:4 5 6;L1Col1:"a","b","c";L1Col2:"d","e","f";L2Col1:"j","k","l";L2Col2:"m","n","o";L3Col1:"s","t","u";L3Col2:"v","w","x";L1Col3:"g","h","i";L2Col3:"p","q","r";L3Col3:"y","z","z")
q)newtab3

basecol1 basecol2 L1Col1 L1Col2 L2Col1 L2Col2 L3Col1 L3Col2 L1Col3 L2Col3 L3Col3
--------------------------------------------------------------------------------
1        4        a      d      j      m      s      v      g      p      y
2        5        b      e      k      n      t      w      h      q      z
3        6        c      f      l      o      u      x      i      r      z

That is, "Col3" for each "L" should be moved to the end

I want to fill all the columns that don't have data with zeroes. That is, I need to add 3 columns to table 1 (and move one column), and 6 columns to table 2 (and move 2 columns). The columns are always labeled with this same pattern. I started by creating the column names like this:

LegColNames:("Col1";"Col2";"Col3")
L1LegColNames: "L1",/:legColNames

but this makes them a string, how do I get those to be column names?

FWIW, in the real implementation, this will be joining hundreds of tables, and I'll need to take the table with the maximum number of Ls, and add the appropriate numbers of columns of zeroes to each table to make it equal that maximum. It can go up as high as L13.

2

There are 2 answers

2
Ryan Hamilton On

Always break these things down into steps:

q)t:uj/[(newtab1;newtab2;newtab3)]  / join all tables

q)t2:0^t / this would fill the number columns
q)t2:flip {$[10h=type x;"0"^x;x]} each flip t / fill character columns with zeros

q){c:cols x; (c iasc c like "*Col3") xcols x}t2 / pull our columns to the end

basecol1 basecol2 L1Col1 L1Col2 L2Col1 L2Col2 L3Col1 L3Col2 L1Col3 L2Col3 L3Col3
--------------------------------------------------------------------------------
10       40       a      d      0      0      0      0      g      0      0
20       50       b      e      0      0      0      0      h      0      0
30       60       c      f      0      0      0      0      i      0      0
100      400      a      d      j      m      0      0      g      p      0
200      500      b      e      k      n      0      0      h      q      0
300      600      c      f      l      o      0      0      i      r      0
1        4        a      d      j      m      s      v      g      p      y
2        5        b      e      k      n      t      w      h      q      z
3        6        c      f      l      o      u      x      i      r      z

The last part is the only bit that will be more difficult for many more tables. You'll want to parse the ending number and then find the sorted indices of the columns.

  • Ryan
2
WooiKent Lee On

I'm starting to think that your fields are not char type but string type (list of char list) since you pointed out that "" is one of the value of the field. If that's the case, using suggestion from @Ryan then:

q)newtab1:([]basecol1:string 10 20 30;basecol2:string 40 50 60;L1Col1:string 3?`2;L1Col2:string 3?`2;L1Col3:string 3?`2)
q)newtab2:([]basecol1:string 100 200 300;basecol2:string 400 500 600;L1Col1:string 3?`2;L1Col2:string 3?`2;L2Col1:string 3?`2;L2Col2:string 3?`2;L1Col3:string 3?`2;L2Col3:string 3?`2)
q)newtab3:([]basecol1:string 1 2 3;basecol2:string 4 5 6;L1Col1:string 3?`2;L1Col2:string 3?`2;L2Col1:string 3?`2;L2Col2:string 3?`2;L3Col1:string 3?`2;L3Col2:string 3?`2;L1Col3:string 3?`2;L2Col3:string 3?`2;L3Col3:string 3?`2)
q)t:uj/[(newtab1;newtab2;newtab3)]

q)@[`t;cols t;{?[x like "";count[x]#enlist 1#"0";x]}]      / using vector condition

q){c:cols x; (c iasc c like "*Col3") xcols x}t
basecol1 basecol2 L1Col1 L1Col2 L2Col1 L2Col2 L3Col1 L3Col2 L1Col3 L2Col3 L3Col3
--------------------------------------------------------------------------------
"10"     "40"     "do"   "ma"   ,"0"   ,"0"   ,"0"   ,"0"   "gm"   ,"0"   ,"0"
"20"     "50"     "nc"   "bo"   ,"0"   ,"0"   ,"0"   ,"0"   "kp"   ,"0"   ,"0"
"30"     "60"     "df"   "mn"   ,"0"   ,"0"   ,"0"   ,"0"   "ec"   ,"0"   ,"0"
"100"    "400"    "ll"   "fp"   "ml"   "ce"   ,"0"   ,"0"   "ch"   "ho"   ,"0"
"200"    "500"    "cn"   "ol"   "fm"   "ij"   ,"0"   ,"0"   "bo"   "of"   ,"0"
"300"    "600"    "pk"   "pf"   "ii"   "ap"   ,"0"   ,"0"   "ed"   "ii"   ,"0"
,"1"     ,"4"     "jo"   "dh"   "cm"   "om"   "oa"   "ak"   "ap"   "no"   "kj"
,"2"     ,"5"     "kg"   "mh"   "ll"   "jk"   "nn"   "og"   "nn"   "gc"   "ig"
,"3"     ,"6"     "nh"   "mg"   "ob"   "fl"   "mi"   "lf"   "am"   "jo"   "nf"

0^ and "0"^ won't work on empty string:

q)"0"^""
""