Concatenate the rows based on number (Google Refine, Excel/Google Spreadsheet)

Question

Concatenate the rows based on number (Google Refine, Excel/Google Spreadsheet)

321 views Asked by armando85 At 20 December 2016 at 22:44

I have large amount of rows on a csv file, which look like:

name a,1
name b,1
name c,1
name d,2
name e,2

I need to concatenate the rows based on number. Result should be:

name a|name b|name c
name d|name e

How can I do it in Google Refine or in Excel/Google Spreadsheet?

I am thinking of it, but with no solution.

Thank you a lot!

Original Q&A

There are 2 answers

arkottke On 21 December 2016 at 00:32

If you can use Python it would be pretty easy to do this manipulation. In the the code below, the name and group are read from "input.csv", and the grouped names (along with the group) are written to "output.csv". A defaultdict is used to create empty lists to store the group members.

import collections
import csv

grouped = collections.defaultdict(list)
with open('input.csv') as fp:
    reader = csv.reader(fp)
    for row in reader:
        name, group = row
        grouped[group].append(name)


with open('output.csv', 'w', newline='') as fp:
    writer = csv.writer(fp, delimiter='|')
    for key in sorted(grouped.keys()):
        writer.writerow([key] + grouped[key])

**Ettore Rizza** · Accepted Answer · 2016-12-21T10:27:52+00:00

Here is a proposal with Open refine. The only Grel formula i used is:

row.record.cells['myColumn'].value.join('|')

screencast

And here is the JSOn, assuming that your first column is named "myColumn" and the second "number" :

[
  {
    "op": "core/column-addition",
    "description": "Create column test at index 2 based on column number using expression grel:value",
    "engineConfig": {
      "mode": "row-based",
      "facets": [
        {
          "omitError": false,
          "expression": "isBlank(value)",
          "selectBlank": false,
          "selection": [
            {
              "v": {
                "v": false,
                "l": "false"
              }
            }
          ],
          "selectError": false,
          "invert": false,
          "name": "ee",
          "omitBlank": false,
          "type": "list",
          "columnName": "ee"
        }
      ]
    },
    "newColumnName": "test",
    "columnInsertIndex": 2,
    "baseColumnName": "number",
    "expression": "grel:value",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-move",
    "description": "Move column test to position 0",
    "columnName": "test",
    "index": 0
  },
  {
    "op": "core/blank-down",
    "description": "Blank down cells in column test",
    "engineConfig": {
      "mode": "row-based",
      "facets": [
        {
          "omitError": false,
          "expression": "isBlank(value)",
          "selectBlank": false,
          "selection": [
            {
              "v": {
                "v": false,
                "l": "false"
              }
            }
          ],
          "selectError": false,
          "invert": false,
          "name": "ee",
          "omitBlank": false,
          "type": "list",
          "columnName": "ee"
        }
      ]
    },
    "columnName": "test"
  },
  {
    "op": "core/column-addition",
    "description": "Create column concatenation at index 2 based on column myColumn using expression grel:row.record.cells['myColumn'].value.join('|')",
    "engineConfig": {
      "mode": "row-based",
      "facets": [
        {
          "omitError": false,
          "expression": "isBlank(value)",
          "selectBlank": false,
          "selection": [
            {
              "v": {
                "v": false,
                "l": "false"
              }
            }
          ],
          "selectError": false,
          "invert": false,
          "name": "ee",
          "omitBlank": false,
          "type": "list",
          "columnName": "ee"
        }
      ]
    },
    "newColumnName": "concatenation",
    "columnInsertIndex": 2,
    "baseColumnName": "myColumn",
    "expression": "grel:row.record.cells['myColumn'].value.join('|')",
    "onError": "set-to-blank"
  }
]

TechQA.

Concatenate the rows based on number (Google Refine, Excel/Google Spreadsheet)

There are 2 answers

Related Questions in EXCEL

Related Questions in OPENREFINE

Popular Questions

Popular Tags

Trending Questions