Powerquery: how to import multiple text files from a folder in separate columns in the same excel sheet

1.2k views Asked by At

I want to import multiple text files from a folder, each file containing two columns, into a single excel sheet, so that every new file starts in a new column. Ideally, I need the two columns from the first file and only the second column from every additional text file. In powerquery, I tried to use the "Import From Folder (Import metadata and links about files in a folder)" functionality followed by query editor and expanding the binaries and the result was that every new file was appended at the end of the previous one. But I want every file to start a new column in the same sheet and I don't know how to do that.

How can I direct powerquery to do that? Thanks in advance for your help!

1

There are 1 answers

0
MarcelBeug On

My proposal includes 2 rather difficult steps added via the advanced editor, but it is dynamic with regard to the number of .txt files in the folder. I added a ton of comments so it should be self explanatory.

/*  In this query, .txt files from a folder are combined.
Each source file has 2 columns.
The resulting table consists of both columns from the first file and each second column from the other files.
Tables are joined using each first column as key and with a left outer join
It is assumed that each file has column headers in the first row, that the first column header is the same for each file
    and, preferably, the second column header differs per file, although this is not necessary.
This query is tested with the following file contents:

File1.txt:
    ID,File1
    1,A
    2,B
    3,C
    4,D

File2.txt:
    ID,File2
    1,W
    2,X
    3,Y

Another file was added later on, to test for .txt files being added to the folder: works fine!
*/    

let
// Standard UI:
Source = Folder.Files("C:\Users\Marcel\Documents\Forum bijdragen\StackOverflow Power Query\Multiple files in 1 folder"),

// Standard UI; step renamed
FilteredTxt = Table.SelectRows(Source, each [Extension] = ".txt"),

// Standard UI; step renamed
RemovedColumns = Table.RemoveColumns(FilteredTxt,{"Name", "Extension", "Date accessed", "Date modified", "Date created", "Attributes", "Folder Path"}),

// UI add custom column "FileContents" with formula Csv.Document([Content]); step renamed
AddedFileContents = Table.AddColumn(RemovedColumns, "FileContents", each Csv.Document([Content])),

// Standard UI; step renamed
RemovedBinaryContent = Table.RemoveColumns(AddedFileContents,{"Content"}),

// In the next 3 steps, temporary names for the new columns are created ("Column2", "Column3", etcetera)
// Standard UI: add custom Index column, start at 2, increment 1
#"Added Index" = Table.AddIndexColumn(RemovedBinaryContent, "Index", 2, 1),

// Standard UI: select Index column, Transform tab, Format, Add Prefix: "Column"
#"Added Prefix" = Table.TransformColumns(#"Added Index", {{"Index", each "Column" & Text.From(_, "en-US"), type text}}),

// Standard UI:
#"Renamed Columns" = Table.RenameColumns(#"Added Prefix",{{"Index", "ColumnName"}}),
// Now we have the names for the new columns

// Advanced Editor: create a list with records with FileContents (tables) and ColumnNames (text) (1 list item (or record) per txt file in the folder)
// From this list, the resulting table will be build in the next step.
ListOfRecords = Table.ToRecords(#"Renamed Columns"),

// Advanced Editor: use List.Accumulate to build the table with all columns, 
// starting with Column1 of the first file (Table.FromList(ListOfRecords{0}[FileContents][Column1], each {_}),)
// adding Column2 of each file for all items in ListOfRecords.
BuildTable = List.Accumulate(ListOfRecords,
                             Table.FromList(ListOfRecords{0}[FileContents][Column1], each {_}),
                             (TableSoFar,NewColumn) => 
                              Table.ExpandTableColumn(Table.NestedJoin(TableSoFar, "Column1", NewColumn[FileContents], "Column1", "Dummy", JoinKind.LeftOuter), "Dummy", {"Column2"}, {NewColumn[ColumnName]})),

// Standard UI
#"Promoted Headers" = Table.PromoteHeaders(BuildTable)

 in

    #"Promoted Headers"