Is there a way to perform a cross join on two tables in Excel using VBA/Power Query?

162 views Asked by At

What I have currently are two worksheets: one that is a rolling list of company projects that will be added to every so often, and another that provides a list of tasks that are associated with every project (about 110 tasks). My ideal state is where our administrator adds a new project to the project sheet, I want to run a script or process that adds all new projects (denoted by [project id]) to a master sheet (where every time this runs, it will append the new data to the bottom of the records), and makes a row for every task, with the task in an adjacent column. [edit] - This process will need to be automated, either by way of scheduled refresh or a button/trigger initiated by the recording of the new project(s).

Project Table

Project ID Attribute
PID1 XXXXX
PID2 XXXXX
PID3 XXXXX

Task Table

Checkpoint Task
CP1 Task 1
CP2 Task 2
CP3 Task 3

New Table

Project ID Checkpoint Task
PID1 CP1 Task 1
PID1 CP2 Task 2
PID1 CP3 Task 3
PID2 CP1 Task 1
PID2 CP2 Task 2
PID2 CP3 Task 3

I wanted to run SQL to get this done, because that would be done easily, but I cannot figure out how to do that without using Power Automate. However, I can't use PA because these files are on SharePoint and that requires a different license than what I have.

I tried using Power Query, but since I do not have any common columns, I can't get the correct columns to populate to a flat table. I also tried Power Automate but ran into the licensing issue. I have entertained the idea of using Access but have not done anything with that yet.

2

There are 2 answers

4
Rory On

Load both tables as queries. Then in one add a custom column and for the formula simply refer to the other query (e.g. if you are in Projects, use =Tasks as the formula). That will create a table for each row. Then simply expand the table using the double-headed arrow at the top of the column.

1
Sam Nseir On

This answer also uses Power Query and is very similar to the previous answer, but this one only appends new rows to the main table while retaining any additional columns with data entry.

Queries in Power Query:

Query Load To …
tblProject Connection only
tblTask Connection only
tblMain Existing Connection only
tblMain Table

Queries tblProject, tblTask, tblMain Existing - are simple and pull in the respective table in the worksheets.

Query tblMain is the one that does all the work and it is also the one that loads in as Table. tblMain Existing loads in this same table from the worksheet. Steps of tblMain:

  • Reference tblProject
  • Left Anti (rows only in the first) merge with tblMain Existing on ProjectID. This will leave you with only the new projects added since last refresh.
  • Remove column of the merge
  • Add a new column Tasks referencing tblTask (same as previous answer)
  • Expand Tasks
  • Append tblMain - Existing with the previous step
    = Table.Combine({#"tblMain Existing", #"Expanded Tasks"})

I've created a sample spreadsheet: https://1drv.ms/x/s!AtpcKzRsDrY1o7Vk6bgpWE8AH7wTqw?e=umWQ8x