union data in multiple tabs

527 views Asked by At

I'm using the Google Analytics add on for sheets. The particular query I'd like to run returns 70k rows of data. Since Google Analytics only returns a max of 10k ata time I broke the queries out into 7 where the start index was at 10,001, 20,001 etc.

So now I have 7 tabs of data each with 10k rows.

I would like to join them all together in one big table. I've looked into the =query() function and fiddled with it with no success. I've also looked at =join() but also no success.

Even if I was successful with those options I'm not sure they are the "best" ones either.

Thanks to SO I have some limited experience with Google Apps script (but all together pretty basic). I'm not sure if I should try to figure out how to do this with a script or if there's a handy built in function.

I'm also conscious of breaching my sheets count of cells limit of 200k.

For example, I have the data already shared amongst the 7 tabs. Seems inefficient to bring them altogether into 1 big table since that's duplication.

My end goal is to query the data as a group using for example =sum(filter()) function and/or =query() functions. I need to slice n dice this data in several ways but it's currently spread across 7 tables.

  1. How can I join into one big table
  2. Is on a good idea and if not is there a better way?
2

There are 2 answers

0
pointNclick On

According to your description, I feel you're not using join() correctly. It is not the same as a DB join but rather concatenates cells of a row into a string.

It is not just the 200K cell-limit but execution time for each script could be one of the problems depending on how much data you have on each sheet. If you have 20 columns in each row of a sheet of 10K, you'll easily be hitting the 200K limit with each one of those. I would draw your attention at this point to some Best Practices in this scenario. Specifically, using the Cache Service to decrease your turn-around time for following calls. As mentioned in the documentation, the first call will still take the same amount of time but all subsequent calls will be much faster.

With 7 sheets and a lot of data to go through, Cache service will be helpful as it can be used to query the Cached data for a faster turn-around time. The default value of expiration time is 10 minutes but using putAll(value, expirationInSeconds), you could have the cached data available for 21600 seconds(6 hours).

Write a function to cache all of your data and use the service to fetch you all the info you need. I feel in your case, that is your best bet. Also, this might be a helpful link for some of the FAQ related to AppScript.

4
Tim On

You are using the wrong tool for the job - import all the sheets into Google Fusion tables, then use a single common key field (create one before uploading if it doesn't already exist) to join them in a new Merged Table.

It will execute much faster, and you will be able to filter the results & select the columns with a View, then download the output into a new spreadsheet which you can also upload as a Google Sheet :)