We have lots of content that is spread out in multiple word/excel files. Currently we have a workflow that is generating a huge script file that is then executed on the InDesign Server to place items and add pages etc. So, it's all done using scripts and XML import for the data. This works fine, but it is slow and does not scale well. We are currently exploring if we can automate the generation of a publication using IDML and skip the whole scripting as much as possible, instead provide InDesign with a file that contains all the content already, fully mapped to the correct styles.
The document size ranges from 100-500 pages per document. Each document consists primarily, but not exclusively, of text and tables. The goal would be a book with multiple documents so we can use a global TOC and global page numbers.
Here are the current steps we have in our prototype:
- We have a INDD layout that we export as IDML with a primary text frame where the content needs to go
- We have a variable amount of word documents in a specific order that needs to be added to the document (problem: number of pages is unknown and changes all the time)
- The word documents contain text with styles (1:1 mapping to ID, so it's quite easy to generate paragraphs in the IDML XML format)
- In the same word files, we reference tables. That's like a placeholder where the tables need to be added in the content flow
- The data for the tables live in Excel files, generating tables for IDML is straight forward too
Our current strategy is to fill in a story that is linked to the primary text frame with Auto Text Flow
enabled. That gives us the possibility to dump the content into an IDML Story file. We then trigger a recompose
on the document when we execute our generator script on the InDesign Server. This means InDesign will then calculate the text flow and generate pages for the overflowing content. So far so good, that works usually very well and compared to our previous workflow it is quite fast indeed.
But while testing a couple of different layouts we made some observations:
- Single column layouts are no issue at all, and the generation of the pdf/indd files also run fast enough on the InDesign Server
- A layout using two columns without tables is slower, but not by a lot
- Adding tables into the mix with a two column layout is where we start to see performance issues
To summarize the performance issues:
- Without any style information ID will (obviously) try to squish the tables into the first column
- For testing purposes, we added a
span
style that spans the parent paragraph element of the table and forces the text to flow properly, and it helps with table/text overlapping, which made some things a little faster - The biggest issue we see is with full page tables spanning over multiple pages, we manage to crash Indesign locally on our machines and InDesign Server at some point just gives up
- In order to combat overflowing tables we added a very simple calculation that makes sure the tables fit onto a page (in combination with the previous mentioned
span
paragraph style)
We have checked a couple of other things:
- Made sure the justification composer in the table cells is set to the single line composer (honestly don't remember the real name right now, sorry)
- Disabled preflight
- Made sure that the
Typical
display performance is active (on the local ID installation, which is) - Added
enableRedraw = false
and this did not change anything on InDesign Server in regard to performance (was to be expected but still, thanks for the hint :)) - Based on other feedback we added some more pages with empty text frames. The content is now split into these additional frames/stories. Meaning there is not a single big one anymore but multiple ones. The performance is worse, it takes now roughly 3 times as long as before.
I am very much interested to learn more about performance optimization, settings that need to be verified and general thoughts about our process and if we could use something else from ID that could help us. The biggest issue IMHO is the dynamic number of pages based on an unknown amount of (a lot of) content. I am not sure if performance would be better if we could split the content up into more stories so it's not just a single big one. But I have no idea how we could identify these page breaks.
Thank you very much for reading this and your thoughts!
Update: In case anyone else has to look into such an issue, we managed to almost solve all performance issues by wrapping the tables in a textframe and move the actual table content into a story. There are other issues but at the very least performance is much better now.