How to browse thru TTML and get all the time\captions into JSON file

428 views Asked by At

I have a ttml file that contains video captions, I want to fetch thru all the pairs time\caption and place them into a JSON file, I have tried https://www.npmjs.com/package/ttml?activeTab=readme but it did not work this one. Any ideas ? Thank you

2

There are 2 answers

0
Pierre-Anthony Lemieux On

For folks that prefer Python, ttconv can split TTML/IMSC documents into a series of Intermediate Synchronic Documents (ISDs), each one corresponding to a period of time where the contents of the TTML/IMSC document is static.

import ttconv.imsc.reader
import ttconv.isd
import xml.etree.ElementTree as et

tt_doc = """<?xml version="1.0" encoding="UTF-8"?>
  <tt xml:lang="fr" xmlns="http://www.w3.org/ns/ttml">
  <body>
    <div>
      <p begin="1s" end="2s">Hello</p>
      <p begin="3s" end="4s">Bonjour</p>
    </div>
  </body>
  </tt>"""

m = ttconv.imsc.reader.to_model(et.ElementTree(et.fromstring(tt_doc)))

st = ttconv.isd.ISD.significant_times(m)

for t in st:
  isd = ttconv.isd.ISD.from_model(m, t)
  
  # convert ISD to JSON

ttconv also supports conversion from TTML/IMSC to SRT, which is a simple text-based format. All styling information is lost however.

tt.py convert -i <input .ttml file> -o <output .srt file> --otype SRT --itype TTML
2
Nigel Megitt On

Try looking at https://github.com/sandflow/imscJS for code that extracts the Intermediate Synchronic Documents (ISDs) - e.g. the file isd.js may be relevant.

By the way, it's worth noting that the data model in TTML doesn't exactly match the idea of a mapping between pairs of times and individual captions. You may get duplications.

Each ISD is a snapshot between two moments on the timeline in which the presented content does not change.

This is an important distinction because in TTML it is possible to have the same "caption" appear at times that overlap with other captions appearing and disappearing, for example:

...
<div begin="10s" end="20s">
  <p>This text appears at 10s and disappears by 20s</p>
  <p end="5s">This text appears at 10s and disappears by 15s</p>
  <p begin="5s">This text appears at 15s and disappears by 20s</p> 
</div>
...

So the result in ISDs is:

0->10s [nothing]

10s->15s

This text appears at 10s and disappears by 20s

This text appears at 10s and disappears by 15s

15s->20s

This text appears at 10s and disappears by 20s

This text appears at 15s and disappears by 20s

20s-> [nothing]

As you can see that first line appears in two ISDs. It's up to you in your application how you deal with this, of course.