TL;DR: I want to hack the internals of CKEditor so that it produces an alternate format for rich text (instead of HTML), and I'd like an opinion from an expert regarding the feasibility of that idea.
I'm working on a project that requires a collaborative rich-text editor (like Google Docs), and I'm planning on using an operational transform library (ShareJS) to implement it. But operational transform is difficult to implement with HTML because of tag-nesting rules. For example, a naive OT implementation would be prone to producing this kind of garbage:
<b>overlapping bold <i>and</b> italic text.</i>
The correct way to represent such text in HTML would be something more like this:
<b>overlapping bold </b>
<b><i>and</i></b>
<i> italic text.</i>
Or better yet, something like this:
<span class="bold">overlapping bold </span>
<span class="bold italic">and</span>
<span class="italic"> italic text.</span>
But, to get those kinds of representations, the OT implementation needs to know all the rules of HTML tag-nesting and how to correct erroneous merges.
I've been thinking about a possible solution using an alternate form of markup that doesn't enforce tag-nesting rules at all. Something like this:
BOLD (start: 0, length: 20)
ITALIC (start: 17, length: 16)
TEXT:overlapping bold and italic text.
Using a format like that, I could use a plain-vanilla OT library to manage the ongoing diff/rebase/merge operations, and then transform the resultant document into HTML at the last moment before updating the GUI on both sides of the collaboration.
The easiest way to implement this would be to ask CKEditor for its HTML output and then reconstruct the document in the new format after-the-fact. But OT would require me to perform that transformation on every keypress, and that seems a little too heavyweight. For performance reasons, I wonder whether it would be possible to override the default HTML writer within CKEditor, asking it to produce an alternate format as it walks the DOM.
What do you think?
This may be a very ignorant answer -- but I have been thinking along the lines of doing something like this with CKEditor as well. My thought (and I have not yet tried this) would be to hook into the Undo/Redo buffers and use that as a sort of poor man's OT that gets sent back and forth across the wire. It would probably require writing your own Undo/Redo buffers (or significantly expanding what is already there). All edits would then become a series of Command Pattern instances and the Undo/Redo buffers become a giant list of playable commands.
Like I said, I haven't tried implementing this at all yet. This might be a terrible idea for a bunch of reasons. It is my first thought at implementing some sort of collaborative editing feature with CKEditor.