Regex-rule for matching single word during input (TipTap InputRule)

883 views Asked by At

I'm currently experimenting with TipTap, an editor framework. My goal is to build a Custom Node extension for TipTap that wraps a single word in <w>-Tags, whenever a user is typing text. In TipTap I can write an InputRule with Regex for this purpose

For example the rule /(?:^|\s)((?:~)((?:[^~]+))(?:~))$/ will match text between two tildes (~text~) and wrap it with <strike>-Tags.

Click here for my Codesandbox

I was trying for so long and can't figure it out. Here are the rules that I tried:

/**
 * Regex that matches a word node during input
 */

// Will match words between two tilde characters; I'm using this expression from the documentation as my starting point. 
//const inputRegex =  /(?:^|\s)((?:~)((?:[^~]+))(?:~))$/

// Will match a word but will append the following text to that word without the space inbetween
//const inputRegex =  /\b\w+\b\s$/

// Will match a word but will append the following text to previous word without the space inbetween; Will work with double spaces
//const inputRegex =  /(?:^|\s\b)(?:[^\s])(\w+\b)(?:\s)$/

// Will match a word but will swallow every second character
//const inputRegex =  /\b([^\s]+)\b$/g

// Will match every second word
//const inputRegex =  /\b([^\s]+)\b\s(?:\s)$/

// Will match every word but swallow spaces; Will work if I insert double spaces
const inputRegex =  /\b([^\s]+)(?:\b)\s$/
2

There are 2 answers

1
viv3k On

The problem here is the choice of delimiter, which is space.

This becomes clear when we see the code for markInputRule.ts (line 37 to be precise)

    if (captureGroup) {
        const startSpaces = fullMatch.search(/\S/)
        const textStart = range.from + fullMatch.indexOf(captureGroup)
        const textEnd = textStart + captureGroup.length

        const excludedMarks = getMarksBetween(range.from, range.to, state.doc)

When we are using '~' as delimiters, the input rule tries to place the markers for start and end, without the delimiters and provide the enclosed-text to the extension tag (CustomItalic, in your case). You can clearly test this when entering strike-through text with enclosing '~', in which case the '~' are extracted out and the text is put inside the strike-through tag.

This is exactly the cause of your double-space problem, when you are getting the match of a word with space, the spaces are replaced and then the text is entered into the tag.

I have tried to work around this using negative look-ahead patterns, but the problem remains in the code of the file mentioned above.

What I would suggest here is to copy the code in markInputRule.ts and make a custom InputRule as per your requirements, which would be way easier than working with the in-built one. Hope this helps.

0
Itchy On

I assume the problem lies within the "space". Depending on the browser, the final "space" is either not represented at all in the underlying html (Firefox) or replaced with &nbsp; (e.g. Chrome). I suggest you replace the \s with (\s|\&nbsp;) in your regex.