Get ANSI color for character at index

991 views Asked by At

I have developed couleurs NPM package which can be set to append rgb method to String.prototype:

> console.log("Hello World!".rgb(255, 0, 0)) // "Hello World!" in red
Hello World!
undefined
> "Hello World!".rgb(255, 0, 0)
'\u001b[38;5;196mHello World!\u001b[0m'

This works fine. What's the proper way to get the ANSI color/style of character at index i?

Probably this can be hacked with some regular expressions, but I'm not sure if that's really good (however, if a correct implementation is available I'm not against it)... I'd prefer a native way to get the color/style by accessing the character interpreted by tty.

> function getStyle (input, i) { /* get style at index `i` */ return style; }

> getStyle("Hello World!".rgb(255, 0, 0), 0); // Get style of the first char
{
   start: "\u001b[38;5;196m",
   end: "\u001b[0m",
   char: "H"
}
> getStyle("Hello " + "World!".rgb(255, 0, 0), 0); // Get style of the first char
{
   start: "",
   end: "",
   char: "H"
}

Things get complicated when we have multiple combined styles:

> console.log("Green and Italic".rgb(0, 255, 0).italic())
Green and Italic
undefined
> getStyle("Green and Italic".rgb(0, 255, 0).italic(), 0);
{
   start: "\u001b[3m\u001b[38;5;46m",
   end: "\u001b[0m\u001b[23m",
   char: "G"
}
> getStyle(("Bold & Red".bold() + " but this one is only red").rgb(255, 0, 0), 0);
{
   start: "\u001b[38;5;196m\u001b[1m",
   end: "\u001b[22m\u001b[0m",
   char: "B"
}
> getStyle(("Bold & Red".bold() + " but this one is only red").rgb(255, 0, 0), 11);
{
   start: "\u001b[38;5;196m",
   end: "\u001b[0m",
   char: "u"
}
> ("Bold & Red".bold() + " but this one is only red").rgb(255, 0, 0)
'\u001b[38;5;196m\u001b[1mBold & Red\u001b[22m but this one is only red\u001b[0m'

Like I said, I'm looking for a native way (maybe using a child process).

So, how to get the complete ANSI style for character at index i?

2

There are 2 answers

8
Jongware On BEST ANSWER

There are a couple of ways to 'add' formatting to text, and this is one of them. The problem is you are mixing text and styling into the same object -- a text string. It's similar to RTF

Here is some \b bold\b0 and {\i italic} text\par

but different from, say, the native format of Word .DOC files, which works with text runs:

(text) Here is some bold and italic text\r
(chp)  13 None
       4  sprmCFBold
       5  None
       6  sprmCFItalic
       6  None

-- the number at the left is the count of characters with a certain formatting.

The latter format is what you are looking for, since you want to index characters in the plain text. Subtracting the formatting lengths will show which one you are interested in. Depending on how many times you expect to ask for a formatting, you can do one-time runs only, or cache the formatted text somewhere.

A one-time run needs to inspect each element of the encoded string, incrementing the "text" index when not inside a color string, and updating the 'last seen' color string if it is. I added a compatible getCharAt function for debugging purposes.

var str = '\u001b[38;5;196m\u001b[1mBo\x1B[22mld & Red\u001b[22m but this one is only red\u001b[0m';

const map = {
    bold: ["\x1B[1m", "\x1B[22m" ]
  , italic: ["\x1B[3m", "\x1B[23m" ]
  , underline: ["\x1B[4m", "\x1B[24m" ]
  , inverse: ["\x1B[7m", "\x1B[27m" ]
  , strikethrough: ["\x1B[9m", "\x1B[29m" ]
};

String.prototype.getColorAt = function(index)
{
    var strindex=0, color=[], cmatch, i,j;

    while (strindex < this.length)
    {
        cmatch = this.substr(strindex).match(/^(\u001B\[[^m]*m)/);
        if (cmatch)
        {
            // Global reset?
            if (cmatch[0] == '\x1B[0m')
            {
                color = [];
            } else
            {
                // Off code?
                for (i=0; i<map.length; i++)
                {
                    if (map[i][1] == cmatch[0])
                    {
                        // Remove On code?
                        for (j=color.length-1; j>=0; j--)
                        {
                            if (color[j] == map[i][0])
                                color.splice (j,1);
                        }
                        break;
                    }
                }
                if (j==map.length)
                    color.push (cmatch[0]);
            }
            strindex += cmatch[0].length;
        } else
        {
            /* a regular character! */
            if (!index)
                break;
            strindex++;
            index--;
        }
    }
    return color.join('');
}

String.prototype.getCharAt = function(index)
{
    var strindex=0, cmatch;

    while (strindex < this.length)
    {
        cmatch = this.substr(strindex).match(/^(\u001B\[[^m]*m)/);
        if (cmatch)
        {
            strindex += cmatch[0].length;
        } else
        {
            /* a regular character! */
            if (!index)
                return this.substr(strindex,1);
            strindex++;
            index--;
        }
    }
    return '';
}

console.log (str);

color = str.getColorAt (1);
text = str.getCharAt (1);
console.log ('color is '+color+color.length+', char is '+text);

The returned color is still in its original escaped encoding. You can make it return a constant of some kind by adding these into your original map array.

1
georg On

I can't provide you with a full solution, but here's a sketch:

  • maintain a stack which accumulates the current format
  • split a string into chunks espace sequence | just a character
  • iterate over this list of chunks
  • if it's just a char, save its index + the current state of the stack
  • if it's an escape, either push the respective format onto the stack, or pop the format from it

You can also use this algorithm to convert an escaped string into html, and use XML methods to walk the result tree.

BTW, the latter would be also nice the other way round, how about this:

console.log("<font color='red'>hi <b>there</b></font>".toANSI())