Removing lines of text that match strings in an array

53 views Asked by At

I haven't been able to find a satisfactory solution to remove lines of text that match keywords in an arrays of strings. A typical application would be, for instance, YAML frontmatter, which we want to filter of non applicable properties.

Let' say we have a string like this:

let str = `
up: some text here
archetype: "[[Atlas]]"
related: 
status: 
type: 
uuid: '202403041152'
`

I am using backticks to enter the string to make it easier to enter one line of text at a time.

We want to exclude some properties from the frontmatter string entering the exclusions in an array.

const exclude = ["status", "type", ]

How do we remove the keywords in the array exclude from the multi-line string?

2

There are 2 answers

2
f0nzie On

I have found the most consistent result is when we convert each line of the YAML string to an array of strings, and then iterate through exclude and the array to apply the regex to each line.

A function to remove the whole line in str that are listed in the exclude array:

function removeWhitespace(txt) {
    return txt
        .filter(function(entry) { return entry.trim() != '' });
}
function removeExcludedKeys(str, exclude) {
    let strA = str.split("\n")    // convert to array
    strA = removeWhitespace(strA) // remove white spaces
    exclude.forEach(ex => {
        let regex = RegExp("\\b" + ex + ".*","gi")
        strA = strA.map(p => p.replace(regex, '')) });
    return removeWhitespace(strA).join("\n") 
}

Called with:

console.log(removeExcludedKeys(str, exclude))

Output as follows:

up: some text here
archetype: "[[Atlas]]"
related:
uuid: '202403041152'

I came up with this solution when I was receiving false positives whenever I wanted to remove type, and unintentionally it was removing the property archetype. The function removeWhitespace is necessary to remove whitespaces caused when entering the string between backticks, and also removing the whitespaces in the resulting array.

0
f0nzie On

@Yogi, adding a trim() to remove the extra whitespace at the top of the resulting string.

let textContent = `
up: some text here
archetype: "[[Atlas]]"
related: 
status: 
type: 
uuid: '202403041152'
`

const exclude = ["status", "type", "related"]

// generate regex and remove keys
// regex = (^\s*key$\n)|...

const removeExcludedKeys = (str = '', exclude = []) => 
str.replace(new RegExp( exclude.map(capture => 
`(^\\s*${capture}:.*$\\n)`).join('|'), 'gmiu'), '').trim()

console.log(removeExcludedKeys(textContent, exclude))

Result:

up: some text here
archetype: "[[Atlas]]"
uuid: '202403041152'

I tried also to replace the regex to something that could be defined outside the function but I was unsuccessful. Something like:

let regex = "`(^\\s*${capture}:.*$\\n)`"