Parsing a variably space delimited list with nom

Question

Parsing a variably space delimited list with nom

1.2k views Asked by Brian Kung At 23 April 2021 at 20:31

How can I consume a list of tokens that may or may not be separated by a space?

I'm trying to parse Chinese romanization (pinyin) in the cedict format with nom (6.1.2). For example "ni3 hao3 ma5" which is, due to human error in transcription, sometimes written as "ni3hao3ma5" or "ni3hao3 ma5" (note the variable spacing).

I have written a parser that will handle individual syllables e.g. ["ni3", "hao3", "ma5"], and I'm trying to use a nom::multi::separated_list0 to parse it like so:

nom::multi::separated_list0(
    nom::character::complete::space0,
    syllable,
)(i)?;

However, I get a Err(Error(Error { input: "", code: SeparatedList })) after all the tokens have been consumed.

Original Q&A

There are 1 answers

**Brian Kung** · Answer 1 · 2021-04-23T20:31:47+00:00

The problem with using

nom::multi::separated_list0(
    nom::character::complete::space0,
    syllable,
)(i)?;

Is that the space0 delimiter matches empty string, so it will reach the end of the input string and the separated_list0 will continue to try to consume the empty string, hence the Err(Error(Error { input: "", code: SeparatedList })).

The solution in my case was to use nom::multi::many1 and handling the optional spaces in the inner parser instead of nom::multi::separated_list0 like so:

fn syllables(i: &str) -> IResult<&str, Vec<Syllable>> {
    // many  instead of separated_list0
    multi::many1(syllable)(i)
}

fn syllable(i: &str) -> IResult<&str, Syllable> {
    let (rest, (_, pronunciation, tone)) = sequence::tuple((

        // and handle the optional space
        //              here 
        character::complete::space0,
        character::complete::alpha1,
        character::complete::digit0,
    ))(i)?;

    Ok((rest, Syllable::new(pronunciation, tone)))
}

TechQA.

Parsing a variably space delimited list with nom

There are 1 answers

Related Questions in PARSING

Related Questions in RUST

Related Questions in NOM

Popular Questions

Popular Tags

Trending Questions