Issue trying to compile Spirit.Qi parser

1.1k views Asked by At

Below is a fully self-contained example. The problem appears to be lines 84-89 - if those lines are commented out, the example compiles. What I'm trying to parse is each line of a file, with five colon-delimited items, with the last three items being optional. The single function takes a boost::filesystem::file, sucks it in using boost.interprocess, and parses it.

Examples of what I want this to parse:

a:1
a:2:c
a:3::d
a:4:::e
a:4:c:d:e

The results should store in the vector<file_line>, and file_line is a struct with five members, the last three being optional. Here is the code, and the errors:

Code

#if defined(_MSC_VER) && (_MSC_VER >= 1020)
# pragma warning(disable : 4512) // assignment operator could not be generated
# pragma warning(disable : 4127) // conditional expression is constant
# pragma warning(disable : 4244) // 'initializing' : conversion from 'int' to 'char', possible loss of data
#endif

#include <boost/fusion/adapted/struct/adapt_struct.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/home/qi.hpp>
#include <boost/spirit/home/qi/string.hpp>
#include <boost/spirit/home/karma.hpp>
#include <boost/spirit/home/karma/binary.hpp>
#include <boost/spirit/home/phoenix.hpp>
#include <boost/spirit/home/phoenix/bind.hpp>
#include <boost/spirit/home/phoenix/core.hpp>
#include <boost/spirit/home/phoenix/operator.hpp>
#include <boost/spirit/home/phoenix/statement/sequence.hpp>
#include <boost/fusion/include/std_pair.hpp>
#include <boost/interprocess/file_mapping.hpp>
#include <boost/interprocess/mapped_region.hpp>
#include <boost/filesystem/operations.hpp>

#include <string>

// This struct and fusion adapter is for parsing file servers in colon-newline format. 
struct file_line
{
  std::string a;
  unsigned short b;
  boost::optional<std::string> c;
  boost::optional<std::string> d;
  boost::optional<std::string> e;
};
BOOST_FUSION_ADAPT_STRUCT(
  file_line,
  (std::string, a)
  (unsigned short, b)
  (boost::optional<std::string>, c)
  (boost::optional<std::string>, d)
  (boost::optional<std::string>, e)
)

void
import_proxies_colon_newline(const boost::filesystem::path& file)
{
  using namespace boost::spirit;
  using qi::parse;
  using qi::char_;
  using qi::eol;
  using qi::eoi;
  using qi::lit;
  using qi::ushort_;

  // <word>:<ushort>:[word]:[word]:[word]
  if(boost::filesystem::exists(file) && 0 != boost::filesystem::file_size(file))
  {
    // Use Boost.Interprocess for fast sucking in of the file. It works great, and provides the bidirectional
    // iterators that we need for spirit.
    boost::interprocess::file_mapping mapping(file.file_string().c_str(), boost::interprocess::read_only);
    boost::interprocess::mapped_region mapped_rgn(mapping, boost::interprocess::read_only);

    const char*       beg = reinterpret_cast<char*>(mapped_rgn.get_address());
    char const* const end = beg + mapped_rgn.get_size();

    // And parse the data, putting the results into a vector of pairs of strings.
    std::vector<file_line> output;

    parse(beg, end,

          // Begin grammar
          (
            *(
                *eol
              >> +(char_ - (':' | eol) 
              >> ':' >> ushort_         
              >> -(':'
                    >> *(char_ - (':' | eol)) 
                    >> (eol | 
                          -(':'
                              >> *(char_ - (':' | eol)) 

                              // This doesn't work. Uncomment it, won't compile. No idea why. It's the same
                              // as above.
                              >> (eol |
                                    -(':'
                                        >>
                                        +(char_ - eol) 
                                      )
                                )
                          )
                        )
                  )
              >> *eol
            )
          )
          // End grammar, begin output data

          ,output
          );
  }
}

Error Messages from MSVC 10

Since questions are limited to 30,000 characters, I'll only display the first few here. The example should attempt to compile and produce the same thing on your machine.

1>C:\devel\dependencies\boost\boost-1_44\include\boost/spirit/home/support/container.hpp(101): error C2955: 'boost::Container' : use of class template requires template argument list
1>          C:\devel\dependencies\boost\boost-1_44\include\boost/concept_check.hpp(602) : see declaration of 'boost::Container'
1>          C:\devel\dependencies\boost\boost-1_44\include\boost/spirit/home/qi/operator/kleene.hpp(65) : see reference to class template instantiation 'boost::spirit::traits::container_value<Container>' being compiled
1>          with
1>          [
1>              Container=char
1>          ]
1>          C:\devel\dependencies\boost\boost-1_44\include\boost/spirit/home/qi/detail/fail_function.hpp(38) : see reference to function template instantiation 'bool boost::spirit::qi::kleene<Subject>::parse<Iterator,Context,Skipper,Attribute>(Iterator &,const Iterator &,Context &,const Skipper &,Attribute &) const' being compiled
1>          with
1>          [
1>              Subject=boost::spirit::qi::difference<boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::spirit::tag::char_,boost::spirit::char_encoding::standard>>,boost::spirit::qi::alternative<boost::fusion::cons<boost::spirit::qi::literal_char<boost::spirit::char_encoding::standard,true,false>,boost::fusion::cons<boost::spirit::qi::eol_parser,boost::fusion::nil>>>>,
1>              Iterator=const char *,
1>              Context=const boost::fusion::unused_type,
1>              Skipper=boost::fusion::unused_type,
1>              Attribute=char
1>          ]

...snip...

1>C:\devel\dependencies\boost\boost-1_44\include\boost/spirit/home/support/container.hpp(102): fatal error C1903: unable to recover from previous error(s); stopping compilation
1

There are 1 answers

1
hkaiser On BEST ANSWER

I already answered on the Spirit mailing list, but let me post it here for the sake of completeness as well.


Your example is far from minimal. I see no reason why you left interprocess, filesystem or Karma references in the code. This just makes diagnosing things so much more difficult for everybody willing to help. Moreover you have a mismatched parenthesis in there somewhere. I assume you missed to close the +(char_ - (':' | eol).

Ok, let's look closer. This is your (simplified) grammar. It does not do anything useful anymore, but attribute-wise it should behave the same as the original one:

*(+char_ >> -(*char_ >> (eol | -(*char_ >> (eol | -(':' >> +char_))))))

The exposed (propagated attribute) of this grammar is:

vector<
  tuple<
    std::vector<char>,
    optional<
      tuple<
        std::vector<char>,
        variant<
          char,
          optional<
            tuple<
              std::vector<char>,
              variant<
                char,
                optional<
                  std::vector<char>
                >
              >
            >
          >
        >
      >
    >
  >
>

Attribute compatibility rules can do quite a bit, but they can't map a std::string onto a variant<char, vector<char> > for sure. Moreover, I believe you do not understand your grammar yourself anymore, why do you expect Spirit to get it right in this case?

What I'd suggest is that you start with simplifying your grammar by outfactoring things into rules. That not only makes it easier to understand, but allows you to tell Spirit what attribute you expect to get back from what subpart of your grammar. For instance:

rule<char const*, std::string()> e1 = +~char_(":\r\n");
rule<char const*, std::string()> e2 = *~char_(":\r\n");
rule<char const*, std::string()> e3 = +~char_("\r\n");
rule<char const*, ushort()> u = ':' >> ushort_;
rule<char const*, file_line()> fline = 
    *eol >> e1 >> u
         >> -(':' >> e2 >> (eol | -(':' >> e2 >> (eol | -(':' >> e3))))) >> *eol;

which makes the overall grammar more readable already:

*fline

pretty, huh?

If you think about it further, you will realize, that writing

foo >> (eol | -bar) >> *eol

is equivalent to:

foo >> -bar >> *eol

which simplifies it even more:

rule<char const*, file_line()> f = 
    *eol >> e1 >> u >> -(':' >> e2 >> -(':' >> e2 >> -(':' >> e3) ) ) >> *eol;

What you can see now is that your grammar produces at least 5 sub-attributes, while your file_list has only four members. You need to adjust your file_list structure accordingly.

The above does compile now (Boost SVN trunk), but it fails producing the correct results. If I feed it with "a:4:c:d:e", I get the results: output[0].a == "a", output[0].b == 4, and output[0].c == "cde". Let's analyze why that happens.

Again, attribute compatibility rules can do only part of the work. In this case file_list::a gets mapped onto e1, file_list::b onto u, while file_list::c gets mapped onto the whole rest of the expression. That's what you would expect, actually, as the optional breaks the sequence into 3 elements. Your attribute is 'flattened', while the grammar is not.

There are two solutions: a) change your attribute to match the structure of the grammar:

struct file_line
{
  std::string a;
  unsigned short b;
  boost::optional<
    fusion::vector<
      std::string, 
      boost::optional<
        fusion::vector<std::string, boost::optional<std::string> >
      >
    >
  > c;
};

or b) use semantic actions to set the elements of your attribute (which is what I would do).