My purpose is to build a config file. So I basically have to represent command packet structures in some way.

When the actual data(binary) comes I want to compare the packet with this config file and then process it (convert the data into CSV format).

So there are different command types.

So each time when a packet comes, I have to take its opcode and check it with the config file and return the appropriate command format that represents the format of that packet.

The command format may look like below:

opcode - 1 bytes - integer

command - 4 byte - string

...

All the commands doesn't have the same number of fields or the same format.

I want to retrieve all these details. I can represent it in XML and parse it using some library like libxml2.

A sample XML format is given below:

 <cmd type="Multiplication">
    <field name="opcode" type="string" bytes="4"/>
    <field name="Multiplicand" type="number" bytes="2"/>
    <field name="Multiplier" type="number" bytes="2"/>
 </cmd>

But this approach is rather slow.

My thinking is to somehow represent the command packet format in structures. But since C/C++ is not a reflective language, the structure members cannot be known and you will need one function per structure (command) to parse it.

Please suggest some way to store the formats such that one generic function can parse the binary data just by looking at this format.

  • The languages can be C or C++.
  • Performance is top priority so XML and similar types are discouraged.
  • In memory data structures are preferred.

Any help is highly appreciated.

1 Answers

1
Jorge Perez On

I think your best bet is to represent the file as a collection of variants of the correct command type. For example, let's say you have three options of commands:

struct Constant {
    short value;
};
struct UnaryOperation {
    unsigned char opcode;
    short value;
};
struct BinaryOperation {
    unsigned char opcode;
    short value1;
    short value2;
};

Representing an unknown command. If you have an unknown command, you can represent it as a variant of the three types:

using Command = std::variant<Constant, UnaryOperation, BinaryOperation>; 

Applying a function based on the type of command. Let's say you have different functions for each command:

short eval(Constant c) {
    return c.value;
}
short eval(UnaryOperation u) {
    switch(u.opcode) {
        // stuff
    }
}
short eval(BinaryOperation b) {
    switch(b.opcode) {
        // stuff
    }
}

We can use std::visit to evaluate an arbitrary Command:

short evaluate_command(Command const& command) {
    short output; 
    // This calls the right overload automatically 
    std::visit(command, [&](auto cmd) { output = eval(cmd); }); 
    return output; 
}

Parsing a command. We can create a std::variant automatically from any of the types it's defined over. That means that if you provide a way to figure out what the command is based on the file, it's pretty easy to do.

enum class OpType : unsigned char {
    ConstantOp, UnaryOp, BinaryOp
};
// Command can be automatically constructed from a Constant, a UnaryOperation, or a BinaryOperation
Command readFromStream(std::istream& i) {
    OpType type;
    unsigned char op;
    short value, value2;

    // Read the type of the operation
    i >> (unsigned char&)type;

    //Return either a Constant, a UnaryOperation, or a BinaryOperation
    switch(type) {
        case OpType::ConstantOp: {
             i >> value;
             return Constant{value};
        }
        case OpType::UnaryOp: {
            i >> op >> value;
            return UnaryOperation{op, value};
        }
        case OpType::BinaryOp {
            i >> op >> value >> value2;
            return BinaryOperation{op, value, value2}; 
        }
    }
}