Using TextFieldParser.FixedWidth in variable length strings

949 views Asked by At

I am working on a parser that is intended to read in data in fixed-width format (8 char x 10 col). However, sometimes this isn't the case, and there is sometimes valid data in the areas that do not meet this. It is not safe to assume that there is an escape character (such as the + in the figure below), as that is one of several formats.

I had attempted using TextFieldParser.FixedWidth, and giving it a 8x10 input, but anything that does not meet this quantity is sent to the ErrorLine instead.

  1. It doesn't seem like it would be good practice to parse from my exception catching block, is it?

  2. Since it is only discrepant lines who require additional work, is a brute force submethod the best approach? All of my data always comes in 8 char blocks. the final block in a line can be tricky in that it may be shorter if it was manually entered. (Predicated on #1 being OK to do)

  3. Is there a better tool to be using? I feel like I'm trying to fit a square peg in a round hole with a fixedwidth textfieldparser.

Note: Delimited parsing is not an option, see the 2nd figure.

edit for clarification: the text below is a pair of excerpts of input decks for NASTRAN, a finite element code. I am aiming to have a generalized parsing method that will read the text files in, and then hand off the split up string[]s to other methods to actually process each card into a specific mapped object. (e.g. in the image below, the two object types are RBE3 and SET1)

Extracted Method:

    public static IEnumerable<string[]> ParseFixed(string fileName, int width, int colCount)
    {
        var fieldArrayList = new List<string[]>();
        using (var tfp = new TextFieldParser(fileName))
        {
            tfp.TextFieldType = FieldType.FixedWidth;
            var fieldWidths = new int[colCount];
            for (int i = 0; i < fieldWidths.Length; i++)
            {
                fieldWidths[i] = width;
            }
            tfp.CommentTokens = new string[] { "$" };
            tfp.FieldWidths = fieldWidths;
            tfp.TrimWhiteSpace = true;
            while (!tfp.EndOfData)
            {
                try
                {
                    fieldArrayList.Add(tfp.ReadFields());
                }
                catch (Microsoft.VisualBasic.FileIO.MalformedLineException ex)
                {
                    Debug.WriteLine(ex.ToString());
                    // parse atypical lines here...?
                    continue;
                }
            }
        }
        return fieldArrayList;
    }

example data, (Show All Characters turned on for better visibility)

more example data

0

There are 0 answers