Python pandas, read_fwf

150 views Asked by At

I'm trying to read a table from a text file (or StringIO) into pandas. To accomplish this, I use pandas.read_fwf.

However, I'm facing problems with the automatic column width detection. In my case it works properly for columns 1-3 but not for column 4, which contains informal text of undefined width.

The detection works good for the first three columns, because their width can properly determined from the headers. The 4th column start can also be determined properly, as it is aligned with the corresponding header.

However, pandas refuses to put all remaining text into the 4th column. It either creates several Unnamed: X columns with each word of the informal text in one column or it creates one named column which contains only the first word of the informal text.

Here is the column format:

CL              NAME          STATE        INFO
some category   some_name     some_state   some informal info text
...

I'd like to achieve that all categories are put in column 1, all names in column two, all states in column three and all infos in column 4.

The two options I tried were:

  • x1 = pandas.read_fwf(infile, infer_nrows=1)
    

    -> Results in a INFO column containing only the first word of the info text.

                           CL          NAME  ... Unnamed: 5 Unnamed: 6
    0          some category     some_name  ...        NaN        NaN
    
  • x2 = pandas.read_fwf(infile)
    

    -> Results in several unnamed columns each containing one word of the info text.

        CL             NAME             STATE     INFO
    0   some      some_name        some_state     some
    
0

There are 0 answers