read.table expects 39 columns in .txt file, when there are only 20

37 views Asked by At

I'm trying to read a .txt file. For some reason I get the following error message:

Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
  line 1 did not have 39 elements

I used the following code:

FAMA <- read.table("FAMAFrench.txt", header = TRUE)

There are only 20 columns of data, none of the rows have gaps. The columns are delimited by spaces. There are 1166 rows of data

The .txt file looks like this:

month  <= 0      Lo 30  Med 40   Hi 30   Lo 20   Qnt 2   Qnt 3   Qnt 4   Hi 20   Lo 10   Dec 2   Dec 3   Dec 4   Dec 5   Dec 6   Dec 7   Dec 8   Dec 9   Hi 10
192607  -99.99    0.14    1.54    3.42    0.37    0.48    1.68    1.41    3.67   -0.12    0.52   -0.05    0.82    1.39    1.89    1.62    1.29    3.53    3.71
192608  -99.99    3.19    2.73    2.91    2.21    3.58    3.70    1.50    3.07    1.13    2.55    4.00    3.32    2.76    4.41    1.54    1.49    0.61    3.79
192609  -99.99   -1.73   -0.88    0.80   -1.39   -1.25    0.07   -0.23    0.81    0.59   -2.00   -2.01   -0.77   -0.01    0.13   -1.93    0.74   -0.77    1.25
192610  -99.99   -2.94   -3.26   -2.79   -2.56   -3.99   -2.65   -3.36   -2.74   -4.29   -2.01   -3.25   -4.45   -3.02   -2.39   -3.55   -3.26   -3.36   -2.56
192611  -99.99   -0.38    3.73    2.74   -0.95    3.03    3.50    3.25    2.71   -3.28   -0.23    0.08    4.90    3.56    3.45    3.60    3.05    3.86    2.40
192612  -99.99    4.15    1.66    3.04    2.45    3.47    1.37    2.81    3.00   -2.49    3.93    5.60    2.22    1.33    1.40    1.81    3.37    3.11    2.97
1

There are 1 answers

0
jay.sf On

The columns are separated by more than one space. read.table can only use one byte, though. What you can do is to skip the first row (i.e. the header), and read it in separately using readLines. Then you can strsplit it at '\\s{2,}' (two or more spaces) and setNames with it.

read.table('file.txt', skip=1) |>
  setNames(el(strsplit(readLines('file.txt', 1), '\\s{2,}')))
#    month   <= 0 Lo 30 Med 40 Hi 30 Lo 20 Qnt 2 Qnt 3 Qnt 4 Hi 20 Lo 10 Dec 2 Dec 3 Dec 4 Dec 5 Dec 6 Dec 7 Dec 8 Dec 9 Hi 10
# 1 192607 -99.99  0.14   1.54  3.42  0.37  0.48  1.68  1.41  3.67 -0.12  0.52 -0.05  0.82  1.39  1.89  1.62  1.29  3.53  3.71
# 2 192608 -99.99  3.19   2.73  2.91  2.21  3.58  3.70  1.50  3.07  1.13  2.55  4.00  3.32  2.76  4.41  1.54  1.49  0.61  3.79
# 3 192609 -99.99 -1.73  -0.88  0.80 -1.39 -1.25  0.07 -0.23  0.81  0.59 -2.00 -2.01 -0.77 -0.01  0.13 -1.93  0.74 -0.77  1.25
# 4 192610 -99.99 -2.94  -3.26 -2.79 -2.56 -3.99 -2.65 -3.36 -2.74 -4.29 -2.01 -3.25 -4.45 -3.02 -2.39 -3.55 -3.26 -3.36 -2.56
# 5 192611 -99.99 -0.38   3.73  2.74 -0.95  3.03  3.50  3.25  2.71 -3.28 -0.23  0.08  4.90  3.56  3.45  3.60  3.05  3.86  2.40
# 6 192612 -99.99  4.15   1.66  3.04  2.45  3.47  1.37  2.81  3.00 -2.49  3.93  5.60  2.22  1.33  1.40  1.81  3.37  3.11  2.97