R. Why is the output data becoming ranks of the data instead of the original data?

78 views Asked by At

I want to combine the 3th column and the 8th column to one column. There are two problems in my code. The original data is like this.

incidence<-read.csv("incidence.csv",head=F);incidence<-incidence[c(-1,-2),]

incidence[,3]
[1] 15266   1340    14842   7819    130516  8256    No Data No Data 1578    35914   27963  
[12] 3419    2379    No Data 22153   9482    8931    10433   No Data 3401    No Data 14764  
[23] 38551   9166    10448   19225   2071    5667    4934    2572    25518   5409    No Data
[34] 27011   2105    25539   5702    10365   40827   No Data 12829   1339    18739   40457  
[45] 4505    1779    24387   No Data 7586    17666   1629    No Data
46 Levels: 10365 10433 10448 12829 130516 1339 1340 14764 14842 15266 1578 1629 17666 ... Number of New Cases

The original data is like:

incidence[,8]
[1] 18705   1693    15199   8774    160836  9393    No Data No Data 1578    48646   38417  
[12] 4892    3241    No Data 23053   10599   6728    13365   No Data 3429    No Data 16927  
[23] 45537   12103   10930   19225   1954    5001    5152    2123    28859   6165    No Data
[34] 32294   1928    46637   No Data 11689   48231   No Data 11979   0       23199   50551  
[45] 5541    1917    20037   No Data 9400    20452   1752    No Data
45 Levels: 0 10599 10930 11689 11979 12103 13365 15199 1578 160836 16927 1693 1752 ... Number of New Cases

When I try to combine these data, I get the ranking of the original data and it seems that I get 2 rows instead 1 column at last. I do not know why.

rbind(incidence[,3],incidence[,8])
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16]
[1,]   10    7    9   40    5   41   45   45   11    30    27    29    21    45    20    44
[2,]   14   12    8   41   10   42   44   44    9    33    29    34    27    44    23     2
     [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [,29] [,30] [,31]
[1,]    42     2    45    28    45     8    31    43     3    16    18    37    35    25    23
[2,]    40     7    44    28    44    11    30     6     3    16    18    35    37    22    25
     [,32] [,33] [,34] [,35] [,36] [,37] [,38] [,39] [,40] [,41] [,42] [,43] [,44] [,45] [,46]
[1,]    36    45    26    19    24    38     1    33    45     4     6    15    32    34    14
[2,]    39    44    26    17    31    44     4    32    44     5     1    24    36    38    15
     [,47] [,48] [,49] [,50] [,51] [,52]
[1,]    22    45    39    13    12    45
[2,]    20    44    43    21    13    44
1

There are 1 answers

0
nicola On

Missing data in R are handled through the NA value. Since calling missing values NA is not universal, read.table gives you the opportunity to specify how missing values are indicated through the na.strings argument. Try reading the file with:

read.csv("incidence.csv",head=F,na.strings="No Data") 

In this way, the columns you are interested can be correctly parsed as numeric and you don't have problems with factor/character/numeric conversion afterwards.