Is it possible to coerce NAs into logicals in R?

68 views Asked by At

I'm trying to do sorting for something like this:

State     Value
   AK         1
   WA         3
   LA        NA
   AK        NA
   OR         1
   DL         1

and then try to find say the state of rank 2.

If I sort by the order of "Value", it'd become something like this:

State     Value
   AK         1
   OR         1
   DL         1
   WA         3
   LA        NA
   AK        NA

But then this is not the ranking I want. I want to sort it alphabetically, which means rank 2 should be DL instead of OR.

The way I do it is I use a while loop to check if there are states above and below OR with the same Value, and get all the states with same Value. Take them out and reorder them and then map their positions.

count <- 2
ucount <- 0
dcount <- count

while (data$Value[count] == data$Value[count + 1]) {
    dcount <- dcount + 1
    count <- dcount
}

count <- num

while (data$Value[count] == data$Value[count - 1]) {
    ucount <- ucount + 1
    count <- count - 1
}

if (dcount == 2 && ucount == 0) {
    result <- as.character(data$State[2])
}

else if (dcount != num) {
    result <- as.character(sort(data[ucount:dcount, "State"])[1])
}

else {
    result <- as.character(sort(data[ucount:dcount, "State"])[ucount + 1])
}

result

This code, tho ugly, works, but only when there's no NA values on the below.

If we don't have

State     Value
   WA         3

The data frame would become

State     Value
   AK         1
   OR         1
   DL         1
   LA        NA
   AK        NA

Then the code won't work because it'll try to compare with the value of LA, which is NA. I want to tell the while loop to stop when it see an NA but R doesn't seem to let me do anything with NA.

I know this is probably not a very smart to do it but it's the only way I can think of. I'm still pretty new to R, and I hope this is not a stupid question. Thanks for your help!

PS I've checked this post How to sort a dataframe by column(s)?, which is indeed similar to what I wanted to do. However, my question is about how to deal with NAs. I'm happy theres an alternative way to my approach (which I know is not very smart, haha) that can solve my problem nicely, but I'm still hoping with this question I can see some insights about NAs.

2

There are 2 answers

1
Josh O'Brien On

The function order() accepts multiple arguments, allowing you to sort first on Value and then -- for breaking any ties -- by the alphabetical order of State:

df[order(df$Value, as.character(df$State),]
  State Value
1    AK     1
6    DL     1
5    OR     1
2    WA     3
4    AK    NA
3    LA    NA
1
giraffehere On

Something tells me you come from a C++ esque coding background as per your solution, haha.

order can be used as in Josh's answer above, however, using arrange in the plyr (quite widely used) package I think is the most readable. Use as such:

> library(plyr)
> DF <- data.frame(State = c("AK", "WA", "LA", "AK", "OR", "DL"), Value = c("1", "3", NA, NA, "1", "1"))
> DF
  State Value
1    AK     1
2    WA     3
3    LA  <NA>
4    AK  <NA>
5    OR     1
6    DL     1
> DF <- arrange(DF, Value, State) # Sort by Value, then by State.
> DF
  State Value
1    AK     1
2    DL     1
3    OR     1
4    WA     3
5    AK  <NA>
6    LA  <NA>

Note that as seen above, arrange also resets the row names, which may or may not be something you want.