I am using a dataframe that has multiple NA values so I was thinking about sorting the attributes based on their NA values.
I was trying to use a for
loop and this is what I have so far:
> data <- read.csv("C:/Users/Nikita/Desktop/first1k.csv")
> for (i in 1:length(data) ) {
+ temp <- c(sum(is.na(data[i])))}
> temp
[1] 0
It is the first time I am using a for loop in r so I am sure it is just a silly syntax problem but I can't understand which one exactly.
Ultimately, I need a list that shows the name of the attribute and its NA count. This way I could sort the list and get the desired information. Here is some mock data to make it easier.
data <- data.frame(A = c(500, 600, 700, 1000),
B = c(500, 600, 700, NA),
C = c(NA, NA, 500, 700),
D = c(800, NA, 933, NA),
E = c(NA, NA, NA, NA))
Edit:
Thank you all for the help. All three solution worked for me. I do wonder though if there is a one line code that will sort those attributes before I export them into a file. like I mentioned before, I am quite new in r
so I am not sure if it is possible.
Edit 2: When I run the sort is gives me the next error:
temp <- sort(temp)
Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) :
'x' must be atomic
Any idea why?
The right way to do iterative code in R is to avoid explicit
for
loops. Useapply
(and the company) instead. @jeremycg gave you the right R-ish answer. Regarding your code, you should make some editing to make it work.You had
temp
rewritten at each iteration. Moreover you didn't write the labels of your variables intotemp
. Hence the output you see is the number ofNA
s in the last column of your dataset.Regarding OP's edit