How to find non-NA values (sample size) of a large data frame?

649 views Asked by At

I have a large data frame that contains lots of NAs. The rows are soil samples from different plots, and the columns are chemical variables. I wanted to create a column or data frame with the sample size of each variable to identify which variables may be undersampled.

When I tried looking online, there were answers that were specific to correlation tests and answers focused on finding number of occurrences of specific values, not just the presence of a non-NA vector, so that did not help me.

I can brute-force the issue by counting NAs in each column and subtracting those from the # of samples, but I have 400 columns and don't know how to write a function?

Sample ID C:N %Fe
Plot1 46 3
Plot2 NA 5

If this were the table, I'd want a column or data frame of "C:N sample size" = 1, %Fe = 2. This is where it's odd, because there would only be 1 row for each column variable, so I guess I'd want to make it as a new data frame or table.

If there's any links to good guides for making reprexes for data frames for R, I'd also appreciate that- this is my first question.

Thank you!

1

There are 1 answers

0
TarJae On

This will give you the NAs per column in your_dataframe

library(dplyr)
library(purrr)

your_dataframe %>% 
  map_df(~sum(is.na(.)))