I have a large data frame that contains lots of NAs. The rows are soil samples from different plots, and the columns are chemical variables. I wanted to create a column or data frame with the sample size of each variable to identify which variables may be undersampled.
When I tried looking online, there were answers that were specific to correlation tests and answers focused on finding number of occurrences of specific values, not just the presence of a non-NA vector, so that did not help me.
I can brute-force the issue by counting NAs in each column and subtracting those from the # of samples, but I have 400 columns and don't know how to write a function?
Sample ID | C:N | %Fe |
---|---|---|
Plot1 | 46 | 3 |
Plot2 | NA | 5 |
If this were the table, I'd want a column or data frame of "C:N sample size" = 1, %Fe = 2. This is where it's odd, because there would only be 1 row for each column variable, so I guess I'd want to make it as a new data frame or table.
If there's any links to good guides for making reprexes for data frames for R, I'd also appreciate that- this is my first question.
Thank you!
This will give you the
NA
s per column inyour_dataframe