I have a human datasets with genes ensembl and I want to annotate IDs to Symbol instead of ensembl in one of these datasets I have exactly 20176 genes I used two methods, but in boths I got NAs in some genes
- First method:
library(biomaRt)
library(org.Hs.eg.db)
keytypes(org.Hs.eg.db)
Data <- read.csv("Data.csv", header = T, row.names = 1)
Data$SYMBOL <- mapIds (org.Hs.eg.db, keys = row.names(Data), keytype = "ENSEMBL", column = "SYMBOL")
but I found exactly 3845 NAs:
sum(is.na(Data))
Second Method:
`library("EnsDb.Hsapiens.v86")
keytypes(EnsDb.Hsapiens.v86) mapIds <- mapIds(EnsDb.Hsapiens.v86, keys = genes$'row.names(Data)', keytype = "GENEID", column = "SYMBOL")`
but also I found 761 NAs.
I'm wondering if there's a newer version of EnsDb.Hsapiens to use it to get all gene Symbols without any NAs or even another package.
my genes name : https://docs.google.com/document/d/1VVtveHXbOXt8m02ttcAmjHxF59YTFFgOEvyBhyqw13w/edit?usp=sharing
After downloading your shared data, steps taken:
At the above site they don't say explicitly what is returned if a match is not found, one would think
NA
:It appears the recommendation is to update to the version running the site above.