R biomaRt package: obtaining all values in linked databases

109 views Asked by At

A bioinformatics programming question. In R, I have a classic speciesA-to-speciesB gene symbol conversion, in this example from mouse to human, which I'm performing using biomaRt, and specifically the getLDS function.

x<-c("Lbp","Ndufv3","Ggt1")
require(biomaRt)
convert<-function(x){
        human=useMart("ensembl",dataset="hsapiens_gene_ensembl")
        mouse=useMart("ensembl",dataset="mmusculus_gene_ensembl")

    newgenes=getLDS(
        attributes="mgi_symbol",
        filters="mgi_symbol",
        values=x,
        mart=mouse,
        attributesL="hgnc_symbol",
        martL=human,
        uniqueRows=TRUE
    )
    humanx<-unique(newgenes)
    return(humanx)
}
conversion<-convert(x)

However, I would like to obtain ALL ids present in the linked database: in other words, all mouse/human pairs (in this example). Something to tell the parameter value in the getLDS function to retrieve all ids, not just those specified in the x variable. I am talking about a full map, tens of thousands of lines long, specifying all orthologous relationships between symbols of the two databases.

Any ideas or workarounds? Thanks a lot!

1

There are 1 answers

1
Tonio On BEST ANSWER

I believe a workaround could be retrieving all IDs from the Biomart database itself, here: https://www.ensembl.org/biomart/martview/

  • Click on choose database -> Ensembl Genes
  • Choose dataset -> your selected species (e.g. Mouse genes)
  • Click on Results -> Check "Unique results only" -> Go
  • Profit

The list retrieved here has currently 53605 ids, which is, I believe, what you need.

Enjoy!