Dear Stackoverflowers,
I am currently coding an application that displays sequences and genes of 11 species of nematodes (such as C.elegans).
I am using R shiny combined with gggenomes package which you can think of like ggplot but to display alignments between genes through several sequences.
gggenomes takes three data frames to work : seqs, genes and links.
Inside the data frames, you can find the columns below :
-genes : seq_id, start, end, length, orthogroup
-seqs :seq_id, start, end, length
-links : seq_id, start, end, seq_id2, start2, seq_id2
Here is an example :
p <- gggenomes(seqs = seqs, genes = genes, links = links) +
geom_seq() +
geom_gene(aes(fill = Orthogroup), stroke = 0.5) +
geom_bin_label(fontface = "italic", size = 5, expand_left = 0.8) +
geom_link(offset = 0.25)+
theme(axis.text.x=element_text(size=15))+
labs(fill = "Orthogroups")
Since it works like ggplot, it also uses geoms and aesthetic (aes).
Last info you need to know :
geom_bin_label is a geom that takes the seq_id column from seqs data frame and plot the sequence name at the left of each sequences.
Here is a plot generates with gggenomes using geom_bin_label :
So, on the plot, the seq_ids are constructed like this : "species_name sequence_name".
Example : "bovis CBOVI.ctg00005_chrIV"
WHAT I WANT TO DO
- Align the species_name and align the sequence_name so they form two nice columns on the plot.
Such as :
bovis CBOVI.ctg00005_chrIV sequences here...
becei CSP29.scaffold174_cov172 ...
panamensis CSP28.scaffold107_cov92 ...
inopinata SP34_chr4 ...
elegans IV ...
tropicalis Scaffold629 ...
remanei IV ...
latens scaffold_77 ...
tribulationis CSP40_scaffold02881 ...
briggsae IV ...
nigoni CM008512.1 ...
Reminder : in the seq_id column the seq_ids are written like this : "species sequence".
- Color in red the species_name and in blue the sequence_name (the colors are random, I just want to display them in different colors)
I hope you would be able to help me. It seems like an easy problem of displaying but it's actually quite tricky.
I let here some links that could help you :
https://thackl.github.io/gggenomes/reference/index.html
https://thackl.github.io/gggenomes/reference/geom_bin_label.html