I have a dataframe with in each row a taxon, and each column is a level of description of the taxon. I need to transform this dataframe into two dataframes, one of nodes and one of edges, to be used in igraph. Not all taxon levels, i.e. columns, are complete, i.e. there are NAs. An example extrapolated from my data is this:
object_taxonomy = data.frame(class = c("Insecta", "Insecta", "Insecta", "Insecta",
"Insecta", "Insecta", "Insecta", "Insecta", "Insecta", "Insecta"
), order = c("Hymenoptera", "Hemiptera", "Hemiptera", "Hemiptera",
"Hemiptera", "Hemiptera", "Hemiptera", "Hymenoptera", "Hemiptera",
"Hemiptera"), superfamily = c("Chalcidoidea", "Coccoidea", "Coccoidea",
"Coccoidea", "Coccoidea", "Coccoidea", NA, "Chalcidoidea", "Coccoidea",
"Coccoidea"), family = c("Azotidae", "Diaspididae", "Diaspididae",
"Diaspididae", "Diaspididae", "Diaspididae", "Margarodidae",
"Encyrtidae", "Diaspididae", "Diaspididae"), subfamily = c(NA,
NA, NA, NA, NA, NA, NA, "Encyrtinae", NA, NA), tribe = c(NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_, NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_),
genus = c("Ablerus", "Chionaspis", "Diaspidiotus", "Diaspidiotus",
"Diaspidiotus", "Lepidosaphes", "Kuwania", "Lakshaphagus",
"", "Diaspidiotus"), scientificName = c("Ablerus celsus",
"Chionaspis salicis", "Diaspidiotus ostreaeformis", "Diaspidiotus perniciosus",
"Diaspidiotus prunorum", "Lepidosaphes ulmi", "Kuwania rubra",
"Lakshaphagus merceti", "", "Diaspidiotus gigas"))
Creating a data.frame of nodes is no problem, with gather and filtering on NA. Instead, I am looking for help in creating the edges data.frame, in which NA is skipped when present.
What I would like to achieve is this:
| From | To |
|---|---|
| Insecta | Hymenoptera |
| Insecta | Hemiptera |
| Hymenoptera | Chalcidoidea |
| Hemiptera | Coccoidea |
| Chalcidoidea | Azotidae |
| Chalcidoidea | Encyrtidae |
| Hymenoptera | Chalcidoidea |
| Hemiptera | Coccoidea |
| Coccoidea | Diaspididae |
| Hemiptera | Margarodidae |
| Encyrtidae | Encyrtinae |
| Azotidae | Ablerus |
| Encyrtinae | Lakshaphagus |
| Diaspididae | Chionaspis |
| Diaspididae | Diaspidiotus |
| Diaspididae | Lepidosaphes |
| Margarodidae | Kuwania |
| Ablerus | Ablerus celsus |
| Lakshaphagus | Lakshaphagus merceti |
| Chionaspis | Chionaspis salicis |
| Diaspidiotus | Diaspidiotus ostreaeformis |
| Diaspidiotus | Diaspidiotus perniciosus |
| Diaspidiotus | Diaspidiotus prunorum |
| Lepidosaphes | Lepidosaphes ulmi |
| Kuwania | Kuwania rubra |
Here are two options
and you can achieve
igraphapproach You can try addingpathby rows inobject_taxonomyand you will achieve