I have been playing with Obsidian and R trying to do some cleaning up of YAML properties on dozens of papers. I have used dplyr
, tidyr
, purrr
, ymlthis
to achieve this. Still I am finding some challenges in dealing with nested lists. Here is a case.
Let's say we have a paper with this data (originally in YAML):
li <- list(
aliases = "Al-Kadem2016",
type = "paper",
title = "Real-Time Estimation of Flow Rate in Dry Gas Wells - A New Field",
citationKey = "Al-Kadem2016",
id = "SPE-182754-MS",
DOI = "https://doi.org/10.2118/182754-MS",
ISBN = "",
volume = "",
url = "",
authors = list(
"Mohammad S. Al-Kadem",
"Mohammad S. Al Dabbous",
"Ali S. Al Mashhad",
"Hassan A. Alsadah",
"Dhafer Al Shehri"
),
institutions = list(
'[[Saudi Aramco]]',
'[[KFUPM]]'
),
published = '2016-04-25',
conferenceName = "Saudi Arabia Annual Technical Symposium and Exhibition",
location = "Dammam, Saudi Arabia",
publisher = "SPE",
created = '2023-01-26',
cdow = "Thursday",
downloaded = '2023-01-26',
summary = "An empirical correlation was developed to calculate the real-time flow rate in dry gas wells at the surface utilizing the most appropriate parameters: upstream flowing wellhead pressure, downstream flowing wellhead pressure, upstream flowing wellhead temperature, and choke size.",
critique = "Shared with VFM team. Contains **Panhandle choke correlation original and
modified**. Covers some empirical choke correlations, applicability and limitations.",
rating = 4,
pages = 10,
references = 5,
review = "",
tags = "",
has_glossary = FALSE,
cited_by = "",
up = '[[Choke Correlation]]',
down = "",
related = list(
'[[Gas Wells]]',
'[[Atlas/Concepts/Empirical Correlations]]',
'[[Atlas/Concepts/Flow Rate Estimation]]',
'[[Atlas/Objects/Venturi Flowmeter]]',
'[[Atlas/Objects/Plant Information]]'
),
research = "CHK",
notes = "what-makes(iiii) + doi + authors + tags(iiii) + snips(iiii) + view + defs(iiiii)",
cover = '[[poster-20231026193936.png]]',
poster = '[[poster-20231026193936.png]]',
pdf_annotated = FALSE,
zotero_up = FALSE,
zotero_notes = FALSE,
zotero_highlights = FALSE,
uuid = '20230126065407',
added = '2023-01-01',
pdf_attached = "spe-000000.pdf",
pdf_highlights = FALSE
)
li
which yields:
$aliases
[1] "Al-Kadem2016"
$type
[1] "paper"
$title
[1] "Real-Time Estimation of Flow Rate in Dry Gas Wells - A New Field"
$citationKey
[1] "Al-Kadem2016"
$id
[1] "SPE-182754-MS"
$DOI
[1] "https://doi.org/10.2118/182754-MS"
...
...
...
$zotero_highlights
[1] FALSE
$uuid
[1] "20230126065407"
$added
[1] "2023-01-01"
$pdf_attached
[1] "spe-000000.pdf"
$pdf_highlights
[1] FALSE
I have been able to produce this R code to extract keys and values for the paper nested list:
# get values from frontmatter
fm_pluck <- li
for(i in 1:length(fm_pluck)) {
element <- fm_pluck[i]
key <- names(element)
values <- unlist(element)
n_values <- length(values)
cat(sprintf("%2d %2d %-15s", i, n_values, key))
if (n_values > 1) {
cat("\n")
for (v in 1:n_values) {
cat(sprintf("%20d %-12s \n", v, values[v]))
}
} else {
cat(sprintf("%-20s \n", values))
}
}
which gives me something like this:
1 1 aliases Al-Kadem2016
2 1 type paper
3 1 title Real-Time Estimation of Flow Rate in Dry Gas Wells - A New Field
4 1 citationKey Al-Kadem2016
5 1 id SPE-182754-MS
6 1 DOI https://doi.org/10.2118/182754-MS
7 1 ISBN
8 1 volume
9 1 url
10 5 authors
1 Mohammad S. Al-Kadem
2 Mohammad S. Al Dabbous
3 Ali S. Al Mashhad
4 Hassan A. Alsadah
5 Dhafer Al Shehri
11 2 institutions
1 [[Saudi Aramco]]
2 [[KFUPM]]
12 1 published 2016-04-25
...
...
...
In other words, I am getting what I want but I find the code lacking; there are two for-loops in there.
I was wondering if anyone could suggest a more efficient and elegant code to print the keys and values on the screen. I am aware of the existence of ymlthis::as_yml()
, but it is not about pretty-printing but rather learning traversing techniques in nested lists.