I have a list of S4 objects of class Seurat, where each object has several slots:

> lapply(seurat.objects, slotNames)
$gw14
 [1] "raw.data"     "data"         "scale.data"   "var.genes"    "is.expr"
 [6] "ident"        "meta.data"    "project.name" "dr"           "assay"
[11] "hvg.info"     "imputed"      "cell.names"   "cluster.tree" "snn"
[16] "calc.params"  "kmeans"       "spatial"      "misc"         "version"

$gw17
 [1] "raw.data"     "data"         "scale.data"   "var.genes"    "is.expr"
 [6] "ident"        "meta.data"    "project.name" "dr"           "assay"
[11] "hvg.info"     "imputed"      "cell.names"   "cluster.tree" "snn"
[16] "calc.params"  "kmeans"       "spatial"      "misc"         "version"

$gw18
 [1] "raw.data"     "data"         "scale.data"   "var.genes"    "is.expr"
 [6] "ident"        "meta.data"    "project.name" "dr"           "assay"
[11] "hvg.info"     "imputed"      "cell.names"   "cluster.tree" "snn"
[16] "calc.params"  "kmeans"       "spatial"      "misc"         "version"

$gw19
 [1] "raw.data"     "data"         "scale.data"   "var.genes"    "is.expr"
 [6] "ident"        "meta.data"    "project.name" "dr"           "assay"
[11] "hvg.info"     "imputed"      "cell.names"   "cluster.tree" "snn"
[16] "calc.params"  "kmeans"       "spatial"      "misc"         "version"

I want to replace the dataframe stored in each of the @metadata slot of each list element with the corresponding dataframe in a second list, metadata.

> lapply(metadata, head)
$gw14
# A tibble: 98,879 x 7
   cell.name           nGene  nUMI orig.ident pct.mito pct.ribo age
   <chr>               <int> <int> <chr>         <dbl>    <dbl> <chr>
 1 AAACCTGAGAGGTTAT_1    598  1202 CGE            0.02    0.24  gw14
 2 AAACCTGAGCAGGTCA_2    582   914 CGE            0.01    0.17  gw14
 3 AAACCTGAGGAGCGAG_3    493  1225 CGE            0.01    0.43  gw14
 4 AAACCTGAGGGCATGT_4    414   731 CGE            0.02    0.290 gw14
 5 AAACCTGAGTGATCGG_5    449   794 CGE            0.03    0.27  gw14
 6 AAACCTGCAAAGTGCG_6   1055  2439 CGE            0.02    0.25  gw14
 7 AAACCTGCAATCGGTT_7    724  1485 CGE            0.01    0.23  gw14
 8 AAACCTGCACTTGGAT_8    514   885 CGE            0       0.18  gw14
 9 AAACCTGCAGACGCCT_9    593  1215 CGE            0.03    0.27  gw14
10 AAACCTGCAGCATACT_10   411   795 CGE            0.02    0.290 gw14
# ... with 98,869 more rows

$gw17
# A tibble: 61,578 x 7
   cell.name           nGene  nUMI orig.ident pct.mito pct.ribo age
   <chr>               <int> <int> <chr>         <dbl>    <dbl> <chr>
 1 AAACCTGAGAAGGACA_1    401   733 CGE            0.03     0.3  gw17
 2 AAACCTGAGCACCGTC_2    351   687 CGE            0.01     0.33 gw17
 3 AAACCTGAGCCAGAAC_3    408   824 CGE            0.01     0.3  gw17
 4 AAACCTGAGTGGCACA_4    557  1041 CGE            0.01     0.25 gw17
 5 AAACCTGCACACAGAG_5   1650  3609 CGE            0.02     0.19 gw17
 6 AAACCTGCAGCCACCA_6    295   730 CGE            0.01     0.05 gw17
 7 AAACCTGCAGTCGTGC_7   1136  2263 CGE            0.01     0.21 gw17
 8 AAACCTGCATATGCTG_8    733  1561 CGE            0.01     0.26 gw17
 9 AAACCTGCATTAGGCT_9   1344  3463 CGE            0.02     0.28 gw17
10 AAACCTGGTACCGCTG_10   915  2031 CGE            0.03     0.23 gw17
# ... with 61,568 more rows

$gw18
# A tibble: 113,918 x 7
   cell.name           nGene  nUMI orig.ident pct.mito pct.ribo age
   <chr>               <int> <int> <chr>         <dbl>    <dbl> <chr>
 1 AAACCTGAGCTAGTCT_1   1506  5420 CGE            0.03    0.37  gw18
 2 AAACCTGAGGGCACTA_2   1177  3580 CGE            0.02    0.27  gw18
 3 AAACCTGCAATCTGCA_3   1111  3204 CGE            0.04    0.33  gw18
 4 AAACCTGCAATGAATG_4   1323  4864 CGE            0.04    0.4   gw18
 5 AAACCTGCAGCCTTGG_5   1451  4840 CGE            0.02    0.23  gw18
 6 AAACCTGCAGGTGGAT_6   1402  4685 CGE            0.02    0.2   gw18
 7 AAACCTGCATCCTTGC_7   1917  6749 CGE            0.02    0.24  gw18
 8 AAACCTGGTAAACACA_8   1224  3925 CGE            0.02    0.33  gw18
 9 AAACCTGGTCATGCCG_9   2726 10896 CGE            0.03    0.28  gw18
10 AAACCTGGTGTAACGG_10   967  3034 CGE            0.03    0.290 gw18
# ... with 113,908 more rows

$gw19
# A tibble: 65,955 x 7
   cell.name           nGene  nUMI orig.ident pct.mito pct.ribo age
   <chr>               <int> <int> <chr>         <dbl>    <dbl> <chr>
 1 AAACCTGCAAGGCTCC_1    473   887 CGE            0       0.23  gw19
 2 AAACCTGCACCAGCAC_2    582  1400 CGE            0.01    0.290 gw19
 3 AAACCTGGTCTGATTG_3    570  1372 CGE            0.03    0.290 gw19
 4 AAACCTGGTGCACTTA_4    573  1279 CGE            0.02    0.32  gw19
 5 AAACCTGGTGTAATGA_5    617  1429 CGE            0.02    0.28  gw19
 6 AAACCTGTCATAAAGG_6   1470  3837 CGE            0.02    0.26  gw19
 7 AAACCTGTCCAACCAA_7    663  1720 CGE            0.02    0.33  gw19
 8 AAACCTGTCTTAACCT_8    418   807 CGE            0.02    0.19  gw19
 9 AAACGGGAGATGCCAG_9   1092  3306 CGE            0.02    0.45  gw19
10 AAACGGGAGTCCTCCT_10  1894  6252 CGE            0.04    0.32  gw19
# ... with 65,945 more rows

The best solution I can come up with is below, but I'm sure there has to be a better way.

test <- lapply(names(seurat.objects) %>% setNames(nm = .), 
                         function(x) {
                            seurat.objects[[x]]@meta.data <- metadata[[x]] %>% 
                                                            column_to_rownames(. , var = "cell.name")
                            return(seurat.objects[[x]])
                         }


                         )

This solution preserves the entire S4 object while only modifying the @metadata slot, and also preserves the names of each list element, but it's a rather convoluted path... Thanks for your advice.

1 Answers

1
Alexis On

In R, every operation is a function call, including assignments. You can type ?Extract in the console and you'll see some documentation for the base operators, like [<-, [[<-, and $<-. S4 objects also have a special operator: slot<-. So whenever you do something like [email protected] <- "foo", the function call `slot<-`(S4obj, "x", value="foo") could also be used. That means that you can do what you want with:

Map("slot<-", seurat.objects, "meta.data", value=metadata)

However, there is a gotcha you should be aware of. R usually has copy-on-modify semantics, which means that a copy of an object is made before it is modified. For example:

vecs <- list(1:2, 3:4)
vecs2 <- lapply(vecs, "[<-", 1L, 0L)

> vecs
[[1]]
[1] 1 2

[[2]]
[1] 3 4

> vecs2
[[1]]
[1] 0 2

[[2]]
[1] 0 4

This doesn't always apply, environments and reference classes have different semantics. For example:

envs <- list(new.env(), new.env())
envs2 <- lapply(envs, "[[<-", "foo", "bar")

> sapply(envs, ls)
[1] "foo" "foo"
> sapply(envs2, ls)
[1] "foo" "foo"

In this case, the environments in envs were not copied before modifying them for envs2, so both lists hold the same objects.

For some reason, which I wonder if it's a bug (R v3.6.0) which is a known bug, the following also modifies the original objects without copying:

setClass("Foo", list(x="integer"))
s4s <- list(new("Foo", x=0L), new("Foo", x=1L))
s4s2 <- Map("slot<-", s4s, "x", value=list(2L, 3L))

> s4s
[[1]]
An object of class "Foo"
Slot "x":
[1] 2


[[2]]
An object of class "Foo"
Slot "x":
[1] 3


> s4s2
[[1]]
An object of class "Foo"
Slot "x":
[1] 2


[[2]]
An object of class "Foo"
Slot "x":
[1] 3

So use the form suggested by akrun if you want to avoid that:

Map(function(x, y) { slot(x, "meta.data") <- y; x }, seurat.objects, metadata)