I have a variable which contains the actor names.
(actor=structure(c(4L, 1L, 6L, 2L, 5L, 3L), .Label = c("Christian Bale, Tom Hardy, Anne Hathaway, Gary Oldman",
"Jamie Foxx, Christoph Waltz, Leonardo DiCaprio, Kerry Washington",
"Jennifer Lawrence, Josh Hutcherson, Liam Hemsworth, Stanley Tucci",
"Leonardo DiCaprio, Joseph Gordon-Levitt, Ellen Page, Ken Watanabe",
"Leonardo DiCaprio, Mark Ruffalo, Ben Kingsley, Max von Sydow",
"Robert Downey Jr., Chris Evans, Scarlett Johansson, Jeremy Renner"
), class = "factor"))
# [1] Leonardo DiCaprio, Joseph Gordon-Levitt, Ellen Page, Ken Watanabe
# [2] Christian Bale, Tom Hardy, Anne Hathaway, Gary Oldman
# [3] Robert Downey Jr., Chris Evans, Scarlett Johansson, Jeremy Renner
# [4] Jamie Foxx, Christoph Waltz, Leonardo DiCaprio, Kerry Washington
# [5] Leonardo DiCaprio, Mark Ruffalo, Ben Kingsley, Max von Sydow
# [6] Jennifer Lawrence, Josh Hutcherson, Liam Hemsworth, Stanley Tucci
# 6 Levels: Christian Bale, Tom Hardy, Anne Hathaway, Gary Oldman ...
I want to extract all the complete actor names from it (name + surname) and make them columns in an output matrix.
If you wanted to extract the unique names of actors, you can get the indicated actors with the
as.character
function, split it on the commas withstrsplit
, combine together all vectors in the resulting list withunlist
, and grab the unique names withunique
:By using
as.character(actor)
, this code uses only the actors that show up in the the factoractor
, even if that factor has many more levels that are unused. If you uselevels(actor)
instead, you will get all the actors in the factor's levels, regardless of whether they are used inactors
. You can use whichever you prefer when definingall.actors
.If you wanted a matrix indicating the inclusion of each actor in each element of
actor
, you could then do