I have a bipartite graph and a data frame with each rows associated with each vertices of the first side of the graph. The graph is not connected and when I find the largest component of the graph, I have to subset the data frame (which does not give me the correct answer). Another possible option is to set the the rows of the data frame as the attribute of the vertices of the first side of the graph (which I don't know how to do it)!. Here is a toy example:
edgelist = matrix(c("A","a","A","b","B","b","C","c","D","c"),ncol=2,byrow=T)
bg <- graph.data.frame(edgelist, directed=F)
V(bg)$type <- V(bg)$name %in% edgelist[,1]
summary(bg)
V(bg)[V(bg)$type==1]
df = data.frame(id=c("A","B","C","D"), x=runif(4,10,50), y=sample(4), z=rnorm(4))
gclust = clusters(bg)
numClust = gclust$no
numLCC = gclust$csize[1]
bg2 = induced.subgraph(bg, which(gclust$membership ==which.max(gclust$csize)))
remained_set1 <- V(bg2)[V(bg2)$type==1]
df[as.character(df[,1])%in%remained_set1,] # wrong answer
Try using the names of the vertices and flipping the
%in%
This returns the rows for A and B in your example.