I have a 10-page Excel spreadsheet, each page represents a cell cluster, with the first row being the signature genes for that cell cluster. The data structure is as follows:
Gene p_val avg_logFC pct.1 pct.2 p_val_adj
Col15a1 1.4665E-146 1.18233234 0.735 0.217 3.0925E-142
Gsn 4.4013E-143 1.028705212 1 0.91 9.2811E-139
... ... ... ... ... ...
My code is as follows:
all_data <- lapply(1:10, function(i) {
data <- read_excel("cells.xlsx", sheet = i)
colnames(data)[1] <- "Gene"
data <- data[,c(1,3)]
row.names(data) <- data$Gene
return(data)
})
seurat_list <- lapply(all_data, function(data) {
pbmc <- CreateSeuratObject(counts = as.matrix(data))
pbmc <- NormalizeData(pbmc)
pbmc <- FindVariableFeatures(pbmc)
pbmc <- ScaleData(pbmc)
pbmc <- RunPCA(pbmc, verbose = FALSE)
pbmc <- RunUMAP(pbmc, reduction = "pca", dims = 1:10)
return(pbmc)
})
However, I encountered the following error:
Warning: Data is of class matrix. Coercing to dgCMatrix.
Normalizing layer: counts
Performing log-normalization
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Finding variable features for layer counts
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Error in simpleLoess(y, x, w, span, degree = degree, parametric = parametric, :
'x' incorrect
Warning message:
In matrix(data = as.numeric(x = x), ncol = nc) : NAs introduced by coercion" in as.numeric()
Could anyone please help me solve this problem? Thanks a lot!
A Seurat object expects counts, not any sort of processed and aggregated data as you have it. There is no meaningful way to convert these files into a Seurat object. You need counts, be it raw counts, or some sort of normalized values to build your object. Please see the Seurat vignettes to get the basics right.