Code
library(Seurat)
library(tidyverse)
This is a Quarto document which nicely combines both R code, its results and text explanation to enable an easy and interactive access for both learners, readers and supervisors to such analyses. To learn more about Quarto see https://quarto.org. By default all code lines are blended, but you can show them by clicking on the code button.
For more information about Seurat R package please visit a dedicated documentation page with all details regarding embedded functions and their usage. If you use custom packages in your publications, do not forget to cite them using the information provided at the maintainers web-site.
This is a basic Seurat analysis of scRNA-seq data downstream of CellRanger pipeline. The reference paper authors previously assembled feature count matrix, metadata and row data Seurat object and uploaded it on GEO repository. Analysis in this document is required before annotating specific cell types and analyzing differential gene expression. We aim to have dimensionality reduction data like tSNE or UMAP available for visualizing results further. This analysis was taken from a youtube tutorial almost without modifications, but using current dataset.
For this analysis we need Seurat and tidyverse packages
library(Seurat)
library(tidyverse)
Loading our dataset
<- readRDS("./../input/GSE238137_blood-liver_MAIT-Tmem_seurat.rds")
data.initial data.initial
An object of class Seurat
39947 features across 89456 samples within 5 assays
Active assay: SCT (17646 features, 0 variable features)
3 layers present: counts, data, scale.data
4 other assays present: RNA, ADT, HTO, integrated
2 dimensional reductions calculated: pca, umap
Let’s reduce our seurat object until desired cell populations and assays
DefaultAssay(data.initial) <- "RNA" # to allow other assay layers to be deleted
<- DietSeurat(
data.initial
data.initial,features = NULL,
assays = "RNA",
dimreducs = NULL,
graphs = NULL,
misc = TRUE)
.2exp <- data.initial[, data.initial$experiment == "Exp 2"]
data.mait.Tmem
.2exp <- data.mait.Tmem.2exp[, data.mait.Tmem.2exp$tissue == "Liver"]
data.liver.mait.Tmem.2exp data.liver.mait.Tmem
An object of class Seurat
19289 features across 17069 samples within 1 assay
Active assay: RNA (19289 features, 0 variable features)
2 layers present: counts, data
Let’s first calculate the percentage of mitochondrial gene transcripts using build-in Seurat function
.2exp[["percent.mt"]] <- PercentageFeatureSet(data.liver.mait.Tmem.2exp, pattern = "^MT-")
data.liver.mait.TmemIdents(data.liver.mait.Tmem.2exp) <- data.liver.mait.Tmem.2exp$cell_type
Plotting Violin plot showing all information about QC metrics
VlnPlot(data.liver.mait.Tmem.2exp, features = c("nFeature_RNA", "nCount_RNA", "percent.mt"), ncol = 3)
And FeatureScatter plot
FeatureScatter(data.liver.mait.Tmem.2exp, feature1 = "nCount_RNA", feature2 = "nFeature_RNA") +
geom_smooth(method = 'lm')
`geom_smooth()` using formula = 'y ~ x'
In our case the authors already performed QC filtering, so we skip this step.
# data.liver.mait.Tmem.2exp <- subset(Garner.Trem.2exp, subset = nFeature_RNA > 200 & nFeature_RNA < 2500 & percent.mt < 5)
.2exp[["RNA"]]@meta.features <- data.frame(row.names = rownames(data.liver.mait.Tmem.2exp[["RNA"]])) # Assign to Seurat objects rownames correctly
data.liver.mait.Tmem<- NormalizeData(data.liver.mait.Tmem.2exp)
data <- FindVariableFeatures(data) data
<- head(VariableFeatures(data), 10)
top10 top10
[1] "HSPA6" "HSPA1A" "CCL20" "HSPA1B" "CCL4L2" "GNLY" "CCL3L1" "CCL3"
[9] "EGR1" "XCL1"
<- rownames(data)
all.genes <- ScaleData(data, features = all.genes) data
Centering and scaling data matrix
print(data[["pca"]], dims = 1:5, nfeatures = 5)
PC_ 1
Positive: HLA-DRB1, GZMH, XCL2, HLA-DPA1, ARID5B
Negative: KLRB1, IL7R, SLC4A10, CEBPD, RORA
PC_ 2
Positive: NR4A2, ACTB, DUSP2, IER2, PPP1R15A
Negative: MT-ND3, HLA-G, MT-CYB, FTH1, PDE3B
PC_ 3
Positive: METRNL, PPP2R5C, FAM177A1, TUBB4B, VPS37B
Negative: RPS26, TXNIP, MYL12A, ANXA1, CD52
PC_ 4
Positive: DUSP4, COTL1, ZFP36, ATP1B3, ALOX5AP
Negative: NKG7, FGFBP2, FCGR3A, GZMB, GNLY
PC_ 5
Positive: JUND, ZFP36, DUSP2, MAFF, PPP1R15A
Negative: SAMSN1, ARRDC3, MALAT1, GLIPR1, ITK
DimHeatmap(data, dims = 1, cells = 500, balanced = TRUE)
ElbowPlot(data)
Visualize PCA plot with clusters
DimPlot(data, reduction = "pca", group.by = "RNA_snn_res.0.5", label = TRUE)
Idents(data) <- "RNA_snn_res.0.5"
Plot UMAP to visualize clusters spatial distribution
DimPlot(data, reduction = "umap", group.by = "RNA_snn_res.0.5")
data
An object of class Seurat
19289 features across 17069 samples within 1 assay
Active assay: RNA (19289 features, 2000 variable features)
3 layers present: counts, data, scale.data
2 dimensional reductions calculated: pca, umap
#saveRDS(data, file = "./../output/Garner.seurat.2exp.liver.mait_analyzed.rds")
sessionInfo()
R version 4.3.3 (2024-02-29)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.3
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/Zurich
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4
[5] purrr_1.0.2 readr_2.1.5 tidyr_1.3.1 tibble_3.2.1
[9] ggplot2_3.5.0 tidyverse_2.0.0 Seurat_5.0.3 SeuratObject_5.0.1
[13] sp_2.1-3
loaded via a namespace (and not attached):
[1] RColorBrewer_1.1-3 rstudioapi_0.16.0 jsonlite_1.8.8
[4] magrittr_2.0.3 spatstat.utils_3.0-4 ggbeeswarm_0.7.2
[7] farver_2.1.1 rmarkdown_2.26 vctrs_0.6.5
[10] ROCR_1.0-11 spatstat.explore_3.2-7 htmltools_0.5.8.1
[13] sctransform_0.4.1 parallelly_1.37.1 KernSmooth_2.23-22
[16] htmlwidgets_1.6.4 ica_1.0-3 plyr_1.8.9
[19] plotly_4.10.4 zoo_1.8-12 igraph_2.0.3
[22] mime_0.12 lifecycle_1.0.4 pkgconfig_2.0.3
[25] Matrix_1.6-5 R6_2.5.1 fastmap_1.1.1
[28] fitdistrplus_1.1-11 future_1.33.2 shiny_1.8.1.1
[31] digest_0.6.35 colorspace_2.1-0 patchwork_1.2.0
[34] tensor_1.5 RSpectra_0.16-1 irlba_2.3.5.1
[37] labeling_0.4.3 progressr_0.14.0 fansi_1.0.6
[40] spatstat.sparse_3.0-3 timechange_0.3.0 mgcv_1.9-1
[43] httr_1.4.7 polyclip_1.10-6 abind_1.4-5
[46] compiler_4.3.3 withr_3.0.0 fastDummies_1.7.3
[49] MASS_7.3-60.0.1 tools_4.3.3 vipor_0.4.7
[52] lmtest_0.9-40 beeswarm_0.4.0 httpuv_1.6.15
[55] future.apply_1.11.2 goftest_1.2-3 glue_1.7.0
[58] nlme_3.1-164 promises_1.2.1 grid_4.3.3
[61] Rtsne_0.17 cluster_2.1.6 reshape2_1.4.4
[64] generics_0.1.3 gtable_0.3.4 spatstat.data_3.0-4
[67] tzdb_0.4.0 data.table_1.15.4 hms_1.1.3
[70] utf8_1.2.4 spatstat.geom_3.2-9 RcppAnnoy_0.0.22
[73] ggrepel_0.9.5 RANN_2.6.1 pillar_1.9.0
[76] spam_2.10-0 RcppHNSW_0.6.0 later_1.3.2
[79] splines_4.3.3 lattice_0.22-6 survival_3.5-8
[82] deldir_2.0-4 tidyselect_1.2.1 miniUI_0.1.1.1
[85] pbapply_1.7-2 knitr_1.45 gridExtra_2.3
[88] scattermore_1.2 xfun_0.43 matrixStats_1.2.0
[91] stringi_1.8.3 lazyeval_0.2.2 yaml_2.3.8
[94] evaluate_0.23 codetools_0.2-20 cli_3.6.2
[97] uwot_0.1.16 xtable_1.8-4 reticulate_1.35.0
[100] munsell_0.5.1 Rcpp_1.0.12 globals_0.16.3
[103] spatstat.random_3.2-3 png_0.1-8 ggrastr_1.0.2
[106] parallel_4.3.3 dotCall64_1.1-1 listenv_0.9.1
[109] viridisLite_0.4.2 scales_1.3.0 ggridges_0.5.6
[112] leiden_0.4.3.1 rlang_1.1.3 cowplot_1.1.3