vignettes/Session_1_sequencing_assays.Rmd
Session_1_sequencing_assays.RmdThis workshop introduces spatial transcriptomics analysis using the
Bioconductor framework, with a particular focus on the
SpatialExperiment package. Participants will learn
essential concepts and practical skills for analyzing spatially-resolved
genomic data.
By the end of this session, participants will be able to: -
Understand the fundamentals of spatial transcriptomics data analysis -
Work with the SpatialExperiment package and related tools -
Perform basic data manipulation and visualization of spatial data -
Apply quality control measures to spatial transcriptomics data - Conduct
dimensionality reduction and clustering analyses - Interpret spatial
patterns in gene expression data
Here we briefly describe some of the major technologies. This section is contributed by Dr Luciano Martellotto.
Spatial-omics encompasses a suite of powerful methods that reveal not only which genes are active in a tissue but also exactly where those genes are switched on. One widely used strategy involves laying a thin slice of tissue onto a specially prepared glass slide that carries an array of microscopic “spots,” each spot marked with its own unique molecular barcode. As the tissue is gently broken down, the messenger RNA molecules released from each cell adhere to the underlying spots and pick up that spot’s barcode. By sequencing the barcodes together with the captured RNA, researchers can reconstruct a two-dimensional map of gene expression. For example, the Visium platform from 10x Genomics uses this barcoded-surface approach to chart gene activity across tumour biopsies, helping oncologists to identify pockets of treatment-resistant cells within a cancerous mass.
An alternative method, known as combinatorial FISH (fluorescence in situ hybridisation), skips the need for physical barcodes by using fluorescent probes that bind directly to RNA molecules within intact tissue. Each probe is tagged with a small coloured label, and by carrying out multiple rounds of staining, imaging and probe removal, a unique sequence of coloured dots is generated for each target gene. It’s akin to reading a barcode of coloured spots: once the entire sequence of images has been captured, computational decoding reveals which gene each pattern corresponds to and pinpoints its exact location. This technique underlies MERFISH (Multiplexed Error-Robust FISH), which neuroscientists often employ to map hundreds of genes simultaneously in brain sections, illuminating the molecular identities of different neuronal subtypes.
In-situ sequencing offers yet another route to spatially resolved transcriptomics by performing the sequencing reactions directly within fixed tissue sections. Rather than relying on pre-made probes, this approach uses a series of enzymatic ligation or polymerisation steps to read out the RNA sequence base by base. At each cycle, fluorescently labelled reagents indicate which nucleotide (A, C, G or T) has been incorporated, and repeated imaging across multiple cycles yields short sequence reads in situ. Once these reads are matched to a reference genome, they reveal precisely where specific transcripts lie. Developmental biologists have harnessed this method—pioneered by technologies such as Fluorescent In Situ Sequencing (FISSEQ)—to follow gene expression patterns during embryo formation, tracking how cells differentiate according to their spatial context.

The Visium CytAssist platform from 10x Genomics brings the power of spatial transcriptomics into a streamlined, sequencing-based workflow. At its heart lies a standard glass slide bearing an 11 mm by 11 mm capture area patterned with roughly 14 000 microscopic spots (or 5 000 spots on a smaller 6.5 mm by 6.5 mm format). Each spot is densely coated with millions of identical oligonucleotides, each bearing a unique spatial barcode, a unique molecular identifier (UMI) and a poly(dT) tail designed to bind the polyadenylated tails of mRNA. When a fresh‐frozen or FFPE tissue section is mounted onto this slide, RNA molecules released during permeabilisation will hybridise to these oligos, effectively “stamping” each transcript with its precise tissue coordinates.
The CytAssist instrument automates the critical steps of permeabilisation, RNA digestion and probe release. Rather than capturing native transcripts directly, Visium employs probe hybridisation: a comprehensive set of probes tiles the entire transcriptome (v2 chemistry covers some 18 000 human or 19 000 mouse genes), binding selectively to their target RNAs. Once the tissue has been permeabilised, these probes are enzymatically released and immediately recaptured by the underlying barcoded array. A short extension reaction then attaches the probe insert to the spatial barcode and UMI, before a denaturation step frees the complete construct for library preparation.
Sequencing libraries are configured so that Read 1 decodes the slide’s spatial barcode and the UMI, while Read 2 reads into the ligated probe insert, revealing the gene identity. To ensure robust detection of both abundant and rare messages, Visium recommends a minimum of 25 000 read‐pairs per covered spot. Optional immunofluorescence staining can be performed in parallel, providing morphological and protein‐level context alongside the transcriptomic data.
In practice, Visium CytAssist has found widespread use across many fields. Cancer researchers have applied it to map immune cell infiltration and stromal niches within melanoma or breast carcinoma biopsies. Developmental biologists use it to chart gene expression gradients in embryonic tissues, revealing how cells acquire distinct identities in different locations. Even neuroscientists have begun to dissect the molecular architecture of brain regions, linking spatial patterns of gene activity with anatomy and function. By combining a turnkey instrument with a comprehensive probe set and high‐throughput sequencing, Visium offers an accessible route to the spatial “geography” of gene expression in virtually any tissue.

The Visium HD system represents a next-generation leap in spatial transcriptomics, offering subcellular resolution on a standard CytAssist instrument. Instead of discrete 55 µm spots, the Visium HD slide presents a continuous lawn of capture oligonucleotides across a 6.5 mm × 6.5 mm area, each oligo bearing a unique spatial barcode and UMI. These barcodes are patterned in a fine grid of 2 µm × 2 µm squares, which are digitally binned into 8 µm × 8 µm “pixels” for data analysis. In practice, this means that gene expression can be mapped at roughly one-cell or even subcellular scale—more than a six-fold improvement in resolution compared with the original Visium array.
As with the standard Visium workflow, fresh-frozen or FFPE tissue sections are first stained (H&E or immunofluorescence, if desired) and imaged for morphological context. The CytAssist then automates permeabilisation, RNA digestion and probe‐release steps: a comprehensive probe set tiles the entire transcriptome, binding each target mRNA; released probes are recaptured by the underlying barcoded oligo lawn; and a short extension reaction fuses the probe insert to its spatial barcode and UMI. After denaturation frees these constructs, they undergo library preparation and high-throughput sequencing. Read 1 decodes the spatial barcode and UMI, while Read 2 reads into the probe insert to identify the gene. To cover the full 6.5 mm capture area at HD resolution, Visium HD recommends approximately 275 million read-pairs per run.
### STOmics
BGI’s STOmics system brings spatial transcriptomics onto DNA nanoball (DNB) patterned chips that can cover areas from a few square millimetres right up to an entire microscope slide, offering both enormous scale and subcellular resolution. The process begins with the creation of a dense array of molecular “nanoballs,” each just 220 nm across and stamped onto the chip in a precise grid. During chip manufacture, each nanoball is endowed with three key elements: a poly-T tail for capturing polyadenylated mRNA, a unique molecular identifier (UMI) to count individual transcripts, and a coordinate identifier (CID) that records its exact X–Y position on the array.
Once a freshly frozen or paraformaldehyde-fixed tissue section has been mounted and (optionally) stained for nuclei or protein markers, the chip is brought into contact with the specimen so that mRNA diffuses down into the nanoball layer and hybridises to the poly-T oligos. Reverse transcription then converts these captured RNAs into complementary DNA, preserving both their sequence information and spatial tag. Library construction and high-throughput sequencing follow much as in conventional RNA-seq, but every read now carries the CID and UMI, which bioinformatics pipelines use to reconstruct a high-density map of gene expression.
The sheer density of the DNB pattern—over 25 000 spots per 100 µm² in the highest-resolution formats—means that STOmics can detect transcripts at nearly subcellular scale, revealing fine-grained differences in gene activity within single cells or across tiny tissue niches. At the same time, chip formats up to 174 cm² in area allow researchers to profile entire organs or large tissue biopsies in one run, without stitching together multiple fields of view. In practice, developmental biologists have used this platform to survey gene expression across whole zebrafish embryos, while tumour biologists have mapped the spatial organisation of immune infiltrates in large cancer resections. By marrying nanometre-scale resolution with slide-wide coverage, BGI’s STOmics empowers scientists to explore biological landscapes from the level of subcellular compartments all the way up to entire tissue architectures.

Bioconductor is an open-source, open-development software project built on the R programming language. It provides powerful tools for analyzing and comprehending high-throughput genomic data.
Spatial omics analysis combines traditional genomic data with spatial information, enabling researchers to understand gene expression patterns within their tissue context. Bioconductor provides several specialized tools for this purpose:
In the following sections of this workshop, we will explore practical examples and dive deeper into how Bioconductor is used in spatial omics analyses, including hands-on coding examples. Stay tuned for an engaging journey through spatial omics with Bioconductor!
The SpatialExperiment package provides a robust
framework for handling spatial transcriptomics data. It extends the
SingleCellExperiment class by adding functionality specific
to spatial data:
Righelli et al. doi: 10.1093/bioinformatics/btac299

# Install BiocManager
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
# Install SpatialExperiment package
BiocManager::install("SpatialExperiment")We’ll work with data from the spatialLIBD package, which
contains spatial transcriptomics data from human dorsolateral prefrontal
cortex. This dataset provides an excellent example for learning spatial
analysis techniques as it includes:
# Load the SpatialExperiment package
# For SpatialExperiment the following might be needed
# 1. bash: module load ImageMagick/6.9.11-22
# 2. R: devtools::install_github("ropensci/magick")
library(SpatialExperiment)Visualisation functions for spatial transcriptomics data.
Utility packages for single-cell data.
In this section, we explore the handling and processing of spatial
transcriptomics data using the spatialLIBD and
ExperimentHub packages. The following R code block
retrieves a specific dataset from the ExperimentHub, a
Bioconductor project designed to manage and distribute large biological
data sets. The code efficiently fetches the data, removes any existing
dimensional reductions, and filters the dataset to include only selected
samples. This approach is essential for analysing spatial patterns in
gene expression across multiple samples, and the code below exemplifies
how to manipulate these datasets in preparation for further analysis.
This process is adapted from the Banksy package’s vignette,
which provides advanced methods for multi-sample spatial
transcriptomics.
Maynard and Torres et al., doi: 10.1038/s41593-020-00787-0, tutorial
library(spatialLIBD)
library(ExperimentHub)
# To avoid error for SPE loading
# https://support.bioconductor.org/p/9161859/#9161863
setClassUnion("ExpData", c("matrix", "SpatialExperiment"))
spatial_data <- tryCatch({
ExperimentHub::ExperimentHub() |>
spatialLIBD::fetch_data(eh = _, type = "spe")
}, error = function(e) {
message("ExperimentHub failed, using Zenodo fallback: ", conditionMessage(e))
options(timeout = max(300, getOption("timeout")))
tmp <- tempfile(fileext = ".rds")
utils::download.file(
"https://zenodo.org/records/11233385/files/tidySpatialWorkshop2024_spatial_data.rds",
destfile = tmp,
mode = "wb"
)
readRDS(tmp)
})
names(libd_layer_colors) = gsub("ayer", "", names(libd_layer_colors))
# Clear the reductions
reducedDims(spatial_data) = NULL
# Select only 3 samples
spatial_data = spatial_data[, spatial_data$sample_id %in% c("151673", "151675", "151676")]
# Display the object
spatial_data## class: SpatialExperiment
## dim: 33538 10691
## metadata(0):
## assays(2): counts logcounts
## rownames(33538): ENSG00000243485 ENSG00000237613 ... ENSG00000277475
## ENSG00000268674
## rowData names(9): source type ... gene_search is_top_hvg
## colnames(10691): AAACAAGTATCTCCCA-1 AAACAATCTACTAGCA-1 ...
## TTGTTTCCATACAACT-1 TTGTTTGTGTAAATTC-1
## colData names(69): sample_id Cluster ... array_row array_col
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
## imgData names(4): sample_id image_id data scaleFactor
From: https://bookdown.org/sjcockell/ismb-tutorial-2023/practical-session-2.html
If ExperimentHub should not work. The
spatial_data object from the previous code block can be
downloaded from Zenodo
- 10.5281/zenodo.11233385
We shows metadata for each cell, helping understand the dataset’s structure.
| sample_id | Cluster | sum_umi | sum_gene | subject | position | replicate | subject_position | discard | key | cell_count | SNN_k50_k4 | SNN_k50_k5 | SNN_k50_k6 | SNN_k50_k7 | SNN_k50_k8 | SNN_k50_k9 | SNN_k50_k10 | SNN_k50_k11 | SNN_k50_k12 | SNN_k50_k13 | SNN_k50_k14 | SNN_k50_k15 | SNN_k50_k16 | SNN_k50_k17 | SNN_k50_k18 | SNN_k50_k19 | SNN_k50_k20 | SNN_k50_k21 | SNN_k50_k22 | SNN_k50_k23 | SNN_k50_k24 | SNN_k50_k25 | SNN_k50_k26 | SNN_k50_k27 | SNN_k50_k28 | GraphBased | Maynard | Martinowich | layer_guess | layer_guess_reordered | layer_guess_reordered_short | expr_chrM | expr_chrM_ratio | SpatialDE_PCA | SpatialDE_pool_PCA | HVG_PCA | pseudobulk_PCA | markers_PCA | SpatialDE_UMAP | SpatialDE_pool_UMAP | HVG_UMAP | pseudobulk_UMAP | markers_UMAP | SpatialDE_PCA_spatial | SpatialDE_pool_PCA_spatial | HVG_PCA_spatial | pseudobulk_PCA_spatial | markers_PCA_spatial | SpatialDE_UMAP_spatial | SpatialDE_pool_UMAP_spatial | HVG_UMAP_spatial | pseudobulk_UMAP_spatial | markers_UMAP_spatial | spatialLIBD | ManualAnnotation | in_tissue | array_row | array_col | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AAACAAGTATCTCCCA-1 | 151673 | 7 | 8458 | 3586 | Br8100 | 0 | 1 | Br8100_pos0 | FALSE | 151673_AAACAAGTATCTCCCA-1 | 6 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 4 | 4 | 4 | 4 | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 6 | 5 | 5 | 4 | 7 | 2_3 | 2 | Layer3 | Layer3 | L3 | 1407 | 0.1663514 | 3 | 3 | 2 | 2 | 2 | 1 | 1 | 3 | 1 | 1 | 3 | 1 | 1 | 3 | 1 | 7 | 1 | 1 | 2 | 1 | L3 | NA | TRUE | 50 | 102 |
| AAACAATCTACTAGCA-1 | 151673 | 4 | 1667 | 1150 | Br8100 | 0 | 1 | Br8100_pos0 | FALSE | 151673_AAACAATCTACTAGCA-1 | 16 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 4 | 4 | 4 | 4 | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 10 | 10 | 10 | 10 | 4 | 2_3 | 3 | Layer1 | Layer1 | L1 | 204 | 0.1223755 | 2 | 5 | 3 | 7 | 8 | 2 | 2 | 2 | 2 | 1 | 7 | 5 | 2 | 2 | 3 | 2 | 1 | 4 | 2 | 3 | L1 | NA | TRUE | 3 | 43 |
| AAACACCAATAACTGC-1 | 151673 | 8 | 3769 | 1960 | Br8100 | 0 | 1 | Br8100_pos0 | FALSE | 151673_AAACACCAATAACTGC-1 | 5 | 2 | 3 | 4 | 5 | 8 | 9 | 10 | 11 | 10 | 9 | 10 | 11 | 11 | 12 | 13 | 12 | 11 | 11 | 12 | 20 | 20 | 21 | 22 | 21 | 22 | 8 | WM | WM | WM | WM | WM | 430 | 0.1140886 | 4 | 4 | 7 | 5 | 5 | 5 | 5 | 6 | 6 | 1 | 5 | 4 | 4 | 5 | 3 | 5 | 7 | 5 | 3 | 2 | WM | NA | TRUE | 59 | 19 |
| AAACAGAGCGACTCCT-1 | 151673 | 6 | 5433 | 2424 | Br8100 | 0 | 1 | Br8100_pos0 | FALSE | 151673_AAACAGAGCGACTCCT-1 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 4 | 4 | 4 | 4 | 3 | 3 | 3 | 2 | 2 | 2 | 7 | 6 | 6 | 6 | Layer3 | Layer3 | L3 | 1316 | 0.2422234 | 3 | 3 | 2 | 3 | 6 | 1 | 1 | 3 | 1 | 3 | 3 | 3 | 1 | 2 | 2 | 3 | 4 | 2 | 1 | 1 | L3 | NA | TRUE | 14 | 94 |
| AAACAGCTTTCAGAAG-1 | 151673 | 3 | 4278 | 2264 | Br8100 | 0 | 1 | Br8100_pos0 | FALSE | 151673_AAACAGCTTTCAGAAG-1 | 4 | 1 | 1 | 1 | 1 | 1 | 3 | 3 | 3 | 3 | 3 | 3 | 2 | 2 | 2 | 1 | 1 | 1 | 1 | 6 | 6 | 6 | 5 | 4 | 4 | 3 | 3 | 5 | 5 | Layer5 | Layer5 | L5 | 651 | 0.1521739 | 2 | 1 | 1 | 4 | 2 | 1 | 1 | 7 | 3 | 4 | 2 | 1 | 2 | 4 | 1 | 3 | 3 | 8 | 4 | 4 | L5 | NA | TRUE | 43 | 9 |
| AAACAGGGTCTATATT-1 | 151673 | 3 | 4004 | 2178 | Br8100 | 0 | 1 | Br8100_pos0 | FALSE | 151673_AAACAGGGTCTATATT-1 | 6 | 2 | 4 | 5 | 6 | 5 | 6 | 7 | 8 | 12 | 12 | 13 | 14 | 15 | 16 | 17 | 17 | 17 | 16 | 17 | 16 | 16 | 17 | 18 | 17 | 18 | 3 | 5 | 5 | Layer6 | Layer6 | L6 | 621 | 0.1550949 | 6 | 7 | 8 | 8 | 4 | 1 | 7 | 1 | 3 | 4 | 6 | 8 | 7 | 7 | 7 | 3 | 3 | 3 | 4 | 4 | L6 | NA | TRUE | 47 | 13 |
We access and display feature-related information from the dataset.
| source | type | gene_id | gene_version | gene_name | gene_source | gene_biotype | gene_search | is_top_hvg | |
|---|---|---|---|---|---|---|---|---|---|
| ENSG00000243485 | havana | gene | ENSG00000243485 | 5 | MIR1302-2HG | havana | lincRNA | MIR1302-2HG; ENSG00000243485 | FALSE |
| ENSG00000237613 | havana | gene | ENSG00000237613 | 2 | FAM138A | havana | lincRNA | FAM138A; ENSG00000237613 | FALSE |
| ENSG00000186092 | ensembl_havana | gene | ENSG00000186092 | 6 | OR4F5 | ensembl_havana | protein_coding | OR4F5; ENSG00000186092 | FALSE |
| ENSG00000238009 | ensembl_havana | gene | ENSG00000238009 | 6 | AL627309.1 | ensembl_havana | lincRNA | AL627309.1; ENSG00000238009 | FALSE |
| ENSG00000239945 | havana | gene | ENSG00000239945 | 1 | AL627309.3 | havana | lincRNA | AL627309.3; ENSG00000239945 | FALSE |
| ENSG00000239906 | havana | gene | ENSG00000239906 | 1 | AL627309.2 | havana | antisense | AL627309.2; ENSG00000239906 | FALSE |
Here, we perform a preliminary examination of the assay data contained within the spatial dataset.
assay(spatial_data)[1:20, 75:100]## 20 x 26 sparse Matrix of class "dgCMatrix"
## [[ suppressing 26 column names 'AACCAGTATCACTCTT-1', 'AACCATAGGGTTGAAC-1', 'AACCATGGGATCGCTA-1' ... ]]
##
## ENSG00000243485 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000237613 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000186092 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000238009 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000239945 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000239906 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000241599 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000236601 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000284733 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000235146 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000284662 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000229905 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000237491 . . . . . . . . . . . . . 1 . . . . . . 1 . . . . 2
## ENSG00000177757 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000225880 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000230368 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000272438 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000230699 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000241180 . . . . . . . . . . . . . . . . . . . . . . . . . .
## ENSG00000223764 . . . . . . . . . . . . . . . . . . . . . . . . . .
Spatial transcriptomics data requires specialized visualization
approaches to understand both molecular and spatial aspects
simultaneously. The ggspavis package provides powerful
tools for this purpose:
# image data
imgData(spatial_data)## DataFrame with 3 rows and 4 columns
## sample_id image_id data scaleFactor
## <character> <character> <list> <numeric>
## 1 151673 lowres #### 0.0450045
## 2 151675 lowres #### 0.0450045
## 3 151676 lowres #### 0.0450045
# Simple visualization of spatial data (one panel per section via sample_id)
ggspavis::plotCoords(spatial_data, sample_id = "sample_id")
Exercise 1.0
Calculate how many spots have been profiled, per sample. And calculate what is the % of the total available spots in Visium low resolution.
Calculate the sparcity (i.e. percentage of 0s) per sample
We can enhance our understanding by adding layer annotations. In this dataset, layers L1-6 represent different cortical layers, while WM indicates white matter:
# Plot spots with anatomical annotations
ggspavis::plotCoords(
spatial_data,
annotate = "spatialLIBD",
sample_id = "sample_id",
pal = libd_layer_colors
)
Explore additional visualisation features offered by the Visium platform, exposing the H&E (hematoxylin and eosin) image.
ggspavis::plotVisium(spatial_data, point_size = 0.5)
This visualisation focuses on specific tissue features within the dataset, emphasising areas of interest. 1. Mitochondrial Content: High mitochondrial gene expression often indicates stressed or dying cells 2. Library Size: Total RNA content per spot 3. Number of Detected Genes: Diversity of gene expression per spot
ggspavis::plotVisium(
spatial_data,
annotate = "spatialLIBD",
highlight = "in_tissue",
point_size =0.5,
) +
facet_wrap(~sample_id)
We will use the scater package McCarthy
et al. 2017 to compute the three primary QC metrics we discussed
earlWe’llUsing the scater Package for QC Metrics: We’ll apply the
scater package to compute three primary quality control
metrics. We’ll also use ggspavis for visualisation along
with some custom plotting techniques.
Previously, we visualised both on- and off-tissue spots. Moving forward, we focus on on-tissue spots for more relevant analyses. This block shows how to filter out off-tissue spots to refine the dataset.
Source OSTAWorkshopBioc2021
## Dataset dimensions before the filtering
dim(spatial_data)## [1] 33538 10691
Filtering Dataset to Retain Only On-Tissue Spots: We then refine our dataset by keeping only those spots that are on-tissue, aligning with our focus for subsequent analysis. The dimensions of the dataset after filtering are displayed to confirm the changes.
## Subset to keep only on-tissue spots
spatial_data <- spatial_data[, colData(spatial_data)$in_tissue == 1]
dim(spatial_data)## [1] 33538 10691
Next, we identify mitochondrial genes, as they are indicative of cell health. Cells with high mitochondrial gene expression typically indicate poor health or dying cells, which we aim to exclude.
## Classify genes as "mitochondrial" (is_mito == TRUE)
## or not (is_mito == FALSE)
is_gene_mitochondrial <- grepl("(^MT-)|(^mt-)", rowData(spatial_data)$gene_name)
rowData(spatial_data)$gene_name[is_gene_mitochondrial]## [1] "MT-ND1" "MT-ND2" "MT-CO1" "MT-CO2" "MT-ATP8" "MT-ATP6" "MT-CO3"
## [8] "MT-ND3" "MT-ND4L" "MT-ND4" "MT-ND5" "MT-ND6" "MT-CYB"
After identifying mitochondrial genes, we apply quality control metrics to further clean the dataset. This involves adding per-cell QC measures and setting a threshold to exclude cells with high mitochondrial transcription.
## Add per-cell QC metrics to spatial data using identified mitochondrial genes
spatial_data <- scater::addPerCellQC(
spatial_data,
subsets = list(mito = is_gene_mitochondrial)
)## Warning in scater::addPerCellQC(spatial_data, subsets = list(mito = is_gene_mitochondrial)): 'scater::addPerCellQC' is deprecated.
## Use 'scrapper::quickRnaQc.se' instead.
## See help("Deprecated")
## Warning in .per_cell_qc_metrics(assay(x, assay.type), subsets = subsets, : 'perCellQCMetrics' is deprecated.
## Use 'scrapper::computeRnaQcMetrics' instead.
## See help("Deprecated")
## Select expressed genes threshold
qc_mitochondrial_transcription <- colData(spatial_data)$subsets_mito_percent > 30
## Check how many spots are filtered out
table(qc_mitochondrial_transcription)## qc_mitochondrial_transcription
## FALSE TRUE
## 10599 92
After applying the QC metrics, it’s crucial to visually assess their impact. This step involves checking the spatial pattern of the spots removed based on high mitochondrial transcription, helping us understand the distribution and quality of the remaining dataset.
## Add threshold in colData
colData(spatial_data)$qc_mitochondrial_transcription <- qc_mitochondrial_transcription
## Visualize spatial pattern of filtered spots
plotObsQC(
spatial_data,
plot_type = "spot",
annotate = "qc_mitochondrial_transcription"
) +
facet_wrap(~sample_id)
This analysis focuses on examining the distribution of library sizes
across different spots. Each curve is coloured by tissue
layer (spatialLIBD), so you can contrast QC
metrics across anatomical regions; panels separate the three tissue
sections (sample_id).
library(ggplot2)
## Density of library size by cortical layer (spatialLIBD) within each section
data.frame(colData(spatial_data)) |>
ggplot(aes(x = sum, color = spatialLIBD)) +
geom_density(linewidth = 0.7) +
scale_x_log10() +
scale_color_manual(values = libd_layer_colors, na.value = "grey60") +
facet_wrap(~sample_id, ncol = 1, scales = "free_y") +
xlab("Library size") +
ylab("Density") +
labs(color = "Tissue layer") +
theme_classic() +
theme(legend.position = "bottom")
Setting Library Size Threshold: After examining the library sizes, a threshold is applied to identify spots with library sizes below 700, which are considered for potential exclusion from further analysis.
## Select library size threshold
qc_total_counts <- colData(spatial_data)$sum < 700
## Check how many spots are filtered out
table(qc_total_counts)## qc_total_counts
## FALSE TRUE
## 10639 52
Incorporating Library Size Threshold in Dataset: This step involves adding the library size threshold to the dataset’s metadata and examining the spatial pattern of the spots that have been removed based on this criterion.
## Add threshold in colData
colData(spatial_data)$qc_total_counts <- qc_total_counts
## Check for putative spatial pattern of removed spots
plotObsQC(
spatial_data,
plot_type = "spot",
annotate = "qc_total_counts",
) +
facet_wrap(~sample_id)
This analysis examines how many genes are expressed per spot, using
density curves coloured by tissue layer (spatialLIBD) and
faceted by section (sample_id).
## Density of detected genes by tissue layer within each section
data.frame(colData(spatial_data)) |>
ggplot(aes(x = detected, color = spatialLIBD)) +
geom_density(linewidth = 0.7) +
scale_x_log10() +
scale_color_manual(values = libd_layer_colors, na.value = "grey60") +
facet_wrap(~sample_id, ncol = 1, scales = "free_y") +
xlab("Number of genes with > 0 counts") +
ylab("Density") +
labs(color = "Tissue layer") +
theme_classic() +
theme(legend.position = "bottom")
Setting Gene Expression Threshold: This block applies a threshold to identify spots with fewer than 500 detected genes, considering these for exclusion to ensure data quality.
## Select expressed genes threshold
qc_detected_genes <- colData(spatial_data)$detected < 500
## Check how many spots are filtered out
table(qc_detected_genes)## qc_detected_genes
## FALSE TRUE
## 10642 49
Incorporating Gene Expression Threshold in Dataset: After setting the gene expression threshold, it is added to the dataset’s metadata. The spatial pattern of spots removed based on this threshold is then examined.
## Add threshold in colData
colData(spatial_data)$qc_detected_genes <- qc_detected_genes
## Check for putative spatial pattern of removed spots
plotObsQC(
spatial_data,
plot_type = "spot",
annotate = "qc_detected_genes",
) +
facet_wrap(~sample_id)
Exploring the Relationship Between Library Size and Number of Genes Detected: This scatter plot uses the same tissue-layer colours, so you can see how depth and complexity co-vary with anatomy within each section.
## Library size vs detected genes, coloured by tissue layer
data.frame(colData(spatial_data)) |>
ggplot(aes(sum, detected, color = spatialLIBD)) +
geom_point(shape = 16, size = 0.35, alpha = 0.55) +
scale_x_log10() +
scale_y_log10() +
scale_color_manual(values = libd_layer_colors, na.value = "grey60") +
facet_wrap(~sample_id, ncol = 1) +
xlab("Library size") +
ylab("Number of genes with > 0 counts") +
labs(color = "Tissue layer") +
theme_classic() +
theme(legend.position = "bottom")
After applying all QC filters, this block combines them and stores the results in the dataset. The spatial patterns of all discarded spots are then reviewed to ensure comprehensive quality control.
## Store the set in the object
colData(spatial_data)$discard <- qc_total_counts | qc_detected_genes | qc_mitochondrial_transcription
## Check the spatial pattern of combined set of discarded spots
plotObsQC(
spatial_data,
plot_type = "spot",
annotate = "discard",
) +
facet_wrap(~sample_id)
The final step in data preprocessing involves removing all spots identified as low-quality based on the previously applied thresholds, refining the dataset for subsequent analyses.
spatial_data = spatial_data[,!colData(spatial_data)$discard ]Exercise 1.0.1
It is good practice to perform quality control independently for each sample and different cell types. This because samples and cell types (or tissue regions) could have a distinct baseline distributions of quality control factors (e.g. mitochondrial transcription).
subsets_mito_percent grouping by
sample, using ggplot2.colData as already described) as colors, using
ggplot2
Exercise 1.0.2
Thresholding is a easy-to-understand approach, but often arbitrary. A
better strategy is outlier detection. With this strategy the baseline
distribution of a QC factor (e.g. mitochondrial transcription) will be
used to detect anomalous spots/cells. Read the documentation of
scater::isOutlier, and use it to label outlier spots for
mitochondrial transcription.
Then, note which method is the most stringent, between our thresholding and outlier-detection.
Dimensionality reduction is essential in spatial transcriptomics due to the high-dimensional nature of the data, which includes vast gene expression profiles across various spatial locations. Techniques such as PCA (Principal Component Analysis) and UMAP (Uniform Manifold Approximation and Projection) are particularly valuable. PCA helps to reduce noise and highlight the most significant variance in the data, making it simpler to uncover underlying patterns and correlations. UMAP, ofen calculated from principal components (and not directly from features) preserves both global and local data structures, enabling more nuanced visualisations of complex cellular landscapes. Together, these methods facilitate a deeper understanding of spatial gene expression, helping to reveal biological insights such as cellular heterogeneity and tissue structure, which are crucial for both basic biological research and clinical applications.
genes <- !grepl(pattern = "^Rp[l|s]|Mt", x = rownames(spatial_data))
dec = scran::modelGeneVar(spatial_data, subset.row = genes)
# Visualisation
plot(dec$mean, dec$total, xlab = "Mean log-expression", ylab = "Variance")
curve(metadata(dec)$trend(x), col = "blue", add = TRUE)
# Get top variable genes
dec = scran::modelGeneVar(spatial_data, subset.row = genes, block = spatial_data$sample_id)
hvg = scran::getTopHVGs(dec, n = 1000)
rowData(spatial_data[head(hvg),])[,c("gene_id", "gene_name")]## DataFrame with 6 rows and 2 columns
## gene_id gene_name
## <character> <character>
## ENSG00000123560 ENSG00000123560 PLP1
## ENSG00000197971 ENSG00000197971 MBP
## ENSG00000110484 ENSG00000110484 SCGB2A2
## ENSG00000131095 ENSG00000131095 GFAP
## ENSG00000173786 ENSG00000173786 CNP
## ENSG00000109846 ENSG00000109846 CRYAB
With the highly variable genes, we perform PCA to reduce dimensionality, followed by UMAP to visualise the data in a lower-dimensional space, enhancing our ability to observe clustering and patterns in the data.
spatial_data <-
spatial_data |>
scuttle::logNormCounts() |>
scater::runPCA(subset_row = hvg)## Warning in .library_size_factors(assay(x, assay.type), ...): 'librarySizeFactors' is deprecated.
## Use 'scrapper::centerSizeFactors' instead.
## See help("Deprecated")
## Warning in .local(x, ...): 'normalizeCounts' is deprecated.
## Use 'scrapper::normalizeCounts' instead.
## See help("Deprecated")
## Check correctness - names
reducedDimNames(spatial_data)## [1] "PCA"
reducedDim(spatial_data, "PCA")[1:5, 1:5]## PC1 PC2 PC3 PC4 PC5
## AAACAAGTATCTCCCA-1 4.323918 -0.9600003 -0.3514095 -2.2100416 -0.05912948
## AAACAATCTACTAGCA-1 1.560326 2.1527877 -0.8244904 0.3135456 -0.76248069
## AAACACCAATAACTGC-1 -19.168276 -2.2825249 -0.1300977 -0.2591148 -2.48548951
## AAACAGAGCGACTCCT-1 3.070884 1.0284710 -1.0845869 -2.6911326 0.88723306
## AAACAGCTTTCAGAAG-1 1.588394 -2.2461043 2.9044546 -0.5902962 0.13658943
As for single-cell data, we need to verify that there is not significant batch effect. If so we need to adjust for it (a.k.a. integration) before calculating principal component. Many adjustment methods to output adjusted principal components directly.
You can appreciate that, in this case, selecting within-sample variable genes, we do not see major batch effects across samples. We see two major pixel clusters.
We can appreciate that there are no major batch effects across samples, and we don’t see grouping driven by sample_id.
set.seed(42)
spatial_data <- scater::runUMAP(spatial_data, dimred = "PCA")
scater::plotUMAP(spatial_data, colour_by = "sample_id", point_size = 0.2) 
Exercise 1.1
Visualise where the two macro clusters are located spatially. We will
take a very pragmatic approach and get cluster label from splitting the
UMAP coordinated in two (colData() and
reducedDim() will help us, see above), and then visualise
it with ggspavis.
SpatialExperiment object based on the UMAP1
dimension so to divide those 2 clusterHighly variable genes (HVGs), as selected above with
scran::modelGeneVar(), are genes with large expression
variance after accounting for the mean-variance relationship. This is
useful, but it does not explicitly ask whether expression varies
with spatial position. Spatially variable genes (SVGs) are
genes whose expression pattern is spatially structured across the
tissue.
One Bioconductor-native approach is nnSVG.
The method fits a spatial model for each gene using a nearest-neighbor
Gaussian process. In practice, this compares a spatial model, where
nearby spots can have correlated expression, against a non-spatial
model. Genes are ranked by evidence for spatial structure, with results
stored in rowData() of the
SpatialExperiment.
The nearest-neighbor approximation makes the Gaussian process scalable to thousands of spots, but a full analysis can still take time. For this workshop, we run one tissue section with stringent gene filtering, then use the top genes as a small case study.
# Run one section for a clear workshop example
spatial_data_svg <- spatial_data[, spatial_data$sample_id == "151673"]
# Keep only genes with enough expression to make SVG fitting fast and stable
spatial_data_svg <- nnSVG::filter_genes(
spatial_data_svg,
filter_genes_ncounts = 10,
filter_genes_pcspots = 5
)
# Keep the most expressed genes to keep workshop runtime short.
n_keep <- min(1500, nrow(spatial_data_svg))
top_gene_idx <- order(
rowMeans(assay(spatial_data_svg, "logcounts")),
decreasing = TRUE
)[seq_len(n_keep)]
spatial_data_svg <- spatial_data_svg[top_gene_idx, ]
# Detect spatially variable genes
set.seed(123)
spatial_data_svg <- nnSVG::nnSVG(spatial_data_svg)
# Inspect top-ranked SVGs
svg_results <- rowData(spatial_data_svg) |>
as.data.frame() |>
dplyr::arrange(rank)
svg_results |>
dplyr::select(gene_name, rank, LR_stat, padj, prop_sv) |>
head(12) |>
knitr::kable(format = "html", digits = 3)| gene_name | rank | LR_stat | padj | prop_sv | |
|---|---|---|---|---|---|
| ENSG00000197971 | MBP | 1 | 6270.376 | 0 | 0.839 |
| ENSG00000123560 | PLP1 | 2 | 4674.557 | 0 | 0.724 |
| ENSG00000131095 | GFAP | 3 | 3886.406 | 0 | 0.793 |
| ENSG00000110484 | SCGB2A2 | 4 | 3798.250 | 0 | 0.810 |
| ENSG00000173786 | CNP | 5 | 3388.665 | 0 | 0.641 |
| ENSG00000109846 | CRYAB | 6 | 3369.094 | 0 | 0.681 |
| ENSG00000160307 | S100B | 7 | 2367.239 | 0 | 0.570 |
| ENSG00000167996 | FTH1 | 8 | 2086.292 | 0 | 0.488 |
| ENSG00000107317 | PTGDS | 9 | 2032.456 | 0 | 0.639 |
| ENSG00000154146 | NRGN | 10 | 2001.645 | 0 | 0.674 |
| ENSG00000171617 | ENC1 | 11 | 1924.199 | 0 | 0.478 |
| ENSG00000132639 | SNAP25 | 12 | 1913.332 | 0 | 0.520 |
The top ranked genes describe major anatomical and cell-state
structure in this section. MBP, PLP1, and
CNP are myelin/oligodendrocyte-associated genes, and they
mark white matter-rich areas. GFAP, S100B, and
CRYAB are glial or stress-associated genes.
NRGN, SNAP25, VSNL1, and
NEFL are neuronal or synaptic genes, and tend to highlight
cortical grey matter regions. This is exactly the type of signal SVG
methods are designed to recover: not only variable genes, but genes
whose variability is spatially organised.
We can visualise a few top SVGs on the tissue coordinates.
svg_case_study_genes <- c("MBP", "PLP1", "CNP", "GFAP", "S100B", "NRGN", "SNAP25", "NEFL")
svg_case_study_ids <- rownames(spatial_data_svg)[
match(svg_case_study_genes, rowData(spatial_data_svg)$gene_name)
]
valid_gene_idx <- !is.na(svg_case_study_ids)
svg_case_study_genes <- svg_case_study_genes[valid_gene_idx]
svg_case_study_ids <- svg_case_study_ids[valid_gene_idx]
if (length(svg_case_study_ids) == 0) {
stop("None of the case-study genes were found in this sample.")
}
spot_gene_index <- rep(seq_len(ncol(spatial_data_svg)), times = length(svg_case_study_genes))
spatial_data_svg_long <- spatial_data_svg[, spot_gene_index]
colData(spatial_data_svg_long)$svg_gene <- rep(svg_case_study_genes, each = ncol(spatial_data_svg))
colData(spatial_data_svg_long)$svg_logcounts <- as.numeric(
t(assay(spatial_data_svg, "logcounts")[svg_case_study_ids, ])
)
ggspavis::plotCoords(
spatial_data_svg_long,
annotate = "svg_logcounts",
pal = "viridis",
point_size = 0.6
) +
facet_wrap(~svg_gene, ncol = 4) +
labs(color = "logcounts") +
theme(legend.position = "bottom")
Because the UMAP above was built from transcriptomic similarity rather than physical coordinates, colouring the same SVGs on UMAP asks a slightly different question: do spatial patterns also correspond to expression-defined clusters?
umap_svg_df <- data.frame(
UMAP1 = rep(reducedDim(spatial_data, "UMAP")[, 1], times = length(svg_case_study_genes)),
UMAP2 = rep(reducedDim(spatial_data, "UMAP")[, 2], times = length(svg_case_study_genes)),
gene = rep(svg_case_study_genes, each = ncol(spatial_data)),
logcounts = as.numeric(t(assay(spatial_data, "logcounts")[svg_case_study_ids, ]))
)
umap_svg_df |>
ggplot(aes(UMAP1, UMAP2, color = logcounts)) +
geom_point(size = 0.2) +
facet_wrap(~gene, ncol = 4) +
scale_color_viridis_c(na.value = "grey80") +
labs(color = "logcounts") +
theme_classic() +
theme(legend.position = "bottom")
In this example, the strongest SVGs mostly agree with the visible
UMAP structure. Myelin genes such as MBP define the white
matter-associated side of the embedding, while neuronal genes such as
NRGN mark the complementary cortical/grey matter side.
However, SVGs and UMAP clusters are not identical concepts: an SVG can
follow a smooth anatomical gradient or layer-specific pattern without
becoming a perfectly separated UMAP cluster.
Other useful approaches include spatialDE,
another Gaussian-process method; DESpace,
which tests for genes associated with spatial domains or clusters; and
Voyager,
which provides spatial autocorrelation statistics such as Moran’s I and
Geary’s C through a geospatial analysis framework. The Bioconductor OSTA
chapter on feature
selection and testing provides a broader discussion of these
choices.
Clustering in spatial transcriptomics is crucial for understanding the intricate cellular composition of tissues. This technique groups cells/pixels based on similar gene expression profiles, enabling the identification of distinct cell types and states within a tissue’s spatial context. Clustering reveals patterns of cellular organisation and differentiation, and interactions in the microenvironment.
First, we establish the number of nearest neighbors to use in the
k-NN graph. This graph forms the basis for clustering, using the
Walktrap algorithm to detect community structures that suggest natural
groupings or clusters in the data. buildSNNGraph is from
the scan package.
## Set number of Nearest-Neighbours (NNs)
k <- 10
## Build the k-NN graph
g_walk <-
spatial_data |>
scran::buildSNNGraph(
k = 10,
use.dimred = "PCA"
) |>
igraph::cluster_walktrap()## Warning in .buildSNNGraph(reducedDim(x, use.dimred), d = NA, transposed = TRUE, : 'buildSNNGraph' is deprecated.
## Use 'bluster::makeSNNGraph' instead.
## See help("Deprecated")
clus <- g_walk$membership
## Check how many
table(clus)## clus
## 1 2 3 4 5 6 7
## 966 2856 751 2336 3084 280 274
Applying Clustering Labels and Visualising Results: After determining clusters, we apply these labels back to the spatial data and visualise the results using UMAP. This allows us to observe how the data clusters in a reduced dimension space, and further visualise how these clusters map onto the physical tissue sections for context.
We can appreciate here that we get two main pixel clusters.
colLabels(spatial_data) <- factor(clus)
scater::plotUMAP(spatial_data, colour_by = "label") + scale_color_brewer(palette = "Paired")## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.

Those two clusters group the white matter from the rest of the layers.
## Plot in tissue map
ggspavis::plotCoords(spatial_data, annotate = "label", sample_id = "sample_id") +
scale_color_brewer(palette = "Paired")## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.

As for comparison, we show the manually annotated regions. We can see that while the single cell style clustering catchers, the overall tissue, architecture, a lot of details are not retrieved. We clusters cannot faithfully recapitulate the tissue morphology. However, they might represent specific cell types within morphological regions.
## Plot ground truth in tissue map
print(
ggspavis::plotCoords(
spatial_data,
annotate = "spatialLIBD",
sample_id = "sample_id",
pal = libd_layer_colors
)
)
To cluster spatial regions (i.e. tissue domain) rather than single-cell types, the clustering algorithms need to take spatial context into account. For example what is the transcriptional profile of the neighbouring pixels or neighbouring cells.
BANKSY combines molecular and spatial information. BANKSY leverages the fact that a cell’s state can be more fully represented by considering both its own transcriptome “nd that of its local microenvironment.This algorithm embeds cells within a combined space that incorporates their own transcriptome and that of their locell’svironment, representing both the cell state and the surrounding microenvironment.
Overview of the algorithm
* Construct a neighborhood graph between cells in physical space (k-nearest neighbors or radius nearest neighbors).
* We use neighborhood graph to compute two matrices:
– ** Average neighborhood expression matrix
– ** “Azimuthal Gabor filter” matrix. It represents the transcriptomic microenvironment around each cell. It measures the gradient of gene expression in each cell’s neighborhood.
* These matrices are then scaled on the basis of a mixing parameter λ, which controls their relative weighting
* Concatenate these two matrices with the original gene–cell expression matrix
* Combine these three matrices by direct product
library(Banksy)
# scale the counts, without log transformation
spatial_data = spatial_data |> logNormCounts(log=FALSE, name = "normcounts")Highly-variable genes
The Banksy documentation, suggest the use of Seurat for
the detection of highly variable genes.
library(Seurat)
# Convert to list
spatial_data_list_for_seurat <- lapply(unique(spatial_data$sample_id), function(x)
spatial_data[, spatial_data$sample_id == x ]
)
seu_list <- lapply(spatial_data_list_for_seurat, function(x) {
x <- as.Seurat(x, data = NULL)
NormalizeData(x, scale.factor = 5000, normalization.method = 'RC')
})
# Compute HVGs
hvgs <- lapply(seu_list, function(x) {
VariableFeatures(FindVariableFeatures(x, nfeatures = 2000))
})
hvgs <- Reduce(union, hvgs)
rm(seu_list, spatial_data_list_for_seurat)
rowData(spatial_data[head(hvgs),])[,c("gene_id", "gene_name")]We now split the data by sample, to compute the neighbourhood matrices.
# Convert to list
spatial_data_list <- lapply(unique(spatial_data$sample_id), function(x)
spatial_data[
hvgs,
spatial_data$sample_id == x
]
)
spatial_data_list <- lapply(
spatial_data_list,
computeBanksy, # Compute the component neighborhood matrices
assay_name = "normcounts"
)
# Rejoin the datasets
spe_joint <- do.call(cbind, spatial_data_list)Here, we perform PCA using the BANKSY algorithm on the joint dataset. The group argument specifies how to treat different samples, ensuring that features are scaled separately per sample group to account for variations among them.
Note: this step takes long time
spe_joint <- runBanksyPCA( # Run PCA on the Banskly matrix
spe_joint,
lambda = 0.2, # spatial weighting parameter. Larger values (e.g. 0.8) incorporate more spatial neighborhood
group = "sample_id", # Features belonging to the grouping variable will be z-scaled separately.
seed = 42
)Once the dimensional reduction is complete, we cluster the spots
across all samples and use connectClusters to visually
compare these new BANKSY clusters against manual annotations.
spe_joint <- clusterBanksy( # clustering on the principal components computed on the BANKSY matrix
spe_joint,
lambda = 0.2, # spatial weighting parameter. Larger values (e.g. 0.8) incorporate more spatial neighborhood
resolution = 0.7, # numeric vector specifying resolution used for clustering (louvain / leiden).
seed = 42
)
colData(spe_joint)$clust_annotation = colData(spe_joint)$Cluster
spe_joint <- connectClusters(spe_joint)As an optional step, we smooth the cluster labels for each sample independently, which can enhance the visual coherence of clusters, especially in heterogeneous biological samples.
From SpiceMix paper Chidester et al., 2023
spatial_data_list <- lapply(
unique(spe_joint$sample_id),
function(x)
spe_joint[, spe_joint$sample_id == x]
)
spatial_data_list <- lapply(
spatial_data_list,
smoothLabels,
cluster_names = "clust_M0_lam0.2_k50_res0.7",
k = 6L,
verbose = FALSE
)
names(spatial_data_list) <- paste0("sample_", unique(spe_joint$sample_id))The raw and smoothed cluster labels are stored in the
colData slot of each SingleCellExperiment or
SpatialExperiment object.
cluster_metadata =
colData(spatial_data_list$sample_151673)[, c("clust_M0_lam0.2_k50_res0.7", "clust_M0_lam0.2_k50_res0.7_smooth")]
knitr::kable(head(cluster_metadata, 10), format = "html")Using cluster comparison metrics like the adjusted Rand index (ARI) we evaluate the performance of our clustering approach. This statistical analysis helps validate the clustering results against known labels or pathologies.
The Adjusted Rand Index (ARI) is a measure of the similarity between two data clusterings. Measures degree of overlapping between two partitions.
compareClusters(spatial_data_list$sample_151673, func = 'ARI')We calculate the ARI for each sample to assess the consistency and accuracy of our clustering across different samples.
ari <- sapply(spatial_data_list, function(x) as.numeric(tail(compareClusters(x, func = "ARI")[, 1], n = 1)))
ariVisualising Clusters and Annotations on Spatial Coordinates: We utilise the ggspavis package to visually map BANKSY clusters and manual annotations onto the spatial coordinates of the dataset, providing a comprehensive visual overview of spatial and clustering data relationships.
When using knitr (including vignettes), only the
last ggplot produced in a code chunk is written out by
default. If you need several figures from the same narrative block,
split them into separate chunks below or wrap each ggplot in
print().
# Use scater:::.get_palette('tableau10medium')
library(cowplot)
pal <- c(
"#1abc9c", "#3498db", "#9b59b6", "#e74c3c", "#34495e",
"#f39c12", "#d35400", "#7f8c8d", "#2ecc71", "#e67e22"
)
ggspavis::plotCoords(
do.call(cbind, spatial_data_list),
annotate = sprintf("%s_smooth", "clust_M0_lam0.2_k50_res0.7"),
pal = pal,
sample_id = "sample_id"
) +
theme(legend.position = "none") +
labs(title = "BANKSY clusters (smoothed)")
ggspavis::plotCoords(
do.call(cbind, spatial_data_list),
annotate = sprintf("%s", "clust_M0_lam0.2_k50_res0.7"),
pal = pal,
sample_id = "sample_id"
) +
theme(legend.position = "none") +
labs(title = "BANKSY clusters")
ggspavis::plotCoords(
spatial_data,
annotate = "spatialLIBD",
sample_id = "sample_id",
pal = libd_layer_colors
) +
theme(legend.position = "none") +
labs(title = "spatialLIBD regions")Exercise 1.2
We have applied cluster smoothing using smoothLabels.
How much do you think this operation has affected the cluster labels. To
find out,
plotObsQC that we have used
above.One of the popular algorithms for spatial deconvolution is SPOTlight. Elosua-Bayes et al., 2021.
SPOTlight uses a seeded non-negative matrix factorization regression, initialized using cell-type marker genes and non-negative least squares.
Here, we retrieve and prepare a single-cell RNA reference. The dataset in question, zhong-prefrontal-2018, originates from a study by Zhong et al. (2018), which offers a comprehensive single-cell transcriptomic survey of the human prefrontal cortex during development . Utilising the scRNAseq package, the dataset is fetched and subsequently processed to aggregate counts across cells sharing the same sample and cell type, thereby reducing data complexity and enhancing interpretability. Further filtering steps ensure the removal of empty columns and entries with missing cell type annotations. Finally, the logNormCounts function from the scuttle package is applied to perform log-normalisation, a crucial step for mitigating technical variability and preparing the data for accurate comparative analyses .
# Get reference
library(scRNAseq)
brain_reference <- fetchDataset("zhong-prefrontal-2018", "2023-12-22")
brain_reference =
brain_reference |>
scuttle::aggregateAcrossCells(ids = paste(brain_reference$sample, brain_reference$cell_types, sep = "_"))
brain_reference = brain_reference[, brain_reference |> assay() |> colSums() > 0]
brain_reference = brain_reference[, !brain_reference$cell_types |> is.na()]
brain_reference =
brain_reference |>
logNormCounts()| developmental_stage | gender | sample | cell_types | week | Gene_num | Pre_Map_Reads | Aligned_Reads | MappingRate | ids | ncells | sizeFactor | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GW08_PFC1_GABAergic neurons | 8 weeks after gestation | female | GW08_PFC1 | GABAergic neurons | GW08 | NA | NA | NA | NA | GW08_PFC1_GABAergic neurons | 3 | 0.0382891 |
| GW08_PFC1_Microglia | 8 weeks after gestation | female | GW08_PFC1 | Microglia | GW08 | NA | NA | NA | NA | GW08_PFC1_Microglia | 2 | 0.2713548 |
| GW08_PFC1_Neurons | 8 weeks after gestation | female | GW08_PFC1 | Neurons | GW08 | NA | NA | NA | NA | GW08_PFC1_Neurons | 15 | 0.6444229 |
| GW08_PFC1_Stem cells | 8 weeks after gestation | female | GW08_PFC1 | Stem cells | GW08 | NA | NA | NA | NA | GW08_PFC1_Stem cells | 3 | 0.1336195 |
| GW09_PFC1_GABAergic neurons | 9 weeks after gestation | female | GW09_PFC1 | GABAergic neurons | GW09 | NA | NA | NA | NA | GW09_PFC1_GABAergic neurons | 3 | 0.1548780 |
| GW09_PFC1_Neurons | 9 weeks after gestation | female | GW09_PFC1 | Neurons | GW09 | NA | NA | NA | NA | GW09_PFC1_Neurons | 35 | 3.9421720 |
These are the cell types included in our reference, and the number of pseudobulk samples we have for each cell type.
table(brain_reference$cell_types)##
## Astrocytes GABAergic neurons Microglia Neurons
## 18 38 28 39
## OPC Stem cells
## 27 25
These are the number of samples we have.
table(brain_reference$sample)##
## GW08_PFC1 GW09_PFC1 GW10_PFC1_1 GW10_PFC1_2 GW10_PFC2_1 GW10_PFC2_2
## 4 3 3 3 4 4
## GW10_PFC3 GW12_PFC1 GW13_PFC1 GW16_PFC1_1 GW16_PFC1_2 GW16_PFC1_3
## 3 5 3 4 3 4
## GW16_PFC1_4 GW16_PFC1_5 GW16_PFC1_6 GW16_PFC1_7 GW16_PFC1_8 GW16_PFC1_9
## 5 5 5 5 5 5
## GW19_PFC1_1 GW19_PFC1_2 GW19_PFC1_3 GW23_PFC1_1 GW23_PFC1_2 GW23_PFC1_3
## 5 5 4 4 5 3
## GW23_PFC2_1 GW23_PFC2_2 GW23_PFC2_3 GW23_PFC2_4 GW23_PFC2_5 GW26_PFC1_1
## 5 4 5 5 4 5
## GW26_PFC1_10 GW26_PFC1_2 GW26_PFC1_3 GW26_PFC1_4 GW26_PFC1_5 GW26_PFC1_6
## 5 6 5 5 5 6
## GW26_PFC1_7 GW26_PFC1_8 GW26_PFC1_9
## 5 6 5
Initially, the code prepares the spatial data by setting gene names as row identifiers.
spatial_data_gene_name = spatial_data
rownames(spatial_data_gene_name) <- rowData(spatial_data_gene_name)$gene_name
spatial_data_gene_name = logNormCounts(spatial_data_gene_name)## Warning in .local(x, ...): 'normalizeCounts' is deprecated.
## Use 'scrapper::normalizeCounts' instead.
## See help("Deprecated")
We then identify and score relevant marker genes based on their expression and significance across different cell types. The result is a curated list of high-potential marker genes, organised and ready for deeper analysis and interpretation in the context of spatial gene expression patterns.
This function provides a convenience wrapper for marker gene
identification between groups of cells, based on running
pairwiseTTests.
This function represents a simpler and more intuitive summary of the differences between the groups. We do this by realizing that the p-values for these types of comparisons are largely meaningless; individual cells are not meaningful units of experimental replication, while the groups themselves are defined from the data. Thus, by discarding the p-values, we can simplify our marker selection by focusing only on the effect sizes between groups.
mgs <- scran::scoreMarkers(
brain_reference,
groups = brain_reference$cell_types,
# Omit mitochondrial genes and keep all the genes in spatial
subset.row =
grep("(^MT-)|(^mt-)|(\\.)|(-)", rownames(brain_reference), value=TRUE, invert=TRUE) |>
intersect(rownames(spatial_data_gene_name))
)## Warning in .local(x, ...): 'summarizeAssayByGroup' is deprecated.
## Use 'scrapper::aggregateAcrossCells' or 'beachmat::tatami.sums.by.group' instead.
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
## Warning in computeMinRank(res[[i]][[j]]): 'computeMinRank' is deprecated.
## Use 'scrapper::summarizeEffects' instead.
## See help("Deprecated")
# Select the most informative markers
mgs_df <- lapply(names(mgs), function(i) {
x <- mgs[[i]]
# Filter and keep relevant marker genes, those with AUC > 0.8
x <- x[x$mean.AUC > 0.8, ]
# Sort the genes from highest to lowest weight
x <- x[order(x$mean.AUC, decreasing = TRUE), ]
# Add gene and cluster id to the dataframe
x$gene <- rownames(x)
x$cluster <- i
data.frame(x)
})
mgs_df <- do.call(rbind, mgs_df)
head(mgs_df)## self.average other.average self.detected other.detected mean.logFC.cohen
## CLU 13.04434 7.354836 1 0.9349685 3.284103
## PTN 14.23937 9.908782 1 0.9848571 3.639877
## CST3 15.01614 10.143497 1 0.9842105 4.276129
## HOPX 12.16743 5.769645 1 0.8571423 2.918201
## TTYH1 11.46129 6.643598 1 0.9026910 2.646161
## MLC1 10.48779 3.884452 1 0.6874225 3.283323
## min.logFC.cohen median.logFC.cohen max.logFC.cohen rank.logFC.cohen
## CLU 2.095170 2.996598 5.187289 3
## PTN 1.435501 3.862314 6.435137 2
## CST3 1.771824 3.953989 7.535800 1
## HOPX 1.554792 3.515078 3.958490 7
## TTYH1 1.622048 2.806502 3.775169 14
## MLC1 1.610386 3.780573 4.493997 1
## mean.AUC min.AUC median.AUC max.AUC rank.AUC mean.logFC.detected
## CLU 0.9969218 0.9866667 1.0000000 1 1 0.09372303
## PTN 0.9943210 0.9777778 1.0000000 1 1 0.02107825
## CST3 0.9917695 0.9588477 1.0000000 1 1 0.02243015
## HOPX 0.9900576 0.9547325 1.0000000 1 1 0.21245867
## TTYH1 0.9897760 0.9650206 0.9955556 1 1 0.14541250
## MLC1 0.9879012 0.9506173 1.0000000 1 1 0.53653045
## min.logFC.detected median.logFC.detected max.logFC.detected
## CLU 1.601713e-16 5.573335e-02 0.20979238
## PTN -3.203427e-16 1.601713e-16 0.05573335
## CST3 -3.203427e-16 0.000000e+00 0.11215073
## HOPX 1.091634e-01 2.337602e-01 0.32736198
## TTYH1 0.000000e+00 1.137066e-01 0.38994652
## MLC1 2.371477e-01 4.066253e-01 1.12454510
## rank.logFC.detected gene cluster
## CLU 1557 CLU Astrocytes
## PTN 1557 PTN Astrocytes
## CST3 1557 CST3 Astrocytes
## HOPX 974 HOPX Astrocytes
## TTYH1 1157 TTYH1 Astrocytes
## MLC1 518 MLC1 Astrocytes
We now utilise SPOTlight to deconvolve
tisslet’smposition from our independent mouse brain reference. The
result is visualised through a scatter pie plot, which provides a
graphical representation of the spatial distribution of cell types
within a let’sfic sample. This visualisation aids in understanding the
spatial heterogeneity.
library(SPOTlight)
res <- SPOTlight(
x = brain_reference |> assay("logcounts"),
y = spatial_data_gene_name |> assay("logcounts"),
groups = brain_reference$cell_types,
group_id = "cluster",
mgs = mgs_df,
weight_id = "mean.AUC",
gene_id = "gene"
)##
## iter | tol
## ---------------
## 1 | 3.51e-01
## 2 | 1.67e-02
## 3 | 6.16e-03
## 4 | 4.62e-03
## 5 | 1.04e-02
## 6 | 1.03e-02
## 7 | 4.00e-03
## 8 | 1.20e-03
## 9 | 4.73e-04
## 10 | 2.38e-04
## 11 | 1.30e-04
## 12 | 7.58e-05
## 13 | 4.57e-05
## 14 | 2.74e-05
## 15 | 1.70e-05
## 16 | 1.11e-05
## 17 | 7.60e-06
cell_first_sample = colData(spatial_data_gene_name)$sample_id=="151673"
plotSpatialScatterpie(
x = spatial_data_gene_name[,cell_first_sample],
y = res$mat[cell_first_sample,],
cell_types = colnames(res$mat[cell_first_sample,]),
img = FALSE,
scatterpie_alpha = 1,
pie_scale = 0.4
) 
Let’s visualise without pit’syte_cell and endothelial cells, which oclet’sa lot of the visual spectrum.
plotSpatialScatterpie(
x = spatial_data_gene_name[,cell_first_sample],
y = res$mat[cell_first_sample,-c(2,9)],
cell_types = colnames(res$mat[cell_first_sample,-c(2,9)]),
img = FALSE,
scatterpie_alpha = 1,
pie_scale = 0.4
) 
Let’s also exclude without muscle_cell, which occupy a lot of the visual spectrum.
plotSpatialScatterpie(
x = spatial_data_gene_name[,cell_first_sample],
y = res$mat[cell_first_sample,-c(2, 9, 5)],
cell_types = colnames(res$mat[cell_first_sample,-c(2, 9, 5)]),
img = FALSE,
scatterpie_alpha = 1,
pie_scale = 0.4
) 
No, let’s look at the correlation matrices to see which cell type are most often occurring rather than mutually exclusive within our data set.
plotCorrelationMatrix(res$mat)## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## ℹ The deprecated feature was likely used in the ggcorrplot package.
## Please report the issue at <https://github.com/kassambara/ggcorrplot/issues>.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

mat_df = as.data.frame(res$mat)Exercise 1.4
Rather than looking at the correlation matrix, overall, let’s observe whether the correlation structure amongst cell types is consistent across samples. Do you think it’s consistent or noticeably different?
Exercise 1.5
Some of the most positive correlations in the new matrix are seen between:
Microglia are the resident immune cells of the central nervous system, constantly surveying the parenchyma and clearing debris.
Neurons are the electrically excitable cells that transmit and process information via synaptic connections.
Astrocytes are star-shaped glia that support neuronal metabolism, regulate extracellular ions and neurotransmitter uptake.
Stem.cells denote undifferentiated progenitors capable of self-renewal and differentiation into multiple neural lineages.
Let us now visualise where these pairs of cell types most co-occur in your spatial map. For each pair, carry out the following:
0 = neither condition met1 = both abundances > 0.12 = summed abundance > 0.4You should end up with two analogous visualisations:
Feel free to reuse your previous code, simply substituting the cell-type columns and updating the thresholds as above.
Session Information
## R version 4.6.0 (2026-04-24)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.4 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] SPOTlight_1.17.0 scRNAseq_2.27.0
## [3] ExperimentHub_3.3.0 AnnotationHub_4.3.0
## [5] BiocFileCache_3.3.0 dbplyr_2.5.2
## [7] spatialLIBD_1.25.0 scran_1.41.0
## [9] scater_1.41.1 scuttle_1.23.1
## [11] ggspavis_1.19.0 ggplot2_4.0.3
## [13] SpatialExperiment_1.23.0 SingleCellExperiment_1.35.1
## [15] SummarizedExperiment_1.43.0 Biobase_2.73.1
## [17] GenomicRanges_1.65.0 Seqinfo_1.3.0
## [19] IRanges_2.47.1 S4Vectors_0.51.2
## [21] BiocGenerics_0.59.2 generics_0.1.4
## [23] MatrixGenerics_1.25.0 matrixStats_1.5.0
##
## loaded via a namespace (and not attached):
## [1] fs_2.1.0 ProtGenerics_1.45.0 bitops_1.0-9
## [4] httr_1.4.8 RColorBrewer_1.1-3 doParallel_1.0.17
## [7] tools_4.6.0 alabaster.base_1.13.0 R6_2.6.1
## [10] DT_0.34.0 nnSVG_1.17.0 HDF5Array_1.41.0
## [13] lazyeval_0.2.3 uwot_0.2.4 rhdf5filters_1.25.0
## [16] GetoptLong_1.1.1 withr_3.0.2 gridExtra_2.3
## [19] cli_3.6.6 textshaping_1.0.5 scatterpie_0.2.6
## [22] alabaster.se_1.13.0 labeling_0.4.3 sass_0.4.10
## [25] S7_0.2.2 pbapply_1.7-4 pkgdown_2.2.0
## [28] yulab.utils_0.2.4 Rsamtools_2.29.0 systemfonts_1.3.2
## [31] sessioninfo_1.2.3 attempt_0.3.1 limma_3.69.1
## [34] RSQLite_3.52.0 shape_1.4.6.1 BiocIO_1.23.3
## [37] dplyr_1.2.1 Matrix_1.7-5 ggbeeswarm_0.7.3
## [40] abind_1.4-8 lifecycle_1.0.5 yaml_2.3.12
## [43] edgeR_4.11.0 rhdf5_2.57.0 SparseArray_1.13.2
## [46] paletteer_1.7.0 grid_4.6.0 blob_1.3.0
## [49] promises_1.5.0 dqrng_0.4.1 crayon_1.5.3
## [52] lattice_0.22-9 beachmat_2.29.0 cowplot_1.2.0
## [55] GenomicFeatures_1.65.0 cigarillo_1.3.0 KEGGREST_1.53.0
## [58] magick_2.9.1 pillar_1.11.1 knitr_1.51
## [61] ComplexHeatmap_2.29.0 metapod_1.21.0 rjson_0.2.23
## [64] codetools_0.2-20 glue_1.8.1 ggfun_0.2.0
## [67] data.table_1.18.4 vctrs_0.7.3 png_0.1-9
## [70] gypsum_1.9.0 gtable_0.3.6 rematch2_2.1.2
## [73] cachem_1.1.0 xfun_0.57 S4Arrays_1.13.0
## [76] mime_0.13 ggside_0.4.1 iterators_1.0.14
## [79] statmod_1.5.2 bluster_1.23.0 bit64_4.8.2
## [82] alabaster.ranges_1.13.0 filelock_1.0.3 RcppAnnoy_0.0.23
## [85] GenomeInfoDb_1.49.0 bslib_0.11.0 irlba_2.3.7
## [88] vipor_0.4.7 otel_0.2.0 colorspace_2.1-2
## [91] DBI_1.3.0 tidyselect_1.2.1 BRISC_1.0.6
## [94] bit_4.6.0 compiler_4.6.0 curl_7.1.0
## [97] httr2_1.2.2 BiocNeighbors_2.7.2 h5mread_1.5.0
## [100] desc_1.4.3 DelayedArray_0.39.2 plotly_4.12.0
## [103] rtracklayer_1.73.0 scales_1.4.0 rappdirs_0.3.4
## [106] stringr_1.6.0 digest_0.6.39 alabaster.matrix_1.13.0
## [109] rmarkdown_2.31 benchmarkmeData_2.0.0 XVector_0.53.0
## [112] htmltools_0.5.9 pkgconfig_2.0.3 sparseMatrixStats_1.25.0
## [115] fastmap_1.2.0 ensembldb_2.37.0 rlang_1.2.0
## [118] GlobalOptions_0.1.4 htmlwidgets_1.6.4 UCSC.utils_1.9.0
## [121] shiny_1.13.0 farver_2.1.2 jquerylib_0.1.4
## [124] jsonlite_2.0.0 BiocParallel_1.47.0 config_0.3.2
## [127] BiocSingular_1.29.0 RCurl_1.98-1.18 magrittr_2.0.5
## [130] Rhdf5lib_2.1.0 Rcpp_1.1.1-1.1 viridis_0.6.5
## [133] stringi_1.8.7 alabaster.schemas_1.13.0 MASS_7.3-65
## [136] plyr_1.8.9 parallel_4.6.0 ggrepel_0.9.8
## [139] Biostrings_2.81.1 circlize_0.4.18 locfit_1.5-9.12
## [142] rdist_0.0.5 igraph_2.3.1 reshape2_1.4.5
## [145] ScaledMatrix_1.21.0 BiocVersion_3.24.0 XML_3.99-0.23
## [148] evaluate_1.0.5 golem_0.5.1 BiocManager_1.30.27
## [151] tweenr_2.0.3 foreach_1.5.2 httpuv_1.6.17
## [154] polyclip_1.10-7 RANN_2.6.2 tidyr_1.3.2
## [157] purrr_1.2.2 benchmarkme_1.0.8 clue_0.3-68
## [160] alabaster.sce_1.13.0 ggforce_0.5.0 BiocBaseUtils_1.15.1
## [163] rsvd_1.0.5 xtable_1.8-8 restfulr_0.0.16
## [166] AnnotationFilter_1.37.0 RSpectra_0.16-2 ggcorrplot_0.1.4.1
## [169] later_1.4.8 viridisLite_0.4.3 ragg_1.5.2
## [172] tibble_3.3.1 memoise_2.0.1 beeswarm_0.4.0
## [175] AnnotationDbi_1.75.0 GenomicAlignments_1.49.0 cluster_2.1.8.2
## [178] shinyWidgets_0.9.1
References
<mangiola.s at wehi.edu.au>↩︎