Bioconductor Meets Brazil: A One-Day Immersion in RNA-seq with tidybulk and Carpentries

We are thrilled to share that we recently organized the first Bioconductor (bioC-RNAseq) course in the state of Minas Gerais—one of the few such events ever held in Brazil.

The course took place at the Federal University of Minas Gerais (UFMG), supported by the graduate programs in Bioinformatics and in Biochemistry & Immunology. It was part of the 10-year anniversary edition of the Bioinformatics Summer Course.

Our experience in organizing this course builds upon insights from our publication, “From In-Person to the Online World: Insights Into Organizing Events in Bioinformatics”. A key highlight was the collaboration between departments at Instituto de Ciências Biológicas (ICB) and Departamento de Ciência da Computação/Instituto de Ciências Exatas (DCC/ICEx), fostering interdisciplinary learning and research.

The new tidy sccomp interface

Lifecycle:maturing R build status

We announce the new tidy and modular interface for a sccomp, which improves modularity, and clarity. The main change is the modularisation of sccomp in functions which can be linked with the pipe operator |>.

FunctionDescription
Estimation: sccomp_stimate()which is usually run once in the analysis (per model).
Testing: sccomp_test()which candy run multiple times, depending on how many contrasts you want to test (e.g. age, untreated vs treated).
Outlier removal: sccomp_remove_outliers()which is usually run once after sccomp_estimate() in case you want to produce estimates not influenced by outlier data points.
Unwanted variation removal: sccomp_remove_unwanted_variation()which is run after sccomp_estimate() and produces a dataset that just preserve the variability of your factor of interest.
Data replication: sccomp_replicate()which is run after sccomp_estimate() and produces a dataset representing the theoretical data distribution according to the model (from the posterior distribution).
Plotting: plot()which is run after sccomp_test and outputs a series of summary plots.

A reminder: what is sccomp

sccomp1 is a statistical model developed for differential variability analysis in compositional data, primarily used in cellular omics fields like single-cell genomics, proteomics, and microbiomics (Mangiola et al. 2023). It addresses limitations of existing methods in differential abundance analysis by incorporating several advanced features. sccomp effectively models compositional count data properties, which were previously not adequately addressed, and tackles cell-group-specific differential variability. This model uses a constrained Beta-binomial distribution to enable more precise analyses. Key capabilities of sccomp include improved differential abundance analyses through cross-sample information borrowing, outlier identification and exclusion, realistic data simulation, and facilitating cross-study knowledge transfer. By incorporating these features, sccomp provides a more comprehensive and accurate framework for analyzing cellular omics data, identifying crucial biological drivers such as disease progression markers in cancer and pathogen infection.