Analysis Type : DA report

Authors

Affiliation

Eugénie Lohmann

CRCM (CiBi Group)

Adrien Mazuel

CRCM (CiBi Group)

Published

July 6, 2023

Input parameters
	Sélection
project	Demonstration on ChKV
rds	demochkv.rds
path	/home/lohmann/immunopanc_jg/12_demochkv_sg/11_concat/
adcode	TRUE
analyse_type	DA
min_cells	5
min_samples	1
p_value	0.001
cluster_id	cluster_id
conditions	cond
batch	acqdate
individuals	patient_id
ncells	Freq
sorted	1
log_bar	1
cluster_column_id	marker
id_mfi	shared_col

This document has been formated using knitr (Xie 2015) and quarto (Allaire 2022).

1 Features summary

Sample table with information on conditions (cond), batch (acqdate), individuals (patient_id), ncells (Freq)

Information on metaclusters is grouped in this table with the number of cells and the corresponding description.

A total of 39 markers are included in this analysis report.

The exhaustive list of markers is as follows: Bi209Di, Dy161Di, Dy162Di, Dy163Di, Dy164Di, Er166Di, Er167Di, Er168Di, Er170Di, Eu151Di, Eu153Di, Gd155Di, Gd156Di, Gd158Di, Gd160Di, Ho165Di, In113Di, In115Di, Lu175Di, Nd142Di, Nd143Di, Nd144Di, Nd145Di, Nd146Di, Nd148Di, Nd150Di, Pt194Di, Pt198Di, Sm147Di, Sm149Di, Sm152Di, Sm154Di, Tb159Di, Tm169Di, Yb171Di, Yb172Di, Yb173Di, Yb174Di, Yb176Di.

Differentially Abundant Clusters (DAC) are especially interesting when you want to identify clusters having different cell abundances between two biological conditions.

The following report provides a turnkey analysis using the diffcyt package.

2 Differential Abundance test with egdeR

2.1 Design

The design associated is based on the comparaison of the two parameters of the cond column.

2.2 Differential Abundance (DA) table result

Sorted by ajusted p-value.
Clusters without enough cell counts per patients are excluded.
The DA table can be exported in different formats.

Are excluded the clusters : T Cell_unassigned, Debris without enough cell counts.

3 Volcano Plot

log2(Fold Change) in x-axis.
-log10(adjusted p-value) in y-axis.
The number of cells associated with the cluster is indicated by the size of the dot.
“Significant” DAC are highlighted in red color; associated cuttofs are a combinaison of :
- absolute log2(FC) > 1.
- adjusted pvalue cutoff : 0.001.

(a) example plot with ggiraph

Figure 1: Volcano Plot DA edgeR

4 Number of cells per Cluster and P-value association

Number of cells per cluster associated with low adjusted P-values.

mean % of total cells in x-axis.
-log10(adjusted p-value) in y-axis.
The number of cells associated with the cluster is indicated by the size of the dot.
“Significant” DAC are highlighted in red color; associated cuttofs are a combinaison of :
- mean % of total cells > 1.
- adjusted pvalue cutoff : 0.001.

(a) simple dot plot with ggplot

Figure 2: Abundance on DA clusters

4.1 Significant cluster list

List of the sorted significant clusters id : 20, 21, 26, 16, 22, 2, 8, 17, 10, 19, 13.

5 Violin plot

Violin Plot on transformed DAC expressions.

Shows the entire distribution of DAcs between groups.
Ashin transformation.
Sorted by adjusted p-value.

6 Heatmap of significant clusters

Exploration of the variability of patient_id accross DACs.

Visualize samples organized according to hierarchical grouping or by condition on DAC.

Percentage heatmap with samples in columns and clusters in rows.

By default, clusters and samples are grouped using hierarchical clustering. It is also possible to classify samples by condition, gender or patient, and to give clusters as an ordered vector.

Group abundance is shown in the right-hand barplot with log values.

Code

# css: www/custom.css
# js: www/script.js
#| include=F
library(R.AnalytiCyte)
library(readr)
library(dplyr)
library(SummarizedExperiment)
library(diffcyt)
library(ComplexHeatmap)
library(RColorBrewer)
library(randomcoloR)
library(colorRamp2)
library(data.table)
library(tidyr)
library(ggplot2)
library(ggrepel)
library(grid)
library(kableExtra)
library(tibble)
library(plotly)
library(reactable)





# parameters set

se <- dataSE$SE_abundance
rowData(se) |>
  as.data.frame() %>%
  mutate(cluster_id = factor(cluster_id, levels = rownames(se))) %>%
  arrange(cluster_id) -> rowData(se)

desired_width <- 10 # Adjust the
desired_height <- 10 # Adjust the coefficient to control the scaling


knitr::opts_chunk$set(
  fig.width = desired_width,
  fig.height = desired_height
)


se_mfi <- dataSE$SE_mfi
applied_fct_plot <- "median"
t_metadata <- "meta"
perCellCounts <- "perCellCounts"
id <- "shared_col"
scaled_MFI <- "scaled_MFI"

Conditions <- params$conditions
batch <- params$batch

patient <- params$individuals
size_ <- params$ncells


col_interest <- Conditions
col_cell <- size_
separator_ <- Conditions
col_ <- Conditions
shape_ <- Conditions

r_order <- rowData(se) |>
  as.data.frame() |>
  dplyr::arrange(desc(n_cells)) |>
  rownames()


cluster_column_id <- params$cluster_column_id # : "ftr (sec)"
id_mfi <- params$id_mfi # : "file"

sorted <- params$sorted
log_bar <- params$log_bar

if (params$analyse_type == "DA" | params$analyse_type == "DS") {
  trend_method_ <- "none"
  min_cells_ <- params$min_cells
  min_samples_ <- params$samples
  normalize_ <- FALSE
  norm_factors_ <- "TMM"
  c_id <- params$cluster_id
  pval <- params$p_value
}


R.AnalytiCyte::create_dt(colData(se) %>%
  as.data.frame(), length = 10, filter = "top")


R.AnalytiCyte::create_dt(rowData(se) %>%
  as.data.frame(), length = 10, filter = "top")
markers <- names(assays(se_mfi))




labs <- knitr::all_labels()

library(diffcyt)
library(ggiraph)
library(reactable)

meta <- getFeature(se_object = se, target_vector = c("metadata", t_metadata))
# sprintf('## Design Table')
design <- diffcyt::createDesignMatrix(
  experiment_info = meta, cols_design = c(col_interest)
)

# contrast <- diffcyt::createContrast(c(0,1,0,0))
contrast <- diffcyt::createContrast(c(0, 1))
# data.frame(parameters = colnames(design), contrast)



res_DA <- diffcyt::testDA_edgeR(
  se,
  design,
  contrast,
  trend_method = trend_method_,
  min_cells = min_cells_,
  min_samples = min_samples_,
  normalize = normalize_,
  norm_factors = norm_factors_
)





table_Resul_annot <- merge(rowData(se), rowData(res_DA), by = c_id)




R.AnalytiCyte::create_dt(na.omit(table_Resul_annot) |> as.data.frame() |> arrange(p_adj) %>%
  as.data.frame(), filter = "top", length = 10)



# Volcano plot  ####
R.AnalytiCyte::volcano_plot(
  annotation = "CellSubset",
  se = se,
  deres = res_DA,
  exprs = "counts",
  annot_se = c_id,
  annot_deres = c_id,
  t_metadata = t_metadata,
  col_cell = col_cell,
  col_interest = col_interest,
  sorted = F,
  target = "rowData",
  id = id,
  pvalue_ = pval
)

R.AnalytiCyte::Abundance_size_sig(
  annotation = "CellSubset",
  se = se,
  deres = res_DA,
  exprs = "counts",
  annot_se = c_id,
  annot_deres = c_id,
  t_metadata = t_metadata,
  col_cell = col_cell,
  col_interest = col_interest,
  sorted = F,
  target = "rowData",
  id = id,
  pvalue_ = pval
)



subset(table_Resul_annot, p_adj < as.numeric(pval)) |> as.data.frame() -> Sub_DA_res


# R.AnalytiCyte::create_dt(Sub_DA_res, filter="top")


significant <- Sub_DA_res |>
  arrange(p_adj) |>
  pull(c_id)
names <- Sub_DA_res |>
  arrange(p_adj) |>
  pull(CellSubset)
significant <- droplevels(significant)
# significant <- factor(significant, levels = significant)



if (length(significant) > 0) {
  R.AnalytiCyte::boxplot_like_catalyst(
    rowdata = TRUE,
    se = se,
    exprs = "perCellCountsNorm",
    t_metadata = t_metadata,
    col_cell = col_cell,
    col_interest = col_interest,
    sorted = sorted,
    annot_se = c_id,
    target = "rowData",
    split_ = "CellSubset",
    col_ = col_,
    separator_ = col_,
    shape_ = patient,
    id = id,
    sub_split_ = names,
    applied_fct_plot = "geom_violin"
  )
}

if (length(significant) > 0) {
  R.AnalytiCyte::heatmap_abundance_like_catalyst(
    # ajouter la possibilité de trier les colonnes, rows
    se = se,
    metadata_sub = c("acqdate", "patient_id", "cond"),
    exprs = "counts",
    t_metadata = t_metadata,
    separator_ = col_,
    patient = patient,
    id = id,
    metadata_sort = NULL,
    clustr = T,
    clustc = T,
    margin_ = 1,
    q_ = 0.01,
    round_ = 8,
    fontsize_ = 8,
    subset_patient = NULL, # reduction of patients of interest, order of the heatmap if clustr F
    subset_cluster = as.character(significant), # reduction of clusters of interest, order of the heatmap if clustc F
    log_bar = T
  )
}

Sources

Allaire, JJ. 2022. Quarto: R Interface to ’Quarto’ Markdown Publishing System. https://CRAN.R-project.org/package=quarto.

Xie, Yihui. 2015. Dynamic Documents with R and Knitr. 2nd ed. Boca Raton, Florida: Chapman; Hall/CRC. https://yihui.org/knitr/.