Four Tools. One Canvas. One Export.

Salmon's workflow is structurally different from alignment-based RNA-seq. There is no HISAT2 or STAR step and no BAM file. Reads go directly to Salmon after trimming. FastQC, BBDuk, Salmon, and MultiQC are all pre-configured in the GenXflo library.

Why Salmon Skips Alignment and When That Matters

FeatureCounts and StringTie require a sorted BAM file from HISAT2 or STAR. They count reads that have been explicitly aligned to the genome. Salmon uses a different model: it maps reads directly to a transcript index using a quasi-mapping approach, estimates transcript abundances using an expectation-maximisation algorithm, and produces TPM and count estimates without ever creating a BAM file.

This makes Salmon substantially faster and less storage-intensive for large cohorts. The trade-off is that you get transcript-level quantification without the genome-aligned BAM that other downstream analyses might need. If your goal is differential expression at the gene or transcript level and you do not need the BAM for any other purpose, Salmon is the more efficient choice. If you need the BAM for variant calling, peak calling, or visualisation in IGV, an alignment-based pipeline is required.

FastQC

Quality control
Checks raw reads for quality and adapter contamination before trimming. FastQC summary statistics feed into the MultiQC report at the end of the pipeline through a channel connection in GenXflo.

BBDuk

Adapter trimming
Removes adapter sequences before quantification. Salmon's selective alignment is robust to low-quality bases but adapter contamination can affect quantification accuracy. BBDuk parameters are configured in the GenXflo form and documented in the exported config file.

Salmon

Pseudoalignment and quantification
Quantifies transcript abundance from trimmed reads using a transcript FASTA index. The transcript index path, library type, and validation mapping mode are configured through the GenXflo parameter form. Library type must match the sequencing protocol. Salmon produces a quant.sf file per sample containing TPM and estimated count values at transcript level, easily aggregated to gene level with tools such as tximeta in R.

MultiQC

Report aggregation
Aggregates FastQC and Salmon mapping statistics across all samples. Salmon produces per-sample log files with mapping rates that MultiQC displays as a comparative bar chart. In GenXflo, MultiQC receives these files through channel connections, guaranteeing a complete report across every sample.

Salmon in a Script vs Salmon in GenXflo

Without GenXflo

  • Library type flag easy to misconfigure, producing silently wrong quantification
  • Transcript index path hardcoded, pipeline breaks when paths change
  • No container: Salmon version inconsistent across team environments
  • Per-sample quant.sf files require manual collection before downstream analysis
  • No documented link between quantification parameters and the abundance estimates produced

With GenXflo

  • Library type configured in the form and documented in the exported config file
  • Transcript index path is a pipeline parameter, swappable without code changes
  • Container image pins the exact Salmon version across all environments
  • Channel routing collects per-sample quant.sf files with consistent output paths
  • Every parameter documented in the config file alongside the workflow script

Questions About Salmon and GenXflo

When should I use Salmon rather than HISAT2 plus FeatureCounts?

Use Salmon when your goal is differential expression analysis at the gene or transcript level and you do not need a genome-aligned BAM for any other purpose. Salmon is faster and produces less intermediate data. Use HISAT2 plus FeatureCounts when you need the BAM file for variant calling, visualisation in IGV, or any other downstream step that requires genome-level alignment.

Does Salmon quantification work with DESeq2?

Yes. Salmon quant.sf files are compatible with the tximeta and tximport R packages, which import Salmon output into a format DESeq2 and edgeR read natively. This approach also propagates transcript-to-gene mapping and uncertainty estimates into the differential expression analysis.

Does the pipeline run on HPC or cloud?

Yes. The exported Nextflow DSL2 code runs on local machines, HPC clusters, and cloud platforms including AWS, Azure, and Google Cloud.

Build Your Salmon Pipeline Today

No BAM files. Transcript quantification connected directly to MultiQC. Reproducible across every sample.