Five Tools. One Canvas. One Export.

StringTie and FeatureCounts solve slightly different problems. StringTie produces transcript-level abundances and can detect novel isoforms. Both are in the GenXflo library and connect to the same HISAT2 alignment output.

What GenXflo Does Across This Pipeline

StringTie requires a sorted BAM from HISAT2 and a reference GTF. Both the BAM routing and the GTF parameter need to be correct or the transcript assembly produces unreliable results. GenXflo validates the BAM input type and keeps the GTF path as a documented parameter in the config file.

BAM routing validated

GenXflo checks that HISAT2 sorted BAM output is compatible with StringTie input before generating any code.

GTF path as parameter

Reference annotation path is a config parameter, not hardcoded. Update it for a different genome build without touching the pipeline script.

Strandedness documented

StringTie strandedness setting is configured in the form and exported to the config file. Misconfigurations become visible and traceable.

Reproducible across samples

Nextflow channels run every sample through the same steps with consistent output paths. MultiQC gets every sample in its report.

Why StringTie and How It Fits the Workflow

FastQC

Quality control
Checks raw reads before trimming. Per-base quality and adapter content inform BBDuk configuration. Its output reaches MultiQC at the end through a direct channel connection in GenXflo.

BBDuk

Adapter trimming
Removes adapters and quality-trims reads. For RNA-seq, the minimum read length threshold after trimming matters because very short reads can map ambiguously to spliced transcripts. Parameters are configured through the GenXflo form and documented in the exported config.

HISAT2

Alignment
Aligns reads to the reference genome with splice-site awareness, producing a sorted BAM that StringTie reads directly. HISAT2 is the natural pairing for StringTie because both are developed by the same research group and share splicing model assumptions. GenXflo connects them through a validated canvas link.

StringTie

Transcript quantification
Assembles transcripts from the aligned reads and estimates FPKM and TPM values at both gene and transcript level. Unlike FeatureCounts, it can detect novel transcripts not present in the reference GTF. The strandedness flag and reference GTF path are configured in the GenXflo parameter form and documented in the exported config file.

MultiQC

Report aggregation
Aggregates FastQC and HISAT2 alignment statistics across all samples into a single HTML report. In GenXflo, MultiQC receives its inputs through channel connections, guaranteeing a complete report across every sample in the cohort.

StringTie in a Script vs StringTie in GenXflo

Without GenXflo

  • Strandedness flag easy to misconfigure, producing silently wrong transcript abundances
  • Reference GTF path hardcoded, pipeline fails when paths change across environments
  • No containers: StringTie version drifts across team members and HPC modules
  • Per-sample GTF outputs require manual collection before any downstream merging step
  • No documented connection between quantification results and the parameters used

With GenXflo

  • Strandedness set in the form and documented in the config file alongside the workflow
  • Reference GTF is a pipeline parameter, not a hardcoded string
  • Container image pins the exact StringTie version across all environments
  • Per-sample outputs collected through Nextflow channels with consistent paths
  • Full audit trail: every quantification result traces to its parameters via the config file

Questions About This Pipeline

Should I use StringTie or FeatureCounts for my RNA-seq experiment?

FeatureCounts produces gene-level read counts and is straightforward to use with DESeq2 or edgeR. StringTie produces transcript-level quantification and can detect novel isoforms not in the reference annotation. If you need gene-level counts for standard differential expression, FeatureCounts is simpler. If you need isoform resolution or are working with organisms with incomplete annotations, StringTie is the better choice. Both are in the GenXflo library and connect to the same HISAT2 output.

Can I use STAR instead of HISAT2 with StringTie?

Yes. STAR is also in the GenXflo library. Swap it for HISAT2 on the canvas. StringTie connects to sorted BAM output regardless of which aligner produced it.

Does the pipeline run on HPC or cloud?

Yes. The exported Nextflow DSL2 code runs on local machines, HPC clusters, and cloud platforms including AWS, Azure, and Google Cloud.

Build Your StringTie Pipeline Today

Transcript-level quantification connected to alignment on one canvas. Reproducible across every sample.