SNP = Single Nucleotide Polymorphism SNP analysis = identifying and interpreting single‑base differences between genomes.
What it answers
How different are two strains?
Which mutations define a lineage?
Are mutations under selection?
What is the evolutionary distance?
Two ways to do SNP analysis
A. Using FASTQ (raw reads) → TRUE variant calling
Pipeline:
Align reads (Bowtie2, BWA)
Call variants (bcftools, FreeBayes, GATK)
Produce VCF
Filter + interpret
This is the gold standard.
B. Using FASTA assemblies → whole‑genome alignment
Tools: MAFFT, Parsnp, snippy-core This is not true variant calling, but you can still extract SNPs.
Think of it like
Comparing two books line‑by‑line to find spelling differences
How these tools fit together in real workflows
If you have FASTQ reads
Bowtie2 → align
bcftools → call SNPs
MACSE → align coding regions
HyPhy → selection analysis
If you have Nanopore reads
Epi2ME → basecalling + variant calling
Or run your own pipeline
If you have FASTA assemblies
Use for AMRFinderPlus
Use for pangenomes (Panaroo)
Use for whole‑genome SNP alignment (not variant calling)