Preparing mixed orientation reads for dada2

Hi @shreyaramachandran,

This may help.. though a bit onerous.

That being said...

We've been discussing ways to implement the vsearch --orient ... command into QIIME 2 for orienting fastq files. You could try running this vsearch command externally on each of your R1 and R2 reads, or you can use reads that are already merged. vsearch is available within your QIIME 2 environment. It might also be easiest to run this prior to importing into QIIME 2.

tl;dr:

vsearch \
   --orient  R1-seqs.fastq \
   --db reference-database.fasta \
   --fastqout  R1-seqs-oriented.fastq \
   --notmatched  R1-seqs-not-oriented.fastq 

more detail

You can download and export any of the marker gene reference databases from here, as your input to --db.

More details can be found within the vsearch manual.

So you can do the following...

Export SILVA reference sequences (FASTA)
(You can obtain from the Data resources page linked above.)

qiime tools export \
    --input-path silva-138-99-seqs.qza \
    --output-path silva-138-99-seqs-export/

export your raw fastqs (R1 & R2, or merged) if you already imported them
Otherwise just use the fastqs you have prior to importing into QIIM# 2

qiime tools export \
    --input-path raw-seqs.qza \
    --output-path raw-seqs-export

Run vsearch to orient your fastqs
Again, you need to run on the R1 (forward) and R2 (reverse) reads separately. There is a chance one of the pairs will be oriented and the other will not, causing paired read mismatches. But hopefully it'll be minimal. Though there might be some minor manual edits require for one or both files.

Or, if you plan to use deblur as your denoising approach, you can simply merge your reads with vsearch, and then run the vsearch --orient command. From here you can run deblur on your oriented merged reads. This will avoid task of running vsearch --orient on R1 and R2 separately.

vsearch \
  --orient R1-seqs.fastq \
  --db  silva-138-99-seqs-export/dna-sequences.fasta \
  --fastqout oriented.fastq \
  --notmatched not-oriented.fastq

Import oriented fastqs into QIIME 2
Then you can simply import these fastqs as you would normally do.

Note: I've not completely vetted this strategy myself, but I figured this will provide you with a more tenable place to start.

2 Likes