Hello,
It's been several years since I've posted on the forum and it's changed a lot, so I hope this is posted in the right place. I have a sequencing run that contains sequence data from 2 different primers that amplify 2 different gene regions targeting the same group of species. We first ran 2 different PCRs for each sample then did a second PCR to attach the Illumina overhang adapter sequences. The amplicons from both primers were pooled together by sample then we used Illumina tags to index our samples. These samples were then pooled into one library and sequenced on the MiSeq (2x250 bp). I now have demultiplexed fastq files that are labeled by sample but contain reads from both primers in each sample fastq file. What is the best/easiest way to go about analyzing the sequence data?
Should I:
Option 1: pre-filter the fastq files for each primer set first then import the separate files into QIIME2 and use cutadapt then DADA2? What would be the best method of sorting the fastq files per primer set? However, according to this forum thread Analyzing Different sample batches in the same sequencing run, it is not recommended to split the sequence run data because DADA2 will run better with with more samples from the same run.
or
Option 2: import into QIIME2 as is with sequence data from both primers, use cutadapt to trim both primer sets then DADA2? This approach was recommended in this this forum Separating two different amplicons from demultiplexed data. The only problem with this option is that I export the rep seqs file and use standalone blastn because we are interested to see what species we can detect with these primers so it doesn't make sense to spend time to make a reference database that encompasses everything in GenBank when I can just use standalone blast to do the same job. Therefore, if I trim the primers, I won't be able to differentiate which reads belong to which primer set after DADA2...
I want to be able to differentiate the taxonomy assignment per primer set downstream so I am worried about not being able to differentiate which sequence belongs to which primer set if I trim with cutadapt before DADA2.
If I missed any relevant forums related to my questions, please share them with me.
Thank you for reading through my long post!
Yer