The MiSeq was ran with demultiplexing option so I skipped the demultiplex step in the tutorial.
My question here is, what 'trunc-len value' would you use in DADA2 step when R2 read is all over the place like this?
My second question is in a similar context of saving these files even R2 read was bad.
I understand that I can check out raw sequences of R1 and R2 by using 'more xxx.fastq' prompt in the terminal but where can I find the sequences that is joined?
My thought was that knowing my index, adapter sequences, if I can compare sequence read in R1, R2 with joined sequence, I can kind of get an idea how much sequences were lost.
In Qiime1 there were join pair step which allowed me to investigate the joined sequences but I can't seems to find one in qiime2. Perhaps the R1 and R2 are joined during import?
Lastly, I have total reads, PF reads and demultiplexed sequence counts of 43,659,742, 13,505,280 and 5,434,962 respectively. Do you think its the bad R2 read that caused significant number drop?
I would personally scrap the reverse reads and proceed with only the forward reads as though it were single-end data.
post-denoising I am not sure if there's an easy way to match these up. You could join paired ends with qiime vsearch join-pairs and then look at reads — but I'm not sure that does what you want.
qiime vsearch join-pairs. Use that prior to deblur or OTU picking, but NOT with dada2 (which joins pairs after denoising).
Probably. I would just scrap the reverse reads... that quality profile does not look good and I would be really suspicious of using those data.
I tried the join-pairs command and it seems quite straight forward. quality score up until 250bp was good but from 250 to 400bp was all over the place.
After that, I ran DADA2 with trunc-len of 240 and 400 just to compare each other and as expected, DADA2 with trunc-len value of 400 wasn’t even able to run properly (error:‘No reads passed the filter’).
I will just use the first 250bp for the analysis as you suggested.
By the way, do you know what might caused the bad R2 read? I was thinking maybe the initial concentration of library was too high?
I really appreciate for your help and hope this thread help other too.