I'll try to answer as much as I can as succinctly as possible.
So it seems that the problem is in the filtering/merging/dada2 step.
I know that ITS is variable in length (although the amplicons we got, at least the ones visible on the gel, were more or less of comparable size - a nice band with little smear (and if any, it was longer, not shorter)). In any case if this was the problem, I would expect QIIME1 to fail in merging as well.
You do not mention if/how reads were joined in QIIME 1.
I used:
multiple_join_paired_ends.py
parameters:
join_paired_ends:min_overlap10
join_paired_ends:perc_max_diff15
split_libraries_fastq:phred_offset 33
you do not mention if/how you trimmed amplicons in QIIME 1 or 2, specifically to address the read-through issue
No trimming on the right side of reads was done in any case, because the merging worked well without it in QIIME1 and no read-through was expected (also a selection of the un-paired reads were checked and they looked like they simply failed due to low quality).
Stats files
datasummary.qzv (285.7 KB)
denoisetable.qzv (310.9 KB)
you have read-through in QIIME 2 but not QIIME 1 (possibly because you did not join in QIIME 1? so the single-end seqs are not long enough to have this problem), so the non-biological DNA is causing classification and alignment-to-expected issues.
I did the joining step in both QIIME1 and QIIME2. For QIIME1 the summary of the samples was as follows (many more joined reads than in QIIME2):
Summary.txt (590 Bytes)
I am curious, how are you deciding what are expected sequences? Are you sequencing positive controls or is this just based on previous examination of these samples?
Based on previous examination of these samples. Also, if we see specific taxons in QIIME1 representative sequences, but nothing even remotely similar in QIIME2, something is clearly wrong.
Thanks!