I have a bunch of 18S data (2x300 bp) from stool samples and the quality is pretty good for the samples - Over 95% reads pass the filter stage of denoising by dada2 for all samples and I am running qiime2-2021.11 in a singularity container. Maximum 10% reads are lost at the chimeric step which I don’t think is a problem. However, over half the samples are losing more than 50-90% reads at the merging step so I am losing a lot of data at this step - this is after primer removal btw, leaving 285 nt long forward reads and 284 nt long reverse reads.
I have seen papers using OTUs where read pairs are joined using artificial linkers and I supppose the idea is that at the taxonomy assignment step, the ambiguity is resolved by the assignment algorithm by looking at the nonambiguous ends. I have tried using artificially linked read pairs after trimming and quality filtering in dada2 as single read data but because of the ambiguous nucleotides, all reads from all samples all get filtered out without fail. Is there way to turn off this filtering or would this generally be a bad idea to use with ASVs? The other way I can think of is perhaps using GGGs/AAAs/TTTs/CCCs instead of NNNs in the linker to avoid the filtering step and then changing them before inputting the representative sequences into assignment algorithms but I am not confident this would be the best way to do things.