I am quite new to the field of fungal amplicon analysis and I had many discussion whether to use DADA2 or Vsearch with OTU-based approach, to obtain the famous count table. As with all the things in bioinformatics there is no consensus, no gold-standard, only "the way we do it is...". So, the only thing one can do is perform the same analysis with ASV and OTU and hope the results are similar.
On top of this, I was told that joining R1 and R2 before extracting the ITS leads to better results, when I follow the OTU way. Recently, I posted about joining R1 and R2, but there are issues with the score of the merged sequence. So, I thought, why not joining? Indeed, fastq_join is not overlapping R1 and R2, the score is not affected, but there is a padding sequence in between, which, I suppose, is right in the middle of the ITS sequence itself which I want to extract. From vsearch's manual:
--fastq_join filename
Join paired-end sequence reads into one sequence and add a gap between them using a
padding sequence.
Question: will the padding affect the extracted ITS sequence and the following clustering at 97%/99%/whatever threshold? What about the blasting?