I Hope, all are safe.
I am an amateur who wants to study microbial diversity analysis/amplicon studies.
For my graduate studies, I need to study the algal diversity of a particular environment. My dataset is paired-end sequences. After importing the dataset, I visualized the sequence quality and the sequence count. Then I did the Vsearch join-pairs command to merge the forward and reverse reads, and it removed most of the sequences from the sample. I tried to change the minimum overlap length to 50 bp, it further reduced the sequence count. I want to understand why it removed most of the sequences.
NOTE: my amplicon size is 350bp. forward and reverse read size is 300bp each, therefore the overlap length should be ~250bp. However, it produces ~600bp read when I tried to join the paired-end sequences.
Pre-joined sequence count:
demux_seqs.qzv (317.3 KB)
demux-joined.qzv (298.9 KB)