how to proceed?


I have data coming from Illumina MISeq. Our sequencing facility provided us the sequences already demultiplexed. Thus, I have 211 folders corresponding to each of the libraries I prepared in the lab, and which they had a unique tag combination. However, we took the approach of two step PCR with tagged primers. Thus, my sequences already contain overhag sequences and illumina adapters. How should I proceed? I have already assesed the sequence quality using FASTQC and I will trim/filter my sequences based on this assessment. Now I want to continue downstream the pipeline. Should I used cutadapt to trim those overhang and adpater sequences ? I am new to bioinformatics and I am a bit lost?

Hi, welcome to the forum!
Maybe you should try to proceed by taking following steps:

  1. Import your reads by libraries to Qiime2. This tutorial may help. Check Casava format.
  2. Use Cutadapt to remove adapters/primers.
  3. Denoise your datasets by dada2 or deblur to obtain ASVs. Or you can use vsearch to process your data and cluster it to OTUs.