Seperating 16S, 18S, ITS, COI and chloroplast sequences

mars · September 12, 2024, 7:12am

I have about 500 fastq files and each file contains the following 5 different types of amplicon sequences.

16S having 1 pair of primer
18S having 1 pair of primer
ITS having 1 pair of primer
COI having 2 pairs of primers
Chloroplast having 3 pairs of primers

Given that, I need to separate the reads and make 5 independent fastq files from each fastq file. Is there a way in qiime2 where I can extract the reads based on primers? or any other method to accurately pull the respective reads?

I would really appreciate the help.

Many thanks in advance!

timanix · September 12, 2024, 7:38am

Hello!

I am not sure how to handle this but what I would try:

Import samples to Qiime2
Use cutadapt to trimm primers with "--p-discard-untrimmed" and (optionally) "--p-minimum-length" parameters. The idea is that cutadapt will discard any sequence without primers, and if you run it for each primer set you used, it will do the job.

PS. To all: Please feel free to jump in with better suggestions. I am also curious.

Best,
Timur

salias · September 12, 2024, 9:53am

Hi!

I step in just to say I agree with @timanix - the easiest option here is to use q2-cutadapt.

Also, I see here that your idea is to have one FASTQ file for COI (including amplicons derived from both primer sets) and another FASTQ file for Chloroplast (inlcuding the three amplicons):

I think each amplicon should be treated separately. So you would end up with 8 groups instead of 5, one for each primer set.

If you still want to merge amplicons, you can always use qiime feature-table merge and qiime feature-table merge-seqs once you build your feature table (suggested by @llenzi - thanks for that!).

Best,

Sergio

timanix · September 12, 2024, 10:13am

Thank you for stepping in!

I want to add that if you merge amplicons from different primer sets and decide to get beta diversity PCoA, don't be surprised if you get a Y shape or see samples clustering to an almost perfect line. Different primers produce different amplicons, which will be denoised to different ASVs, excluding overlaps between different sets of primers.