Train the Silva database (Qiime2) by extracting two types of primers: Region v3-v4 and v4 of the 16S rRNA

Hello, I am new to using Qiime 2. I have already done one of the tutorials and I am currently analyzing samples of the bacterial microbiome from pigeon fecal matter. For this, I am using the version of qiime2-amplicon-2024.2 with Linux as the Windows subsystem and Ubuntu 22-04 WSL2. Right now I have a concern and if anyone can help me I would be very grateful. I have a total of 8 stool samples. Four of them were sequenced with the primers for the V3-V4 hypervariable region of the 16S rRNA gene. The other four samples were sequenced with the primers of the V4 region of the 16S rRNA gene.
I am in the training part of the database to perform the taxonomic annotation, in this command the Silva database will be used. But, the sequences of the primers that were used during PCR amplification must be extracted. The command to use would be the following:

qiime feature-classifier extract-reads
--i-sequences silva-138-99-seqs.qza
--p-f-primer CCTAYGGGRBGCASCAG
--p-r-primer GGACTACNNGGGTATCTAAT
--p-min-length 100
--p-max-length 610
--o-reads ref-seqs.qza

The question is if it is possible to put in the command, the two types of primers, both those from the V3-V4 and V4 regions, for example, put the command in this way:

qiime feature-classifier extract-reads
--i-sequences silva-138-99-seqs.qza
--p-f-primer CCTAYGGGRBGCASCAG \ Hypervariable Region V3-v4
--p-r-primer GGACTACNNGGGTATCTAAT \ Hypervariable Region V3-v4
--p-f-primer GTGCCAGCMGCCGCGGTAA \ Hypervariable Region V4
--p-r-primer GGACTACHVGGGTWTCTAAT \ Hypervariable Region V4
--p-min-length 150
--p-max-length 500
--o-reads ref-seqs.qza
If this is not possible, what would you recommend I do? Work these groups of samples separately?

Hi @LisethBolCol,

It is not possible to feed more than one primer pair into qiime feature-classifier extract-reads. If you really want to make an amplicon specific classifier, I'd recommend making two separate classifiers, one for V3V4 and another for V4.

However, I suggest that you read the following, related post, which highlights some caveats of trying to combine data from multiple variable regions:

2 Likes

Thank you very much for helping me and providing clarity on the subject.