16S bacterial DNA gene includes 9 hypervariables regions V1-V9. The question is how to differenciate between those regions using QIIME2 ? How can we attribute each bacterial OTUs identified to specific 16S hypervariable regions (eg V1-V2, V3-V4, v5-v6....) ? Is there any qiime 2 script for this purpose ?
Please clarify: do you have a mixture of sequences from different 16S regions, and want to separate them by region before proceeding?
You can use q2-quality-control's exclude-seqs or filter-reads for the purpose (depending on what type of sequence you want to filter), just use reference sequences from the different domains and adjust the % identity accordingly.
Thanks for the reply. I have sequences corresponding to different hypervariable regions of 16srRNA gene. i want to differenciate between those regions and estimate the relative Abundance of bacterial taxa according to those different domains (v1 v2 v3 ......v9).
How can i separate those domains using qiime2 ?
How can i attribute to each domain the corresponding bacterial taxa ?
Thanks @Nicholas_Bokulich . The question is how to define the sequences to exclude using qiime quality-control exclude-seqs . May i search the sequences on ncbi for example correponding to v4 domain and which format of file can i use as input ?How can i generate those inputs
--i-query-sequences query-seqs.qza
--i-reference-sequences reference-seqs.qza
No, NCBI probably would not have such sequences in an easily indexed form but I could be wrong.
Rather, grab some reference sequences (can be a random subsample, do not need all of them) and use qiime feature-table extract-reads to trim to the different hypervariable domains using primers for those domains.
get a FASTA format file and import as a FeatureData[Sequence] artifact
This would be your query sequences, presumably processed upstream in QIIME 2... if not, import FASTA as FeatureData[Sequence]
These would be the trimmed reference sequences. Import FASTA as FeatureData[Sequence] and then trim using extract-reads as described above.