How to differenciate between 16s hypervariables regions using QIIME2 ?

M_F · August 15, 2021, 5:22am

Hello,

16S bacterial DNA gene includes 9 hypervariables regions V1-V9. The question is how to differenciate between those regions using QIIME2 ? How can we attribute each bacterial OTUs identified to specific 16S hypervariable regions (eg V1-V2, V3-V4, v5-v6....) ? Is there any qiime 2 script for this purpose ?

Thank you for your help

Nicholas_Bokulich · August 16, 2021, 2:33pm

Hi @M_F ,

Please clarify: do you have a mixture of sequences from different 16S regions, and want to separate them by region before proceeding?

You can use q2-quality-control's exclude-seqs or filter-reads for the purpose (depending on what type of sequence you want to filter), just use reference sequences from the different domains and adjust the % identity accordingly.

Or do you have a different use case?

M_F · August 17, 2021, 2:08am

Hi @Nicholas_Bokulich ,

Thanks for the reply. I have sequences corresponding to different hypervariable regions of 16srRNA gene. i want to differenciate between those regions and estimate the relative Abundance of bacterial taxa according to those different domains (v1 v2 v3 ......v9).
How can i separate those domains using qiime2 ?
How can i attribute to each domain the corresponding bacterial taxa ?

Thanks for your help

Nicholas_Bokulich · August 17, 2021, 5:00am

Follow the tutorials for taxonomic classification as usual after using q2-quality-control to separate each domain.

Good luck!

M_F · August 17, 2021, 5:08am

Thanks @Nicholas_Bokulich . The question is how to define the sequences to exclude using qiime quality-control exclude-seqs . May i search the sequences on ncbi for example correponding to v4 domain and which format of file can i use as input ?How can i generate those inputs
--i-query-sequences query-seqs.qza
--i-reference-sequences reference-seqs.qza

Thanks

Nicholas_Bokulich · August 17, 2021, 5:16am

No, NCBI probably would not have such sequences in an easily indexed form but I could be wrong.

Rather, grab some reference sequences (can be a random subsample, do not need all of them) and use qiime feature-table extract-reads to trim to the different hypervariable domains using primers for those domains.

get a FASTA format file and import as a FeatureData[Sequence] artifact

This would be your query sequences, presumably processed upstream in QIIME 2... if not, import FASTA as FeatureData[Sequence]

These would be the trimmed reference sequences. Import FASTA as FeatureData[Sequence] and then trim using extract-reads as described above.

Good luck!

M_F · August 17, 2021, 5:23am

Thank you @Nicholas_Bokulich

system · September 17, 2021, 11:23am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.