Hi I am currently working on sequencing data analysis. The reads were filtered using DADA2, and clustering was performed with QIIME VSEARCH cluster-features-de-novo (--p-perc-identity 0.97)
Afterwards, I got the featureID +sequence length + sequence information as:
And I also generated a table_abundance.tsv at 0.97 sequence similarity like:
(I converted it to excel).The featureID (OTU ID) is a string of characters... I saw papers which conducted OTU level analysis assigned the OTU as OTU1, OTU2, etc, I am wondering, is there a way to convert this feature table with OTU ID like OTU1, OTU2,... not just the characters?
Thanks so much for your help!
Have is a followup question regarding to the data analysis.
As mention before also in this paper ( Great differences in performance and outcome of high-throughput sequencing data analysis platforms for fungal metabarcoding ),
The reads were filtered using DADA2, and clustering was performed with QIIME VSEARCH cluster-features-de-novo (--p-perc-identity 0.97)
I then run the command
qiime vsearch uchime-denovo
to remove possible chimeras.
I noticed that before qiime vsearch uchime-denovo, the cluster_table.qzv total frequency is 3014234, and the feature is 1149. After the command is total frequency dropped a little to 2977404, and the feature is 270. I am wondering what caused the drop of the feature numbers?
To answer the part two question, I realized that after removing the chimeras, some low frequency features are removed.
e.g. Minimum frequency before qiime vsearch uchime-denovo is 0; Minimum frequency after running qiime vsearch uchime-denovo is 20688.
Above should probably be a reason.