Thank you for your detail explanation. Now I have better understanding of my questions.
After you give me this specific examples, I was able to understand their differences completely. It looks like the 16s pre-trained classifiers which provided in the Qiime2 data resources website Qiime 2 are using the "taxonomy_7_levels" file instead of the "consensus_taxonomy_7_levels" and I have used it for my 16s data taxonomy classification. To make consistent, I might keep using my old 18s classifier which trained by using "consensus_taxonomy_7_levels". Is it make sense to you?
I am working on both 16s and 18s data. I am satisfied with the 16s result by using the "7_levels" files. But 18s I am really not sure how the taxonomy of Eukaryote works and what is the best way to present the results. It has so many taxonomy rank compare to bacteria. So I thought putting all of the result in the some level might be better for interpretation purpose? I want to create Phyloseq object which require to specify the name of each taxonomic rank. If I use the 7 level, I do not know what name I should give if I go further than level 3 for the 18s data.
I used to follow the classifier tutorial to train my Bayesian classifier for 18s data several month ago. After refreshing my mind, I was able to remember how to import the files to qza files.
It make sense to me. DADA2 used extra long time to process all the 10 samples due to orientation problems. I understand other download stream analysis like beta diversity can also be affected since the distance matrices might not be correct under this conditions.
You mentioned that I can either classify my sample sets into two sets and run the classify-sklearn separately or using the classify-consensus-vsearch to solve this issues. However, I still cannot run the beta diversity analysis if I did not correct the orientation problem, right?
Do you mean the vsearch plugin for clustering and dereplicating the sequence? Did you suggest to use this method instead of using the DADA2 method? If I use vsearch method for clustering, do I still need to concern downstream taxonomic classification and beta diversity issue due to the orientation problems? For example, can I use classifier-skearn to do taxonomy after using the vsearch method?
If I still want to use DADA2 for denoise, as you recommend, I need to reverse the orientation of the reads for the raw fastq. But how can I do that for the fastq file. Can you please let me know by using which tools I can change the orientation of the fastq file.
Thank you so much for sharing your time to help me