May I know is that possible to deal with the sequences that sequenced by different primers (but primers have the overlapped region)? Such as the sequences that sequenced by 515F-806R and the sequences that sequenced by 341F-806R. Actually, they have the same overlapped region 515F-806R. SUre I can seperate deal with them in QIIME2 based on the protocal. But it is not fine to compare them. Their final OTU numbers are different.
May I know is that fine to trim them to the same region 515f-806r, then use DEBLUR/DADA2 to deal with them, then classifier the taxonomy? OR, separately analysis them in DEBLUR/DADA2 to get the feature table. Then any other way out to get the taxonomy?
Hi @Lu_Yang, while you wait for an expert answer on your question, have you looked at the q2-fragment-insertion plug-in? I haven’t used it personally but it sounds like it was designed with your very same issue in mind, using different regions for analysis. Checkout their tutorial here.
You have a few different options here:
@Mehrbod_Estaki’s suggestion is excellent (thank you for the suggestion!) — comparing datasets amplified with different primers is indeed what q2-fragment-insertion was designed for (to my knowledge). I would think that plugin would be most advantageous, however, when comparing datasets with non-overlapping amplicons. So you have other options.
The process that you describe — trimming the longer reads to 515f-806r — is absolutely okay. Of course you lose the additional sequence information but it sounds like that is not important here. You would trim the reads, process separately by dada2/deblur, then merge together into a single feature table and single representative sequence file. Then classify taxonomy on the merged sequences.
You could process separately (without trimming), then classify taxonomy separately. Then collapse both feature tables on level 7 taxonomy, and merge those tables. Then all downstream analyses (e.g., diversity analyses) would be based on taxonomic information and the precise primer site does not matter too much. However, this is definitely the weakest option of the three, because (1) the longer reads may yield deeper taxonomic assignments, so collapsing on species level may still yield very different profiles between the different datasets and (2) collapsing on taxonomy reduces the amount of information you have — i.e., diversity analyses with ASVs are much more sensitive for differentiating sub-groups within your data. This approach would really only be most advantageous when trying to merge datasets from very different amplicons (e.g., 16S rRNA genes and a protein-coding gene).
These codes have been running for more than one hour, still have no results comes out. I just have 3 samples in the testing. May I know is there any settings that can use multiple threads as DADA2 procedure did? Thanks.
b. As the github protocol listed code.
qiime fragment-insertion sepp
(Does the rep-seqs.qza in this procedure means the results of merged rep-seqs.qza from primer 1 and the rep-seqs.qza from primer 2? I am not sure about the understanding)
Then I will get the final taxonomy result as the final taxonomy.qza similar to the Moving picture tutorial produced.
(2) About the trimming the longer reads to the overlapped region, such as 515f-806r. May I know are there any code or protocol can be used in QIIME2 or others? Thanks.
(3)I use the following command to merge the table, “qiime taxa collapse --i-table table.qza --i-taxonomy taxonomy.qza --p-level 7 --o-collapsed-table table-I7.qza”. I have tried, and indeed this will give the result table only at the species level, NOT at OTU level. And I think I will not continue try this procedure. BUT very informative for me. Thanks.
Yes, you can use q2-cutadapt to trim your demultiplexed reads prior to denoising. See this tutorial
Yes that is the point — to collapse at species level (since OTUs from different primer regions will be unique even if they overlap). But I think we can all agree this is not the best solution for your case.
Regarding your questions about q2-fragment-insertion, you can check out this post about this community plugin, or perhaps @Stefan can answer your questions.
I have successfully finished the procedure in the q2-fragment-insertion. And seems understand the procedure now. But I am still looking forward to @Stefan answering the question on the multithreads of q2-fragment-insertion. And also make sure that my understanding is right.
you can invoke the help message for the plugin by executing qiime fragment-insertion sepp --help. This gives you the information you are looking for --p-threads INTEGER is The number of threads to use [default: 1].
The runtime does not depend on the number of samples, but only on the number of representative sequences. If you have ~4,000 sequences, typical runtime is about 2,5h but scales very well when running in parallel, thus with 4 threads the same task can be completed within ~50min.