I am working on 16S seqs that were amplified an archaea-specific primer set; however, I found that the amplicons also comprised a large portion of bacteria in the taxonomic analysis using seqs prepared as described in qiime2. Those contamination of bacterial seqs will be obstacle the diversity analyses for the pure archaea sequence.
Is there any way to remove bacterial seqs from the rep-seqs and table that were generated through DADA2 step; such that I could make tree using the bacterial seq-deleted one, and continue the alpha and beta diversity analyses?
Thanks for all your cooperation.
qiime filter-seqs
--i-sequences arch-seqs.qza
--i-taxonomy arch_taxonomy.qza
--p-include Archaea
--o-filtered-sequences seqs-no-bact.qza
that resulted in a problem as below.
** (1/1) Invalid value for '--i-sequences': Expacted an artifact of at least type FeatureData[Sequence]. An artifact of type SampleData[ParedEndSequencesWithQuality] was provided**.
Could you let me know what I have to do to solve this problem?
I did not use rep-seqs.qza, but used just imported seqs. Yes, now i found that the rep-seqs is working for this. thanks for your comments.
Quick question,
why is it not used --p-exclude bacteria instead of --p-include Archaea if we would like to remove bacterial seqs from arch and bact mixed seqs?
It sounds that my seqs may contain also fungal and other eukaryotic seqs.
Let me make sure that "--p-include" has function to specifically select archaeal seqs from the sequences annotated as archaea, bacteria, fungi and other eukaryote. Am I correctly understanding?
thanks for all your answers, which help me to do this.
Let me shift the topic to a basic question. .
Do you know why my seqs are affiliated with bacteria (13-90% of total seqs) more than archaea from several samples even though they were amplified using a archaea-specific primers and I extracted the Silva references with the primer sequences? Do you think it is primer issue or method problem (e.g., silva-ref-based classifier)?
Usually you may expect that archaea accounts for relatively small amounts of both taxa and abundances. Using bacteria specific primers will even more reduce archaeal amplicons, meanwhile archaea specific primers will help to amplify more archaeal reads. But probably they are just not specific enough to get rid of bacteria and eukaryota on the PCR step.
I am currently analyzing archaeal dataset with no better distribution by domains.
In subsequent, now, I would like to delete the phylum cyanobacteria from the purified bacterial seqs. I tried to use --p-exclude P_cyanobacteria, but it seemed not work, keeping the cyano-seqs in the resulting file. which code(s) do i have to use for this purpose?