so I am working with a subset of my dataset (10 libraries out of 200). I have demultiplexed them using the process_shortreads from the "Stacks" pipeline. Afterwards, I uploaded them into Qiime, and I cut the primers using Cutadapt. Then I denoised the sequences with DADA2. After checking the dada2 stast I see that most of my reads did not pass the quality filtering steps and I ended up with very few rep-seqs and low total feature count. The commands I used were the following ones:
The visualization for the demux-seqs is the following one: demux-ITS-2.qzv (311.6 KB)
Then I trimmed the primer with cutadapt with the following command:
qiime cutadapt trim-paired
Then I denoised the sequences with the following command:
qiime dada2 denoise-paired
I have seen other post related to this question, however they were dealing with the trunc length of the sequences. So I am wondering if there are other ways to solve the problem since I am not truncating the sequences. I believe that changing the default values for "--p-max-ee-f", "--p-max-ee-r", and "--p-trunc-q" would help me. However, I do not fully understand their meaning.
Thank you for your feedback. I will try to apply the truncation of the reads and see the what the results are!! I will also read the information you provided me with regarding the expected error parameters. I will keep you posted!!
However, I have incurred into another problem when performing the taxonomic assignment. Firstly I downloaded all the ITS-2 sequences for Magnoliopsida from genebank with Rescript. The command I used was he following:
qiime rescript get-ncbi-data
--p-query 'txid3398[ORGN] AND (ITS2 OR Internal Transcribed Spacer 2) NOT environmental sample[Title] NOT environmental samples[Title] NOT environmental[Title] NOT uncultured[Title] NOT unclassified[Title] NOT unidentified[Title] NOT unverified[Title]'
Then, I continued to taxonomic assignment with the following command:
Agreed! About 20% of the reads cannot be joined, but this is pretty good given the quality.
ITS classification is harder than 16S classification for a number of reasons. For a first run with a custom database (that you just made!), I think getting to order is pretty good!
Yes. It's all about classify-consensus-vsearch.
Aftering finding hits to the database with vsearch, the taxonomy of these hits are compaired. For each level where >50% of the hits agree a classification is given, until you reach a level where there is not longer a consensus.
For your data, the level in which hits no longer agree is mostly Order.
Thanks for your feedback! I will have a look to what you suggest and come back to you with the results. I hope I can get further than order since it is essential to our study to reach to the genus level at least!! Luckily for us we used two markers that can be used in combination to reach further taxonomic levels. We are studying plant-bird interactions so stopping at the order level it is a bit meaningless.
Thank a lot again!! I am very grateful to Qiime2 community and specially to you moderators