I'm new to the bioinformatics field but reading QIIME2 documents and forum I'm slowly succeding in understanding what to do. In order to master this knowleadge and help some other new self-taught bioinformatic, I would ask here some of my doubts hoping in your usual help. I thank you anyway.

There are my questions:

  1. what to do if the total reads passing the primer trim are very few despite writing the correct pair of primers?

  2. adding "-p--match-read-wildcard" to trim primer comand sometime helps to increase the total of trimmed reads... but does it mean that the quality of work is reduced due to incertainity of that base calling in some reads? Is it better to leave those reads discarded?

  3. a percentage of reads mearged after DADA2 about 30-50% it's acceptable if the quality of reads it's very low?

  4. can I use --p-perc-identity 0.97 with SILVA-99% database to classify the rep-seqs?

  5. why sometime very few rep-seqs (about 30%) are found in database after classification step? Can a classifier trained on my reads improve this result?

Hi @leandro_di_gloria,

These are a lot of questions! Let's see if we can break things down.

I would double check if your primers have already been trimmed (some sequencing facilities will do this for you, especially fi they're proprietary primers.) Otherwise, I would check whether you're using the --discard-untrimmed flag - this will discard any sequence without the primer; which is a problem if the primers have already been trimmed.

I think I'd personally discard reads with wildcards in your adapters, but I also tend to just stick wtih the defaults on things like this.

There have been so many discussions of dada2 parameters around here; you may find it useful to search for these discussions to see if they help you make a choice. I would look at your summary statistics to see where you're losing reads. Ultimately, though, you and your team are the only people who can answer this question for you.

I think you need to make a separate post about your taxonomic classification because I don't have nearly enough details in this to be able to answer this and it's separate from the other two questions you're asking.


Thank you so much Justine, you've made that very clear. :grin: :pray:
Following your advice I'll create another post about taxonomic classification if no one else answers.

Have a great day!

