Your data looks pretty good.
You truncated at 200 and then merged the paired-end reads. So that means that you have a sequence length ~ 400. I specified around 400 because there is going to be overlap between the two sequences and that will result in shorter than length 400 but around there.
I would like to take a look at your data though to confirm. Would you be able to upload the rep-seq-dada2-3.qzv, stats-dada2-3.qzv and demux.qzv so that I can take a look?
Thanks cherman,these are my data you nead.
Could you tell me how to check the data is good through theses images？
and could you help me take a look at my another post? Also relative DADA2.
First of all your data at this point looks pretty normal.
Let's walk through these visualization to get a better idea of what they are telling you. Sound good?
When looking at the interactive quality plot, you can analysis the quality of the data. "Good Data" is really hard to quantify but a good rule of thumb is anything around 30 is a pretty good quality score. Why do you want to truncate, in part it is so that dada2 doesn't get lost in the weeds of a lot of not good quality data. If your case there is really no "bad data" (This would be where you see a fall off in the quality of data. You can see this here) so you can truncate at the very end of your sequences length. If you wanted you could truncate at 210 but 200 works just fine!
This is an overview of the quality control that dada2 did to your data. The short and sweet of it is you don't want to loose too many sequences at one step in the quality control. The steps being the columns at the top. When looking at your data here nothing is out of the ordinary.
The rep-seq files shows all your features and their sequences. The range you have of sequence lengths seems good to me like I said previously. Most of your data is around 400 in length and looks good.
I know that was lot! I hope that this helps not only with this dataset but in future analysis as well!
Sorry for the confusion. All I was saying there was that the stats-dada2.qzv file is an overview of what happened in the dada2 step. If you look at the stats-dada2.qzv, you will see that the header of the columns is the steps of dada2 ( input, filter, denoise, merge, and nonchimeric). For example if I was looking at a stats-dada2.qzv and I saw that I lost 75% of all my sequences in the non-chimeric step then I would know that a large portion of my data was chimeras and I would know that there was something wrong in the data processing. If you don’t see a huge loss for all of your samples at any one step then your data has gone through the quality control step effectively! In your specific case, I don’t see anything that would make me think that something is wrong with your data.
Like many things in QIIMe2 there is no specific number. All I can clarify for you is that if there is a recognizable pattern of all sequences being thrown out in one dada2 step then you should investigate that and make sure there isn’t an issue with the sequencing data.
As for the ITS data, all the rules for how to make sure that the data made it through dada2 are the same.
Hope this helps!