i want to improve my denoising stats

  1. What is your sample type (soil, mouse species, human oral etc.)

@[Mehrbod_Estaki], hi hope you doing well, and i follow your suggestion and got some better result but now i have to improve my data stats.

  1. What is your sample type (soil, mouse species, human oral etc.)
    currently, I am working on the invertebrates gut microbiome project.
  2. How were these reads sequenced? (ex. Illumina 2x300 run)
    my amplicon read is 2x150 run on Illumina platform hi seq 2500.
  3. What region (ex. V4? V3-V4, ITS etc)
    its V3-V4 region
  4. Why do you think 3167 features is quite low? What do you expect?
    when i saw some tutorials present in the website i realize i have to need to improve my denoising stats, i think my merging stats not very good range (see image) 7d1b33ad258fb6977d9c80247087d325da18ce79_2_690x303.
    also, sir, i have one more question since i don't have the primers information how can i make my own v3v4 regions classifier? for taxonomic classification.

Cool!

Something doesn’t add up here. The V3-V4 region is too long to be merged with a 2x150 Illumina run. Are you sure about these descriptions? Your stats results below indicate that your reads are merging just fine, not possible with 150 reads of V3-V4.

I don’t know much about invertebrate gut but I seem to recall that diversity there is lower than mouse/human gut. So the 3167 features you described might not necessarily be low at all, unless you have evidence from previous work that more features are expected. Remember these are the # of unique features you discovered across all your samples, not to be confused with total # of reads that the dada2-stats shows.

Your dada2 stats results actually look pretty good to me and are on par with what we’d expect. You have loads of reads 50k+ to continue with your analysis without worrying but if you really wanted to try and improve your results we would need a bit more information. For example uploading the demux.qzv as well as your dada2-denoise stats vizualization artifact (not just the image) here. The artifacts hold important information in their provenance tab that can help us explore your approach so far.

In the future please try to avoid asking multiple unrelated question in the same thread. This helps us keep the forum organized and easier to search the archives for past questions (including this one has been answered before).

I would recommend asking your sequencing facility or whoever did the amplicon preparation for this information. It will be required for publication anyways so you might as well get it. If for some strange reason this information is lost, you can always use a generic classifier that isn’t trained on your specific primers, albeit with slightly less accuracy. This is available on the data resource page.

2 Likes

Thank you, sir, to giving me information.

  1. Something doesn’t add up here. The V3-V4 region is too long to be merged with a 2x150 Illumina run. Are you sure about these descriptions? Your stats results below indicate that your reads are merging just fine, not possible with 150 reads of V3-V4.
    i am also confused about the 2x150 ( it mention on the Boucher provided that provide me by vendor company) i think there was some mistake did by them it might be 2x250 ( i don't know I am saying just because when i demux it showed that 250 read for forward R1 and 250 read for reverse R2, on the basis of these 250 reads for each i am saying its 2x250 if i am wrong please correct me) here i am sharing my brochure image.
  2. uploading the demux.qzv as well as your dada2-denoise stats visualization artifact (not just the image) here.
    here i am sharing my demux.qzv file demux_seqs.qzv (290.8 KB) and dada2 denoising stats denoising-stats.qzv (1.2 MB)
  3. In the future please try to avoid asking multiple unrelated questions in the same thread. This helps us keep the forum organized and easier to search the archives for past questions (including this one has been answered before).
    Got it, sir, i will remember this thing and try to not happen again from my side.

Hi @SAMRENDRA01,
Thanks for providing the additional information.
Yes, it must have been a typo on their end, these are clearly 2x250 runs , as you pointed out the quality scores show a length of 250 on each Forward and Reverse reads.
After reviewing your DADA2 parameters, I honestly think you are about as good as you can get. You’re not really able to truncate all that more from your reads as you might risk losing proper merging, and that would be the only thing I can think of that could improve this.
That being said, you have excellent number of reads/sample. The smallest one being over 58K still! I would consider even 5k sufficient to move forward.
My recommendation, consider this an excellent run and a successful denoising and you have plenty of confidence moving to the next step of your analysis.
All the best!

1 Like

Thank you soo much sir :hugs:, you are doing great work. now i have the troubleshoot my taxonomic classifier i will post my new problem in a new thread :slightly_smiling_face:

1 Like

@Mehrbod_Estaki sir i am confused to this data.
before i run this data i didn’t remove the adapter sequence and primers. so its necessary to cut primer and remove adapter?

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.