Advice Needed on Pipeline/Merging/Emperor Comparison

Hello,

I was wondering if I might have someone take a look at the pipeline I've developed. Primarily just the order in which commands are performed to determine if I have things in the wrong place. The main commands begin just over 1/3 of the way down. It should be importing EMP formatted paired-end reads and then doing everything necessary to complete the analysis through taxonomy and basic alpha diversity.

I am finding an issue currently that the emperors obtained through this Q2 pipeline look nothing like those previously obtained in Q1 for the same data set. I realize that there will be some difference due to adjustment of methods, but the significance of the change is far more than expected. I suspect I have some an error in some integral component of the pipeline.
I had previously thought that dada2 also merged, but this doesn't appear to be the case. If this is my issue, how might I fix this now that Vsearch allows for direct merging?

Thanks for any thoughts,
Willis

Q2_V6.5.txt (11.7 KB)

It’s a lot to dig through but everything looks okay to me. You can check out the moving pictures or other tutorials to get an idea of the typical order for each command and it looks like you are following that order.

You would need to share these emperor plots to give us a better idea. Indeed, different methodology will lead to major changes but the same general trends (or better) would be expected with QIIME2. But I do not think this is an emperor issue; I think you are correct and this is most likely a paired-end joining issue.

You would need to share the dada2 log (obtain by running the command with the --verbose flag) to assess read counts on input, denoising, merging, chimera checking, and output. What trimming parameters are you using and how does this square up with your quality plots? Paired-end read joining is indeed one of the main stumbling blocks and I think you are correct this is most likely at fault here. We will need all of the above to help diagnose this and please check through this forum first — there are lots of threads covering paired-end joining issues and how to diagnose/correct these that may save you some time. Some cryptic issues can sometimes complicate assessing appropriate read joining (and these are other things to search for in the forum history):

  1. if reads are trimmed, e.g., have undergone any sort of quality trimming prior to importing to QIIME2, or if q2-quality-filter is used prior to dada2.
  2. If amplicons are very variable in length, e.g., ITS domains are hypervariable and many reads may not merge with a given read length.
  3. Sometimes it is better to trim your reads more to obtain shorter, higher quality reads than trying to obtain longer, lower-quality reads (which can sometimes cause reads to be dropped and/or muddle read joining). Just make sure you have a minimum 20 nt overlap between forward/reverse reads.

I hope that helps!

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.