Is there a way to reverse complement trimmed reads after running cutadapt in qiime2?
Specifically, I have bidirectional reads in both the forward and reverse directions generated from two different primers. I am able to separate the reads into their respective orientations using cutadapt such that the output will give me an artifact file for the fwd and one for the reverse.
I would like to be able to reverse complement the reverse reads artifact and concatenate them to the forward artifact to submit to dada2 denoising.
Does this seem reasonable and is it possible to do this in qiime2?
Mixed-orientation reads are a bane, I am glad you could use cutadapt to re-orient in your case!
Do you really need to reverse complement? Shouldn’t the reverse reads in this case be in the correct orientation since cutadapt is detecting the primer in that read? QIIME 2 does not have a method for reverse-complementation at the moment, so you would need to export the reads, reverse complement outside of QIIME 2, then re-import.
Either way, I would recommend denoising separately and then merging afterwards. Not only would this be easier to do in QIIME 2, but merging reads before denoising could cause issues with dada2 since the error profiles on the reverse reads are likely quite different from the forward reads and may need to be trimmed etc separately.
Hi @Nicholas_Bokulich. thank you for your feedback. I have been looking into this notion of merging tables after denoising.
So just to follow up from your suggestion. I have reads from V2f and V2r. I denoised these separately and now have a feature table from dada2 for v2f and a feature table for V2r. You suggested merging afterwards however my forward and reverse reads are from the same samples so I assume the parameter for overlap feature should be ‘sum’. Do you agree?
–p-overlap-method TEXT Choices(‘error_on_overlapping_sample’,
Method for handling overlapping ids.
Yes! Sum would work. Though note that if you have very different read counts from the two different runs, then sum will cause reads from one run or the other to dominate. You may want to check out the read counts for the overlapping samples to decide — one option is to filter these reads out of one table or another if you have a “preferred” run.
You may also — for an initial pass to QC your data — want to relabel the sample IDs in one run so that you can merge and run core-metrics to compare the replicate samples. Ideally these replicate samples should cluster together and you will not see clear differences between runs. Batch effects are common, and since you’ve replicated samples on each run it gives you a great opportunity to make sure this is not a problem in your data!
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.