Cutadapt in Ion Torrent sequences

Hey there!

I am analyzing some Ion Torrent sequences. Until this run, we have been using the 16S Ion Torrent metagenomic kit, so I followed the steps recommended in the post (Possible Analysis Pipeline for Ion Torrent 16S Metagenomics Kit Data in QIIME2?).

However, this time, my colleagues used 2 commercial primers for the amplification, since the samples for this run were obtained in a different way. Therefore, I was considering using cutadapt, before dada2 step, to remove those primers' sequences.

The problem is that cutadapt only has trim-paired and trim-single options, but ion torrent does not work like Illumina and contains both directions, so.. which option should I use?

If it is possible to use cutadapt with IT sequences and I use it to remove my primers, is not necessary to trim again in the dada2 step, is it?

If it is not, what additional option could I use? Should I remove those primers' sequences with the dada2 trimming option? (In this case, if my primers are 20 nucleotide long, is it enough to trim them in 20?)

Thank you so much in advanced :slight_smile:

Miriam

Hi @MiriamGorostidi,

I think this is the first time the team has come across this, so we had to discuss. I'm not totally sure this will work, but it's worth trying :woman_shrugging:.

I'm assuming that you already know the primer sequences.

Your idea of using cutadapt is exactly what I would do. I would use trim-single, with the --discard-untrimmed flag, although I'm struggling with a good way to check this.

No, Cutadapt will trim your primers for you, so you do not need to trim them again.

If you use Sidle (use the memory refactor branch as of right now), you should be able to just scaffold the forward and reverse reads as separate regions and combine them.

Does that pipeline make sense?

Best,
Justine

2 Likes

Hello @jwdebelius !

Sorry for answering you this late, I did not check the forum and almost forgot it...

I will follow your advice and apply cutadapt with trim-single and --discard-untrimmed options then... Is this what you were saying, right?

However, if trimming with dada2 is similar to cutadapt, then, would it be enough to trim the sequences in 20 (on account of my primers having 20nt)?

Thank you so much in advanced :slight_smile:

Hi @MiriamGorostidi,

Yes! That would be my suggestion.

No, DADA2 and Cutadapt trim differently. Cutadapt trims the sequence, Dada2 just trims the number of basepairs you specify, but it can't trim partial sequences or deal with sequences where there primers have already been trimmed. Since you have multiple regions and need to seperate them, cutadapt will let you do that. Once you've run the trimming with Cutadadapt, you don't need to trim with DADA2.

Best,
Justine

1 Like

THANK YOU SO MUCH @jwdebelius !! You have been really helpful :smiling_face_with_three_hearts:

Hello @jwdebelius !!

I have tried what you told me and now I have more questions... (sorry about this):

  1. I apply cutadapt to my samples.qza with the following code:
qiime cutadapt trim-single \
 --p-cores 4 \
 --i-demultiplexed-sequences ${DIR}/samples.qza \
 --p-anywhere AGAGTTTGATCCTGGCTCAG \
 --p-anywhere GGCTGCTGGCACGTAGTTAG \
 --p-match-read-wildcards \
 --p-match-adapter-wildcards \
 --p-discard-untrimmed \
 --o-trimmed-sequences ${DIR}/samples-trimmed.qza \
 --quiet
  1. Then, in order to use dada2, I figured out that a previous visualization was needed, so we can know where to trunc the reads, right?
qiime demux summarize \
  --i-data  ${DIR}/samples-trimmed.qza \
  --o-visualization ${DIR}/samples-trimmed-demux.qzv
  1. Finally, I have performed the dada2 step:
qiime dada2 denoise-pyro \
  --i-demultiplexed-seqs ${DIR}/samples-trimmed.qza \
  --p-trunc-len 175 \
  --p-trim-left 0 \
  --p-n-threads 2 \
  --o-representative-sequences ${DIR}/Dada2_output/rep-seqs-cutadapt-dada2-pyro-conTrun175.qza \
  --o-table ${DIR}/Dada2_output/table-cutadapt-dada2-pyro-conTrun175.qza \
  --o-denoising-stats ${DIR}/Dada2_output/stats-cutadapt-dada2-pyro-conTrun175.qza \
  --verbose

Even if everything seems to work perfectly, I have come across with some strange things when checking the dada2 output.

Many sequences from rep-seqs-dada2.qza file seem to be really similar, see figure (I think it is the first time I see that huge similarity between reads in this file). Moreover, when reviewing the stats-dada2.qza file, the percentage of input passed filter is really small.


I guess I am doing something wrong...

Thank you so much

Best,

Miriam

Hi @MiriamGorostidi,

I probably would have trimmed the forward adapter, but it looks like this is mostly working.

Please check your sequence quality and trim length, because that's where you're losing data (likely due to a quality drop at the end). There are a lot of posts on the forum about how to judge dada2 quality; it might be good to look at those to better understand how to use your dada2 summary to guide trimming.

Best,
Justine

Perfect @jwdebelius !

I have rerun the command only removing the forward adapter.

When it comes to the quality trunc, I had followed the commonly used criteria to cut sequences... however, having tried to cut earlier, I have obtained better results.

Thank you so much again!!

Best,

1 Like

Hi @MiriamGorostidi,

I'm glad it worked for you!

Best,
Justine

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.