q2-cutadapt output

jessica.song · November 2, 2020, 9:32am

Thank you for walking me through this. I have tried out the settings that you've suggested and sent the files as per your request through a dm. It would appear, as you can see, that the output did not change much. I have also tried just trimming the primers with dada2 but having tried it both with and without cutadapt, trimming only with dada2 seems to result in approximately 50% of reads being classified downstream as Unidentified Bacteria.

Given that the 'linked adapters' setup with cutadapt has given me the most promising output of all the different configurations, would you say it is safe to proceed with that but without discarding untrimmed reads? I am uncomfortable with retaining these untrimmed reads, mostly because I am not sure what that could introduce to downstream analyses.

What do you think?

Jessica

SoilRotifer · November 2, 2020, 2:37pm

Good news @jessica.song! I think I solved this.

First, thank you for sending me the PDF describing the protocol. I noticed a key statement:

pairs of primers (Fw-Rev or Rev-Fw) had to be present in the sequence fragments

That is, this particular sequencing facility must expect mixed orientation reads. Therefore you need to enter in both primers for each of the --p-front-* commands (in 5' - 3' orientation).
Like so:

qiime cutadapt trim-paired \
  --i-demultiplexed-sequences paired-end-demux.qza \
  --p-cores 4 \
  --p-front-f CCTACGGGNGGCWGCAG  GACTACHVGGGTATCTAAKCC \
  --p-front-r GACTACHVGGGTATCTAAKCC  CCTACGGGNGGCWGCAG \
  --p-overlap 3 \
  --p-error-rate 0.1 \
  --p-match-read-wildcards \
  --p-match-adapter-wildcards \
  --p-discard-untrimmed \
  --o-trimmed-sequences primer-trimmed-demux-2.qza \
  --verbose > cutadapt-log-2.txt

The output from the first sample looks like this:

=== Summary ===
Total read pairs processed: 133,428
Read 1 with adapter: 133,350 (99.9%)
Read 2 with adapter: 132,063 (99.0%)
Pairs written (passing filters): 131,987 (98.9%)
Total basepairs processed: 77,399,028 bp
Read 1: 38,699,572 bp
Read 2: 38,699,456 bp
Total written (filtered): 71,548,932 bp (92.4%)
Read 1: 35,774,760 bp
Read 2: 35,774,172 bp

-Cheers!
-Mike

SoilRotifer · November 2, 2020, 3:02pm

@jessica.song, I forgot to mention, that you'll still need to orient this output into the same direction prior to any other downstream analyses. You can do this via the RESCRIPt action orient-seqs.

-Good luck!

jessica.song · November 2, 2020, 3:05pm

Hi @SoilRotifer,

You did it! I just ran the command on all my samples and it looks perfect! Terrible oversight on my part. I cannot thank you enough, for your time and for providing the commands!

Jessica

SoilRotifer · November 2, 2020, 3:06pm

Glad we could help!

system · December 4, 2020, 12:28am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.