Thank you for walking me through this. I have tried out the settings that you’ve suggested and sent the files as per your request through a dm. It would appear, as you can see, that the output did not change much. I have also tried just trimming the primers with dada2 but having tried it both with and without cutadapt, trimming only with dada2 seems to result in approximately 50% of reads being classified downstream as Unidentified Bacteria.
Given that the ‘linked adapters’ setup with cutadapt has given me the most promising output of all the different configurations, would you say it is safe to proceed with that but without discarding untrimmed reads? I am uncomfortable with retaining these untrimmed reads, mostly because I am not sure what that could introduce to downstream analyses.
First, thank you for sending me the PDF describing the protocol. I noticed a key statement:
pairs of primers (Fw-Rev or Rev-Fw) had to be present in the sequence fragments
That is, this particular sequencing facility must expect mixed orientation reads. Therefore you need to enter in both primers for each of the --p-front-* commands (in 5' - 3' orientation).
Like so:
=== Summary ===
Total read pairs processed: 133,428
Read 1 with adapter: 133,350 (99.9%)
Read 2 with adapter: 132,063 (99.0%)
Pairs written (passing filters): 131,987 (98.9%)
Total basepairs processed: 77,399,028 bp
Read 1: 38,699,572 bp
Read 2: 38,699,456 bp
Total written (filtered): 71,548,932 bp (92.4%)
Read 1: 35,774,760 bp
Read 2: 35,774,172 bp
@jessica.song, I forgot to mention, that you’ll still need to orient this output into the same direction prior to any other downstream analyses. You can do this via the RESCRIPt action orient-seqs.
You did it! I just ran the command on all my samples and it looks perfect! Terrible oversight on my part. I cannot thank you enough, for your time and for providing the commands!