Hi Nicholas,
The primers are in fact on the sequences, and I actually re-ran the code, since I had entered it incorrectly, and I was able to get a file which shows the primers have been trimmed, but it also had a warning.
Unfortunately, I did not copy the warning but it said something along the lines of: "One or more of your adapter sequences may be incomplete" .....The adapter is preceded by an "A" quite often so the results should be interpreted with care.
At first I was not concerned since its just a warning, but after looking at my file, I see that most of the bars, starting at position 97 are pink, and it says:
*"The plot at position 97 was generated using a random sampling of 9999 out of 7713524 sequences without replacement. This position (97) is greater than the minimum sequence length observed during subsampling (96 bases). As a result, the plot at this position is not based on data from all of the sequences, so it should be interpreted with caution when compared to plots for other positions. Outlier quality scores are not shown in box plots for clarity."*
Did I set up the code incorrectly?
qiime cutadapt trim-paired
--i-demultiplexed-sequences demux.qza
--p-adapter-f [RC of SD-Bact-0341-b-S-17 (F)]] \ # RC of the Fprimer (found in 3’ end of the sequence if you have primer read-through.)
--p-front-f [SD-Bact-0785-a-A-21(R)] \ #should be the primer on the 5’ end of the read
--p-adapter-r [RC SD-Bact-0785-a-A-21(R)] ]
--p-front-r [SD-Bact-0341-b-S-17 (F)]
--o-trimmed-sequences demux-trimmed.qza
Hi,
The primers were present and after checking my code and rerunning it, I get a trimmed file, but I am concerned on the forward reads. When I zoom in to choose an trim, trunc parameters on the forward read, I see that starting at position #80, the bars are in pink and these shows up.
The plot at position 80 was generated using a random sampling of 9999 out of 7713524 sequences without replacement. This position (80) is greater than the minimum sequence length observed during subsampling (79 bases). As a result, the plot at this position is not based on data from all of the sequences, so it should be interpreted with caution when compared to plots for other positions. Outlier quality scores are not shown in box plots for clarity.
I am unsure what to do with this, but I feel like I did something wrong, since the "pink bars" on my ITS dataset did now show up until I scrolled to the left tail.
Also, I ran the code twice, once with this code: Bacteria-demux-trimmed.qzv (306.0 KB)
*qiime cutadapt trim-paired *
*--i-demultiplexed-sequences Bacteria-demux-paired-end.qza *
*--p-adapter-f GGATTAGATACCCBDGTAGTC *
*--p-front-f GACTACHVGGGTATCTAATCC *
*--p-adapter-r CTGCWGCCNCCCGTAGG *
*--p-front-r CCTACGGGNGGCWGCAG *
*--o-trimmed-sequences Bacteria-demux-trimmed.qza *
Than this code: because when I checked the files I noticed that there were sequences preceding and succeeded the primers. Not sure if this is right, but I gave it a try, but in both cases, I got back the same results. (P.S. I decided to run this, since the previous code gave me a warning something along the lines of "WARNING: One or more of your adapter sequences may be incomplete, ....usually preceded by an A". Per one of your previous comments, this is just a warning and not an error and it should be fine, but I ran it just to see. Bacteria-demux-trimmed-1.qzv (305.9 KB)
*qiime cutadapt trim-paired *
- --i-demultiplexed-sequences Bacteria-demux-paired-end.qza *
*--p-adapter-f GGATTAGATACCCBDGTAGTCCCTGACTTGG *
*--p-front-f GGACTACHVGGGTATCTAATCC *
*--p-adapter-r CTGCWGCCNCCCGTAGGC *
*--p-front-r CCTACGGGNGGCWGCAG *
*--o-trimmed-sequences Bacteria-demux-trimmed-1.qza *
--verbose
This is the original file
Bacteria-demux-summary.qzv (300.7 KB)