"fastx_trimmer" trimmed .fastq files have problems to "qiime vsearch join-pairs"

Xiaofan_Wang · December 31, 2020, 7:38am

Hello.
I am facing a problem and hope anyone can provide any ideas.
I used "fastx_trimmer" command to trim both forward and reverse fastq files before "qiime tools import" and "qiime vsearch join-pairs" steps. However, over 99% reads were removed during the qiime vesearch join-pairs step (sequence count without trimmed: 55162; sequence count after trimmed: 396). I had good experience using the same pipeline before but not this time. I really have no clue. Any one can help please?
Here are the commands I used to trim; import and joint the fastq files:

fastx_trimmer -f 27 -i /Users/xxxxx/SRS4615107_1.fastq -o /Users/xxxxx/SRS4615107.trimmed_1.fastq -Q 33
fastx_trimmer -f 27 -i /Users/xxxxx/SRS4615107_2.fastq -o /Users/xxxxx/SRS4615107.trimmed_2.fastq -Q 33

**qiime tools import --type SampleData[PairedEndSequencesWithQuality] --input-path IDtrial.txt --output-path deblur_input.trial --input-format PairedEndFastqManifestPhred33 **

qiime vsearch join-pairs --i-demultiplexed-seqs deblur_input.trial.qza --o-joined-sequences deblur_input_joined.trial.qza

Also, below is one of the sequence before and after trimmed:
head -n 20 SRS4615107_1.fastq
@SRS4615107.5 M00161_31_000000000-BFW3D_1_1101_19926_1347 length=301
ACCGGTATGTACGGACTACTGGGGTTTCTAATCCTGTTTGCTACCCACGCTTTCGTGTCTCAGCGTCAGTTACGGTCCAGAGAGCCGTCTACACCACCGGTGTTCCTCCTGATATCTACGCATTTCACCGCTACACCAGGAATTCCACTCTCCCATCCCGCACTCTAGCCTTCCAGTTTTCGGCGCACCCTCCCGGTTGAGCCGGGAGATTTCACACCAAACTTGGAGAGCCGCCTACACACCCTTTACGCCCAGTAACTCCGAACACCGCTCGCTGCCTACGTATTACCGCGCCGGCTGG
+SRS4615107.5 M00161_31_000000000-BFW3D_1_1101_19926_1347 length=301
CCCCCGGGGGGGGGEGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGGGGFGFGGGGGEGGGGGGGEGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGDGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGDDFEGGEGGGGGGG>FGGGCGGGGGGGGGGGGCGGGGCFGGGGGGG@+>FGDGG)9<);7ECCGGGFF@GGGE>7?FG7DG3:>GF6*4D>G6EB):F7@FFC?FFEF:GG)7(49)1

head -n 20 SRS4615107.trimmed_1.fastq
@SRS4615107.5 M00161_31_000000000-BFW3D_1_1101_19926_1347 length=301
TCTAATCCTGTTTGCTACCCACGCTTTCGTGTCTCAGCGTCAGTTACGGTCCAGAGAGCCGTCTACACCACCGGTGTTCCTCCTGATATCTACGCATTTCACCGCTACACCAGGAATTCCACTCTCCCATCCCGCACTCTAGCCTTCCAGTTTTCGGCGCACCCTCCCGGTTGAGCCGGGAGATTTCACACCAAACTTGGAGAGCCGCCTACACACCCTTTACGCCCAGTAACTCCGAACACCGCTCGCTGCCTACGTATTACCGCGCCGGCTGG
+SRS4615107.5 M00161_31_000000000-BFW3D_1_1101_19926_1347 length=301
GGGGGGGGGGGGGGGGFGGGGGGGGGGGGGGFGFGGGGGEGGGGGGGEGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGDGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGDDFEGGEGGGGGGG>FGGGCGGGGGGGGGGGGCGGGGCFGGGGGGG@+>FGDGG)9<);7ECCGGGFF@GGGE>7?FG7DG3:>GF6*4D>G6EB):F7@FFC?FFEF:GG)7(49)1

Thank everyone in advance.

Nicholas_Bokulich · January 5, 2021, 3:13pm

Hi @Xiaofan_Wang,
I suspect the reads are just failing to join because they are being trimmed too short to overlap. Q=33 is pretty high

At least for the sake of testing, I suggest trying a lower q score threshold and trying again,=.

different sequencing runs can have different quality profiles. This one may just be noisier than others.

As an aside, it would be possible to replicate this pipeline in QIIME 2 (see the q2-quality-filter plugin, which can do q-score-based trimming and filtering). The advantage of doing so is that all processing decisions would be recorded in provenance.

Good luck!

system · February 5, 2021, 9:13pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.