How to set trim and trunc in a curvilinear plot?

Wang_cs001632 · December 20, 2019, 2:23pm

Could you diagnosis whether it is a technical error or a normal outcome? How can we set trim and trunc parameters based on this result?

colinbrislawn · December 20, 2019, 2:52pm

Good morning, @Wang_cs001632,

Welcome to the Qiime 2 forums! :qiime2:

That is an unusual quality score outcome!

Can you tell me more about this data set? How long was the amplicon sequenced?
This looks like paired end reads that were joined... what tool was used to join them?

Colin

Wang_cs001632 · December 21, 2019, 2:26am

Thanks for your timely reply. I add the description of pipeline and a fastq as example.
s107_1_L001_R1_001.fastq.gz (6.8 MB)
Illumina MiSeq

Trimmomatic (v0.35) to remove noise

FLASH (v1.2.11) to join sequence : pair-end reads shall overlap at least 10 bp, the maximum allowable overlap of pair-end reads shall be 200 bp, and the maximum allowable overlap of overlap shall be 20%.

QIIME (v8.0) [6] software will be used for the raw data. Firstly, sequence data of each sample will be distinguished according to barcode sequence, and then files of each sample sequence without barcode will be generated. In general, the file containing barcode can be directly used for the subsequent analysis of biological information. However, if higher quality and more accurate analysis results of biological information are needed, the sequence should be further de-hybridized.

Including sequences containing ambiguous bases, homologous regions, and sequences that are too short and too long (reserved for sequences greater than 200 bp) reduces the quality of the analysis. You remove these bases.

The further noise-removing process and parameters were set as: remove reads containing N bases, the longest permitted single-base repeat sequence (such as AAAAAA) was 6 bases, and the maximum allowed primer mismatch was 2 bases. Mismatches during barcode matching are not allowed.

Through the above quality control steps, finally get the clean tags sequence. (fastq)

Qiime2 code:

qiime tools import --type 'SampleData[SequencesWithQuality]' --input-path mouse --input-format CasavaOneEightSingleLanePerSampleDirFmt --output-path demux-single-end.qza

Your kindly help will greatly accelarate our study!

Nicholas_Bokulich · December 23, 2019, 4:40pm

Hi @Wang_cs001632,
dada2 cannot be used to denoise pre-joined reads.

you can use q2-deblur to denoise those sequences — and the quality is good enough that I recommend not using any trimming/truncation.

You could replicate the entire pipeline you have described in QIIME 2: use q2-vsearch join-pairs instead of trimmomatic, and q2-demux to demultiplex, instead of qiime1. Just pointing this out if you want to implement your entire workflow in QIIME 2.

Good luck!