Quality-filter of sequences QIITA/EBI

Hi everyone,

I would like to analysis some QIITA studies but it seems like the sequences (submitted to EBI) was trimmed and filtered using q1 split_library_fastq.py. As far as I know the script includes a quality filtering step based on the phred score and the parameters are quite similar to q2 quality filter.

I am not sure whether the two are functionally equivalent or not. I noticed that the phred threshold 3 is quite low and I am planning to increase it to 20. Would it be legitimate to download the filtered sequences from EBI and filter it again using q2 quality filter? Also, if a base is found to be below the defined threshold, would it be trimmed out from sequences or replaced by a wildcard N?

Another question (may not be related to QIIME2) is that, is there a easy way to identify the primers that have been used in sequencing? I got some sequences online but the metadata are incomplete and the primer is unknown. Most sequences seem to have a 515f primer yet if I use cutadapt only around 50% sequences can be trimmed (see attachment). Would it be possible to have multiple primer (in generally) in one sequences? Any suggestions would be greatly appreciated.

attachment

Best,
Xi

Regarding the primer situation, any chance you were impacted by this bug? It’s possible its the same primer and our wrapping code mixed up the forward and reverse reads.

HI @ebolyen,

thank you very much! It seems like the 515f primer shows up randomly in forward/reverse reads and that might be the reason only 50% sequences were trimmed. Thanks!

Best,
Xi

sounds like that bug might NOT be the problem. If 50% of the sequences are being trimmed, your input sequences may be in mixed orientations; i.e., your "forward" and "reverse" reads are actually mixtures of 515f->806r and 806r->515f amplicons. Can you confirm? You can hunt around on this forum to check out how other users have addressed this issue, but the short answer right now is that we don't have a way to demultiplex these yet in QIIME 2.

Correct — they do the same thing (in qiime1 this was built into demultiplexing, in qiime2 we split apart demultiplexing and quality filtering — largely because some denoising methods like dada2 don't require the extra filtering)

That would be wise, and in line with our own recommendations for qiime1-style filtering

Yes, if you are trying to apply a more stringent filter.

trimmed.

I hope that helps!

Thank you for suggestions! @Nicholas_Bokulich

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.