Removing non-biological sequences from raw reads

Hi,

This is a very simple question. I have received some reads which I am not sure if they still contain primers, adapters or any type of non-biological sequences. I want to first check if these non-biological sequences exist, and secondly remove them before downstream analysis. Is there a straightforward way to perform the mentioned tasks in Qiime?

1 Like

Hi!
Just in case if nobody will provide a better answer, i checked the dataset after importing and before primer removal step by this command:

qiime demux summarize \
    --i-data demux-paired-end.qza \
    --o-visualization demux-paired-end.qzv 

then opened it in a browser and searched for partial sequences of possible primers / adaptors manually.

If you’ll find primers you cane remove them by cutadapt
https://docs.qiime2.org/2019.10/plugins/available/cutadapt/

1 Like

Hi and thanks for your answer.
I was wondering how did you search for primers/adapters manually?

I knew the region so I just took commonly used primers and searched in the opened in browser .qzv file for partial sequences of those primers. In my case I confirmed that primers are deleted and proceeded with the analysis.

But how do you see the sequences in demux.qzv?
Because all I see is quality scores.

Sorry, I think I copied wrong command - I already deleted this part of the analysis from my laptop and I can’t access a working machine this week.
Probably you need this one:

qiime feature-table tabulate-seqs \
    --i-data rep-seqs.qza \
    --o-visualization rep-seqs.qzv

I remember that I was able to run it with my .qza file before DADA2 in older version of Qiime2

1 Like

The easiest one-step procedure may just be to run q2-cutadapt without looking. If primers or adapters are present, you will see a reduction in read length. If not, then there should be no effect!

You would need to export the sequences and look at them directly.

tabulate-seqs can't help you here, since it requires a FeatureData[Sequence] artifact as input. You can only examine the fastq sequences by exporting them from QIIME 2. Nothing wrong with that... just export and then search for your primer sequences. One reason to just use q2-cutadapt for this is that it will perform a search for degenerate primers so that you don't need to.

2 Likes

Also, question, generally do you know these were processed? Illumina and also the V-region sequenced?

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.