Dereplication error

gamoc · March 2, 2019, 10:16pm

I’ve got some samples from Illumina MiSeq to perform metagenomics out of 16S rRNA analysis. The reads came up paired,zipped and in a fastq format. I’ve already made their import into a .qza artifact, joined the paired ends, applied quality control (q-score 20), trimmed (cut off reads smaller than 200bp), discarded singletons (abundance lower than 2), but I’m stuck in the dereplication process.
I typed the following script

qiime vsearch dereplicate-sequences --i-sequences seqs.qza --o-dereplicated-sequences derep-seqs --o-dereplicated-table derep-table

However, I’ve got this warning:

Parameter ‘sequences’ received an argument of type FeatureData[Sequence]. An argument of subtype SampleData[JoinedSequencesWithQuality] | SampleData[SequencesWithQuality] | SampleData[Sequences] is required.

Debug info has been saved to /tmp/qiime2-q2cli-err-mwsuvmto.log

What am I doing wrong? I need to dereplicate these reads before I sort out the chimeras and start to cluster to obtain the feature tables/sequences.

Nicholas_Bokulich · March 4, 2019, 2:11pm

Hi @gamoc,
It looks like seqs.qza is the wrong type of input. It appears to be a FeatureData[Sequence] artifact, which is the type of output you would get from dereplicate-sequences. So you are either inputting the wrong artifact (e.g., you already ran this command and are mixing up your files) or else you imported this file as the wrong format. You may want to review this tutorial for a similar workflow and importing instructions for the relevant data type.

gamoc · March 6, 2019, 11:37pm

Thank you, Nicholas! Let me remake my question just to finish it up: may I skip the dereplication step and go straight forward to the OTU picking? The clusterization process already arrange the sequences into several clusters, so does it mean those ones which are equal (replicated) would fit in the same group?

Nicholas_Bokulich · March 7, 2019, 12:51am

No. Dereplicated seqs are required — it speeds up the process but most importantly the OTU clustering methods in QIIME 2 require specific input types, namely a feature table and FeatureData[Sequence] artifacts, which are the outputs of the dereplicate method.

yes

system · April 7, 2019, 6:51am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.