I’ve got some samples from Illumina MiSeq to perform metagenomics out of 16S rRNA analysis. The reads came up paired,zipped and in a fastq format. I’ve already made their import into a .qza artifact, joined the paired ends, applied quality control (q-score 20), trimmed (cut off reads smaller than 200bp), discarded singletons (abundance lower than 2), but I’m stuck in the dereplication process.
I typed the following script
Parameter ‘sequences’ received an argument of type FeatureData[Sequence]. An argument of subtype SampleData[JoinedSequencesWithQuality] | SampleData[SequencesWithQuality] | SampleData[Sequences] is required.
Debug info has been saved to /tmp/qiime2-q2cli-err-mwsuvmto.log
What am I doing wrong? I need to dereplicate these reads before I sort out the chimeras and start to cluster to obtain the feature tables/sequences.
Hi @gamoc,
It looks like seqs.qza is the wrong type of input. It appears to be a FeatureData[Sequence] artifact, which is the type of output you would get from dereplicate-sequences. So you are either inputting the wrong artifact (e.g., you already ran this command and are mixing up your files) or else you imported this file as the wrong format. You may want to review this tutorial for a similar workflow and importing instructions for the relevant data type.
Thank you, Nicholas! Let me remake my question just to finish it up: may I skip the dereplication step and go straight forward to the OTU picking? The clusterization process already arrange the sequences into several clusters, so does it mean those ones which are equal (replicated) would fit in the same group?
No. Dereplicated seqs are required — it speeds up the process but most importantly the OTU clustering methods in QIIME 2 require specific input types, namely a feature table and FeatureData[Sequence] artifacts, which are the outputs of the dereplicate method.