First, thanks a lot for all the development in qiime2 and the help provided here, you are amazing people!
I need to analyze several 454 and ion-torrent datasets. I would like to use clustering-vsearch-otus first, and it would be great to do the whole analysis in qiime2. I know that there is a straightforward tutorial for this already in qiime2: Clustering sequences into OTUs using q2-vsearch.
BUT the tutorial starts with already quality-control data.
Short question: ¿how can I quality-control my 454 data before the clustering, in qiime2, including trimming all reads to a fix lenght? (I would like to avoid shifting qiime1-qiime2).
Same question expanded:quality-demux-filter-stats.qzv (1.2 MB)
For the clustering (regarding qiime2) I need that the data follow this requirements:
- non-biological sequences are removed
- reads are all trimmed to the same length
- low-quality reads are discarded
So far I was able to do all these steps in qiime2 with the exception of TRIMMING to the same length (I want 300 bp after checking a raw data fastqc analysis).
I am executing the following commands (following the flowgram here for single-end baroded data Overview of QIIME 2 Plugin Workflows — QIIME 2 2018.11.0 documentation):
qiime cutadapt demux-single --i-seqs raw_data.qza --o-per-sample-sequences demux.qza --o-untrimmed-sequences untrimmed.qza --m-barcodes-file metadata.tsv --m-barcodes-column BarcodeSequence
qiime cutadapt trim-single --i-demultiplexed-sequences demux.qza --p-front AGAGTTTGATCMTGGCTCAG --p-adapter GCTGCCTCCCGTAGGAGT --o-trimmed-sequences demux-trimmed.qza
qiime quality-filter q-score --i-demux demux.qza --o-filtered-sequences quality-demux-filtered.qza --o-filter-stats qiality-demux-filter-stats.qza --p-min-quality 25
¿How can I trim now all reads to 300 bp? I don’t understand the utility here of the option “–p-min-length-fraction” as apparently it is not possible to fix a read length (say 300 bp).
Thanks in advance for your help!
NOTE: I guess the plugin “denoise-piro” (denoise-pyro: Denoise and dereplicate single-end pyrosequences — QIIME 2 2018.11.0 documentation) is not what I have to use for this (?), because it applies the DADA2 error correction algorithm, and the output is an ASV table, what for me does not make sense as an input for vsearch-clustering. Please, correct me in case I am wrong.