Importing and Classifying already quality filtered, de-noised .fastq sequences

Purrsia_Felidae · July 9, 2018, 6:35pm

Hi there!!

I have Illumina 2x150bp paired end sequences .fastq sequences. These are not joined. These sequences are already ready to go, they are denoised, quality filtered, etc. I need to by-pass the Dada2/Deblur steps bc this isn't necessary. What I am trying to do is eventually get taxonomic information using the SILVA_132 database.

In Qiime1 I used to run the following command:
pick_open_reference_otus.py -i seqs.fna -p SILVA_123_db.parameters.txt -r SILVA_128_QIIME_release/rep_set/rep_set_all/97/97_otus.fasta -o M40A_silva123_out

I am trying to run the equivalent in Qiime2

I followed this for import: Importing data — QIIME 2 2017.12.0 documentation
following the prompts under: “Fastq manifest” formats"

Here is my command:
qiime tools import --type SampleData[PairedEndSequencesWithQuality] --input-path manifest_test.txt --output-path demux.test.qza --source-format PairedEndFastqManifestPhred33

This generated a demux.test.qza.

From here, I found this suggestion to by-pass DADA2:

This was the 'general' CLI suggestion:
qiime vsearch dereplicate-sequences
--i-sequences demux.qza
--o-dereplicated-table table
--o-dereplicated-sequences rep-seqs

But - as you can see, I need a 'table' to pass to the --o-dereplicated-table command as it gives me an 'Error: Missing option: --o-dereplicated-table' if I don't pass this flag. Only way I seem to find to create this table is to go through the DADA2 steps, which is unnecessary for my data set.

May I ask if there are any other options on how to generate a rep-seqs.qza?

Next, I wanted to then run the following command after I generate the rep-seqs.qza:

qiime feature-classifier classify-sklearn
--i-classifier silva-132-99-nb-classifier.qza
--i-reads rep-seqs.qza
--o-classification taxonomy.qza

-I found the silva-132-99-nb-classifier.qza here: Silva 132 classifiers

From here, I can attempt to move forward.

Many thanks for your advice!!

thermokarst · July 17, 2018, 1:46pm

Hey there @Purrsia_Felidae!

I would advise against skipping that level of QA/QC - perhaps this is a good opportunity for you to compare a few methods on one dataset!

Looks like you have a typo or a missing value in that command --- the --o-* flags are for the output of the command, meaning, the thing the command is generating or creating. In this case, sequences in, feature table and rep seqs out. Make sense? You won't be able to use your demuxed seqs as input to that command the, you could convert these reads to the post_split_libraries.py QIIME 1 format, and then import and dereplicate them.

I would highly encourage you to take some time reviewing the extensive QIIME 2 documentation, perhaps starting with the Overview tutorial!

Purrsia_Felidae · July 20, 2018, 5:21pm

Thank-you for your suggestions and insight.

system · August 20, 2018, 11:21pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.