Creating a V3 classifier killed.

pasare · January 7, 2021, 2:13pm

Hello Mike Robeson, I am having this error message whilst trying to use the RESCRIPt to create a V3 classifier.

qiime feature-classifier fit-classifier-naive-bayes \

--i-reference-reads silva-138-ssu-nr99-seqs-388f-518r-uniq.qza \
--i-reference-taxonomy silva-138-ssu-nr99-tax-388f-518r-derep-uniq.qza \
--o-classifier silva-138-ssu-nr99-388f-518r-classifier.qza

Killed

Could be related to my computing power?

SoilRotifer · January 7, 2021, 2:50pm

Yes. It is likely you do not have enough memory on your system. In my experience it is not unusual to require anywhere from 24 - 64 GB RAM. If you do not have access to a machine with more memory you can try using alternative means of classification through the feature-classifier plugin.

You can shrink the memory footprint a couple of ways:

Remove / Filter taxonomic groups you are not interested in (e.g. Eukaryota) prior to making your classifier.
- You can use qiime taxa filter-seqs for this.
Do not include the organism name a the species label (i.e. do not use the --p-include-species-labels when constructing your SILVA reference database). These may not be reliable anyway, as SILVA only curates down to genus-level.

pasare · January 7, 2021, 4:35pm

Thank you for your quick suggestions.

Pertaining to the removal of Eukaryota and also Archaea
Can I use the filter-seqs-length-by-taxon plugin?

For example
qiime rescript filter-seqs-length-by-taxon
--i-sequences silva-138-ssu-nr99-seqs-cleaned.qza
--i-taxonomy silva-138-ssu-nr99-tax.qza
--p-labels Bacteria
--p-min-lens 1200
--o-filtered-seqs silva-138-ssu-nr99-seqs-filt.qza
--o-discarded-seqs silva-138-ssu-nr99-seqs-discard.qza

SoilRotifer · January 7, 2021, 5:59pm

This script is intended to filter reference sequences based on the length of that sequence given the taxonomy provided, and leave the rest of the data as is. That is, your command would only filter Bacteria based on the length you specified, all other data will remain in your final output file. This is why I recommended qiime taxa filter-seqs.

If you would like to use qiime rescript filter-seqs-length-by-taxon, then you could use the following command by just setting impossibly high sequence lengths for the other taxonomic groups. But this would end up being much slower than the command I recommended as it will have to check the taxonomy of each entry, and then check the sequence length before it makes the decision to filter.

qiime rescript filter-seqs-length-by-taxon \ 
    –i-sequences silva-138-ssu-nr99-seqs-cleaned.qza \
    –i-taxonomy silva-138-ssu-nr99-tax.qza \
    –p-labels Bacteria Archaea Eukaryota \
    –p-min-lens 1200 9999 9999 \
    –o-filtered-seqs silva-138-ssu-nr99-seqs-filt.qza \
    –o-discarded-seqs silva-138-ssu-nr99-seqs-discard.qza

pasare · January 7, 2021, 7:05pm

thanks again for your comment.

Please I will like to use the qiime taxa filter-seq plugin, however, I am not too familier with that command.

Could please provide me with some tips?

pasare · January 7, 2021, 7:05pm

Also which step should I run the qiime taxa filter-seqs . scrpit?

qiime rescript parse-silva-taxonomy
--i-taxonomy-tree taxtree-silva-138-nr99.qza
--i-taxonomy-map taxmap-silva-138-ssu-nr99.qza
--i-taxonomy-ranks taxranks-silva-138-ssu-nr99.qza
--o-taxonomy silva-138-ssu-nr99-tax.qza
qiime rescript cull-seqs
--i-sequences silva-138-ssu-nr99-seqs.qza
--o-clean-sequences silva-138-ssu-nr99-seqs-cleaned.qza
qiime rescript filter-seqs-length-by-taxon
--i-sequences silva-138-ssu-nr99-seqs-cleaned.qza
--i-taxonomy silva-138-ssu-nr99-tax.qza
--p-labels Archaea Bacteria Eukaryota
--p-min-lens 900 1200 1400
--o-filtered-seqs silva-138-ssu-nr99-seqs-filt.qza
--o-discarded-seqs silva-138-ssu-nr99-seqs-discard.qza
qiime rescript dereplicate
--i-sequences silva-138-ssu-nr99-seqs-filt.qza
--i-taxa silva-138-ssu-nr99-tax.qza
--p-rank-handles 'silva'
--p-mode 'uniq'
--o-dereplicated-sequences silva-138-ssu-nr99-seqs-derep-uniq.qza
--o-dereplicated-taxa silva-138-ssu-nr99-tax-derep-uniq.qza
qiime feature-classifier extract-reads
--i-sequences silva-138-ssu-nr99-seqs-derep-uniq.qza
--p-f-primer ACWCCTACGGGWGGCAGCAG
--p-r-primer ATTACCGCGGCTGCTGG
--p-n-jobs 2
--p-read-orientation 'forward'
--o-reads silva-138-ssu-nr99-seqs-388f-518r.qza
qiime rescript dereplicate
--i-sequences silva-138-ssu-nr99-seqs-388f-518r.qza
--i-taxa silva-138-ssu-nr99-tax-derep-uniq.qza
--p-rank-handles 'silva'
--p-mode 'uniq'
--o-dereplicated-sequences silva-138-ssu-nr99-seqs-388f-518r-uniq.qza
--o-dereplicated-taxa silva-138-ssu-nr99-tax-388f-518r-derep-uniq.qza
qiime feature-classifier fit-classifier-naive-bayes
--i-reference-reads silva-138-ssu-nr99-seqs-388f-518r-uniq.qza
--i-reference-taxonomy silva-138-ssu-nr99-tax-388f-518r-derep-uniq.qza
--o-classifier silva-138-ssu-nr99-388f-518r-classifier.qza

SoilRotifer · January 7, 2021, 7:13pm

The easiest approach, as I suggested earlier, would be to do this just prior to making your classifier. This way you do not have to re-run several steps. So, do this between steps 6 and 7.

SoilRotifer · January 7, 2021, 7:18pm

Additionally, you can check out the QIIME 2 documentation, and work through the tutorials.

You can view a list of all available QIIME 2 plugins and actions. Here is the information for the qiime taxa filter-seqs command. Do not forget you can add --help to any of the qiime commands to read the help text.

pasare · January 12, 2021, 8:32am

Thank you very much for your suggestions. I got it to work

system · February 12, 2021, 2:32pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.