Broken pipe error with feature classifier

cecibe · March 22, 2018, 12:29pm

I try to run classify-sklearn. I wrote:
qiime feature-classifier classify-sklearn --i-classifier silva-119-99-515-80
6-nb-classifier.qza --i-reads rep-seqs.qza --o-classification Silvataxonomy.qza
after waiting long time it appeared:
Plugin error from feature-classifier:
Debug info has been saved to /tmp/qiime2-q2cli-err-u89rmbtf.log

After that appeared:
[Parent 1819, Gecko_IOThread] WARNING: pipe error: Tubería rota: file /build/firefox-CQifnS/firefox-59.0.1+build1/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 709

Can some one explain me what waas my mistake? thank

ebolyen · March 22, 2018, 9:12pm

Hi @cecibe,

It's possible you ran out of memory (or the process was killed for other reasons) since I don't see any message between:

Plugin error from feature-classifier:

and

Debug info has been saved to /tmp/qiime2-q2cli-err-u89rmbtf.log

(There's supposed to be something there.)

If you still have this file: /tmp/qiime2-q2cli-err-u89rmbtf.log that would be helpful to look at, but I don't expect to see anything in it really.

Are you running this on VirtualBox? If so, try increasing the memory available to the machine.

And then as far as this one goes:

You can ignore that, it's just Firefox complaining about something unimportant. The reason you're seeing it at all is probably because you used qiime tools view at some point. That tool launches the default browser, and when browsers are launched in a terminal they seem to write down all of their complaints in it

Nothing to worry about though.

cecibe · March 23, 2018, 7:33pm

Thanks for your help. I am not using a virtual box… I have tried the again but the same message appeared. My problem is that I expected to find archaes but when I run greengen data base I did not found it, that is way I try with Siva. Is there some way to run a classifiy specific for archae? My seqs were amplified by 341F and 805R. thanks

Nicholas_Bokulich · March 23, 2018, 7:53pm

Even without virtualbox, this is probably still a memory error. Some users report that they need 16-32 GB to train/classify the SILVA reference database. You can set --p-reads-per-batch to a lower value (e.g., 2000?) to reduce this memory load at the expense of time; that's what I do to use SILVA on an 8GB laptop.

Greengenes does contain archaea, 4942 entries to be exact:

grep 'Archaea' gg_13_8_otus/taxonomy/99_otu_taxonomy.txt | wc -l
    4942

(that's not to imply that greengenes or classifiers are infallible! just that you may not need SILVA, and you might not have any archaea)

Yes, you could train a classifier specific for Archaea. You would:

filter out all non-archaea sequences from your database of choice (SILVA?)
filter out all non-archaea taxonomies from your database of choice (SILVA?)
import both to qiime2
train a feature classifier in qiime2, or use one of the alignment-based classifiers. The latter might be better with a constrained reference database like this because you can set a similarity threshold — anything that is less than X % similar to the reference sequences (i.e., bacteria; I don't know what is a good % to use for separating out bacteria) will be unclassified (classify-sklearn should also leave these unclassified, but you just don't have control over a specified similarity like this)
classify your sequences. Anything unclassified is probably bacteria, and you can filter these out and classify separately.

Good luck!

system · April 24, 2018, 1:53am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.