P1: on dereplication and removing features
qiime vsearch dereplicate-sequences
--i-sequences total.qza
--p-min-seq-length 50
--p-min-unique-size 10
--o-dereplicated-table table.qza
--o-dereplicated-sequences rep-seqs.qzaPlugin error from vsearch:
Mapping not provided for observation identifier: D1_10122. If this identifier should not be updated, pass strict=False.
Debug info has been saved to /tmp/qiime2-q2cli-err-62rjm8jn.log
Dereplication is used to reduce data size before clustering.
Typically, this is a lossless process for features; all unique reads become unique features in the output table.
In this example, it's a lossy process; features are discarded if they are too short (<50 bp) or too rare (total count <10). This causes the error about missing feature identifiers.
Keeping all features fixes the error.
P2: pickling??
This can happen when the device is out of space. Databases are pretty big, so make sure you have plenty of space on the computer or worker node!
To investigate more, can you post that log file?Debug info has been saved to...