I’m QIIME 2 v2019.10 on a cluster and following the " Alternative methods of read-joining in QIIME 2" tutorial with my data, and am stuck at the deblur denoising/subOTU picking step. I ran the command:
"Traceback (most recent call last): File “/opt/conda/envs/qiime2-2019.10/lib/python3.6/site-packages/q2cli/commands.py”, line 328, in call results = action(**arguments) File “</opt/conda/envs/qiime2-2019.10/lib/python3.6/site-packages/decorator.py:decorator-gen-449>”, line 2, in denoise_16S File “/opt/conda/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 240, in bound_callable output_types, provenance) File “/opt/conda/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 383, in callable_executor output_views = self._callable(**view_args) File “/opt/conda/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_deblur/_denoise.py”, line 100, in denoise_16S hashed_feature_ids=hashed_feature_ids) File “/opt/conda/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_deblur/_denoise.py”, line 150, in denoise_helper ids_with_underscores = df[df.index.str.contains(’’)].index.tolist() File “/opt/conda/envs/qiime2-2019.10/lib/python3.6/site-packages/pandas/core/accessor.py”, line 175, in get accessor_obj = self._accessor(obj) File “/opt/conda/envs/qiime2-2019.10/lib/python3.6/site-packages/pandas/core/strings.py”, line 1917, in init self._inferred_dtype = self._validate(data) File “/opt/conda/envs/qiime2-2019.10/lib/python3.6/site-packages/pandas/core/strings.py”, line 1967, in _validate raise AttributeError("Can only use .str accessor with string " “values!”) AttributeError: Can only use .str accessor with string values! Plugin error from deblur: Can only use .str accessor with string values! "
The test dataset in the tutorial runs fine, so it is something with my file rather than the qiime2 setup. The only thing I can think of is that the script is looking for samples which no longer exist because they were filtered out at the previous step (it is an old labeling experiment so some samples had very few reads to start with). If indeed this is likely to be the case, how do I filter out these fastq files from the .qza file? Or do you have any other ideas about what might have gone wrong?
It seems like there is an issue with the sample ID’s. Can you share with me what your sample ID’s are like?
For example, the sample ID’s for the tutorial data are all of the format XXXYYYY.Y[.Y], where X are letters, Y are numbers, and [] indicate the contained is optional. E.g., BAQ1552.1.1, BAQ2420.2, YUN3856.1.3 are some of the sample ID’s in the file.
Should the sample IDs here be matching the sample IDs in the fastq header of the corresponding files?
In case it will interfere with your interpretation of the output, I ran the commands above on the qiime2 virtual box installed on my laptop, but run qiime2 on a cluster… my laptop isn’t powerful enough to do some of the steps in qiime, but qiime2 isn’t installed in a way that I can access it like a python package for the commands you suggested above)
I did upload the data as a fastq file, and used the --type EMPPairedEndSequences option, rather than making a manifest file. The sequencing was a Golay barcoded, multiplexed 2*150nt illumina run using the original EMP primers from Caporaso. Is the --type EMPPairedEndSequences option only for the newer EMP protocol? I’m not sure if qiime2 would make a manifest file upon uploading the data, and if so, how I would access it. Sorry!
So the reason you are getting the error above is that the entries in the #SampleID column are integers.
Are you sure these match the #SampleID’s in your sample metadata?
If so, then I think you can work around the error by creating a new mapping file with each #SampleID remapped to something with non-numeric characters, e.g., ['sample-1', 'sample-2', 'sample-3', ..., ]. You will need to do this both in the mapping file and the sample metadata file.
If these do not match your sample metadata, you may need to do some more investigating to determine how these sample ID’s correspond to the ones in those fastq headings you showed above.