Hello everyone! I've already looked at a previous forum post which is from a user with the same error.
I used the Gut Microbiome Zymobiomics Standard for the mock communities.
In the previous post @Nicholas_Bokulich said that the error comes from expected and observed taxonomy not matching.
Based on this, I've copied the taxonomy from the expected features from my taxa_barplot.qzv (433.9 KB) that I generated from this data into the expected taxonomy file. In theory, the observed taxonomy and expected taxonomy should be exact matches. I've done this both to the species level and to the level that SILVA database can identify down to (mainly genus level for most taxa). Both result in the same error. This is the expected taxonomy file down to species level: zymo-taxonomy.tsv (2.0 KB). This is the expected taxonomy file to the genus/species level: zymo-taxonomy-genus.tsv (1.6 KB). And, this is the observed frequency table:
zymo-freq-table.qza (133.7 KB)
I can't figure out where the taxonomy file is not matching. I'm sure there's something I'm missing.
Below is the error message I received:
Traceback (most recent call last):
File "/hpc/home/klt75/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/q2cli/commands.py", line 468, in __call__
results = action(**arguments)
File "<decorator-gen-626>", line 2, in evaluate_composition
File "/hpc/home/klt75/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/action.py", line 274, in bound_callable
outputs = self._callable_executor_(
File "/hpc/home/klt75/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/action.py", line 558, in _callable_executor_
ret_val = self._callable(output_dir=temp_dir, **view_args)
File "/hpc/home/klt75/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_quality_control/quality_control.py", line 77, in evaluate_composition
results = _evaluate_composition(
File "/hpc/home/klt75/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_quality_control/_utilities.py", line 127, in _evaluate_composition
score_plot = _pointplot_multiple_y(
File "/hpc/home/klt75/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_quality_control/_utilities.py", line 281, in _pointplot_multiple_y
sns.pointplot(data=results, x=xval, y=score, ax=axes, color=color)
File "/hpc/home/klt75/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/seaborn/categorical.py", line 2839, in pointplot
plotter = _PointPlotter(x, y, hue, data, order, hue_order,
File "/hpc/home/klt75/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/seaborn/categorical.py", line 1603, in __init__
self.establish_colors(color, palette, 1)
File "/hpc/home/klt75/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/seaborn/categorical.py", line 707, in establish_colors
lum = min(light_vals) * .6
ValueError: min() arg is an empty sequence
These are the commands I used:
# ----------------- SCRIPT START -------------------- #
# load parameters
dos2unix ./config.sh
source ./config.sh
# setting input/output variables
echo -e "setting input/output variables"
inputDir="${WKPATH}/output/04-classify/qza"
# directory containing raw fastq.gz files
outputDir="${WKPATH}/output/05-zymoQC"
outputDirQZA="${outputDir}/qza"
outputDirQZV="${outputDir}/qzv"
# directories to place results from script
# if previous output folder exists, delete it
echo -e "checking for old folders, will remove to rerun analysis"
if [ -d "$outputDir" ]
then
echo -e "Previous output folder exists, deleting now..."
rm -Rfv -- "$outputDir"
fi
# -R deletes recursively, -f ignore non-existant files, -v verbose
# '--'' : no more flags for rm command
# making new import folders
echo -e "creating new output folders"
mkdir -p "${outputDir}"/{qza,qzv}
# -p ; make parent directories if needed
echo -e "Input directory is...$inputDir"
echo -e "Output directories are...
main folder: $outputDir
qza: $outputDirQZA
qzv: $outputDirQZV"
echo -e "setting up zymo reference variables..."
echo -e "$(date)"
ZYMOrefseq="/hpc/group/kimlab/Qiime2/reference/zymo-refs/zymo-seqs.fasta"
ZYMOtax="/hpc/group/kimlab/Qiime2/reference/zymo-refs/zymo-taxonomy-genus.tsv"
echo -e "Zymo references are...
zymo ref seqs: $ZYMOrefseq
zymo taxonomy: $ZYMOtax"
echo -e "finished setting up folders and variables"
echo -e "$(date)"
# import relative expected zymo sequences into qiime2 format
echo -e "importing expected taxonomy into qiime2 format"
echo -e "$(date)"
biom convert \
-i "$ZYMOtax" \
-o "$outputDir"/expected-taxonomy.biom \
--table-type="OTU table" \
--to-json
## convert tsv into biom
qiime tools import \
--type "FeatureTable[RelativeFrequency]" \
--input-path "$outputDir"/expected-taxonomy.biom \
--input-format BIOMV100Format \
--output-path "$outputDirQZA"/expected-taxonomy.qza
## import biom into rel.freq feature table
# import expected zymo sequences into qiime2 format
echo -e "importing expected sequences into qiime2 format"
echo -e "$(date)"
qiime tools import \
--input-path "$ZYMOrefseq" \
--output-path "$outputDirQZA"/expected-seqs.qza \
--type 'FeatureData[Sequence]'
## import fasta file into qza format
# filter out ASV table + rep-seqs to only zymo controls
echo -e "filtering ASV and rep-seqs table to only include zymo controls"
echo -e "$(date)"
qiime feature-table filter-samples \
--i-table "$tableQZA" \
--m-metadata-file "$MAPname" \
--p-where '[control]="zymo"' \
--o-filtered-table "$outputDir"/zymo-table.qza
## filter table to only contain zymo controls
qiime feature-table filter-seqs \
--i-table "$outputDir"/zymo-table.qza \
--i-data "$inputDir/rep-seqs.qza" \
--o-filtered-data "$outputDirQZA"/zymo-rep-seqs.qza
# turn ASV table into a table of rel. abundance table
echo -e "creating relative abundance table"
echo -e "$(date)"
qiime feature-table relative-frequency \
--i-table "$outputDir"/zymo-table.qza \
--o-relative-frequency-table "$outputDirQZA"/zymo-freq-table.qza
# compare expected v actual frequencies
echo -e "compare expected vs actual relative abundances and sequences"
echo -e "$(date)"
qiime quality-control evaluate-composition \
--i-expected-features "$outputDirQZA"/expected-taxonomy.qza \
--i-observed-features "$outputDirQZA"/zymo-freq-table.qza \
--o-visualization "$outputDirQZV"/eval-mock-freq-test.qzv
# compare expected sequences to actual sequences
qiime quality-control evaluate-seqs \
--i-query-sequences "$outputDirQZA"/zymo-rep-seqs.qza \
--i-reference-sequences "$outputDirQZA"/expected-seqs.qza \
--o-visualization "$outputDirQZV"/eval-mock-seqs-test.qzv
echo -e "finished zymo QC"
echo -e "$(date)"