Hello, I'm new to QIIIME 2 as well as microbiome bioinformatics. I want to make sure I'm importing a sample properly and figure out why, when I do import, the sample contains few sequences and can't join.
When I extract and examine my fastq.gz files, the read sequence matches the Casava 1.8 Format exactly, but the filenames I was given (possibly renamed to retain anonymity) do not match the proper format for easy importing. I tried renaming them according to their lane number, but then I noticed there were multiple lanes in the sample. It is possibly worth noting that I was given only one sample (2 fastq.gz files: R1 and R2) to work with.
When I import using a manifest, it seems like a lot of sequences are missing and there isn't enough overlap to join pairs.
This is the command I run which completes without issue:
qiime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-path pe-33-manifest.tsv
--output-path paired-end-demux.qza
--input-format PairedEndFastqManifestPhred33V2
This is the manifest file: pe-33-manifest.tsv (131 Bytes)
And here is the .qza and .qzv:
paired-end-demux.qza (2.9 MB)
demux.qzv (269.1 KB)
I then run:
qiime vsearch join-pairs
--i-demultiplexed-seqs paired-end-demux.qza
--o-joined-sequences demux-joined.qza
demux-joined.qza (10.6 KB)
and then:
qiime demux summarize
--i-data demux-joined.qza
--o-visualization demux-joined.qzv
--verbose
which outputs the following:
Plugin error from demux:
Cannot describe a DataFrame without columns
Debug info has been saved to /var/folders/5l/48xfw0c53419vl06526g32kr0000gs/T/qiime2-q2cli-err-9oa7xejf.log
(qiime2-2019.10) ~/Desktop/fastq $ qiime demux summarize --i-data demux-joined.qza --o-visualization demux-joined.qzv --verbose
Traceback (most recent call last):
File "/opt/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2cli/commands.py", line 328, in call
results = action(**arguments)
File "</opt/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/decorator.py:decorator-gen-440>", line 2, in summarize
File "/opt/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/sdk/action.py", line 240, in bound_callable
output_types, provenance)
File "/opt/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/sdk/action.py", line 445, in callable_executor
ret_val = self._callable(output_dir=temp_dir, **view_args)
File "/opt/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_demux/_summarize/_visualizer.py", line 165, in summarize
forward_stats = _compute_stats_of_df(forward_scores)
File "/opt/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_demux/_summarize/_visualizer.py", line 92, in _compute_stats_of_df
percentiles=[0.02, 0.09, 0.25, 0.5, 0.75, 0.91, 0.98])
File "/opt/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/pandas/core/generic.py", line 10178, in describe
raise ValueError("Cannot describe a DataFrame without columns")
ValueError: Cannot describe a DataFrame without columns
Plugin error from demux:
Cannot describe a DataFrame without columns
See above for debug info.
I'm still new to all of this, but it looks like there just isn't enough overlap as seen in demux summary visualization? Or is there something else I'm doing wrong? Can it not join pairs for one sample?
Thanks.
Additional information
Mac / Mojave / 16GB RAM / 2.6 GHz Intel Core i7 / 500 GB HDD