Greetings Qiime community, I typically do analysis in R after classification, tree building and BIOM exporting but am giving QIIME2’s other features a go.
I am starting with alpha and beta diversity metrics but have run into the following error:
All feature_ids
must be present as tip names in phylogeny
. feature_ids
not corresponding to tip names (n=1): 429b5b12899efa00c5d61dc11c424c7f
I could not find this feature ID in my table.qza after performing a search.
I am not sure how to remedy this. My next approach was to go into the tree.nwk, which is upstream in the pipeline of the midpoint rTree (rooted-tree-filtered.qza) being used in the diversity commands. I opened the .nwk in a text editor and searched there. Still didnt find this missing tip/feature ID.
Here are the scripts I’m running on my University’s cluster.
Import:
# ----------------Load Modules--------------------
module load qiime2/2018.4
# ----------------Housekeeping---------------------
#rm -r demux*.q*
cd data
# ----------------Commands------------------------
#Import Data in qiime2 artifact
qiime tools import \
--type 'SampleData[PairedEndSequencesWithQuality]' \
--input-path /ufrc/strauss/emerickl/WCT/data/raw_data \
--source-format CasavaOneEightSingleLanePerSampleDirFmt \
--output-path demux-paired-end.qza
qiime demux summarize \
--i-data demux-paired-end.qza \
--o-visualization demux-paired-end.qzv
DADA
cp data/demux-paired-end.qza features/dada2input.qza
qiime dada2 denoise-paired \
--i-demultiplexed-seqs dada2input.qza \
--output-dir output \
--p-n-threads 14 \
--o-table table.qza \
--o-representative-sequences rep-seqs.qza \
--p-trunc-len-f 251 --p-trunc-len-r 250
Feature table
qiime feature-table summarize \
--i-table table.qza \
--o-visualization table.qzv \
--m-sample-metadata-file #PATH TO VALIDATED MAPPING FILE
qiime feature-table tabulate-seqs \
--i-data rep-seqs.qza \
--o-visualization rep-seqs.qzv
qiime diversity alpha-rarefaction \
--i-table table.qza \
--o-visualization alpha-rarefaction.qzv \
--p-max-depth 8200
--p-metrics chao1,simpson,shannon
--m-metadata-file #PATH TO VALIDATED MAPPING FILE
Taxonomy (classifier trainer at bottom)
qiime feature-classifier classify-sklearn \
--i-reads rep-seqs.qza \
--o-classification taxonomy.qza \
--i-classifier /ufrc/strauss/emerickl/SILVA_nb_99_V3-V4.qza
qiime metadata tabulate \
--m-input-file taxonomy.qza \
--o-visualization taxonomy.qzv
qiime taxa barplot \
--i-table table.qza \
--i-taxonomy taxonomy.qza \
--o-visualization taxa-bar-plots.qzv \
--m-metadata-file WCTmetaData.tsv
BIOM export and other stuff
qiime tools export \
table.qza \
--output-dir ../biom
qiime tools export \
taxonomy.qza \
--output-dir ../biom
module load qiime/1.9.1
cd ../biom
biom convert \
-i feature-table.biom \
-o feature-json.biom \
--table-type="OTU table" \
--to-json
sed -i s/Taxon/taxonomy/ taxonomy.tsv | sed -i s/Feature\ ID/FeatureID/ taxonomy.tsv
biom add-metadata \
-i feature-json.biom \
-o feature_w_tax.biom \
--observation-metadata-fp taxonomy.tsv \
--observation-header FeatureID,taxonomy,Confidence \
--sc-separated taxonomy --float-fields Confidence
filter_samples_from_otu_table.py \
-i feature_w_tax.biom \
-o filtered-table.biom \
-n 5000 #Pay attention to this number, change it according to the table visualization
filter_taxa_from_otu_table.py \
-i filtered-table.biom \
-o table_wo_chl_mit.biom \
-n D_2__Chloroplast,D_4__Mitochondria
normalize_table.py \
-i table_wo_chl_mit.biom \
-a DESeq2 \
--DESeq_negatives_to_zero \
-o DESeq2_table.biom
biom add-metadata \
-i DESeq2_table.biom \
-o DESeq2_w_tax.biom \
--observation-metadata-fp taxonomy.tsv \
--observation-header FeatureID,taxonomy,Confidence \
--sc-separated taxonomy --float-fields Confidence
normalize_table.py \
-i table_wo_chl_mit.biom \
-a CSS \
-o CSS_table.biom
biom convert \
-i table_wo_chl_mit.biom \
-o feature-table.tsv \
--to-tsv \
--table-type "OTU table"
sed -i s/"#OTU ID"/FeatureID/ feature-table.tsv
sed -i '1d' feature-table.tsv
Build Trees
qiime feature-table filter-seqs \
--i-data ../features/rep-seqs.qza \
--m-metadata-file feature-table.tsv \
--p-no-exclude-ids \
--o-filtered-data rep-seqs-filtered.qza
qiime alignment mafft \
--i-sequences rep-seqs-filtered.qza \
--p-n-threads 12 \
--o-alignment aligned-rep-seqs-filtered.qza
qiime alignment mask \
--i-alignment aligned-rep-seqs-filtered.qza \
--o-masked-alignment masked-aligned-rep-seqs-filtered.qza
qiime phylogeny fasttree \
--i-alignment masked-aligned-rep-seqs-filtered.qza \
--o-tree unrooted-tree-filtered.qza
qiime phylogeny midpoint-root \
--i-tree unrooted-tree-filtered.qza \
--o-rooted-tree rooted-tree-filtered.qza
qiime tools export \
rooted-tree-filtered.qza \
--output-dir .
SciKit train
qiime tools import \
--type 'FeatureData[Sequence]' \
--input-path SILVA_132_QIIME_release/rep_set/rep_set_16S_only/99/silva_132_99_16S.fa \
--output-path SILVA_132_99_otus.qza
qiime tools import \
--type 'FeatureData[Taxonomy]' \
--source-format HeaderlessTSVTaxonomyFormat \
--input-path SILVA_132_QIIME_release/taxonomy/16S_only/99/consensus_taxonomy_7_levels.txt \
--output-path SILVA_132_99_tax.qza
qiime feature-classifier extract-reads \
--i-sequences SILVA_132_99_otus.qza \
--p-f-primer GTGYCAGCMGCCGCGGTAA \
--p-r-primer GGACTACNVGGGTWTCTAAT \
--p-trunc-len 300 \
--o-reads SILVA_132_99_otus_515-926.qza
qiime feature-classifier fit-classifier-naive-bayes \
--i-reference-reads SILVA_132_99_otus_515-926.qza \
--i-reference-taxonomy SILVA_132_99_tax.qza \
--o-classifier SILVA_nb_99_V3-V4.qza