Problem Obtaining Proper Bar Graphs

Hello!

I'm extremely new to this, and kinda of fumbling my way through.

I'm dealing with 16s sequences from water samples, looking for potential pathogens. I've gotten all the way to the end where I visualize the .qzv bar plot files.

Essentially, the screenshot I attached is the problem.


The one with only green bars is not properly presenting the different genus in the bar plot.
While the second screenshot is from using another students metadata (after being trimmed and denoised)

I believe the problem is when I'm trimming my own sequences?

Below is the order of inputs I use.

Active qiime by the following script (in inverted commas).

“conda activate qiime2-2023.5”

4). Input manifest file to get sequences and create a demux file (.qza). Use the following scripts.

qiime tools import \
--type 'SampleData[PairedEndSequencesWithQuality]' \
--input-path 16S-manifest.txt \
--output-path S16S-qiime-demux.qza \
--input-format PairedEndFastqManifestPhred33V2

5). To visualize the data use this commond to conver qza file into a qzv file. Which can be visualized the quality of F&R reads. Slect interactive quality plot to see the read qualities. Output will be a visualization file for quality assessment.

qiime demux summarize \
--i-data S16S-qiime-demux.qza \
--o-visualization S16S-qiime-demux_sum.qzv

6). Use this link to visualize the quality control on website “https://view.qiime2.org/

7). To denoise sequences and assemble F&R reads. Use the following scripts.

qiime dada2 denoise-paired \
--i-demultiplexed-seqs S16S-qiime-demux.qza \
--p-trim-left-f 20 \
--p-trim-left-r 20 \
--p-trunc-len-f 270 \
--p-trunc-len-r 250 \
--o-table table-S16S-qiime.qza \
--o-representative-sequences rep-seqs-S16S-qiime-demux.qza \
--o-denoising-stats denoising-stats-S16S-qiime-demux.qza

This will create three files i) Table file (Frequency or feature table), denoising stat (DADA2 stats), rep sequences (feature sequences).

8). This will convert denoising .qza file into .qzv file.

qiime metadata tabulate \
--m-input-file denoising-stats-S16S-qiime-demux.qza \
--o-visualization denoising-stats-S16S-qiime-demux.qzv

This give table that output that is downloaded. This provides information on number of sequences that passed filtering process.

9). Create a meta data file as TSV which will have all naming information and all other sampling details.

10). This script will generate a qzv table that can be visualized to assess the distribution sequences across different samples (sand, nodule, rhizosphere).

qiime feature-table summarize \
--i-table table-S16S-qiime.qza \
--o-visualization table-S16S-qiime.qzv \
--m-sample-metadata-file Sequiota_metadata.tsv

11). This script will create a qzv file for rep sequences, number of sequences, length, and frequency

qiime feature-table tabulate-seqs \
--i-data rep-seqs-S16S-qiime-demux.qza \
--o-visualization rep-seqs-rep-seqs-S16S-qiime-demux.qzv

12). Qiime phylogeny this

qiime phylogeny align-to-tree-mafft-fasttree \
--i-sequences rep-seqs-S16S-qiime-demux.qza \
--o-alignment aligned-rep-seqs-S16S.qza \
--o-masked-alignment masked-aligned-rep-seqs-S16S.qza \
--o-tree unrooted-tree.qza \
--o-rooted-tree rooted-tree.qza

This will create a rooted and unrooted tree.qza files.

13). Xxx

14). Creating ASV table this script can be used to create ASV table. Two output files will be created clustering table and rep seq table. You can specify the cutoff levels here 97% identity or 99% identity.

qiime vsearch cluster-features-de-novo \
--i-table table-S16S-qiime.qza \
--i-sequences rep-seqs-S16S-qiime-demux.qza \
--p-perc-identity 0.97 \
--o-clustered-table table-16s-dn-97.qza \
--o-clustered-sequences rep-seqs-16s-dn-97.qza

15). This script will export ASV table file as “feature-table.biome” as an output

qiime tools export \
--input-path table-16s-dn-97.qza \
--output-path asv-table_97

16). This script will convert -table.biom into a .tsv to view it. Use this and then copy into excel. Create this output in a separate folder. Bring it in the current working directory.

biom convert -i feature-table.biom -o 16s-ASV_97.txt --to-tsv

Taxonomy Inputs

First thing to download a reference 16 S rRNA data as 0.85 out.fasta and. A taxonomy file that goes with this. Go to this site “Moving Pictures” tutorial — QIIME 2 2020.8.0 documentation go to taxonomy and click this link “Training feature classifiers with q2-feature-classifier “ and down load taxonomy “ select data sources “Download URL: https://docs.qiime2.org/2020.8/data-resources/**” and download recent greengene data “Greengenes (16S rRNA 13_8 (most recent)”. I used only 85% OUT and 85% taxonomy file for this analysis.**

18). This script will convert fasta ref file into .qza file.

 qiime tools import \
--type 'FeatureData[Sequence]' \
--input-path 85_otus.fasta \
--output-path 85_otus.qza

19). Do the same for taxonomy file.

qiime tools import \
--type 'FeatureData[Taxonomy]' \
--input-format HeaderlessTSVTaxonomyFormat \
--input-path 85_otu_taxonomy.txt \
--output-path ref-taxonomy.qza

This will create a ref taxonomy file.

20). This script will trim the reference data base to the exact primer sequences for 16S rRNA gene. For 16S reads:

qiime feature-classifier extract-reads \
--i-sequences 85_otus.qza \
--p-f-primer GTGCCAGCMGCCGCGG \
--p-r-primer CCGTCAATTCMTTTRAGTTT \
--p-trunc-len 0 \
--p-min-length 340 \
--p-max-length 400 \
--o-reads ref-seqs-truncated.qza

20). Train classifer: Naïve Bayesian classifier basically link the sequences with the classification.

qiime feature-classifier fit-classifier-naive-bayes \
--i-reference-reads ref-seqs-truncated.qza \
--i-reference-taxonomy ref-taxonomy.qza \
--o-classifier classifier.qza

This will create a classifier file.

21). Test classifer: Finally, we verify that the classifier works by classifying the representative sequences.

qiime feature-classifier classify-sklearn \
--i-classifier classifier.qza \
--i-reads rep-seqs-16s-dn-97.qza \
--o-classification taxonomy.qza”

This will create a taxonomy.qza file.

22). Convert taxonomy file to a visualization file.

qiime metadata tabulate \
--m-input-file taxonomy.qza \
--o-visualization taxonomy.qzv

This can be viewed in an interactive mode.

23). Link meta file with the classification file to rearrange plots based on the treatment. (make sure to use right table.

qiime taxa barplot \
--i-table table-16s-dn-97.qza \
--i-taxonomy taxonomy.qza \
--m-metadata-file SP_metadata3.tsv \
--o-visualization 16S-taxa-bar-plots.qzv

This will create a taxa barpot fig that can be visualized.

Hello @ColdJello,

As a first troubleshooting step, would you feel comfortable sharing your demux qzv and dada2 denoising stats artifact?

Also, could you elaborate on the similarities and differences between the two taxa bar plots you shared? Are they the same sequencing data, but different analysis steps? Same environment but different sequencing runs? etc.

2 Likes

I appreciate the response! I ended up figuring it out, I was using 85% similar reference when I should have been using 99%. Everything worked once I switched OTU files.
Thanks everyone!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.