Create a FeatureID table with count frequency data for each sample

hdoris · August 14, 2024, 5:37pm

Hey Team-

I have searched through the forums and I can find a few that are almost what I need but not exactly what I am looking for. I would like to create a table that has the list of FeatureIDs and then the corresponding frequency's of each sample. I would want it to look something like this (FrequencyID_count_table.txt).
FeatureID_count_table.txt (669 Bytes)
I would then want a corresponding list of all those same FeatureIDs and their fasta sequences. Something that looks like this (FeatureID_fasta_seq.txt).
FeatureID_fasta_seq.txt (1.5 KB)
So far I have been able to create the fasta seq file. I am attaching the file I have created (8_denoise_dada2_seq_visfile.qzv).
8_denoise_dada2_seq_visfile.qzv (508.0 KB)
I am able to also create a FeatureID table that has frequency data for each of the FeatureIDs but not for the indvidual samples (FeatureID_frequency.txt).
If I classify the sequences I am able to create a Taxonomy Bar chart that has frequencys for each individual sample (18_Unfiltered_taxa_bar_plot.qzv).
18_Unfiltered_taxa_bar_plot.qzv (615.6 KB)
But then when I try to work backwards and match it to the FeatureID taxonomy (16_Unfiltered_ASV_taxonomy_visfile.qzv) classification there are more FeatureIDs originally listed than are in the bar chart so it does not exactly correspond.
16_Unfiltered_ASV_taxonomy_visfile.qzv (1.4 MB)

So what I am trying to do is just create a list of the FeatureIDs and the count table for each of the sample listed.

I am currently using Qiime2 version 2024.5.

The commands I am running are:

qiime tools import --type EMPPairedEndSequences --input-path directory_name --output-path 1_output_seqs.qza

qiime demux emp-paired --i-seqs 1_output_seqs.qza --m-barcodes-file metadata.tsv --m-barcodes-column BarcodeSequence --p-rev-comp-barcodes --p-rev-comp-mapping-barcodes --o-per-sample-sequences 2_demux_output_file.qza --o-error-correction-details 3_correction_output_file.qza

qiime demux summarize --i-data 2_demux_output_file.qza --o-visualization 4_demux_output_visfile.qzv

qiime dada2 denoise-paired --i-demultiplexed-seqs 2_demux_output_file.qza --p-trim-left-f 0 --p-trim-left-r 0 --p-trunc-len-f 151 --p-trunc-len-r 151 --o-representative-sequences 5_denoise_dada2_seq_file.qza --o-table 6_denoise_dada2_table.qza --o-denoising-stats 7_denoise_stats_dada2_file.qza

qiime feature-table tabulate-seqs --i-data 5_denoise_dada2_seq_file.qza --o-visualization 8_denoise_dada2_seq_visfile.qzv

qiime feature-classifier classify-sklearn --i-reads 5_denoise_dada2_seq_file.qza --i-classifier ../../../Classifier_files/Classifiers_Silva138.1/16S/silva-138.1-ssu-nr99-515f-806r-classifier.qza --o-classification 14_Unfiltered_ASV_taxonomy.qza

qiime metadata tabulate --m-input-file 14_Unfiltered_ASV_taxonomy.qza --o-visualization 16_Unfiltered_ASV_taxonomy_visfile.qzv

qiime taxa barplot --i-table 6_denoise_dada2_table.qza --i-taxonomy 14_Unfiltered_ASV_taxonomy.qza --m-metadata-file metadata.tsv --o-visualization 18_Unfiltered_taxa_bar_plot.qzv

I am guessing this is an easy command that I am missing. Any advice would be super helpful. Thanks!

SoilRotifer · August 14, 2024, 6:21pm

Hi @hdoris,

You can try the approach of this post:

Then simply add --m-imput-files 5_denoise_dada2_seq_file.qza to that command.

This should give you a tab delimited feature table with the feature ID, taxonomy, and sequence. Then you can tabulate (make a QZV) and then click on the export button. Which will provide a text file that you can open in a spread sheet.

Is this what you are looking for?

-Mike

system · September 15, 2024, 12:21am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.