Frequency of specific taxa according to metadata category

fstudart · March 14, 2018, 5:04pm

Hi Everyone,
I collapsed one of feature stables at the genus level. After doing that, it says how many times a specific genus was found across all samples and also in how many samples this particular genus was found.

In my data, the genus Streptococcus was found in 93 out of 98 samples. I'd like to know, if possible, out of these 93 samples, how many samples were observed in a specific group according to my metadata file. For example, out of these 93 samples, this genus was found in xx COPD samples and the remainder was observe in control samples.
I thought about filtering my feature data (according to my metadata file, e.g., only select COPD patients) to figure this out. But, is there other way that I can do this?

Thanks very much,
FS

colinbrislawn · March 15, 2018, 2:28am

Good evening Fernando,

One option would be to export your feature table, then convert that table into a .tsv file and open it up in Excel.

qiime tools export feature-table.qza \
   --output-dir exported-feature-table

biom convert -i exported-feature-table/name_of_file.biom \
  -o exported-feature-table/feature_table.tsv \
  --to-tsv --header-key taxonomy

The output will have each sample in a different column, and you can see which 93 samples have these taxa.

Let's see what the qiime devs recommend.

Colin

Nicholas_Bokulich · March 15, 2018, 1:21pm

I'm not sure that there is a straightforward way to do this in QIIME2. Filtering your feature table by metadata groups and then summarizing may just be the best way to do this.

@colinbrislawn's suggestion to export and open, e.g., in excel would be the only way to really pick it apart sample-by-sample if that's what you want to do (unless if you are comfortable with python programming and could access/summarize this file via QIIME2's artifact API). You could convert to a presence-absence feature table, which would make counting samples in excel easier (then you just sum the row to get the number of samples).

Also check out this thread. It sounds like your idea is a bit different, but maybe not — having multiple users with the same need would help prioritize a new feature.

fstudart · March 15, 2018, 9:06pm

Hi Dr. Colin,

Thanks for your reply and the suggestion. I'm definitely gonna try this.

Thanks,
FS

fstudart · March 15, 2018, 9:11pm

Hi Dr. Bokulich,

Thanks for your reply. I will also try to create a presence-absence feature table, as you suggested.

Thanks,
FS

system · April 16, 2018, 3:11am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.