How do I create a tree from a subset of samples?

Lu_Yang · November 17, 2017, 4:08pm

Hi, @thermokarst,
Sorry, I have another question about the tree.
I use qiime2 and DADA2 demultiplex the 2 batches of samples separately. Then merge the batches rep-seqs.qza, and assign the taxonomy together.
But now I want to draw a tree of a subset of samples, 10 samples are from the 1st batch, 10 samples are from the 2nd batch. May I know how can I get a tree of the subset samples?
Thanks in advance.
Best.

thermokarst · November 17, 2017, 9:48pm

Hi @Lu_Yang --- I split this post off into a new topic, we generally try and keep each topic thread limited to one discussion point, that way it makes searching the forum easier for other users!

You will want to filter your representative sequences, twice, once per batch group. This will create two new subsets of representative sequences --- you can then create trees for those subsets. Let me know if you need more specific guidance on how to make this happen, but I think if you read the references linked to above you should be set!

Lu_Yang · November 20, 2017, 3:27am

Hi, @thermokarst,

I set the code as below
qiime feature-table filter-seqs --i-data rep-seqs.qza --m-metadata-file test.txt --o-filtered-data formate.qza
in which test.txt is the same format as the sample-metadata.txt, it only contains the SampleID I want to retain.
But I got the error as below.
Plugin error from feature-table:
All features were filtered out of the data.
Debug info has been saved to /var/folders/1l/6kmg2wlx57sc8kdq8n5mwl5r0000gn/T/qiime2-q2cli-err-oglum6sa.log.

Then I try as below.
qiime feature-table filter-seqs --i-data rep-seqs.qza --m-metadata-file mapping.txt --p-where test.txt --o-filtered-data formate.qza
In which the mapping.txt is the whole SampleID, test.txt is the SampleID I want to retain. Then I got the error as below.
Plugin error from feature-table:
Selection of IDs failed with query:
SELECT "#SampleID" FROM metadata WHERE test.txt GROUP BY "#SampleID" ORDER BY "#SampleID";
Debug info has been saved to /var/folders/1l/6kmg2wlx57sc8kdq8n5mwl5r0000gn/T/qiime2-q2cli-err-5dzqfayd.log

Could you do me a favor? Thanks in advance.

thermokarst · November 20, 2017, 11:47pm

Hi @Lu_Yang! The first command is failing because your rep-seqs.qza have Feature IDs, but you are using your Sample IDs to filter with. You will need to filter your feature tables based on the sample IDs, then grab the Feature IDs from a feature-table summarize visualization, then use those IDs to filter the rep seqs. Please see this forum post for more detailed step-by-step instructions.

This workflow is a pain right now --- we have an open issue to streamline this a bit. Let me know if you get stuck!

Lu_Yang · November 21, 2017, 6:09pm

Hi, @thermokarst,

Thanks for your help. I have followed that post, I used the code as below
qiime feature-table filter-samples --i-table table.qza --m-metadata-file mapping.txt --p-where '"#SampleID" IN ("D1.1.27.15", "D2.1.27.15","D3.1.27.15","D1.2.7.15","D2.2.7.15","D3.2.7.15","D1.2.14.15","D2.2.14.15","D3.2.14.15","D1.2.22.15","D2.2.22.15","D3.2.22.15","D1.3.2.15","D2.3.2.15","D3.3.2.15")' --o-filtered-table table-formate.qza
qiime feature-table summarize --i-table table-formate.qza --o-visualization table-formate.qzv
qiime tools view table-formate.qzv
Then set download the csv file, also add the 'feature ID and frequency', the features-to-filter.tsv is shown like this,
Feature ID Frequency
D3.3.2.15 117193
Then I use the code 'qiime feature-table filter-seqs --i-data rep-seqs.qza --m-metadata-file features-to-filter.tsv --o-filtered-data formate.qza'
I got the error.
Plugin error from feature-table: All features were filtered out of the data.
Debug info has been saved to /tmp/qiime2-q2cli-err-9ht3hmxh.log.
Any possible way to solve?
Have a nice day!

thermokarst · November 22, 2017, 2:27pm

Hi @Lu_Yang!

The error indicates that "All features were filtered out of the data." That means that the file features-to-filter.tsv includes every single one of your features in your rep-seqs.qza file, which I think given what you are trying to ultimately accomplish here (make several trees of sample subsets), means you can skip that filtering step, and use just your rep-seqs.qza file when creating the tree for that set of subsamples. Does that make sense?

Keep us posted on your progress!

Lu_Yang · November 22, 2017, 4:52pm

Hi, @thermokarst,

Now it works! Awesome! I found my problem. There are two tables, feature-frequency-details.csv and sample-frequency-details.csv, I have used the wrong table. Now I use the samples-frequency-detais.csv. I created a tree. Great!
Thanks for your help!
Wish you a great Thanks Giving Day!

ebolyen · December 22, 2017, 6:04pm

In the new QIIME 2 2017.12 release you can provide a feature-table to feature-table filter-seqs to remove sequences that do not have corresponding IDs in the table!