I have a quick question, I had previously rarefied the table.qza and exported it to work in R-studio, but prior to working in R, I realized I had to assign guilds to the OTU’s since I only need the ectomycorrhizal species for my study. My question is, I rarefied my data based on the frequency of all the sequences in my data, which included saprotrophs, pathogens, etc, could I have missed valuable information doing it like this? If so, should I export the complete table.qza, without rarefying and assign guilds. Followed by rarefying the data?
I really appreciate your help.
(Matthew Ryan Dillon)
Assigning guilds sounds like an extension of taxonomy assignment (I assume that is how guilds are being assigned). So I think the answer really depends on your application.
Yes, you could be losing information in the sense that you will only be assigning guilds to features that are retained following rarefaction. But if you will ultimately rarefy, those features may be lost anyway (since rarefaction selects features at random, there could be some variation in which features/taxa/guilds are retained, but with a large enough sample size it should not matter that much). So it may be “6 of one, half dozen of the other”.
I think I would personally assign guilds before rarefying, since that information could be used in other ways (e.g., construct taxa bar plots or run ANCOM on non-rarefied tables). But it all really depends on your own experimental goals.
That's what I was thinking. So I went ahead and exported table.qza, assigned guilds and out of 2125 OTU, 998 were assigned to a guild. My question now is, how can I import this information back into qiime? I can use the normal table.qza on qiime, but how do I import the table containing the guilds? Can I import them as a metadata and use it to filter out the samples that are Ectomycorrhizal and then rarefy? Or what is the best way of doing this? I have searched the internet quite a bit and I am slightly lost because the new table I created has the taxonomy assigned, menawhile the table.qza in qiime has and ID instead.
using this guild information as metadata would probably be the most straightforward thing to do.
It looks like the sample ID is effectively the taxonomy assignment. What’s not clear is whether these IDs are unique taxonomy assignments or if they are replicated (in which case they probably correspond to the sample IDs in the input table, in the same order)
If the latter, it should be fairly easy to just relabel with the correct sample ID and use this file as metadata, e.g., in the qiime feature-table filter command.
If these are unique IDs, you can either:
use qiime taxa collapse to collapse/relabel your feature table by taxonomy, then filter with qiime feature-table filter-features (since the sample IDs in both files should be the same taxonomic assignments)
Get a list of taxonomy assignments that you want to filter out. E.g., you could do something like this (untested but should work):
That file will contain all taxa that are EMF, and you can use that list to filter those taxa as shown here. Make sure to use --p-mode exact if you go this route.
There are a few different ways to do it, but these seem like they might be the most straightforward. If EMF fungi fall into a single clade you can just filter out that clade as shown in the filtering tutorial, which is probably even more straightforward.
The ID on the table.qza picture I attached above are the ones created in qiime. I am not sure if the output from FUNGuild is in the same order as I put them in, so I am unsure how to properly add the ID to them. I did create the OTU table in qiime, by using qiime taxa collapse, so technically, can I use the collapse table to run all my analysis, instead of using the frequency[feature] (table.qza)?
If so, I can easily define what sample belong to ectomycorrhizal since the collapsed table was the one that was used in FunGuild.
yes, that would be fine and the collapsed table is still a FeatureTable[Frequency] artifact. The only possible issue is that your features would now be taxa, not sequence variants, which changes the analysis and interpretation a little bit (e.g., in alpha diversity you would not be looking at the # of unique sequences, rather the number of unique taxa detected in each sample). But if that’s your goal then that’s not a problem.
Traceback (most recent call last):
File "/home/qiime2/miniconda/envs/qiime2-2018.6/lib/python3.5/site-packages/q2cli/commands.py", line 274, in call
results = action(**arguments)
File "", line 2, in filter_table
File "/home/qiime2/miniconda/envs/qiime2-2018.6/lib/python3.5/site-packages/qiime2/sdk/action.py", line 232, in bound_callable
File "/home/qiime2/miniconda/envs/qiime2-2018.6/lib/python3.5/site-packages/qiime2/sdk/action.py", line 367, in callable_executor
output_views = self._callable(**view_args)
File "/home/qiime2/miniconda/envs/qiime2-2018.6/lib/python3.5/site-packages/q2_taxa/_method.py", line 90, in filter_table
File "/home/qiime2/miniconda/envs/qiime2-2018.6/lib/python3.5/site-packages/q2_taxa/_method.py", line 35, in _ids_to_keep_from_taxonomy
raise ValueError("At least one filtering term must be provided.")
ValueError: At least one filtering term must be provided.
Plugin error from taxa:
** At least one filtering term must be provided.** See above for debug info
You're almost there! The issue is that you need to use either --p-include or --p-exclude — the contents of EcM-data.txt are what you want to pass to one of these parameters. You do not want to use the query-delimiter parameter (that parameter is just telling whether your taxa to include/exclude are separated by commas or another character. You are using commas (the default delimiter) so do not need to specify anything different)
It appears like all the taxonomy that contain those types of symbols on the metadata (;; etc) are not present in the taxonomy file, and if they are, they do not contain the symbols. Could this be the problem?
Bingo. The taxonomy only contains as many levels as could be confidently classified. Hence, you see ASVs classified to family level without the “;;” at the end. When you use taxa collapse or taxa barcode, you collapse at a specific level and the empty levels are appended to the end to “fill in” missing levels, so you get things like “…Mycosphaerellaceae;;”.
So you are correct — just delete those empty levels from the end of your taxa names in EcM-data.txt and it should work.
Sorry to bug you again. I did as you said, removed all the _; from all the taxa names in EcM-data.txt and I still get the same error. I have attached the EcM-data.txt (21.1 KB)
file in case you want to take a look at it.