I am using Qiime2 with the BOLD database for my sequenced COI amplicons. I was not able to detect a spiked species in one of my samples with dada2, although I can detect it by using BLAST search or when I do mapping with my trimmed data. Now I would like to inspect the ASVs identified with dada2, but I did not find a possibility to extract a fasta of representative sequences for each sample. Instead I can get a combined one for all samples that does not tell me which sequence stems from which sample.
Is there any possibility to see which sequence was found in which sample? I already tried to run a single sample with dada2, but it gives me an error.
The feature table is what maps the individual sequences to each sample. It sounds like what you probably want to do is use qiime feature-table filter-seqs to grab sequences that are only found in that one sample.
Thank you for your fast reply!
I’m not sure, if that is what I asked for. I would like to get a fasta file with the sequences for each sample instead of a feature table. With this command the identifier in the fasta header does not tell me from which sample this sequences is derived from.
do you want a fasta for each individual sample? The command I listed above will allow you to grab all sequences found in a single sample, e.g., the spike-in sample you used. So run that and all seqs will be found in that sample, implicitly. Since it sounds like you are looking for the seqs in that specific sample, this should be a good solution… it would be cumbersome if you want to see what seqs are found in a whole set of files but could theoretically just be looped multiple times to filter out seqs belonging to a bunch of different samples.
well… I’m not sure that’s what you are looking for either. In qiime 1 the rep_set fasta file did contain sample IDs in the header, but those rep seqs could be found in any sample and only one would be listed in the header so it would not be a reliable method for determining what sequences are found in any given sample.
Thank you! This is what I was looking for. However, so far I failed to succeed with qiime feature-table filter-seqs. Could you give me an example? I was already trying with --m-metadata-file that gave me the error (All features were filtered out of the data.) and --p-exclude-ids 1L,2L that gave me a value error (ValueError: Could not coerce value based on expression provided).
My samples are named like 1S,2S,3S,4S,5S,6S,7S,8S,9S,10S,1L,2L,3L,4L,5L,6L,7L,8L,9L,10L
Hi @Josephine! @Nicholas_Bokulich is out of the office right now, I can lend a hand. You’re very close, but have a few things that need to be adjusted. I would suggest checking out the Filtering tutorial for more help.
First, transpose the table (this will flip the sample and feature axes, necessary for the next step). Then, use the transposed as feature metadata, and keep only the features found in samples 1L or 2L: