working with "hits" sequences


I have 10 samples in fastq format. I imported them individually with the following command:

qiime tools import --type 'SampleData[SequencesWithQuality]' --input-path sample_info.csv --output-path data.qza --input-format SingleEndFastqManifestPhred33

sample_info.csv file contains path to each fastq file.

So, this generated 10 qza files. Then I ran demux and denoise steps and obtained "representative" sequences and feature table.

Then, I ran the qiime quality-control exclude-seqs command to obtain hits and misses sequences.

Ok, now I want to use "hit" sequences to perform clustering, taxonomy etc.

For clustering and taxonomy, I need, feature table and feature sequences files that I do not have at this stage. How can I create them? These files are created during "denoising" step but I have already performed that on individual file for the purpose of running exclude seq command.

Please help me how can I obtain feature sequence and feature table using "hits.qza" file.

Many thanks,

Hi @mars,

That's an interesting path to get to a feature table. I have some concerns that I can circle back to at the end.

When you describe the "hits" file, are you looking for a set of sequences that hit a reference database when you cluster or blast? (If so, do you have a reference database you're using?) Are you hoping to get taxonomic assignments for the sequences? It's not terminology I've used before, and so it's easier to direct you if there's a bit more clarity.

However, I'm also a little bit concerned about how you handled your denoising. Dada2 specifically (which I'm assuming you used, based on your pipeline) is designed to work across multiple samples. Running the samples individually can potentially lead to lower quality denoising, more lost samples, and poor performance. Running all your sequences together would be a good way to get a feature table and representative sequences.



This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.