Regarding custom database


I am making custom database for functional gene arsM, for this I have downloaded arsM 71000 gene sequences from fun gene database.

To extract reference gene sequences I am using below command:

qiime feature-classifier extract-reads
–i-sequences arsM_otus.qza
–p-trunc-len 0
–p-min-length 0
–p-max-length 350
–o-reads ref-seqs.qza

How can I confirm that all these sequences contain these above arsM primer sequences and is extract according to primer sequences.

Is it possible to remove sequences which do not contain above primer sequences?

Thanks a lot for all your previous helps.

Kind Regards

extract-reads will only output trimmed sequences from reads that contain the primers. So there is no need to confirm that the primers were in those sequences — they were there, but then were trimmed out and the internal region (simulated amplicon) output.

Hi @Nicholas_Bokulich,

Thanks for your quick help. Do I also need to remove taxonomy for gene Id which are not present in the ref.seq.qza and can I know how many sequences are present in ref.seq.qza?



Try to use feature-table tabulate-seqs. I am not sure, but it might list the number of sequences.

Hi @Nicholas_Bokulich

Thank you for your all help.


1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.