Regarding custom database

Hi,

I am making custom database for functional gene arsM, for this I have downloaded arsM 71000 gene sequences from fun gene database.

To extract reference gene sequences I am using below command:

qiime feature-classifier extract-reads
–i-sequences arsM_otus.qza
–p-f-primer TCYCTCGGCTGCGGCAAYCCVAC
–p-r-primer CGWCCGCCWGGCTTWAGYACCCG
–p-trunc-len 0
–p-min-length 0
–p-max-length 350
–o-reads ref-seqs.qza

How can I confirm that all these sequences contain these above arsM primer sequences and ref.seq.read.fa is extract according to primer sequences.

Is it possible to remove sequences which do not contain above primer sequences?

Thanks a lot for all your previous helps.

Kind Regards
Yogesh

extract-reads will only output trimmed sequences from reads that contain the primers. So there is no need to confirm that the primers were in those sequences — they were there, but then were trimmed out and the internal region (simulated amplicon) output.

Hi @Nicholas_Bokulich,

Thanks for your quick help. Do I also need to remove taxonomy for gene Id which are not present in the ref.seq.qza and can I know how many sequences are present in ref.seq.qza?

Thanks
Yogesh

No

Try to use feature-table tabulate-seqs. I am not sure, but it might list the number of sequences.

Hi @Nicholas_Bokulich

Thank you for your all help.

Regards
Yogesh

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.