this code printed error
"Invalid value for '--i-reference-sequences': Expected an artifact of at
least type FeatureData[Sequence]. An artifact of type SeppReferenceDatabase
was provided."
i readed "OTU picking strategies"
if i use non-overlapping amplicon, like V2 and V4 in rRNA, i should use close-reference clustering....
but i dont know what is reference-sequence and how i get it?? i search in forum and google but i cant find this information...
please help me! thank you
Hi @svbreqwaiu01, the sepp-refs-silva-128.qza file is for fragment insertion. If you'd like to perform closed-reference OTU picking follow the instructions here. The reference files (GreenGenes and SILVA Sequence files) to use can be found on the Data resources page.
As explained in the linked resources, this is exactly what closed-reference clustering means. You are only retaining those sequences that cluster within / match a given reference database within a defined percent similarity. The following article does a great job comparing some of these approaches:
Callahan, Benjamin J., Paul J. McMurdie, and Susan P. Holmes. 2017. “Exact Sequence Variants Should Replace Operational Taxonomic Units in Marker-Gene Data Analysis.” The ISME Journal 2 (12): 1–5. http://dx.doi.org/10.1038/ismej.2017.119