RESCRIPt p-query issue from NCBI: Plugin error from rescript: taxonomy format requires at least one row of data

Hi @bkramer ,

It looks like the issue is that you are trying to download raw reads from a research study, stored in SRA. There are evidently not any taxonomic annotations associated with these.

whereas the bioproject in the tutorial links to annotated reference sequences contained in the nucleotide archive (and more specifically this is a refseqs targeted loci project with curated annotations).

So the short explanation is that there are sequences to download in project 418634, but there are not any associated taxonomy annotations, so the action fails. This is also basically what the error message is saying, that it could not retrieve the expected annotations:

This action is specifically meant for getting annotated sequences — it is not equipped to handle sample metadata so downloading study data is not an intended use (and will obviously fail at this attempt).

We are working on such an action in QIIME 2, which should be released in the coming months, that will allow downloading study data.

RESCRIPt could be used for this purpose, but you should use a keyword query (unless if you find an appropriate bioproject query, e.g., if there is a nifH refseqs targeted loci project). Something like nifH[title] NOT uncultured[title] (note: I have not tested that query!)

Give that a try and let us know what you find... I recommend testing out a query directly on Genbank so that you can refine it further (e.g., to see a summary of how many sequences are retrieved, and the different taxonomic groups that are retrieved). Then download with RESCRIPt once you have found a query that you like!

Good luck!

1 Like