How to find sample IDs from publicly available fastq.gz

Sam_Degregori · August 4, 2022, 4:59pm

Hello qiime people,

I have a question I seem to not be able to find an answer to. I have a large fish gut microbiome dataset and I want to print out a list of unique feature ids for 2 metadata categorial variables. There are 20 levels of factor 1 and 3 levels of factor 2 resulting in 60 lists that I need. My end goal is to make 20 Venn diagrams based on factor 2... I apologize in advance if this is confusing to understand as I myself am struggling to describe this in plain English.

But long story short...is there a way to print out feature ids in the terminal based on a table.qza? Even better, would I be able to print out a list based on a metadata variable(s)? I am selfishly trying to not have to do this in R or manually by downloading all the repseq tsv files...
Any help would be appreciated.
Thanks!
Sam

Sam_Degregori · August 25, 2022, 5:34pm

Ended up manually downloading all the table.qzv’s. It would be really nice to be able to make ASV and taxonomy tables directly in the terminal with qiime. This would open the door for some cool scripts you could do very quickly instead of having to take things into R which can be a pain. If I am missing a tool that makes this possible please let me know!

Keegan-Evans · August 25, 2022, 7:14pm

@Sam_Degregori,

It looks like you may want to check out feature-table tabulate-seqs and metadata tabulate(and here is an example of it being used in one of our tutorials). This would at-least keep you out of R.

Sam_Degregori · August 26, 2022, 9:32pm

Hi @Keegan-Evans,

This is exactly what I was looking for I think. And I now painfully regret the weekend I took off to handle 60 qzv files. No pain no gain though I guess.
Thanks!