Greetings QIIME2 community!
It's been a while. I previously did a lot of QIIME analyses circa 2011-2013ish with things setup on my desktop at work during my postdoc. However, I ended up taking a position at a school where there was little time to do the biocomputing I enjoyed previously. Any research I was able to do was limited, and we relied on summary output from sequencing facilities to summarize results. However, now I am at a different institution in a different position that involves teaching and allows for more research, and I am trying to get back into the swing of things (although feeling WAY behind and overwhelmed in terms of the pipeline).
Currently, I have some sequencing data (both raw and some analysis output files) from a sequencing facility and was hoping to analyze it in QIIME2. The sequencing facility has demultiplexed, removed primers, filtered, dereplicated, and denoised the data and generated a .fa file with the zOTUs. They also did a taxonomic analysis and produced files showing relative abundance data and counts at different taxonomic levels. If possible, I would like to use any of these output files rather than start from scratch (unless I really need to) and go a bit further in the analysis.
Specifically, I would like to do a few things:
- Generate some diversity metrics to use to compare among samples (I sequenced 4 replicate samples from each of 4 species)
- Create a phylogenetic tree or do PCoA to visualize differences/similarities among the samples
- Do an ANOSIM to test for significant differences in community composition among the different samples
I have created an account through the Galaxy server and was hoping to use it to run my analyses, since my current institution does not allow remote access. I found the Alpha and Beta Diversity tutorial and was going to follow it, but I am having a hard time getting myself oriented. It mentions using a table.qza file and a rooted-tree.qza file that was generated in a "Moving Pictures" tutorial. But, I was not sure how this fit in with the files I have or how to generate them with the files I have. None of my files are .qza (I wasn't really sure what program generates that file type). I tried to find out how to generate the feature table (I think this is the table.qza file mentioned), but it looks like that is generated during denoising, and although mine are denoised, I am not sure I have that.
This is a listing of the pipeline files provided by the sequencer with their description:
Within the pipeline folder are 6 different files, which are generated through various stages of our analysis pipeline.
- -pr.fastq: Demultiplexed dataset stripped of forward and reverse primer sequences in fastq format.
- -filtered.fa: Reads are filtered based on Q score and expected error probability and any read with a number of expected errors greater than 1.0 are discarded.
- -uniques.fa: Dereplicated quality filtered reads
- -zmap.txt: zOTU read map
- -zotus.fa: Denoised unique sequences; reads with sequencing or PCR errors are removed followed by the removal of chimeras
- -zotutab.txt: A zOTU table created with the number of reads assigned to each zOTU in each sample. All reads, pre and post filtering are considered for zOTU table construction.
Can someone please point me in the right direction as to how to take the pipeline files I have and begin doing the diversity analyses I am interested in? I believe once I can get my mind around just integrating my files into the pipeline that I will be okay.
Apologies for my deficiencies. I have since initially using QIIME had 2 kids (one of which is only 2yo) and survived a stressful enrollment decline and ultimately job loss at the teaching institution and am trying my best to pick myself up with whatever little energy I have left after a 3hr round trip commute each day and get back in the groove of things.
Thank you QIIME2 community in advance!
MicroMo