Our experiment is looking at differences between the microbiomes of soil treated with two separate compounds at 0, 2, 6, and 12 weeks post-treatment. Each timepoint was processed and sequenced at a different time (not ideal, I know, but we did what we could!). We have worked with the timepoints, separately, up until now. For each timepoint we have run alpha/beta diversity, made phylogenetic trees, and assigned taxonomy. We then filtered each taxa table to remove “unassigned” as well as mitochondria/chloroplasts taxa using the “qiime taxa filter-table” command, which creates a new filtered-feature table. Again, we have done this for each time point.
Now, we want to do some over-time analyses, and there’s not a lot of information we could find about the correct order of steps to take. We think we have a tentative pipeline (below), but we were hoping for some confirmation/advice on what’s the best order!
We thought we’d first merge our filtered-feature tables (at each timepoint) using the feature table plugin method “qiime feature-table merge” and then do the same with the rep-seqs using “qiime feature-table merge-seqs.” We’d also create a new metadata file by copy/pasting all of the information from each individual timepoint metadata file into one gigantic sheet and validate with Keemei. Then we’d have a combined filtered-feature table, combined rep-seqs, and new metadata file. From there, we could make a new rooted phylogenetic tree so that we could do new alpha/beta analysis to look at overall treatment trends. Lastly, we’d do ANCOM to identify the features that are most variable and then look at those features with q2-longitudinal to see if there are treatment and feature differences over time.
Since we’ll export the relative abundance data to a BIOM table, we can do all of that analysis later with Excel, combining whatever time points we need for analysis.
Does that order of steps sound about right? We’d appreciate any advice or suggestions!