We have recently performed analysis on a gut microbiome dataset (16S) where we used QIIME 1.9.1 to demultiplex pooled Illumina sequence reads (earth microbiome project protocol) and split reads per sample (split_libraries_fastq.py; split_sequence_file_on_sample_ids.py), rarefy our OTU table (single_rarefaction.py), and calculate UniFrac distances (beta_diversity.py). We will be moving to QIIME2 for future analyses, but we are now wondering if we need to go back and reanalyze using QIIME2.
Is this necessary? Are there any differences between QIIME 1.9.1 and QIIME 2 we need to be aware of in these areas?
We also have a replication dataset yet to be analyzed, so we want to use QIIME 1.9.1 on this dataset as well for consistency with the previously analyzed dataset.
The big difference in the steps that you describe will be in the sequence quality control. In QIIME 1, that happened in split_libraries_fastq.py, and in QIIME 2 this is achieved through denoising with either DADA2 or Deblur (see the QIIME 2 Moving Pictures Tutorial, which illustrates how to use both of these).
The approaches for quality control in QIIME 2 are much better than what was available in QIIME 1.9.1. This is especially obvious when you look at community richness data generated with the two pipelines: in QIIME 1 you will have much higher “observed OTUs” counts (for example) than you will with QIIME 2 on the same data set, and most of the difference is probably sequencing error that is being mistaken for real biological diversity in QIIME 1. That said, the patterns that you observe in your data, such as relative alpha diversities between samples, are generally the same in QIIME 1 and QIIME 2 (based on my observations, and others who have compared the two), so your QIIME 1 results are not wrong - they could probably just be improved. I do not think that you have to reanalyze the data with QIIME 2.
If it were me, and I had additional data to analyze (you mention you have another dataset still to be analyzed), I would probably re-analyze the first with QIIME 2, and then analyze the additional dataset with QIIME 2 as well. Then I could be using the latest-and-greatest methods for my new data, and I could compare the QIIME 1 versus QIIME 2 results on the original data myself. This is the approach that I’m taking for a study that I’m currently working on.
We recently analyzed our new dataset using both QIIME 1.9.1 and QIIME 2. What we found was that QIIME2, either denoising by dada2 or deblur, performed much better than QIIME1.9.1 in terms of the accuracy of taxonomic assignment, as evidenced by the far superior estimation of mock compostion by the QIIME2 pipeline. The beta-diversity, however, showed no different results. We saw the similar clustering of samples and both pipelines gave the same conclusion.
Now in QIIME2 2017.11, you can easily run both 1.9 and 2 pipepline and compare your results.