qiime2R: samples with 0 reads become duplicates filled with data from other samples in read_qza function?

Hi all and @jbisanz,

As noted in the the title, I noticed some very odd behavior in how qiime2R handles samples with zero counts. I was hoping to get some insights as to why this is happening.

Package versions: Qiime 2019.10, R version 3.6.1, qiime2R 0.99.11.

I am characterizing microbial isolates, so my data is much, much simpler (and lower coverage) than complex communities. I used Qiime2 to process and classify my data, and then used qiime2R to import my data into R, where where I do downstream filtering/analysis in phyloseq. However in my finalized data I noticed that there were a large number of samples that had identical read counts and compositions. Here's a snippet of that dataset:

Since this seemed unlikely to be real, I manually checked these samples on Qiime2 View, and in an exported BIOM file. I found that most of these "duplicate" samples actually had 0 total reads in the original file. Indeed, they do appear to be duplicates. Take P7-D9 and P7-E1 (highlighted in yellow) for example: one of them is the real sample with that composition, and the other is a 0 count sample that somehow was filled with that data. Even more strange, are the samples in red. None of them are real, they are all 0 count samples. After poking around in my data, my best guess is that they are a partial copy of another sample, given their read count/ASV classification.

I suspect that the conversion of 0 read samples into false samples takes place in the qiime2r command read_qza, because the false read counts are present in the $data section of my read_qza object. The nonzero samples all look fine, as far as I can tell.

I saw in a couple threads that there is an issue with qiime2R and samples with 0 total reads, such as Qiime2r file read issue. But unlike this thread, the read_qza did not throw an error or warning, so I didn't realize there was a problem until I was looking at my output data, line by line.

Is it supposed to do this or am I making a mistake somewhere? My current plan is to simply remove 0 count reads in qiime2 before exporting to them to R, but if this is a true action of this function, could the fact that 0 count samples must be removed before exporting be stated explicitly in the qiime2R tutorial?


Hi Caroline, this is an issue with with biomformat, but it has not been patched. A more robust method to catch this case is needed, or perhaps I will try to cook something up myself.


Hi Caroline,

I just changed the method of biom import and in testing with my own files, it is faithfully importing qza files/biom files which 0-count samples. Please test this new version (v0.99.3) and let me know if it works for you!


1 Like

Hi Jordan,

I just tried the new version – and it worked! My zero count samples now have zero counts.

Thanks so much for the speedy reply and fix, I really appreciate it.


1 Like