I am working with q2-sidle in qiime version 2021.4 and every step runs perfectly. However, I have observed that some samples dissapear in the reconstructed table (output of the qiime sidle reconstruct-counts command).
My first thougth was looking into the dada2 table and the dada2 trimmed at x nts length table, but all the inicial samples were there. Maybe these samples dissapeared during the alignment step? But, in all regions? It didn't feel correct to me. I also tried with another database and even more samples dissapear.
I had been running this step previously with the same database and parameters and I kept all the samples in the final reconstructed table.
Thank you all for keep this amazing forum to resolve doubts like mine!!
There is a minimum sample size (min_counts) parameter which removes samples with fewer than 1000 counts in the reconstructed features. You could try lowering this parameter and see if you retain more samples.
However, if this is happening, you should be getting a warning that says something like,
There are 3 samples with fewer than 1000 total reads. These samples will be discarded.
In fact, I had run this command with the same number of samples before and didn't have this problem. Also, I always add the verbose flag because I like to see the evolution. And about the warning message, didn't show... I feel a bit confuse.
If I find the reason behind this problem, I will share it!
I also run it in verbose mode and I got the following warning:
Database map summarized
/home/data/anaconda3/envs/q2-sidle/lib/python3.8/site-packages/distributed/worker.py:4325: UserWarning: Large object of size 808.06 MiB detected in task graph:
([['AAQK01003909.1492.2988', 'AAQK01003909.1492.29 ... 672.1.1491']],)
Consider scattering large objects ahead of time
with client.scatter to reduce scheduler burden and
keep data on workers
future = client.submit(func, big_data) # bad
big_future = client.scatter(big_data) # good
future = client.submit(func, big_future) # good
Alignment map constructed
/home/data/anaconda3/envs/q2-sidle/lib/python3.8/site-packages/q2_sidle/_reconstruct.py:180: UserWarning: There are 854 samples with fewer than 1000 total reads. These samples will be discarded.
warnings.warn("There are %i samples with fewer than %i total"
Relative abundance calculated.
So I guess this is why I lose so many.
Any suggestions on why this is happening? When independently analysing the 2 regions I have no problem of quality at all and I rarefied at >10,000 reads.
I will work on adding that as an update. I'm sorry the error isn't showing up! I've opened an issue on github, and I'll see if I can add it to the pull request that does sample accounting.
As far as your specific issue goes @nandreani, you only had 1 ASV align in the V13 region and none in the V34. (You can the sequences that aligned with regions using qiime metadata tabulate. Most of the alignment/kmer map/reconstruction database files can be coerced to look like metadata.) I might try primer trimming through cutadapt before you generate your ASV table.
So, I think this is a symptom of an earlier problem. I've got a pull request to look at counts retained after alignment, I'll keep you updated about when I can get it merged. (Sorry for the slow updates, sidle is a side project for a lot of the development team.)