I am working with q2-sidle in qiime version 2021.4 and every step runs perfectly. However, I have observed that some samples dissapear in the reconstructed table (output of the qiime sidle reconstruct-counts command).
My first thougth was looking into the dada2 table and the dada2 trimmed at x nts length table, but all the inicial samples were there. Maybe these samples dissapeared during the alignment step? But, in all regions? It didn't feel correct to me. I also tried with another database and even more samples dissapear.
I had been running this step previously with the same database and parameters and I kept all the samples in the final reconstructed table.
Any suggestions?
Thank you all for keep this amazing forum to resolve doubts like mine!!
There is a minimum sample size (min_counts) parameter which removes samples with fewer than 1000 counts in the reconstructed features. You could try lowering this parameter and see if you retain more samples.
However, if this is happening, you should be getting a warning that says something like,
There are 3 samples with fewer than 1000 total reads. These samples will be discarded.
In fact, I had run this command with the same number of samples before and didn't have this problem. Also, I always add the verbose flag because I like to see the evolution. And about the warning message, didn't show... I feel a bit confuse.
If I find the reason behind this problem, I will share it!
Which version of sidle are you using? Have you checked that the samples survive denoising? I am working (slowly) on code to do some accounting, but its still very much a work in progress.
Hello,
I am having the same problem.
as I had already analysed the 2 datasets independently, I used the reps-seqs and tables. I start with 480 samples (x2) and I end up with 112.
Do you have any idea on why this is happening?
I could of course start over from raw data, but I am not sure this will be working either.
The dada2 trim posthoc doesn't care about the depth; the filtering happens during re-construction. If you merge the tables and summarize them, what do the counts look like?
I'm sorry you're having issues! It's hard to trouble shoot without more details about your process and possibly the data.
Could you describe the process (how do you get the tables going in) and check their depth as well? Are you using the version from the main branch on github?
Or, you could try changing the --p-min-counts parameter.
Everything runs smoothly until the qiime sidle reconstruct-counts command. The tables (also the 110nt) ones have 480 samples but the reconstructed one has only 112.
I also run it in verbose mode and I got the following warning:
Database map summarized
/home/data/anaconda3/envs/q2-sidle/lib/python3.8/site-packages/distributed/worker.py:4325: UserWarning: Large object of size 808.06 MiB detected in task graph:
([['AAQK01003909.1492.2988', 'AAQK01003909.1492.29 ... 672.1.1491']],)
Consider scattering large objects ahead of time
with client.scatter to reduce scheduler burden and
keep data on workers
future = client.submit(func, big_data) # bad
big_future = client.scatter(big_data) # good
future = client.submit(func, big_future) # good
warnings.warn(
Alignment map constructed
/home/data/anaconda3/envs/q2-sidle/lib/python3.8/site-packages/q2_sidle/_reconstruct.py:180: UserWarning: There are 854 samples with fewer than 1000 total reads. These samples will be discarded.
warnings.warn("There are %i samples with fewer than %i total"
counts loaded
Relative abundance calculated.
So I guess this is why I lose so many.
Any suggestions on why this is happening? When independently analysing the 2 regions I have no problem of quality at all and I rarefied at >10,000 reads.
I will work on adding that as an update. I'm sorry the error isn't showing up! I've opened an issue on github, and I'll see if I can add it to the pull request that does sample accounting.
As far as your specific issue goes @nandreani, you only had 1 ASV align in the V13 region and none in the V34. (You can the sequences that aligned with regions using qiime metadata tabulate. Most of the alignment/kmer map/reconstruction database files can be coerced to look like metadata.) I might try primer trimming through cutadapt before you generate your ASV table.
So, I think this is a symptom of an earlier problem. I've got a pull request to look at counts retained after alignment, I'll keep you updated about when I can get it merged. (Sorry for the slow updates, sidle is a side project for a lot of the development team.)
thank you very much, Justine. I am trying to start over with 4 samples only to see if I can solve the issue. It doesn't seem I do have adapters in my sequences.