Subsampling issue

Jeongsu_Kim · September 6, 2018, 10:07am

Hi guys,

I’m new to qiime2.

I’m handling with Miseq paired-end fastq data of 400 samples.

While I was running dada2, it took so long… so I chose to run “DADA2 workflow for Big Data (http://benjjneb.github.io/dada2/bigdata_paired.html)”

My question is that… how can I import final files which seem .rds files to qiime2 for downstream analyses?

Or is there any way to subsample with qiime2?

Thank you,
Sue

ebolyen · September 6, 2018, 5:26pm

Hey @Jeongsu_Kim,

I haven't looked in a little while, but the QIIME 2 DADA2 wrapper script does generally follow the per-sample denoising strategy in your link. The reason it's so terribly slow is that it's using a "universal" build which can't make as much use of your specific CPU architecture as a compiled version can.

As far as importing what you have, you won't be able to use .rds files as those aren't really a file format so much as memory dump.

But you should be able to load them and then write a fasta file and roughly compatible OTU table.

This tutorial:

by @ChristianEdwardson has some step-by-step instructions.

Jeongsu_Kim · September 10, 2018, 10:31am

Thank you for your answer.

I will try the tutorial you suggested but I also wonder if there is a tutorial for subsampling using QIIME2?

ebolyen · September 11, 2018, 5:05pm

We don't have any good tooling for subsampling/partitioning raw reads (although we really should).

system · October 12, 2018, 11:10pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.