Merge multiple q2 artifacts prior to denoising

Connor_Geraghty · May 23, 2019, 6:03pm

Hello Forum!

I have a relatively simple question but I haven't been able to find a solution quite yet. Here is what I am trying to do: I have a set of samples all sequenced together, but four different spacers/adapters were used, creating four subsets of samples. The samples are demulitplexed, but still contain the spacer sequence. I imported the four subsets of samples into four separate artifacts with the type SampleData[PariedEndSequencesWithQuality]. To remove the unique spacers, I will have to run cutadapt on the four subsets, with the four output artifacts still being of the type 'SampleData[PariedEndSequencesWithQuality]' .

My question: is there a way to merge these four SampleData[PariedEndSequencesWithQuality] artifacts into one SampleData[PariedEndSequencesWithQuality] artifact? Therefore I can continue my pipeline straight into DADA2 denoising with the single artifact. Or, will I have to run denoising on the four artifacts separately and merge their resulting feature tables?

I guess my other option is to extract the .fastq.gz files from each artifact and re-import them into qiime2 as a single artifact. I would like to stay away from this if possible but could be the best option.

Details on my setup: using python3.6 on a server, with qiime2-2018.11 (only option on the server currently without setting up a docker container).

Thanks for your help!
Connor

jwdebelius · May 23, 2019, 7:49pm

Hi @Connor_Geraghty,

First, welcome to the forum!

It think integration of multiple regions is a holy grail in marker gene right now. The only thing I know that - in theory - does this is SMURF. It is umm... there's an implementation ? in Matlab that's basically a series of inhouse scripts strung together and called a day. Pretty dense and hard to use and not QIIME implemented. Ive run their example and it took me maybe... half and hour to profile the sample (with a substantial sacrifice to my swear jar when the FPM reached heights only previously matched when I tried to read SAS.)

But, you may also find this discussion on a similar issue interesting?

Best,
Justine

Connor_Geraghty · May 23, 2019, 8:57pm

Hi Justine,

Thank you for your response! I should clarify these samples all had the same primer sets (V4, 515F/806R), but the way our sequencing core operates, they use four unique adapter/spacer pairs. I have to use cutadapt to trim each set separately, thus creating four artifacts that each contain 1/4 of the sample data. I am looking for a way to merge these four SampleData artifacts into a single artifact for downstream use.

Connor

jwdebelius · May 23, 2019, 9:12pm

Ahhh. Sorry for the confusion!

In this case, I would run Cutadapt on all of them, and then I think I'd denoise them separately, to generate a feature table and rep set. You can just merge those with the feature-table plugin. There's a lot of answers here about merging if you search the forum and check out the FMT Tutorial

Best,
Justine

system · June 24, 2019, 3:12am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.