merging data from multiple artifacts from qiita

hey!
I want to merge data from multiple artifacts. I have subdirectories that in each of them I have the following files:

forward.fastq.gz
reverse.fastq.gz
barcodes.fastq.gz
metadata.tsv

I want to merge all the data, but I want to do so effectively:
currently I was thinking about creating a qza file for each of them using the following command

qiime tools import \
  --type EMPPairedEndSequences \
  --input-path /path/to/your/input-directory/ \
  --output-path emp-paired-end-sequences.qza

and then using this file with the metadata file for further analysis:

qiime demux emp-paired \
  --i-seqs emp-paired-end-sequences.qza \
  --m-barcodes-file /path/to/metadata.tsv \
  --m-barcodes-column BarcodeSequence \
  --o-per-sample-sequences demux-paired-end.qza \
  --p-rev-comp-mapping-barcodes

and then merging the demux qza file using the following command:

qiime feature-table merge-seqs \
  --i-data demux-directory_1.qza \
  --i-data demux-directory_2.qza \
  --o-merged-data merged-sequences.qza

while of course the names of the qza files will be identical to the ones created and there will be the amount of qza files I have and not only 2.

After the merge I will continue the analysis on a single qza file which will be easier.
Is this a valid method to do the analysis?
Is there a more affective way to merge the data?

every insight will be appreciated
thank you very much
Nadav

Hi @nadavlisha,
I think that this tutorial from our user docs should be helpful to you, check it out!
--Hannah

1 Like

@nadavlisha, I want to draw out one point from the link that @jphagen shared. Generally speaking, you should merge after performing quality control (i.e., at the feature table stage), not before (i.e., at the demultiplexed sequence stage). For some quality control (QC) methods this is critical as it's assumed that all sequences came from the same sequencing run, but this is generally good practice regardless as it lets you perform the QC in multiple runs on smaller data sets which tends to be easier to manage that running one massive QC job.

1 Like