we have a multi-samples (34 samples) dataset of paired-ends reads.
I have demultiplexed and cleaned all reads using a pipeline of my own, ending with 34 fasta files (1 per sample).
I would like to import all these cleaned reads into Qiime for downstream analyses. To do so, I could combine all these sequences into 1 fasta file and import it using the ‘qiime tools import’ command.
In such case, know Qiime would know the origin of each sequence (i.e. which sample it comes from) ? I assume this information should be contained in the sequence headers and in a metadata file, but I couldn’t find any clear description on how to proceed in the Qiime documentation.
No need to combine! QIIME 2 handles many formats of demultiplexed data perfectly well.
The one problem is it sounds like you have fasta data, not fastq... this means:
you can't import using one of the demultiplexed data formats (e.g., this)
you will not be able to denoise your data with dada2 or deblur. You will be forced to use OTU picking.
So if you really do have fasta data and can't tack the quality scores back on to make fastq, you will need to:
concatenate your fasta files, and yes sample information should be included in the headers following this format
Use this tutorial to import and cluster your data. At the end of that tutorial you will have a feature table and representative sequences, which you can use as described in any of the other tutorials.