Importing the data and Metadata

Dear All,

I want to start some work with QIIME2 but it seem a bit complicated especially importing files and metadata to artifact.
I have around 450 samples sequenced by Illumina and paired-end and demultiplexed.
First let’s start with metadata file. I think if the sequences are demultiplex that’s mean I don’t have to mention the BarcodeSequence LinkerPrimerSequence, is that right?
Also, the important thing how can I link the samples ID with file name?
For example, I have one of sample name is A001 and the FASTA files related to this sample are J188b_S1_L001_R1_001.fastq.gz and J188b_S1_L001_R2_001.fastq,gz
How can I make sure that QIIME will associate those files to this specific sample ID?

Another question regard QIIME2, is there any change in the method of alignment or OTU picking from v1.9.1? I’m asking this question because I don’t know if this new version contain significant change in the working methodology that’s worth to change the whole pipeline that we use currently.


Hey @Faisal,

Sorry for the delayed response!

That is correct, also in QIIME 2 there are no “required” metadata columns, so in the event you did have multiplexed data, your BarcodeSequence column could actually be named anything.

Looks like you can just import with the Casava format!

What will happen is QIIME 2 will look at the filenames of each .fastq.gz and work backwards through each underscore. First it will see the illumina 001 segment, the read direction R1, then the lane number L001, the barcode ID S1, then whatever remains is your sample ID.

I’m not sure what you mean by alignment, do you mean read merging, reference picking, or something else?
As far as the general methodology is concerned, we’ve moved away from OTU clustering and now use what are called Amplicon Sequence Variants (ASV). The general impact is that most people see less features and they are of a much higher quality than you’d generally see with 97% clustering or similar. This means the diversity metrics in QIIME 2 generally give you more reasonable results as well.

We’re currently working on implementing closed-reference and de-novo OTU picking, but those steps aren’t quite ready yet (if you needed OTUs).

QIIME 2 is much more agnostic about the specific techniques or methods and is designed to be easily extended. So while it might seem like there’s only one way to do things right now (because we’re still implementing a lot of functionality), ideally in the future, there won’t be any “QIIME pipeline”. Instead it’ll be much more of a “choose your own adventure”.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.