If your goal is just to practice processing data in QIIME 2, you might want to consider using a different dataset (e.g., with quality scores in the barcode) that should be less difficult to process in QIIME 2. However, I recognize you may want a PacBio dataset specifically…
It looks like the link to the metagenome data actually contains both a forward read and an index (barcode) read. Those will be the fastq sequence files that you want to import into QIIME as described below.
Check out this tutorial to learn about importing different data types. This is the specific example you want.
However, note that the barcodes need to have quality scores to import with this command. This is why I suggest using a dataset with quality scores in the barcodes for practice purposes. Otherwise, you need to generate fake quality scores for your barcodes (this would need to be done outside of QIIME 2 before attempting to upload).
This file is actually a list of barcodes used for each sample, not the barcode sequences; add these barcodes to your sample metadata file (see this file for an example format), do not try to upload this file as barcode sequences.
Thanks for the reply, it’s very helpful!
So the MS world file containing barcode in this paper is like a mapping data/barcode (.tsv )?And real barcode sequence data can be saparated form ERR1447468 data by split_libraries.py ?
Exactly! It is not in the same format as the mapping files used in QIIME2 (see the link above for an example), but you can copy the barcodes out of this file and place them in your metadata mapping file to get started.
Yes, the actual barcode for each sequence will be contained within that sequence file; whether the barcodes are contained within the read or in a separate read I do not know (I am not familiar with PacBio data) and you may want to get in touch with the study authors to be sure. If the barcodes are in a second sequence file, you should be able to import those files directly into QIIME2 for demultiplexing. If the barcodes are contained in-line in each sequence, you will need to use a method outside of QIIME 2 to extract those barcodes into a separate fastq file, e.g., the qiime1 script extract_barcodes.py, and then import the resulting fastqs to QIIME2. That functionality is planned in QIIME2 for the next release (end of this month) but is not available in the current release.