Hello, there’s some troubles with my data importation. There are about more than 100 samples in my dataset, and I have sequenced them in two run, which some of the samples have the same barcode, e.g. Sample 1 in Run I may have the barcode 67 while the Sample 100 in Run II may have the sample barcode 67. And now I would like to analyze them in one dataset. How could I import the data? Which kind of data format should I use?
Are your samples already demultiplexed? If so, then the barcodes don’t matter – that is only for demultiplexing multiplexed reads. One thing to keep in mind though is if you plan on running your reads through q2-dada2, you will want to import both of these runs individually, because DADA2’s error model assumes that reads are processed on a per-run basis. See the FMT Tutorial for an example of processing multiple runs in QIIME 2.
If your data isn’t demultiplexed, you might be in a difficult place - QIIME 2 isn’t able to account for runs as part of the demultiplexing process, so you would need to find a tool that is capable of doing that (I personally don’t know of anything that can handle these kinds of data), or writing out a custom script.
I think @zhenhaoluo would be able to just import and demultiplex each run separately. Then the overlapping barcodes don’t matter, and the feature-tables can be merged later on downstream (like in the FMT tutorial).