I was trying to import 80 pairs of paired-end reads (i.e. 160 fastq files in total). I was running it on a server that's usually fast. It took 4 days now and it is still running. There's no error message whatsoever.
Here's the code. Is the input-format correct? Thanks.
I believe it could be because I omitted the "V2" at the end of "PairedEndFastqManifestPhred33". However, adding the V2 gave me so many mistakes that I have no idea how to fix. Basically it's all about the format of the manifest file.
Below is an example of my manifest.csv file
But it didn't work, because it shows the IDs are duplicated (forward and reverse). But when I changed the IDs to unique IDs by adding _1 and _2 suffixes, it tells me that they are not paired. So how exactly should I format the manifest file? Should it be coma separated or tab separated? I tuned it so many times and still couldn't get it to work.
@wei_wei,
I am not sure, but I think you might be on the right track here, here is the documentation for importing using the manifest format. But you will want to have the columns sample-id, forward-absolute-filepath, and reverse-absolute-filepath in your manifest. If you store the manifest inside of the folder with your data, you can cut down the path to the files a bit using the $PWD environment variable, as shown in the tutorial. Hope this helps!
Unfortunately, sometimes importing is just slow, from the directory you are importing, can you run ls -alh and post your result here so that we can get a better idea if your import is being unreasonably slow?
@wei_wei,
Wow, alright it is unsurprising that it is taking that long, that is a lot of data! Just to check, are there multiple MiSeq runs or multiple HiSeq lanes in your dataset? If so, they should be imported, denoised/other QC steps performed on them separately and then merged once you have your feature tables.