qiime tools import or export altering fastq names?

Hi fellow qiime2 users,

So, i'm having this small issue in which my fastq-files names come out different from what is in the manifest file. Example

Manifest file (I replaced part of the path with ... to make it a bit more readable):

sample-id,absolute-filepath,direction
2020.10.Rotavirus.S0001,/hpc/.../demultiplexed_renamed/2020.10.Rotavirus.S0001_S1_L001_R1_001.fastq.gz,forward
2020.10.Rotavirus.S0001,/hpc/.../demultiplexed_renamed/2020.10.Rotavirus.S0001_S1_L001_R2_001.fastq.gz,reverse
2020.10.Rotavirus.S0002,/hpc/.../demultiplexed_renamed/2020.10.Rotavirus.S0002_S2_L001_R1_001.fastq.gz,forward
2020.10.Rotavirus.S0002,/hpc/.../demultiplexed_renamed/2020.10.Rotavirus.S0002_S2_L001_R2_001.fastq.gz,reverse

File names after qiime tools export:

2020.10.Rotavirus.S0001_0_L001_R1_001.fastq.gz
2020.10.Rotavirus.S0001_1_L001_R2_001.fastq.gz
2020.10.Rotavirus.S0002_2_L001_R1_001.fastq.gz
2020.10.Rotavirus.S0002_3_L001_R2_001.fastq.gz

So, basically the 'S' number is changed to an increasing number starting from 0 for the first file and up to the amount of files exported...Now, I wouldn't really mind if this number would be the same for the R1 AND R2 files of a particular sample. So, if the '0' was there for the R2 of sample S0001 as well, for example. But it doesn't seem right to have different labels for the forward and reverse fastq-files of the same sample.

Now, I see in your documents here that you use a slightly different manifest format. On the other hand the MANIFEST that is generated by Qiime2 after exporting has the same format as my manifest-file:

sample-id,filename,direction
2020.10.Rotavirus.S0001,2020.10.Rotavirus.S0001_0_L001_R1_001.fastq.gz,forward
2020.10.Rotavirus.S0001,2020.10.Rotavirus.S0001_1_L001_R2_001.fastq.gz,reverse
2020.10.Rotavirus.S0002,2020.10.Rotavirus.S0002_2_L001_R1_001.fastq.gz,forward
2020.10.Rotavirus.S0002,2020.10.Rotavirus.S0002_3_L001_R2_001.fastq.gz,reverse

Is the difference in manifest format the reason for this "renaming"? I don't think that's supposed to be it, since I thought this other format would be only compatible with PairedEndFastqManifestPhred33V2 and I'm using PairedEndFastqManifestPhred33 (the non-V2 parameter). I'm also not entirely sure if it's going wrong during import or export. On the bright side, it doesn't seem to have any negative effect on the rest of the analyses, which I thought it might at first. So, it's not a huge issue, but still a bit annoying.

@M_R,
This is happening during the import process, the value that is being changed is the barcode(which as long as it is unique, does not matter), checkout the import transformer if you want to see where this is happening. This is similar to how Casava handles the file names as well.

Thanks for the reply!

So, there's no "option/parameter" to fix this, right? I mean, it's indeed not that serious, but would have preferred for R1 and R2 versions of a sample to only differ in that, R1 and R2, and so preferably the rest of the sample label to be identical.
I have also found that when I import reads directly, so not though a manifest file, it actually gives me the results that I'm looking for. So, for some reason it's when a manifest file is involved that this issue arises.

But like I said, it's a minor annoyance, so I can live with it.

Unfortunately not, it is not something we have been asked for before/it is important that they are unique. Glad there is another import method that is doing exactly what you want!

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.