File extension in QIIME2

I would like to ask about file extension in QIIME2.
In regard to importing fastq sequences in QIIME2, the user documentation indicates that the recommended format is tab-separated (i.e., TSV). Could you kindly confirm whether it is still acceptable to import data in a tab-delimited ".txt" format as shown below? It seems Excel can't export TSV format, but it can export tab-delimited ".txt" format.

% qiime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-path manifest.txt
--output-path demux.qza
--input-format PairedEndFastqManifestPhred33V2

In the user documentation, the following explanation was provided:
"QIIME 2 metadata is most commonly stored in a [TSV] (i.e. tab-separated values) file. These files typically have a .tsv or .txt file extension, though it doesn’t matter to QIIME 2 what file extension is used."

Is this still the case when importing sequences?
I wonder how QIIME2 recognizes file extensions in its analyzing process. Does format matter, not extension throughout the QIIME2 process?

Thank you very much.

1 Like

Hi there @microbiome_25 ,

Short answer: Yes, you can use that file.

Long answer:

File format and file extension are different things. A tabular file is a tabular file, regardless of its extension.

Yes. From the Importing data tutorial:

The manifest file is a tab-seperated (i.e., .tsv) text file. The first column defines the Sample ID, while the second (and optional third) column defines the absolute filepath to the forward (and optional reverse) reads. All of the rules and behavior of this format are inherited from the QIIME 2 Metadata format. (emphasis added by me)

QIIME 2 does not check if the filename ends in .tsv or .txt before importing. It only checks that the format is tabular.

File extension is simply a part of the filename commonly used to imply information about the way data might be stored in that file. It is normally delimited from the rest of the filename with a ".", e.g. file.tsv. Windows systems actively rely on file extensions for suggesting programs to open files, and they even hide file extensions by default in the file explorer (which I think is the main reason why people don't really understand what file extensions are).

Don't think of a file extension as something "special", it is simply a part of the filename. You can have your tabular formatted file (TSV) with the filename data.pdf and your machine won't complain (it will probably suggest you to open it with a PDF viewer, but you can still open it with any program you want). You can also name it data, or data.tsv.txt, or data.extension.i.just.made.up, and the file will still be a tabular file that QIIME 2 will accept.

You can feed QIIME 2 directly with that file as long as it's tabular. If you want to have it with .tsv extension, not sure about Excel but you should be able to rename your file (file extension included) while saving it. If not possible, you can rename it afterwards. If you are on Windows, you will probably have to go to the file explorer and do View -> Show -> File extensions in order to see them and modify them with right-click -> Rename.

Cheers!

Sergio

2 Likes

Thank you for your detailed explanation.
I now understand the file extension better.
I will continue using the .txt file for my research analysis.
I appreciate your support.

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.