I am a beginner using QIIME2 for the first time.
I am trying to analyze 'GSE162844(GEO Accession viewer)' data.
I want to do otu clustering of fasta files, what should I do?
First of all, I'm trying to make a fasta file into a qza file.
The error indicates that the file contains invalid lowercase characters. You could convert these to uppercase to continue, but there could be other issues with the file if you are trying to use data that have already been processed.
so you could download and automatically format the data from there using the QIIME 2 plugin q2-fondue:
This might be an easier approach, as you could then also follow the QIIME 2 tutorials from the start instead of figuring out the entry point for starting with FASTA data.
When using 'fondue', may I enter GSE number in 'NCBIAccessionIDs'?
And in the tutorial code, what should I put in the 'metadata_file_runs.tsv' part instead? The only files I can get from GSE162844 are the taxonomy file and the fasta file.
I'm just starting bioinformatics analysis, so I think I'm asking a very basic question, but I'm curious about this part. Please, reply.
No, you cannot use the GSE number, you must use an SRA accession number (see the BioProject entry that I shared, this should work)
Download and open the file in the tutorial to see how the contents are formatted. This is basically just a list of the project IDs that you want to download (in this case only one ID).