Good afternoon,
I am supposed to run a picrust analysis( I specify in addition that I am doing an analysis based on OTu and not on ASV) I was wondering since I have already found one issue, if I am doing correctly:
I am starting with a fasta file (.fasta) made by dereplicating sequences
``joined_import_filter_derep:
export HDF5_USE_FILE_LOCKING='FALSE';
$(CONDA_ACTIVATE) Miqiime2-2021.8;
qiime vsearch dereplicate-sequences
--i-sequences fil_joined.qza
--o-dereplicated-table table.qza
--o-dereplicated-sequences rep-seqs.qza
joined_import_filter_derep_seq_unzip:
unzip rep-seqs.qza -d rep-seqs``
those rep-seqs look something like this:
6a3eea7fb1f9e169b134fc27428e50177e2e9c5f A.join.fastq.gz_12564
CCTACGGGTGGCAGCAGTAGGGAATCTTCCACAATGGGCGAAAGCCTGATGGAGCAACGCCGCGTGGGTGAAGAAGGTCTTCGGATCGTAAAACCCTGTTGTTAGAGAAGAAAGTGCGTGAGAGTAACTGTTCACGTTTCGACGGTATCTAACCAGAAAGCCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTATCCGGATTTATTGGGCGTAAAGGGAACGCAGGCGGTCTTTTAAGTCTGATGTGAAAGCCTTCGGCTTAACCGGAGTAGTGCATTGGAAACTGGGAGACTTGAGTGCAGAAGAGGAGAGTGGAACTCCATGTGTAGCGGTGAAATGCGTAGATATATGGAAGAACACCAGTGGCGAAAGCGGCTCTCTGGTCTGTAACTGACGCTGAGGTTCGAAAGCGTGGGTAGCAAACAGGATTAGATACCCCAGTAGTC
24ce437aa9000774104de794963f862c598c7a42 B.fastq.gz_7568
The point of having speces in the header makes the picrust2 going into error.
However I am not sure if I do something like
awk '{print $1 }' FASTA_IN > FASTA_OUT
as suggested here:
picrust_input
is correct for me. My doubt is the following: my fasta contains data for all the samples, so if I excelude from the header the sample name, would it be useful to run a correct analysis?
I thank you very much,
Michela