Hi @Byron_C_Crump,
The “easy” way to do this would be to go back and re-process your own data with deblur, but add the --p-no-hashed-feature-ids
parameter, which will cause feature IDs to be unhashed (i.e., the sequence will become its own feature ID and match the QIITA data you have).
But I will offer another solution that I hope will save some time. Deblur has caused you enough headaches so I slapped this together in hope that it helps! There are more elegant ways to do this, but a bash one-liner is probably easiest for you to use right now.
- Use
qiime metadata tabulate
and download your sequences as metadata. The file should look something like this:
Feature ID Sequence
#q2:types categorical
ACTGATCGATCG ACTGATCGATCG
ACTGATCTTTCG ACTGATCTTTCG
ACTGGGGGCTCG ACTGGGGGCTCG
- Remove the first two lines of that file.
- Run the following command in your terminal (alter the filepaths)
while read EachLine
do
id=$(echo $EachLine | cut -f 1 -d ' ')
newid=$(echo $id | md5)
echo "$id $newid" | tr ' ' '\t' >> feature_id_map.tsv
done < input_sequences_as_metadata.tsv
This will create a file that maps your sequences to their md5 hashes. Something like this:
ACTGATCGATCG 46c363d67c1b8ced9e320081ad09914f
ACTGATCTTTCG 867a92b54ad55292e5e88660238ac920
ACTGGGGGCTCG 69a84eec85419cea96eece49ae926ea5
- Run the following command:
qiime feature-table group \
--i-table table.qza \
--m-metadata-file feature_id_map.tsv \
--p-axis feature \
--o-grouped-table grouped-table.qza
I think that should do the trick of relabeling the feature IDs in your feature table. This should then be mergeable with your other feature table. But there will probably be some kinks to iron out, e.g., you will probably need to add a header line to your feature_id_map.tsv
file. I have not tested this all the way through.
Let us know if you get stuck and we’ll give you a hand!