Importing MRDNA fasta + qual files with merged R and F reads into qiime2

Greetings!

Please clarify me

My demux interactive quality plot looks like this

Please let me tell my work pipeline

MRDNA send me one full.fasta, one full.qual and the mapping file in txt format

I have used Qiime1 for converting full.fasta and full.qual into fastq file
and then using Qiime 1 I extracted the barcode file
Imported the files as EMPSingleEndSequences in Qiime 1
Demultiplexed in Qiime 2

demux.qzv (290.9 KB)

Now the plot looks like the above screenshot

I am clueless because the error may be

  1. The original raw data may be a paired end sequence. But how to get forward and reverse reads from the given data by the service provider

  2. I switched between Qiime1 and Qiime 2 and imported the data using Qiime1 . Is it a problem?

  3. However, the sequence numbers in "Overview window is looking good

  4. If the sequence is paired end, the workflow is wrong? then how to convert the full.fasta and full.qual file into paired end read and the barcode file in qiime2 or any other way?

Hi @srini,
There isn’t an easy way to unpack your reads in Qiime2 when the qual file is separate and forward/reverse reads are in the same file. But there may be something in Qiime1, at which point it is totally fine to import it into qiime2. But before we go down this rabbit hole, can you describe what you did in qiime1 exactly? And can you describe to us what the mapping file looks like?
By far the easiest would be is to ask MrDNA to give you raw fastq files and not do any processing on your reads. I don’t know why that company always makes things so much more complicated…
I should also mention that your quality scores look very artificial, as if something has been done to them, I don’t know any sequencing platform that gives such perfect reads all the way through… it would be good to know what exactly has been done as well because that can certain affect downstream analysis.

2 Likes

Thank you.

Please find attached the mapping file 072319EBillcus515F-mapping.txt (2.8 KB)

Please find below my workflow in qiime

  1. Converted the full.fasta and full.qual into fastq
    convert_fastaqual_fastq.py -f 072319EBillcus515F-full.fasta -q 072319EBillcus515F-full.qual -o fastq_files/

  2. Extracted the barcodes from the fastq file generated above
    extract_barcodes.py -f 072319EBillcus515F-full.fastq -c barcode_single_end --bc1_len 12 -o processed_seqs

  3. Renamed the fastq file as "emp-single-end-sequences.fastq and zipped it

  4. Imported through Qiime1 using
    qiime tools import --type EMPSingleEndSequences --input-path emp-single-end-sequences --output-path emp-single-end-sequences.qza

  5. But demuxed through Qiime2 2019.7
    qiime demux emp-single --i-seqs emp-single-end-sequences.qza --m-barcodes-file 072319EBillcus515F-mapping.txt --m-barcodes-column BarcodeSequence --o-per-sample-sequences demux.qza --o-error-correction-details demux-details.qza --p-no-golay-error-correction

Since I am confused or clueless about the MRDNA provided full FASTA file, I just selected the possible qiime steps without worrying about single-end / paired-end reads,

I made some attempts to get the fastq file from them, but they keep saying that the data they can provide. There is a readme file sent by them which is also confusing. Please find below a page out of it

Please help me to go come across this issue.

Hi @srini,
Thanks for sharing those.
By far the easiest for us to do is follow the link and use their fastq processor the mention in the document to get individual fastq files. Please give that a go, as the process you are using in qiime1 right now has a lot of complications. They are not giving you the individual fastq files because this is a pay-for-service fee for them ($100), but they do say you can do it yourself for free. So let’s try that before we do anything else!

Thank you again.

But the problem is they given only full.fasta and qual.fasta. But the MRDNA FASTQ Processor demands R1 and R2 file and the barcode file. The mapping file can act as the barcode file. Its OK. But for R1 and R2 FASTQ files ! Please find below the screen shot of the files they provided.

I requested them to give me the FASTQ file. But again they keep saying read the instructions for getting the Qiime required files using the software.

There is an MRDNA software which gives one FASTQ file from full.fasta and qual.fasta file

But with a single fastq which is inclusive of barcode, forward and reverse read, my above approach is fitting and the demux plot looks like that.

If you have any idea, please convey to me.

Thank you

Hi @srini - we are not affiliated with Mr. DNA in any way, you will need to work with them on getting your data from them. QIIME 2 does not have a way to handle fasta+qual files, you will need to use another tool to convert to fastq, or, work with your sequencing provider to prepare fastq files. Please come see us once you have fastq files, we can help you get those loaded up into QIIME 2 and working on your analysis!

Yes, I understand. Thank you for your reply. This session may be closed.

Hi @srini,
When contacting MrDNA, really what you need to tell them is that you are looking for the forward and reverse reads to be separate and not for some reason merged together. I’m not actually sure if that type of merging was done intentionally or not, it is a very bizarre way of delivering raw data. But it sounds like they are aware of qiime2 format needs so they must have a way of delivering the data as they said. qual + fasta =fastq using their tools, sure, that’s good. But you need F and R separate, they should honor that request.

Hi Qiime2 team,

Sorry for reopening this thread. I’m having a similar problem. I was able to convert the fasta and qual file to a fastq file but still, it is a merged full reading. Is there a way to import merged reads, or do I have to wait for the raw data with forward and reverse reads separately?
Thank you so much.
PS: I’m watching the workshop live stream and it looks good. So excited about this week!

Hi @Hui_Yang,

please see this thread:

You should be able to find the command for importing the sequences in your case!
Cheers
Luca

2 Likes