Can anyone explain me R1. 16S R1 fastq.gz data?

khaknasheen · October 26, 2018, 1:05am

Dear Sir,

I have got two files from the company R1.fastq.gz and R2.fastq.gz. Both files were in zip form and I unzipped R1. fastq.gz The screenshot is attached to

this message. and I saw some of the characters which I could not understand. Are here any barcodes or primers in this 16 Sequences. What type of data is this? Is it clean data or raw data? Can anyone explain?

Thanks

mouldinator · October 26, 2018, 11:26am

fastq is not a human readable format unfortunately, this is why there's all those characters, I believe this essentially means more data in smaller file size. To work out if there are any barcodes/ primers in your sequence the best way would probably be to use cutadapt, demux-paired: Demultiplex paired-end sequence data with barcodes in-sequence. — QIIME 2 2018.8.0 documentation
This assumes you have the sequence for your barcodes and adapters, if you dont then you need to work out who the file came from and ask them.
Hope that helps!

Good luck!

Nicholas_Bokulich · October 26, 2018, 12:57pm

This is raw sequence data containing sequences and quality scores. Wikipedia can tell you more.

Good luck!

khaknasheen · October 29, 2018, 4:37pm

Thank you all for your comments. I have few questions more: As Mr. Nicholas Bokulich told the data is raw data. Can I use QIIME2 for further analysis using this raw Data? Do I have to clean the data? If yes, How can I clean the data using QIIME2? Please guide me further. Thanks

Nicholas_Bokulich · October 29, 2018, 9:25pm

Please read the tutorials the top two on that page should answer your questions.

khaknasheen · October 31, 2018, 1:04pm

Thank You very much. Sir

system · December 1, 2018, 7:15pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.