Can anyone explain me R1. 16S R1 fastq.gz data?

Dear Sir,

I have got two files from the company R1.fastq.gz and R2.fastq.gz. Both files were in zip form and I unzipped R1. fastq.gz The screenshot is attached to

this message. and I saw some of the characters which I could not understand. Are here any barcodes or primers in this 16 Sequences. What type of data is this? Is it clean data or raw data? Can anyone explain?

Thanks

fastq is not a human readable format unfortunately, this is why there’s all those characters, I believe this essentially means more data in smaller file size. To work out if there are any barcodes/ primers in your sequence the best way would probably be to use cutadapt, https://docs.qiime2.org/2018.8/plugins/available/cutadapt/demux-paired/
This assumes you have the sequence for your barcodes and adapters, if you dont then you need to work out who the file came from and ask them.
Hope that helps!

Good luck!

This is raw sequence data containing sequences and quality scores. Wikipedia can tell you more.

Good luck!

1 Like

Thank you all for your comments. I have few questions more: As Mr. Nicholas Bokulich told the data is raw data. Can I use QIIME2 for further analysis using this raw Data? Do I have to clean the data? If yes, How can I clean the data using QIIME2? Please guide me further. Thanks

Please read the tutorials the top two on that page should answer your questions.

1 Like

Thank You very much. Sir

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.