I have got two files from the company R1.fastq.gz and R2.fastq.gz. Both files were in zip form and I unzipped R1. fastq.gz The screenshot is attached to
this message. and I saw some of the characters which I could not understand. Are here any barcodes or primers in this 16 Sequences. What type of data is this? Is it clean data or raw data? Can anyone explain?
fastq is not a human readable format unfortunately, this is why there's all those characters, I believe this essentially means more data in smaller file size. To work out if there are any barcodes/ primers in your sequence the best way would probably be to use cutadapt, demux-paired: Demultiplex paired-end sequence data with barcodes in-sequence. — QIIME 2 2018.8.0 documentation
This assumes you have the sequence for your barcodes and adapters, if you dont then you need to work out who the file came from and ask them.
Hope that helps!
Thank you all for your comments. I have few questions more: As Mr. Nicholas Bokulich told the data is raw data. Can I use QIIME2 for further analysis using this raw Data? Do I have to clean the data? If yes, How can I clean the data using QIIME2? Please guide me further. Thanks