I did try a variety of import types but I have not found one that works with the files I have. In reading through the forum and tutorials I think I need to use the manifest option to import data but I am having trouble understanding what to include in a .txt or .tsv file. I will keep looking through the forum but if you have any advice as to how to format the document or what needs to be in it, I’d appreciate it! I think it is different from the metadata, but in the import tutorial/page it notes some metadata info.
Yes, my assumption would be manifest. You can prepare this in excel like are regular table where you just have the table the 3 columns: sample-id, forward-absolute-filepath and reverse-absolute-filepath. In each case, you map whatever name you want to call the sample in the end to the forward (probably contains an R1) and reverse (same name with an R2) files. You need to use the absloute path, which you can probably get by adding $PWD to your path from the folder where you build your manifest.
Once you have constructed it in excel, go to file > save as > and then select “text” from the drop down menu. This will give you a tab-seperated manifest.
I like to keep mine seperate from my metadata because I find it easier to trouble shoot that way, but you need to make sure the file files up with whatever ids are in your metadata.
I created a manifest file with sample-id and absolute-filepath. I think I am using the correct import type and format becasue I don't get an error regarding that but I am now getting this other error (attached).
I don't know what it means by 'No transformation'. Does this mean there is a problem with my code/the absolute file path or is there a problem with the format my data is in or something else?
I really appreciate the help, thanks so much!
Christina
You are using SingleEndFastqManifestPhred33V2 as your format, butt your type is SampleData[PairedEndSequencesWithQuality]. If you have single end data, then you need to use SampleData[SequencesWithQuality].
Thank you so much for the fast replies! I have paried end sequences so I fixed my input code.
code:
qiime tools import \ --type 'SampleData[PairedEndSequencesWithQuality]' \ --input-path /mnt/home/ernakovich/cal1037/devries_files/manifest_file_devries_4.txt \ --output-path /mnt/home/ernakovich/cal1037/devries_files/devries-demux.qza \ --input-format PairedEndFastqManifestPhred33V2
I have run the code on one line and on multiple lines, but either way I keep getting this error
I am still having trouble importing the data. I thought it was Paired end data so I was using but it appears this is not correct... I am confused about the error though because my manifest file does have the header absolute-filepath as seen here (the top few lines of my .txt file
I am not sure if I have done something wrong setting up my manifest file, through reading on the forum/tutorials I thought I set it up correctly. Do you all see a problem with how it is put together?
I know my data is paired end (according to NCBI where I downloaded the data from) but the ...Phred64V2 also does not work for importing the data.
Is there another data type/format that my data might fit under?
If you have paired end data, you need to use the paired end manifest format. You have a single end manifest format. Please go back to my previous post or the tutorial on paired end sequences.
I am not sure how the set up of my manifest file is incorrect from looking at other posts and the errors I get. If I am understanding the error correctly, it wants me to use the "absolute-filepath" column which I have - am I misinterpreting the error (below)?
Based on the file run code I don't know how to tell the R1 from R2 (forward from reverse), does this mean it is not possible to import the data? This is a link of one of the runs/sequences: Run Browser : Browse : Sequence Read Archive : NCBI/NLM/NIH
Am I using the wrong link to get the data?
You’ve got the error flipped. It says that you have a column called absolute-filepath and it’s looking for a two columns called forward-absolute-filepath and reverse-absloute-filepath.
Again, is outlined pretty clearly in the manifest tutorial. Please read that closely.
So I think my problem is that I have paired end reads (according to al the infor on NCBI where I am getting the samples from) but I only have one link for forward/reverse sequences. For example this is one of the runs I have taken from NCBI: ERR2654632 which was imported as a fastq file. This has been the only run style I’ve been able to upload to my directory.
Is there some way I should be renaming the files to make them be in the correct format of R1 and R2? Do I need to have these files in my directory before I import them to qiime or can qiime import them directly from NCBI?
I reimported my data so I have the R1 and R2 (forward and reverse reads). I am now working on importing these data to qiime. Could you let me know if you see something wrong with my code or manifest file?
ls /mnt/home/ernakovish/ca1037/devries_files/ | grep ERR264394
Id like to check. the path to that file. …And If Ive mistyped because Im trying to copy off your images, please use the correct address that’s the first part of the manifest path.