data import type and directory format

Hi @jwdebelius,

I did try a variety of import types but I have not found one that works with the files I have. In reading through the forum and tutorials I think I need to use the manifest option to import data but I am having trouble understanding what to include in a .txt or .tsv file. I will keep looking through the forum but if you have any advice as to how to format the document or what needs to be in it, I’d appreciate it! I think it is different from the metadata, but in the import tutorial/page it notes some metadata info.

Thanks,
Christina

Hi @cal1037,

Yes, my assumption would be manifest. You can prepare this in excel like are regular table where you just have the table the 3 columns: sample-id, forward-absolute-filepath and reverse-absolute-filepath. In each case, you map whatever name you want to call the sample in the end to the forward (probably contains an R1) and reverse (same name with an R2) files. You need to use the absloute path, which you can probably get by adding $PWD to your path from the folder where you build your manifest.

Once you have constructed it in excel, go to file > save as > and then select “text” from the drop down menu. This will give you a tab-seperated manifest.

I like to keep mine seperate from my metadata because I find it easier to trouble shoot that way, but you need to make sure the file files up with whatever ids are in your metadata.

Best,
Justine

2 Likes

HI @jwdebelius,

I created a manifest file with sample-id and absolute-filepath. I think I am using the correct import type and format becasue I don't get an error regarding that but I am now getting this other error (attached).

I don't know what it means by 'No transformation'. Does this mean there is a problem with my code/the absolute file path or is there a problem with the format my data is in or something else?

I really appreciate the help, thanks so much!
Christina

Hi @cal1037,

You are using SingleEndFastqManifestPhred33V2 as your format, butt your type is SampleData[PairedEndSequencesWithQuality]. If you have single end data, then you need to use SampleData[SequencesWithQuality].

Best,
Justine

Hi @jwdebelius,

Thank you so much for the fast replies! I have paried end sequences so I fixed my input code.
code:
qiime tools import \ --type 'SampleData[PairedEndSequencesWithQuality]' \ --input-path /mnt/home/ernakovich/cal1037/devries_files/manifest_file_devries_4.txt \ --output-path /mnt/home/ernakovich/cal1037/devries_files/devries-demux.qza \ --input-format PairedEndFastqManifestPhred33V2

I have run the code on one line and on multiple lines, but either way I keep getting this error

Do you see something wrong with my code which is leading to this error?

Hi @cal1037,

There’s something either with the spacing or quotes. I think it’s probably that you’re using the ‘ quote instead o the ’ quote. So, maybe try

qiime tools import \
 --type 'SampleData[PairedEndSequencesWithQuality]' \ 
 --input-path /mnt/home/ernakovich/cal1037/devries_files/manifest_file_devries_4.txt \
 --output-path /mnt/home/ernakovich/cal1037/devries_files/devries-demux.qza \ 
 --input-format PairedEndFastqManifestPhred33V2

If that doesn’t work, could you post a picture of the command in your terminal?

Best,
Justine

1 Like

Hi @jwdebelius,

I am still getting this error image

Here is the code I am using: image
It doesn't work when it is on different lines which is why I have it all on one line.

When I run the exact code you sent I get this error: image

When the only change I make to your code is putting it onto one line I get this error: image

I am not sure what is wrong with the code I am typing or why on the code you sent it doesn't understand the type, output path, input path etc.

I really appreciate the help! Thank you so much for taking the time to work with me!
Best,
Christina

Hi @cal1037,
You don’t need to type " \ " when you input a command, it’s means " Enter " .

Good luck

2 Likes

Hi @cal1037,

As. @Iris says, you only need the “\” if you’re doing mult-line wrapping. So, you can either try without or copy line by line.

Best,
Justine

Thanks for the help @Iris and @jwdebelius.

I am still having trouble importing the data. I thought it was Paired end data so I was using image but it appears this is not correct... I am confused about the error though because my manifest file does have the header absolute-filepath as seen here (the top few lines of my .txt file image

I am not sure if I have done something wrong setting up my manifest file, through reading on the forum/tutorials I thought I set it up correctly. Do you all see a problem with how it is put together?

I know my data is paired end (according to NCBI where I downloaded the data from) but the ...Phred64V2 also does not work for importing the data.

Is there another data type/format that my data might fit under?

I really appreciate the help!
Christina

Hi @cal1037,

If you have paired end data, you need to use the paired end manifest format. You have a single end manifest format. Please go back to my previous post or the tutorial on paired end sequences.

Best,
Justine

1 Like

Hi @jwdebelius,

I am not sure how the set up of my manifest file is incorrect from looking at other posts and the errors I get. If I am understanding the error correctly, it wants me to use the "absolute-filepath" column which I have - am I misinterpreting the error (below)? image

Based on the file run code I don't know how to tell the R1 from R2 (forward from reverse), does this mean it is not possible to import the data? This is a link of one of the runs/sequences: Run Browser : Browse : Sequence Read Archive : NCBI/NLM/NIH
Am I using the wrong link to get the data?

Thank you for the help!
Christina

Hi @cal1037,

You’ve got the error flipped. It says that you have a column called absolute-filepath and it’s looking for a two columns called forward-absolute-filepath and reverse-absloute-filepath.

Again, is outlined pretty clearly in the manifest tutorial. Please read that closely.

Best,
Justine

1 Like

Hi @jwdebelius,

So I think my problem is that I have paired end reads (according to al the infor on NCBI where I am getting the samples from) but I only have one link for forward/reverse sequences. For example this is one of the runs I have taken from NCBI: ERR2654632 which was imported as a fastq file. This has been the only run style I’ve been able to upload to my directory.

I have been unable to figure out how to get these links (which appear to be forward/reverse reads) http://ftp.sra.ebi.ac.uk/vol1/run/ERR265/ERR2654632/H11C_S287_L001_R1_001.fastq.gz and http://ftp.sra.ebi.ac.uk/vol1/run/ERR265/ERR2654632/H11C_S287_L001_R2_001.fastq.gz to import into my directory/work with the manifest format.

Is there some way I should be renaming the files to make them be in the correct format of R1 and R2? Do I need to have these files in my directory before I import them to qiime or can qiime import them directly from NCBI?

Thanks,
Christina

Hi @cal1037,

Right now, the files do need to be in your directory. So, you need to download them first.

Best,
Justine

Hi @jwdebelius,

I reimported my data so I have the R1 and R2 (forward and reverse reads). I am now working on importing these data to qiime. Could you let me know if you see something wrong with my code or manifest file?

This image shows my code and error:

This image shows my manifest file.

Thanks for the help!
Christina

Hi @cal1037,

Could you please show the print out of

ls /mnt/home/ernakovish/ca1037/devries_files/ | grep ERR264394

Id like to check. the path to that file. …And If Ive mistyped because Im trying to copy off your images, please use the correct address that’s the first part of the manifest path.

Best,
Justine

Hi @jwdebelius,

Here is the path file for the ERR264394 which are the reads before they were split into forward (_1) and reverse (_2):

This is the file path I used in the code I sent this morning which is the split reads

Should I be calling upon the non-split reads (image 1) to pull the files into qiime? That is what I was doing previously but it did not work.

Thanks for the help!
Christina

Hi @cal1037,

I think you need a “/” between in the path in the manifest where you have a space.

So, for line 1, the forward path should be
/mnt/home/ernakovish/ca1037/devries_files/import_devries_split/ERR264394_1.fastq.

Best,
Justine

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.