dada2 "remove primers" PacBio CSS Issue

Just hoping to get some advice on using dada2 for PacBio CSS data! This is the first time I have worked with PacBio CSS, my intention is to use the Qiime2 pipeline, however I read on here that the version dada2 within qiime2 does not yet support long-read data. Thus, I am using this guide (DADA2 + PacBio: Fecal Samples) to process the data in R.

I am getting an error message when trying to run the removePrimers command, this is the latest code I have attempted based on the guidance here (removePrimers: Removes primers and orients reads in a consistent direction. in dada2: Accurate, high-resolution sample inference from amplicon sequencing data)

F1 <- "~Documents/16SCSS Jun21/R Dada2/F1-CSS.fastq"
f1n <- file.path(F1, "F1-CCS.fastq")
path.out <- "Figures/"
path.rds <- "RDS/"
F27 <- "AGAGTTTGATCMTGGCTCAG"
R1492 <- "TACGGYTACCTTGTTAGGACTT"
rc <- dada2:::rc
theme_set(theme_bw())

primF1 <- removePrimers(f1n, path.out, primer.fwd = F27, primer.rev = R1492, trim.fwd = TRUE, trim.rev = TRUE, orient = TRUE, verbose = TRUE)

When I try to run this I get the following error message-
Error in removePrimers(f1n, path.out, primer.fwd = F27, primer.rev = R1492, :
Some input files do not exist.

Not really sure what to make of this, I have defined the path to the fastq file, does anyone know where I should start with troubleshooting this?

Thank you for any advice you can give!

Hi @Sam_Prudence,
I never used dada2 for pacBio yet, but the error seems to suggest there are some mistyping on the path you set, in particular my guess is that should be:
"~/Documents/16SCSS Jun21/R Dada2/F1-CSS.fastq"

As aside, i would remove the spaces in the path, it is never a good idea to use them ...
If you are in doubt, try to open a new terminal and use the 'ls' command to see if the path exists:
"ls ~/Documents/16SCSS Jun21/R Dada2/F1-CSS.fastq "

(That command assume you are using a unix machine).
hope it helps
Luca

Thanks for your help! I have made the modifications you suggested (removed the spaces and added / to the beggining of the file path like) but I am unfortunately getting the same error message.

I am running this through r-studio, but the file path does exist when I navigate the files tab in r-studio.

My workind directory is the "R_Dada2" folder, which contains the fastq file F1-CSS.fastq. I noticed that when I run these commands it makes a new folder within the working directory, this folder is empty and contains a folder called "no primers". I also tried changing the first line to "F1 <- "F1-CSS.fastq". Could this be indicative of something wrong I am doing?

Hi @Sam_Prudence,

I guess dada2 is creating the new output folder ("Figures" in your case ???) bu tis unable to populate it because the error.

If you are within 'R_dada2' folder, could you run 'ls' command and attach the result, please?
Can you also copy the latest command you've used and the error it produces, please?
Cheers,
Luca

Hi @llenzi, I see! Sure no problem, please see below the ls output, and the full commans plus error.

primF1 <- removePrimers(f1n, path.out, primer.fwd = F27, primer.rev = R1492, trim.fwd = TRUE, trim.rev = TRUE, orient = TRUE, verbose = TRUE)
Error in removePrimers(f1n, path.out, primer.fwd = F27, primer.rev = R1492, :
Some input files do not exist.

primF1 <- removePrimers(f1n, nopsF1, primer.fwd=F27, primer.rev=dada2:::rc(R1492), orient=TRUE)
Creating output directory: F1-CSS.fastq/noprimers
Error in removePrimers(f1n, nopsF1, primer.fwd = F27, primer.rev = dada2:::rc(R1492), :
Some input files do not exist.

ls
function (name, pos = -1L, envir = as.environment(pos), all.names = FALSE,
pattern, sorted = TRUE)
{
if (!missing(name)) {
pos <- tryCatch(name, error = function(e) e)
if (inherits(pos, "error")) {
name <- substitute(name)
if (!is.character(name))
name <- deparse(name)
warning(gettextf("%s converted to character string",
sQuote(name)), domain = NA)
pos <- name
}
}
all.names <- .Internal(ls(envir, all.names, sorted))
if (!missing(pattern)) {
if ((ll <- length(grep("[", pattern, fixed = TRUE))) &&
ll != length(grep("]", pattern, fixed = TRUE))) {
if (pattern == "[") {
pattern <- "\["
warning("replaced regular expression pattern '[' by '\\['")
}
else if (length(grep("[^\\]\[<-", pattern))) {
pattern <- sub("\[<-", "\\\[<-",
pattern)
warning("replaced '[<-' by '\\[<-' in regular expression pattern")
}
}
grep(pattern, all.names, value = TRUE)
}
else all.names
}
<bytecode: 0x0000025e8aaae740>
<environment: namespace:base>

Hi @Sam_Prudence, sorry my bad,
I meant the 'ls' command to be run in a new terminal session in the "R_dada2" folder!

Cheers
Luca

Hi @llenzi, Ah apologies my bad! I have now managed to get this working, it turns out I had nor properly processed my filed prior to starting the pipeline! The sequencing service I used provides a fastq processor which generates a .gz file, after generating this and making sure the file paths were all correct this first step now appears to be running.

Thanks for your help!

Sam

Hi @Sam_Prudence,
Well done!
No worries!
Best wishes
Luca