I am configuring Blast+ on my mac (os sierra) and am having trouble configuring my nr and nt databases that I also downloaded locally. I am trying to follow NCBI's instructions here, and am getting hung up on the Configuration and Example Execution steps.
They say to change my .bash_profile so that it says:
export PATH=$PATH:$HOME/Documents/Luke/Research/Pedulla\ 17-18/blast/ncbi-blast-2.6.0+/bin
That works fine, and they say configure a path for BLASTDB "similarly" but to the file where my DB will be, so I have done this:
export BLASTDB=$BLASTDB:$HOME/Documents/Luke/Research/Pedulla\ 17-18/blast/blastdb/nt.00
which specifies the exact folder that I got when I unzipped the nt tar file from their FTP. With this path, if I run the command...
blastn -query test_query.fa -db nt.00 -task blastn -outfmt "7 qseqid sseqid evalue bitscore" -max_target_seqs 5
then it runs successfully and I get results, but I am worried that these are only being checked against the nt.00 section of the entire nt.00 database file, especially because if I run my test_query.fa sequence on the Web Blast, I get different results.
Also, their instructions say that the path only needs to point to the folder that contains the whole database folder nt.00, from the tar I unzipped--and not the specific nt.00 itself--, which in my case would just be "blastdb/" (As opposed to "blastdb/nt.00/" which then contains nt.00.nhd, nt.00.nal, etc.). That makes sense because when I am working I want to be able to blastn on the nt database but also blastp on the nr one, etc. by changing the -db flag on my command, and there shouldn't be a problem with having them all in this folder, right? But if I must specify the path for BLASTDB with the nt.00 DB added to the end, how could I ever use nr.00 in the same folder (blastdb/)? Essentially, I want to do as the instructions say, and just have this:
export BLASTDB=$BLASTDB:$HOME/Documents/Luke/Research/Pedulla\ 17-18/blast/blastdb/
And then depending on what database I want to use I could just say so after the -db flag on my command. But when I make the path like that above, it gives me this error:
BLAST Database error: No alias or index file found for nucleotide database [nt] in search path [/Users/LJStout::/Users/LJStout/Documents/Luke/Research/Pedulla 17-18/blast/blastdb:]
I have tried running that same blastn command from above and swapping out "nt" for "nt.00", and have tried these commands with the path for BLASTDB ending in both "blastdb/" and "blastdb/nt" and of course "blastdb/nt.00" which is the only one that runs without errors.
Here's an example of another thread I read where the OP is worried about his executions not checking the entire nt.00 folder, this was different than my problem however.
Thanks for you help!
This whole problem came down to having the nt.00 & nr.00 files, the original folders that result from unzipping their respective .tar.gz's, in the same parent folder when it should be that their contents are in the same parent folder. I simply deleted the folders they came in and copied the contents over to my new, singular parent. I was kind of mislead by the instructions, it was a simple mistake. Now, I have one folder,
blastdb/
that contains all of the contents of every database I plan on using, including nt,nr, and refseq.