The Marine Biological Laboratory
Home
Directory
JBPC Forms
JBPC Wiki
The Marine Biological Laboratory The Marine Biological Laboratory
 
Faculty
Mitchell Sogin
Seth Bordenstein
Julie Huber
David Mark Welch
David Patterson
Anton Post
William Reznikoff
Jennifer Wernegreen
Research Faculty
Mark Alliegro
Linda Amaral Zettler
Irina Arkhipova
Hilary Morrison
Margrethe (Gretta) Serres
Adjunct Faculty
Robert Campbell
Matthew Meselson
Monica Riley
Andreas Teske
Harold Zakon
MBL/Brown Faculty
David Rand
Gary Wessel
Other Personnel
Administration
Graduate Students
Postdoctoral Fellows
Research Associates
Computer Facilities
Computer Resources
Sequencing Informatics
Software
Databases
Beowulf Clusters
Personnel
Local Databases
Antonospora locustae
GenProtEC
GiardiaDB
ICOMM
Micro*Scope
Spraguea lophii
Education
Advances in Genome Technology and Bioinformatics
Workshop on Molecular Evolution
Brown-MBL Graduate Program
Microbial Life Education Resources
Living in the Microbial World
HHMI-MBL Precollege Science Education Lab Series
Protistology Workshop
JBPC Databases

Databases maintained at the JBPC come in two forms. The first group consists of databases produced by scientists at the JBPC, such as GiardiaDB. Some of these are in the public domain while others consist of unpublished data protected by the MBL firewall and/or password access. The second group consists of maintained local copies of external data, such as a local copy of GenBank non-redundant nucleotide and protein blast databases.

Databases Produced by JBPC Scientists

  • GiardiaDB: The Giardia lamblia Genome and Gene Expression Database.

  • AntonosporaDB: The Antonospora locustae Genome Project

  • GenProtEC: E. coli Genome and Proteome Database.

  • Spraguea lophii Genome Survey

  • ICOMM: The International Census of Marine Microbes

  • Micro*Scope: A database of microbial diversity

  • GMOD.MBL.EDU is our cluster of Advanced Genome Browsers, used for creating online resources for combining and analyzing genome, gene expression, and annotation data for genome-scale sequence data. It is the primary bioinformatics engine for the third phase of the GiardiaDB project, by incorporating genome assembly, genome annotation, gene expression, and high-throughput phylogenetic information into the Generic Model Organism Database (GMOD) paradigm. The GMOD server can be used to support any ongoing genome-level analysis in the JBPC and is currently being used for Antonospora, Blochmania, Trypanosoma, EST, rotifer, and many other projects. See the GMOD Home Page for the complete list of projects and databases.

Local Copies of NCBI, Pfam, and Other External Databases

The JBPC automatically updates its local databases monthly, so BLAST, HMMER, and other analyses can be reliably performed on a number of our computers, including the Beowulf Clusters. See the Server and Beowulf Cluster details to see which computers maintain copies of these databases. Instructions are provided below for maintaining these databases on your personal computer.

Database Local Path
MITOP (not updated every 30 days).
/blastdb/mitop
E. coli peptides (via NCBI)
/blastdb/ecoli.aa
GenBank non-redundant peptides*
/blastdb/nr (excluding environmental)
/blastdb/nr_plus_env (including environmental)
/blastdb/env_nr (environmental only)
GenBank RefSeq peptides
/blastdb/refseq_protein
GenBank non-redundant nucleotides*
/blastdb/nt (excluding environmental)
/blastdb/nt_plus_env (including environmental)
/blastdb/env_nt (environmental only)
PFAM HMM library, local alignment models
/blastdb/Pfam_fs
PFAM HMM library, global alignment models
/blastdb/Pfam_ls
SwissProt (via NCBI)
/blastdb/swissprot
GMOD Databases (blast databases associated with our various GMOD Projects)
Varies according to available GMOD data. View contents of /blastdb or contact gmod@lists.mbl.edu for details.
EukDB, the predicted proteins from 34 diverse eukaryotes for which genome projects have been performed
/blastdb/EukDB
ProkDB, the predicted proteins from 26 prokaryotes for which genome projects have been performed
/blastdb/ProkDB
SsuDB, a subset of the NCBI environmental and nt databases that was generated by using BLAST to pull out matches to 16s and 18s rRNA genes
/blastdb/SsuDB
RefEuks, a collection of peptides from reference genomes for high-throughput eukaryotic phylogeny investigations.
/blastdb/RefEuks.fa
InterPro databases
(used by InterProScan)
PROSITE databases
/blastdb/prosite.dat
NCBI Taxonomy data
(used by SEALS)
SEALS GI data
(used by SEALS)

Maintaining Copies of these Databases on your Personal Computer

Copies of these databases can be obtained for any computer within the MBL firewall by performing rsync to WINTER.MBL.EDU. To see the databases that are available on WINTER.MBL.EDU, enter the following at the command line:

rsync winter.mbl.edu::

The output should look similar to this:

yourprompt% rsync winter.mbl.edu::
mitop           mitop fasta file and blast ready database (964 KB)
pfam            Pfam_ls & Pfam_fs HMMS (compressed)(~200MB)
swissprot       SwissProt blastdb (compressed)(~150kb)
ecoli           NCBI E.coli proteins blastdb (compressed) (~1 MB)
nt              NCBI non-redundant nucleotide (nt) blastdb (compressed)(~2.5 GB)
nr              NCBI non-redundant protein (nr) blastdb (compressed) (~750 MB)
BioDB           nr,nt,swissprot,Pfam,etc (~3.5GB compressed)
linuxDB         linux formated ecoli,nr,nt,swissprot ready for blasting (~6.4GB)
seals           necessary seals databases (gi,taxonomy,etc) (~120MB compressed)
interpro        interpro database (~500MB compressed)

This is a list of download modules, their description, and size. Note that some are compressed and will take up more than the listed hard drive space once uncompressed.

To download a module, rsync to WINTER.MBL.EDU as follows:

rsync -avuz winter.mbl.edu::[modulename] [destination on your machine]

Here is an example to download Genbank NR to your /blast directory:

rsync -avuz winter.mbl.edu::nr /blast

Information about rsync can be obtained at the rsync website or by using its man page.

 

 
     
Supported by NIH, NSF, NASA, The Josephine Bay Paul and C. Michael Paul Foundation, W.M. Keck Foundation, G. Unger Vetlesen Foundation, and Ellison Medical Foundation.
Unless otherwise stated, all material © 2004 Bay Paul Center, MBL.