The Marine Biological Laboratory
Home
Directory
JBPC Forms
JBPC Wiki
The Marine Biological Laboratory The Marine Biological Laboratory
 
Faculty
Mitchell Sogin
Seth Bordenstein
Julie Huber
David Mark Welch
David Patterson
Anton Post
William Reznikoff
Jennifer Wernegreen
Research Faculty
Mark Alliegro
Linda Amaral Zettler
Irina Arkhipova
Hilary Morrison
Margrethe (Gretta) Serres
Adjunct Faculty
Robert Campbell
Matthew Meselson
Monica Riley
Andreas Teske
Harold Zakon
MBL/Brown Faculty
David Rand
Gary Wessel
Other Personnel
Administration
Graduate Students
Postdoctoral Fellows
Research Associates
Computer Facilities
Computer Resources
Sequencing Informatics
Software
Databases
Beowulf Clusters
Personnel
Local Databases
Antonospora locustae
GenProtEC
GiardiaDB
ICOMM
Micro*Scope
Spraguea lophii
Education
Advances in Genome Technology and Bioinformatics
Workshop on Molecular Evolution
Brown-MBL Graduate Program
Microbial Life Education Resources
Living in the Microbial World
HHMI-MBL Precollege Science Education Lab Series
Protistology Workshop

Software packages currently installed on bioware machines:

Name Version Notes
Alidot 2.05 The program alidot is designed to detect conserved RNA secondary structures in small data sets of related RNA sequences.
AlignIR
LI-COR's AlignIR software provides sequence alignment and assembly functions.
Apollo 1.4.4 Apollo is a genome annotation viewer and editor.
Arachne
Arachne is a tool for assembling genome sequences from whole genome shotgun reads, mostly in forward-reverse pairs obtained by sequencing clone ends. Use of Arachne on HPSRV.MBL.EDU is restricted to large-scale sequence assembly projects.
Arb
The ARB software is a graphically oriented package comprising various tools for sequence database handling and data analysis. Arb may be installed on a number of computers, but the important Arb databases are only maintained on GREENGENES.MBL.EDU, a computer specifically configured for Arb.
Artemis 5 Artemis is a free genome viewer and annotation tool that allows visualization of sequence features and the results of analyses within the context of the sequence, and its six-frame translation.
ATV 1.92 ATV (A Tree Viewer) - a phylogenetic tree display tool.
BAMBUS 2.33 BAMBUS is the first publicly available genome assembly scaffolding program.
BLAST 2.2.6 BLAST 2.0, (Basic Local Alignment Search Tool), provides a method for rapid searching of nucleotide and protein databases. High-throughput, parallel use of the blastall program is provided on the Beowulf Clusters - please submit your large jobs to the Beowulf Clusters. We maintain updated, local copies of several blast databases.
CAP3
The CAP3 sequence assembler makes use of a large number of forward-reverse constraints to locate and correct errors in layout of sequence reads.
Clustalw 1.82 Clustal W is a general purpose multiple sequence alignment program for DNA or proteins.
Clustalx 1.82 Clustal X is a new windows interface for the ClustalW multiple sequence alignment program.
Cluster 1.24 Cluster and TreeView are programs that provide a computational and graphical environment for analyzing data from DNA microarray experiments, SAGE, or other gene expression datasets. The command line version is invoked by hcluster.
COILS 2.2 COILS is a program that compares a sequence to a database of known parallel two-stranded coiled-coils and derives a similarity score. By comparing this score to the distribution of scores in globular and coiled-coil proteins, the program then calculates the probability that the sequence will adopt a coiled-coil conformation.
Combiner 2.12 The Combiner is a program that predicts gene models using the output from other annotation software.
Consed 12 Consed is a program for editing sequence assemblies created with Phil Green's PHRAP assembly program, Arachne, or other sequence assemblers supporting the ACE format. Use of Consed on HPSRV.MBL.EDU is restricted to large-scale sequence assembly projects.
CONSEL v0.1f CONSEL: for assessing the confidence of phylogenetic tree selection
Critica 1.0.5b CRITICA (Coding Region Identification Tool Invoking Comparative Analysis) is a program for identifying the protein coding regions in DNA based on a combination of evidence from dicodon usage frequencies and a comparative analysis of the relative conservation of DNA sequence and their potential protein products. We highly recommend you submit large jobs to the Beowulf Clusters to take advantage of job parallelization.
Cross_match 0.990329 Cross-match is a general-purpose utility, based on SWAT, for comparing any two sets of (long or short) DNA sequences. It is most commonly used for vector screening.
DOTUR 1.53 Distance Based OTU and Richness; DOTUR is a computer program that takes a distance matrix describing the genetic distance between DNA sequence data and assigns sequences to operational taxonomic units (OTUs) using either the furthest, average, or nearest neighbor algorithms for all possible distances that can be described using the distance matrix. Using the OTU composition data, DOTUR constructs collector's and rarefaction curves for sampling intensity, richness estimators, and diversity indices.
ELPH 0.1.4 ELPH is a general-purpose Gibbs sampler for finding motifs in a set of DNA or protein sequences.
Emboss 2.8.0 EMBOSS is a free Open Source software analysis package specially developed for the needs of the molecular biology user community.
FASTA 34 Many options for local alignment searches of sequence databases. We maintain updated, local copies of several blast databases that can be read by FASTA.
FastDNAml 1.2.2 fastDNAml is a program for estimating maximum likelihood phylogenetic trees from nucleotide sequences.
FastME
FastME is a distance based phylogeny reconstruction algorithm.
Fischer perl scripts
A number of small sequence manipulation programs, including file conversion and translation.
Forester 1.92 FORESTER is being developed as a framework for (automated) sequence function prediction based on phylogenetic information, e.g. phylogenomics.
GCcalc.pl
GCCalc calculates GC content at all codon positions, as well as 1st, 2nd, and 3rd codon positions alone.
GCG 10.3 Molecular biologists worldwide use the GCG Wisconsin Package as their software of choice for comprehensive sequence analysis. This is licensed software that only runs on GCG.MBL.EDU. The GCG databases are updated as they are provided from the vendor.
GDE
An integrated linux environment for bioinformatics and evolutionary analysis based on the Genetic Data Environment.
Genesplicer
A fast, flexible system for detecting splice sites in the genomic DNA of various eukaryotes.
GLIMMER 2.13 A system for finding genes in microbial DNA, especially the genomes of bacteria and archaea.
GLIMMERM 2.5.1 A gene finder derived from Glimmer, but developed specifically for eukaryotes.
GMEP 1.0.1

Combined expression data and sequence analysis for prediction of regulatory motifs from expression profiles.

GraphViz 1.1 Graphviz - open source graph drawing software
Hmmer 2.3.2 HMMER is a freely distributable implementation of profile HMM software for protein sequence analysis. High-throughput, parallel use of several of the HMMER programs is provided on the Beowulf Clusters - please submit your large jobs to the Beowulf Clusters. We maintain updated, local copies of several Pfam databases.
Infernal 0.1 Infernal - inference of RNA secondary structure alignments
InterPro Scan 3.3 InterPro is a database of protein families, domains and functional sites in which identifiable features found in known proteins can be applied to unknown protein sequences.
Jalview 1.7.5b Jalview - a java multiple alignment editor
Jnet
Jnet: A neural network protein secondary structure prediction method.
JOT

Random number generator

Lamarc 1.1.1 LAMARC is a package of programs for computing population parameters, such as population size, population growth rate and migration rates by using likelihoods for samples of data (sequences, microsatellites, and electrophoretic polymorphisms) from populations.
Ldhat
A package for coalescent analysis of patterns of linkage disequilibrium and estimation of the population recombination rate
LUCY 1.18p A sequence clean-up program (vector and quality screening).
McArthur perl scripts
A variety of simple sequence and file manipulation programs, including processing of SAGE tags.
Meme / Mast 3.0.4 Motif discovery and search. Note, to avoid GCG's version of MEME use memenogcg.
Mesquite
Mesquite is software for evolutionary biology, designed to help biologists analyze comparative data about organisms.
Migrate 1.7.3 Migrate estimates the effective population sizes and migration rates of n constant populations using nonrecombining sequence, microsatellite or enzyme electrophoretic data.
Modeller 7v7 A program for comparative protein structure modelling by satisfaction of spatial restraints.
Modeltest 3.7 Modeltest helps a user to choose the model of DNA substitution that best fits the data, among 56 possible models. Model selection is performed using many processors on the Beowulf Clusters.
Molphy 2.3b3 MOLPHY is a program package for MOLecular PHYlogenetics.
MrBayes 3.1.1 MrBayes is a program for the Bayesian estimation of phylogeny. We are experimenting with high-speed MrBayes analyses - parallel versions on the Beowulf Clusters may or may not be available. MACBLASTER.MBL.EDU (an Apple G5) performs very well. If you have many analyses to perform, we request you restrict your analysis to the Beowulf Clusters, even if they are running in single-processor mode at the moment. Intense MrBayes users should contact biocomp@lists.mbl.edu for training on use of the HP Alpha HPSRV.MBL.EDU.
MUMmer 3.11 A system for aligning whole genome sequences.
MUSCLE developer version Advanced multiple protien sequence alignment software.
NAMOT 2 Nucleic acid molecule modeling tool.
Paml 3.13d Phylogenetic Analysis by Maximum Likelihood (PAML)
Paup 4.0b10 PAUP is one of the most widely used software packages for the inference of evolutionary trees. PAUP may be installed on a number of computers, but due to the computational intensity of these searches, large PAUP analyses are only tolerated on the Beowulf Clusters, where several types of analyses can be performed using multiple processors.
Phrap 0.990329 Phrap is a program for assembling shotgun DNA sequence data. A number of pipelines are available for running the complete PHRED, CROSS_MATCH, and PHRAP family of programs. Contact biocomp@lists.mbl.edu for details.
Phred
Phred reads DNA sequencer trace data, calls bases, assigns quality values to the bases, and writes the base calls and quality values to output files.
Phylip 3.5c PHYLIP is a free package of programs for inferring phylogenies. Large analyses are only tolerated on the Beowulf Clusters, where use of multiple processors is in development.
Polyphred 4.22 PolyPhred is a program that compares fluorescence-based sequences across traces obtained from different individuals to identify heterozygous sites for single nucleotide substitutions.
Primer3
Primer3 picks primers for PCR reactions.
Probcons 1.1

PROBCONS is an efficient protein multiple sequence alignment program, which has demonstrated a statistically significant improvement in accuracy compared to several leading alignment tools.

Pseq-gen 1.1 Protein Sequence-Generator: An application for the Monte Carlo simulation of protein sequence evolution along phylogenetic trees.
Psi-Phi

Psi-Fi is a suite of programs designed to identify pseudogenes (attributable both to misannotation and to non-recognition) through comparative analyses of related genomes

PSORT 2 2.0

Prediction of subcellular localization of proteins.

PUZZLE

Puzzleboot 1.03 Application of the Roger and Holder puzzleboot routine to the Beowulf Clusters, with use of multiple processors.
Pymol 0.88 PyMOL is a user-sponsored molecular modeling system with an open-source foundation.
R 1.8.1 R is a language and environment for statistical computing and graphics.
R8s 1.6 This is a program for estimating absolute rates of molecular evolution and divergence times on a phylogenetic tree.
RBSfinder
RBSfinder is a Perl script that implements an algorithm to find ribosome binding sites for genes in bacterial and archaeal genomes.
Readseq
Readseq takes on the job of guessing what your input biosequence data format is and converting it to what your software knows how to handle.
ren
Whereas mv can rename (as opposed to move) only one file at a time, ren can rename many files according to search and replacement patterns.
RepeatFinder
RepeatFinder is a computational system for analysis of repetitive structure of genomic sequences.
Reputer
The REPuter program family provides state of the art software solutions to compute and visualize repeats in whole genomes or chromosomes.
Rrtree 1.1 RRTree is a program for relative-rate tests. It compares substitution rates between DNA or protein sequences grouped in phylogenetically defined lineages: relative-rate tests with a tree.
SAM 3.4

SAM (Sequence Alignment and Modeling) utilizes hidden markov models (HMMs) to perform sequence similarity searches and multiple sequence alignments.

SEALS 0.824 SEALS (A System for Easy Analysis of Lots of Sequences) is a software package of handy sequence and blast report manipulation tools. SEALS is no longer supported by the author - not all functions may continue to work.
Seg
A program for filtering low complexity regions in amino acid sequences.
Seq-gen 1.2.7 An application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees.
Sequencher 4.2 An OSX desktop application for sequence fragment viewing, editing, and assembling. The JBPC has one keyserver-based license. Email biocomp@lists.mbl.edu for instructions on obtaining a client and connecting your client to the keyserver.
SignalP 3.0 SignalP predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes. The method incorporates a prediction of cleavage sites and a signal peptide/non-signal peptide prediction based on a combination of several artificial neural networks and hidden Markov models. Current limits: 6000 amino acid maximum sequence length, maximum of 2000 sequences per run, maximum of 200,000 total number of residues per run.
Sim4
Alignment of expressed DNA sequence with a genomic sequence, allowing for introns.
SOAP 1.1 Analysis and cleaning of multiple sequence alignments.
SONS 1.0 Shared OTUs and Similarity; SONS uses non-parametric estimators to estimate similarity between communities based on membership and structure. Because SONS is directly compatible with output files from DOTUR, it is possible to quickly determine the fraction of OTUs shared by two communities for any desired distance level. package.
T-Coffee 1.37 T-Coffee is a multiple sequence alignment package.
Ta2ace 1.4 Ta2ace provides a simple way to convert the output of TIGR assembler to the new ACE format used by Consed, one of the most widely used assembly editors.
TargetP 1.1 TargetP predicts the subcellular location of eukaryotic protein sequences. The subcellular location assignment is based on the predicted presence of any of the N-terminal presequences chloroplast transit peptide (cTP), mitochondrial targeting peptide (mTP) or secretory pathway signal peptide (SP). Current limits: 4000 amino acid maximum sequence length, maximum of 2000 sequences per run, maximum of 200,000 total number of residues per run.
TIGR Assembler 2 The TIGR Assembler is the classic assembly tool developed by TIGR to build a consensus sequence from smaller sequence fragments.
TIGR Gene Indices Software Tools

This package automates clustering and assembly of a large EST/mRNA dataset. The clustering is performed by a slightly modified version of NCBI's megablast , and the resulting clusters are then assembled using CAP3 assembly program. TGICL starts with a large multi-FASTA file (and an optional peer quality values file) and outputs the assembly files as produced by CAP3. Vector should be removed by LUCY, not CROSSMATCH, and input sequence should be in uppercase.

TmHmm 2.0c Prediction of transmembrane helices in proteins
tnimage 3.3.15a A scientific image analysis program that allows you to create, edit, analyze, and produce color prints of images. It is particularly useful for analyzing images of SDS and agarose gels and X-ray or MRI images.
Traceviewer 3.0.2 TraceViewer is a Java program that allows you to see, print, and edit DNA sequencing traces.
TransTerm
TransTerm is a program that finds rho-independent transcription terminators in bacterial genomes.
TreeClimber 1.0 TreeClimber uses a parsimony-based test statistic to determine whether the observed difference between the phylogenies of multiple communities are due to an accumulation of evolutionary variation or some perturbation.
Tree-Puzzle (formerly PUZZLE) 5.1 Maximum likelihood analysis for nucleotide, amino acid, and two-state data.
Treeview (Michael Eisen)

Unveil 1 Unveil is an ab initio gene prediction program based loosely on the VEIL design.
Vector NTI 7.2 Desktop sequence analysis and molecular biology data management software.
Vienna RNA package 1.4 RNA secondary structure prediction and comparison.
Weka 3.4.1 Data mining software in Java.
Wise 2.2.0 The Wise2 form compares a protein sequence to a EST or genomic DNA sequence, allowing for introns and frameshifting errors. High-throughput, parallel use of several of the WISE programs is provided on the Beowulf Clusters - please submit your large jobs to the Beowulf Clusters. We maintain updated, local copies of several Pfam databases.
Wu-blast 2 Washington University BLAST (WU-BLAST) version 2.0 is a powerful software package for gene and protein identification, using sensitive, selective and rapid similarity searches of protein and nucleotide sequence databases. We maintain updated, local copies of several blast databases that can be read by WU-BLAST



 
     
Supported by NIH, NSF, NASA, The Josephine Bay Paul and C. Michael Paul Foundation, W.M. Keck Foundation, G. Unger Vetlesen Foundation, and Ellison Medical Foundation.
Unless otherwise stated, all material © 2004 Bay Paul Center, MBL.