The core of Cistematic is a Python package with a rich set of API's that simplify the collection and analysis of candidate cis-regulatory elements from a number of different motif-finding programs such as Meme, and cisGreedy (built-in). Cistematic assesses the significance of each motif by comparing it to its prevalence genome-wide.
One of the more useful APIs in Cistematic involved wrapping genes, annotations, and genomic sequences that are stored in a sqlite database. It is on top of this foundation that we have built additional platforms, such as ChIPSeqMini, and ERANGE.
This page has the updated version of the Cistematic code underlying the following papers:
Cistematic currently runs on Linux and Macintosh; it also runs under Cygwin in Microsoft Windows. In addition to python, the current version of Cistematic is heavily dependent on sqlite and its python interface, pysqlite (which is now part of python 2.5+). You will therefore need:
Note that earlier versions of python did not include pysqlite, which we rely on heavily. You might be able to get Cistematic to run with older versions of python if you separately install sqlite/pysqlite, but we do not support it any longer.
In addition to the requirements listed above, three additional packages will allow you to get the most out of Cistematic.
psyco, which only runs on Intel 32-bit CPUs and on ALL Macintosh Intel platforms, will give you approximately 9-fold speed up running Cistematic code and is highly recommended, if it's available for your platform.
Matplotlib was used to generate the figures in the papers and is hence also recommended.
Weblogo is used to visualize the PSFMs in ERANGE. It is not necessary if you are only interested in RNA-seq, for example.
The actual version numbers are:
You will need to download the following packages:
as well as a set of motif finding binaries (not required for RNA-seq, but
definitely for ChIP-seq):
For motif-finding with ChIP-seq data, you can now use the bundled cisGreedy program, a python implementation of Consensus (Hertz, 1999), which will work well for short motifs, but isn't optimized for speed. You may also want to download Meme and modify the parameter memePath in $CISTEMATIC_ROOT/cistematic/programs/meme.py to the appropriate path for the meme top-level script.
To install, create a directory (for example /proj/genome), cd into it, and unpack each file using tar xzvf.
You will need to add the directory in which you installed the Cistematic python code to your PYTHONPATH environment!
If you use a root directory different from /proj/genome, you will need to tell Cistematic where to find it by setting up the environmental variable CISTEMATIC_ROOT.
Some of the code for cistematic.core.motif for PSFM and Markov1 motif
scanning across mammalian genomes in pure python mode. Therefore some of the
key code is optionally compilable into a python C-extension. The default distribution
contains the source code as well as pre-compiled binaries for Linux (64 bit)
as default and MacOS 10.5. Mac users should rename the precompiled binary for
the extension to be used:
cp $CISTEMATIC_ROOT/cistematic/core/_motif.so.mac $CISTEMATIC_ROOT/cistematic/core/_motif.so
python setup.py build
To make full use of Cistematic, you will need to use genomes that are
either installed from scratch, or that are installed from the following
packages:
which are unpacked in the same manner as the other Cistematic files.
Earlier versions of Cistematic are now obsolete, but files for the earlier release of Cistematic are available here.