DNA-arrays (or DNA-chips or microarrays) are flat slabs of glass, silicon or plastic onto which thousands of multiple short single-stranded (ss) DNA sequences (corresponding to small regions of a genome) have been attached. After performing a mRNA extraction in induced and non-induced cells, the mRNA is again reverse transcribed, but here the reaction is tweaked, so that the emerging cDNA contains nucleotides marked with different fluorophores for controls and experiment. Targets will hybridize by base-pairing with those probes that resemble them the most. The array can then be stimulated by a laser and scanned for fluorescence at two different wavelengths (control and induced). The ratio or log-ratio between the two fluorescence intensities corresponds to the induction level.
The DNAse foot-printing method starts by focusing on a given region of interest (e.g. a promoter region) and amplifying it by PCR to obtain lots of sample. It then throws in the TF and then the DNAse. The mix is left to stir for a short time and then gel electrophoresis is run to compare the pattern of fragments in a control (no TF) and in the sample. If the TF has bound the sample, it will have protected a stretch of DNA (encompassing some fragments of the control) and thus those fragments will not appear in the sample gel. The fragments can then be cut-out from the gel, purified and sequenced to obtain the sequence of the protected region. This is often used to identify the binding motif of a TF for the first time. The foot-printing will typically resolve the protected region down to 50-100 bp, and the sequence can be then examined for possible TF-binding sites either by eye of using a computer search.
Electro-mobility shift-assays (or gel retardation assays) are a standard way of assessing TF-binding. A fragment of DNA of interest is amplified and labeled with a fluorophore. The fragment is left to incubate in a solution containing abundant TF and non-specific DNA (e.g. randomly cleaved DNA from salmon sperm, of all things) and then a gel is run with the incubated sample and a control (sample that has not been in contact with the TF). If the TF has bound the sample, the complex will migrate more slowly than unbound DNA through the gel, and this retarded band can be used as evidence of binding. The unspecific DNA ensures that the binding is specific to the fragment of interest and that any non-specific DNA-binding proteins left-over in the TF purification will bind there, instead of on the fragment of interest. EMSAs are typically carried out in a bunch of fragments, shown as multiple double (control+experiment) lanes in a wide picture. Certain additional controls are run in at least one of the fragments to ascertain specificity. In the most basic of these, specific competitor (the fragment of interest or a known positive control, unlabelled) is added to the reaction. This should sequester the TF and hence make the retardation band disappear, proving that the binding is indeed specific
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Regulated genes for each binding site are displayed below. Gene regulation diagrams
show binding sites,
both positively and negatively regulated
genes, genes with unspecified type of regulation.
For each indvidual site, experimental techniques used to determine the site are also given.