For the selected transcription factor and species, the list of curated binding sites
in the database are displayed below. Gene regulation diagrams show binding sites, positively-regulated genes,
negatively-regulated genes,
both positively and negatively regulated
genes, genes with unspecified type of
regulation.
ChIP-chip (and to a lesser degree ChIP-Seq) results are often validated with ChIP-PCR, in which a PCR with specific primers is performed on the pulled-down DNA. As in the case of RNASeq, there are many variations of these main techniques.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
Electro-mobility shift-assays (or gel retardation assays) are a standard way of assessing TF-binding. A fragment of DNA of interest is amplified and labeled with a fluorophore. The fragment is left to incubate in a solution containing abundant TF and non-specific DNA (e.g. randomly cleaved DNA from salmon sperm, of all things) and then a gel is run with the incubated sample and a control (sample that has not been in contact with the TF). If the TF has bound the sample, the complex will migrate more slowly than unbound DNA through the gel, and this retarded band can be used as evidence of binding. The unspecific DNA ensures that the binding is specific to the fragment of interest and that any non-specific DNA-binding proteins left-over in the TF purification will bind there, instead of on the fragment of interest. EMSAs are typically carried out in a bunch of fragments, shown as multiple double (control+experiment) lanes in a wide picture. Certain additional controls are run in at least one of the fragments to ascertain specificity. In the most basic of these, specific competitor (the fragment of interest or a known positive control, unlabelled) is added to the reaction. This should sequester the TF and hence make the retardation band disappear, proving that the binding is indeed specific
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
Electro-mobility shift-assays (or gel retardation assays) are a standard way of assessing TF-binding. A fragment of DNA of interest is amplified and labeled with a fluorophore. The fragment is left to incubate in a solution containing abundant TF and non-specific DNA (e.g. randomly cleaved DNA from salmon sperm, of all things) and then a gel is run with the incubated sample and a control (sample that has not been in contact with the TF). If the TF has bound the sample, the complex will migrate more slowly than unbound DNA through the gel, and this retarded band can be used as evidence of binding. The unspecific DNA ensures that the binding is specific to the fragment of interest and that any non-specific DNA-binding proteins left-over in the TF purification will bind there, instead of on the fragment of interest. EMSAs are typically carried out in a bunch of fragments, shown as multiple double (control+experiment) lanes in a wide picture. Certain additional controls are run in at least one of the fragments to ascertain specificity. In the most basic of these, specific competitor (the fragment of interest or a known positive control, unlabelled) is added to the reaction. This should sequester the TF and hence make the retardation band disappear, proving that the binding is indeed specific
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
Electro-mobility shift-assays (or gel retardation assays) are a standard way of assessing TF-binding. A fragment of DNA of interest is amplified and labeled with a fluorophore. The fragment is left to incubate in a solution containing abundant TF and non-specific DNA (e.g. randomly cleaved DNA from salmon sperm, of all things) and then a gel is run with the incubated sample and a control (sample that has not been in contact with the TF). If the TF has bound the sample, the complex will migrate more slowly than unbound DNA through the gel, and this retarded band can be used as evidence of binding. The unspecific DNA ensures that the binding is specific to the fragment of interest and that any non-specific DNA-binding proteins left-over in the TF purification will bind there, instead of on the fragment of interest. EMSAs are typically carried out in a bunch of fragments, shown as multiple double (control+experiment) lanes in a wide picture. Certain additional controls are run in at least one of the fragments to ascertain specificity. In the most basic of these, specific competitor (the fragment of interest or a known positive control, unlabelled) is added to the reaction. This should sequester the TF and hence make the retardation band disappear, proving that the binding is indeed specific
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
Electro-mobility shift-assays (or gel retardation assays) are a standard way of assessing TF-binding. A fragment of DNA of interest is amplified and labeled with a fluorophore. The fragment is left to incubate in a solution containing abundant TF and non-specific DNA (e.g. randomly cleaved DNA from salmon sperm, of all things) and then a gel is run with the incubated sample and a control (sample that has not been in contact with the TF). If the TF has bound the sample, the complex will migrate more slowly than unbound DNA through the gel, and this retarded band can be used as evidence of binding. The unspecific DNA ensures that the binding is specific to the fragment of interest and that any non-specific DNA-binding proteins left-over in the TF purification will bind there, instead of on the fragment of interest. EMSAs are typically carried out in a bunch of fragments, shown as multiple double (control+experiment) lanes in a wide picture. Certain additional controls are run in at least one of the fragments to ascertain specificity. In the most basic of these, specific competitor (the fragment of interest or a known positive control, unlabelled) is added to the reaction. This should sequester the TF and hence make the retardation band disappear, proving that the binding is indeed specific
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
Electro-mobility shift-assays (or gel retardation assays) are a standard way of assessing TF-binding. A fragment of DNA of interest is amplified and labeled with a fluorophore. The fragment is left to incubate in a solution containing abundant TF and non-specific DNA (e.g. randomly cleaved DNA from salmon sperm, of all things) and then a gel is run with the incubated sample and a control (sample that has not been in contact with the TF). If the TF has bound the sample, the complex will migrate more slowly than unbound DNA through the gel, and this retarded band can be used as evidence of binding. The unspecific DNA ensures that the binding is specific to the fragment of interest and that any non-specific DNA-binding proteins left-over in the TF purification will bind there, instead of on the fragment of interest. EMSAs are typically carried out in a bunch of fragments, shown as multiple double (control+experiment) lanes in a wide picture. Certain additional controls are run in at least one of the fragments to ascertain specificity. In the most basic of these, specific competitor (the fragment of interest or a known positive control, unlabelled) is added to the reaction. This should sequester the TF and hence make the retardation band disappear, proving that the binding is indeed specific
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
Electro-mobility shift-assays (or gel retardation assays) are a standard way of assessing TF-binding. A fragment of DNA of interest is amplified and labeled with a fluorophore. The fragment is left to incubate in a solution containing abundant TF and non-specific DNA (e.g. randomly cleaved DNA from salmon sperm, of all things) and then a gel is run with the incubated sample and a control (sample that has not been in contact with the TF). If the TF has bound the sample, the complex will migrate more slowly than unbound DNA through the gel, and this retarded band can be used as evidence of binding. The unspecific DNA ensures that the binding is specific to the fragment of interest and that any non-specific DNA-binding proteins left-over in the TF purification will bind there, instead of on the fragment of interest. EMSAs are typically carried out in a bunch of fragments, shown as multiple double (control+experiment) lanes in a wide picture. Certain additional controls are run in at least one of the fragments to ascertain specificity. In the most basic of these, specific competitor (the fragment of interest or a known positive control, unlabelled) is added to the reaction. This should sequester the TF and hence make the retardation band disappear, proving that the binding is indeed specific
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
Electro-mobility shift-assays (or gel retardation assays) are a standard way of assessing TF-binding. A fragment of DNA of interest is amplified and labeled with a fluorophore. The fragment is left to incubate in a solution containing abundant TF and non-specific DNA (e.g. randomly cleaved DNA from salmon sperm, of all things) and then a gel is run with the incubated sample and a control (sample that has not been in contact with the TF). If the TF has bound the sample, the complex will migrate more slowly than unbound DNA through the gel, and this retarded band can be used as evidence of binding. The unspecific DNA ensures that the binding is specific to the fragment of interest and that any non-specific DNA-binding proteins left-over in the TF purification will bind there, instead of on the fragment of interest. EMSAs are typically carried out in a bunch of fragments, shown as multiple double (control+experiment) lanes in a wide picture. Certain additional controls are run in at least one of the fragments to ascertain specificity. In the most basic of these, specific competitor (the fragment of interest or a known positive control, unlabelled) is added to the reaction. This should sequester the TF and hence make the retardation band disappear, proving that the binding is indeed specific
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
Electro-mobility shift-assays (or gel retardation assays) are a standard way of assessing TF-binding. A fragment of DNA of interest is amplified and labeled with a fluorophore. The fragment is left to incubate in a solution containing abundant TF and non-specific DNA (e.g. randomly cleaved DNA from salmon sperm, of all things) and then a gel is run with the incubated sample and a control (sample that has not been in contact with the TF). If the TF has bound the sample, the complex will migrate more slowly than unbound DNA through the gel, and this retarded band can be used as evidence of binding. The unspecific DNA ensures that the binding is specific to the fragment of interest and that any non-specific DNA-binding proteins left-over in the TF purification will bind there, instead of on the fragment of interest. EMSAs are typically carried out in a bunch of fragments, shown as multiple double (control+experiment) lanes in a wide picture. Certain additional controls are run in at least one of the fragments to ascertain specificity. In the most basic of these, specific competitor (the fragment of interest or a known positive control, unlabelled) is added to the reaction. This should sequester the TF and hence make the retardation band disappear, proving that the binding is indeed specific
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
Electro-mobility shift-assays (or gel retardation assays) are a standard way of assessing TF-binding. A fragment of DNA of interest is amplified and labeled with a fluorophore. The fragment is left to incubate in a solution containing abundant TF and non-specific DNA (e.g. randomly cleaved DNA from salmon sperm, of all things) and then a gel is run with the incubated sample and a control (sample that has not been in contact with the TF). If the TF has bound the sample, the complex will migrate more slowly than unbound DNA through the gel, and this retarded band can be used as evidence of binding. The unspecific DNA ensures that the binding is specific to the fragment of interest and that any non-specific DNA-binding proteins left-over in the TF purification will bind there, instead of on the fragment of interest. EMSAs are typically carried out in a bunch of fragments, shown as multiple double (control+experiment) lanes in a wide picture. Certain additional controls are run in at least one of the fragments to ascertain specificity. In the most basic of these, specific competitor (the fragment of interest or a known positive control, unlabelled) is added to the reaction. This should sequester the TF and hence make the retardation band disappear, proving that the binding is indeed specific
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
ChIP-Seq is equivalent to ChIP-chip down to the last step. In ChIP-Seq, immunoprecipiated DNA fragments are prepared for sequencing and funneled into a massively parallel sequencer that produces short reads. Even though the sonication step is the same as in ChIP-chip, ChIP-Seq will generate multiple short-reads within any given 500 bp region, thereby pinning down the location of TFBS to within 50-100 bp. A similar result can be obtained with ChIP-chip using high-density tiling-arrays. The downside of ChIP-Seq is that sensitivity is proportional to cost, as sensitivity increases with the number of (expensive) parallel sequencing runs. To control for biases, ChIP-seq experiments often use the "input" as a control. This is DNA sequence resulting from the same pipeline as the ChIP-seq experiment, but omitting the immunoprecipitation step. It therefore should have the same accessibility and sequencing biases as the experiment data.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.