Mireille Régnier ; Alain Denise - Rare Events and Conditional Events on Random Strings

dmtcs:310 - Discrete Mathematics & Theoretical Computer Science, January 1, 2004, Vol. 6 no. 2 - https://doi.org/10.46298/dmtcs.310
Rare Events and Conditional Events on Random Strings

Authors: Mireille Régnier 1; Alain Denise ORCID-iD2

  • 1 Algorithms
  • 2 Laboratoire de Recherche en Informatique

Some strings -the texts- are assumed to be randomly generated, according to a probability model that is either a Bernoulli model or a Markov model. A rare event is the over or under-representation of a word or a set of words. The aim of this paper is twofold. First, a single word is given. One studies the tail distribution of the number of its occurrences. Sharp large deviation estimates are derived. Second, one assumes that a given word is overrepresented. The distribution of a second word is studied; formulae for the expectation and the variance are derived. In both cases, the formulae are accurate and actually computable. These results have applications in computational biology, where a genome is viewed as a text.


Volume: Vol. 6 no. 2
Published on: January 1, 2004
Imported on: March 26, 2015
Keywords: computable closed formulae,large deviations,combinatorics,generating fumctions,words,genome,computable closed formulae.,[INFO.INFO-DM] Computer Science [cs]/Discrete Mathematics [cs.DM]

Linked publications - datasets - softwares

Source : ScholeXplorer IsRelatedTo ARXIV 1204.1571
Source : ScholeXplorer IsRelatedTo DOI 10.1371/journal.pone.0080511
Source : ScholeXplorer IsRelatedTo DOI 10.48550/arxiv.1204.1571
Source : ScholeXplorer IsRelatedTo PMC PMC3855595
Source : ScholeXplorer IsRelatedTo PMID 24324603
  • 24324603
  • PMC3855595
  • PMC3855595
  • 24324603
  • 1204.1571
  • 10.48550/arxiv.1204.1571
  • 10.1371/journal.pone.0080511
Bayesian centroid estimation for motif discovery

Consultation statistics

This page has been seen 262 times.
This article's PDF has been downloaded 214 times.