Markov property
From Wikipedia, the free encyclopedia
In mathematics, the term Markov property or Markov-type property can refer to either of two closely-related things.
In the narrowest sense, a stochastic process has the Markov property if the conditional probability distribution of future states of the process, given the present state and a constant number of past states, depend only upon the present state and the given states in the past, but not on any other past states, i.e. it is conditionally independent of these older states. Such a process is called Markovian or a Markov process. The articles Markov chain and continuous-time Markov process explore this property in greater detail.
In a broader sense, if a stochastic process of random variables determining a set of probabilities which can be factored in such a way that the Markov property is obtained, then that process is said to have the Markov-type property ; this is defined in detail below. Useful in applied research, members of such classes defined by their mathematics or area of application are referred to as Markov random fields, and occur in a number of situations, such as the Ising model. The Markov property is named after Andrey Markov[1].
[edit] Definition
If one has a system composed of a set of random variables , then in general, the probability of a given random variable Xj being in a state xj is written as
That is, in general, the probability of Xj being in a state xj depends on the values of all of the other random variables {Xk}. If, instead, one has that this probability only depends on some, but not all of these, then one says that the collection has the Markov property[2]. Letting Nj denote the subset of {Xk} on which Xj depends, one then writes this limited dependence as
Any collection of random variables having this property is referred to as a Markov network. The set Nj is sometimes referred to as the neighbors of Xj; alternately, it is the Markov blanket of Xj.
The probability distribution of a Markov network can always be written as a Gibbs distribution, that is, as
for an appropriate energy function E defined on the subset Nj. The normalizing constant is known as the partition function.
Markov networks are commonly seen in maximum entropy methods, since the Gibbs measure also has the property of being the unique stochastic measure that maximizes the entropy for a given energy functional.
[edit] Notes
- ^ A. A. Markov (1954) Theory of algorithms. [Translated by Jacques J. Schorr-Kon and PST staff] Imprint Moscow, Academy of Sciences of the USSR, 1954 [i.e. Jerusalem, Israel Program for Scientific Translations, 1961; available from the Office of Technical Services, U.S. Dept. of Commerce, Washington] Description 444 p. 28 cm. Added t.p. in Russian Translation of Works of the Mathematical Institute, Academy of Sciences of the USSR, v. 42. Original title: Teoriya algorifmov. [QA248.M2943 Dartmouth College library. U.S. Dept. of Commerce, Office of Technical Services, number OTS 60-51085.]
- ^ For a more advanced approach cf: Markov Processes and Semi-groups, Ch. X, § 8, Vol II Introduction to Probability Theory and Its Applications (2nd edition), William Feller, Wiley 1971, LCCCN 57-10805, ISBN 0-471-25709-5