Maximum Entropy Shopping
Maximum
Shopping
Entropy
Website Links For
Maximum
 

Information About

Maximum Entropy




The principle was first expounded by E.T. Jaynes in 1957 , as an interpretation of the Gibbs Algorithm of Statistical Mechanics . He suggested that thermodynamics, and in particular thermodynamic Entropy , should be seen just as a particular application of a general tool of inference and information theory. (See Maximum Entropy Thermodynamics .)

The maximum entropy principle is like other Bayesian methods in that it makes explicit use of prior information. This is an alternative to the methods of inference of classical statistics.


TESTABLE INFORMATION


The principle of maximum entropy is useful only when applied to ''testable information''. A piece of information is testable if it can be determined whether a given distribution is consistent with it. For example, the statements

:The expectation of the variable ''x'' is 2.87
and
p2


are statements of testable information.

Given testable information, the maximum entropy procedure consists of seeking the probability distribution which maximizes information entropy, subject to the constraints of the information. This constrained optimization problem is typically solved using the method of Lagrange Multiplier s.

Entropy maximization with no testable information takes place under a single constraint: the sum of the probabilities must be one. Under this constraint, the maximum entropy probability distribution is the uniform distribution,

:p_i= rac{1}{n}\ { m for\ all}\ i\in\{\,1,\dots,n\,\}.

The principle of maximum entropy can thus be seen as a generalization of the classical Principle Of Indifference , also known as the principle of insufficient reason.


GENERAL SOLUTION FOR THE MAXIMUM ENTROPY DISTRIBUTION WITH LINEAR CONSTRAINTS



Discrete case


We have some testable information ''I'' about a quantity ''x'' ∈ {''x1'', ''x2'',..., ''xn''}. We express this information as ''m'' constraints on the expectations of the functions ''fk''; that is, we require our epistemic probability distribution to satisfy

  :<math>\sum {i 1}^n \Pr(x_iI) = 1</math>
  :<math>\Pr(x II) rac{1}{Z(\lambda_1,\cdots, \lambda_m)} \exp\left f_1(x_i) + \cdots + \lambda_m f_m(x_i) ight </math>
  :<math>\int P(xI)f K(x)dx F_k \qquad k = 1, \cdots,m</math>
  :<math>\int P(xI)dx 1</math>
  :<math>p(xI) rac{1}{Z(\lambda_1,\cdots, \lambda_m)} m(x)\exp\left f_1(x) + \cdots + \lambda_m f_m(x) ight </math>
  :<math> P(xI) A \cdot m(x), \qquad a < x < b</math>