Cumulative distribution networks: Graphical models for cumulative distribution networks

Traditional graphical models express a decomposition of a joint distribution as a product of functions of subsets of variables. However, in some situations such as when learning to rank documents in information retrieval or ranking regulatory elements in computational systems biology, it is more natural to think of each network function as providing evidence about possible orderings of its variables. We formalize this by introducing a new type of graphical model called a `cumulative distribution network' (CDN), which uses a product of functions of subsets of variables to describe the joint cumulative distribution (CDF). The conditional independence properties of CDNs are quite different from other graphical models. Unlike Bayesian networks, Markov networks and factor graphs, in a CDN the marginal distribution of any variable can be computed in constant time whilst allowing for complex conditional dependencies.


An example of a CDN defined over 3 variables X,Y,Z. The product of the local CDN functions yields the joint CDF over X,Y,Z.

To compute conditional cumulative distributions in the network, we have developed the derivative-sum-product algorithm in which messages correspond to local conditional CDFs obtained from computing derivatives of the joint cumulative distribution. Due to the unique statistical independence properties of the CDN, these messages do not in general have a one-to-one correspondence with messages exchanged in standard algorithms, such as belief propagation.
Messages passed in the derivative-sum-product (DSP) algorithm for inference in CDNs. The DSP messages correspond to derivatives of the joint CDF with respect to local sets of variables, just as messages in standard belief propagation correspond to summations/integrals of the joint probability function of sets of variables.


Structured ranking learning using cumulative distribution networks

The joint CDF of many variables arises in problems of learning to rank objects, such as in document retrieval and in discovering regulatory sequences in computational systems biology. The problem of learning to rank from examples has recently gained significant interest in the machine learning community, with importance similarities and differences with the canonical problems of regression and classification. Unlike standard regression or classification in which we predict outputs independently, in ranking we must predict structured outputs so that misranking one object affects whether we correctly rank the other objects. In practice, the problem of ranking involves a large number of objects to be ranked and either approximate structured prediction methods are required, or assumptions of independence between object scores must be made in order to make the problem tractable. CDNs provide a natural and flexible class of loss functionals for learning to rank which allow for a compact representation of the dependencies between pairwise object preferences involved in ranking. By viewing loss functionals as CDNs, we can unify many previous methods for learning to rank, which were all developed in different fields for different applications. The underlying probabilstic models can be viewed as particular instances of CDNs, so that CDNs allow one to explore of a broad class of flexible structured loss functionals for learning to rank.

The structured ranking learning framework using cumulative distribution networks. Given a set of observations consisting of order graphs over nodes to be ranked, we wish to learn a ranking function which assigns a score to any given node in an order graph. Preferences between nodes can be represented as random variables in a CDN, so that the joint CDF over the preferences is the probability of generating the observed orderings in the observations. Learning a ranking function under a structured loss functional specified by the CDN is then equivalent to maximizing the probability of generating observed orderings.

Applications of this novel framework include document retrieval for web search [2] and ranking regulatory sequences in computational biology [1]. In the latter case, the formulation of the problem of discovering regulatory sequences as one of learning to rank allows us to integrate multiple types of datasets and diverse computational predictions under a single model without making explicit assumptions about relationships between different biological features and their impacts on orderings over sequences. The use of the CDN framework provides a unified probabilistic view of learning to rank and a tractable representation which can be learned using stochastic optimization methods.

Converting an order graph to a cumulative distribution network. The order graph shown over nodes V1,V2,V3,V4 encodes the set of preferences V1 > V2, V1 > V3, V1 > V4, V2 > V4, V3 > V4. Each of these preferences is then assigned to a random variable in a CDN. The connectivity of the CDN then allows one to express statistical dependencies between preference variables which arise when learning to rank many objects.


A walkthrough of cumulative distribution networks and their applications

Click here for more info...




References:

J.C. Huang and B.J. Frey. (2009) STORMSeq: A method for ranking regulatory sequences by integrating experimental datasets with diverse computational predictions
Submitted.

J.C. Huang and B.J. Frey. (2009) Structured ranking learning using cumulative distribution networks.
Advances in Neural Information Processing Systems (NIPS) 21. Supplementary File

J.C. Huang and B.J. Frey. (2008) Cumulative distribution networks and the derivative-sum-product algorithm.
Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI).