| |||||||||||||||||||||||||||||
|
|
Probabilistic
Modelling
|
![]() |
The core skill of any scientist is his or her ability to abstract conceptual models of phenomena of interest and to express them in a useful form. We are interested in extracting models that not only account for observations but are able to predict. High performance inference is only possible when the various uncertainties can be modelled and compensated for. Such uncertainties include:
measurement uncertainty - errors and noise in measured variables and signals
intrinsic uncertainty - from a random physical process or random human behaviour
model uncertainty - lack of knowledge of which model is correct
model vagueness - interpolation between a reduced set of models
The BI Group's probabilistic modelling skills enable it to define elaborate mathematical models that reflect both a systems deterministic behaviour and its uncertainties.
People: Daniel McMichael, Geoff Jarrad & Simon Williams
Projects:
All the Group's research projects; the newest type of probabilistic model created by the group are for Situation & Threat Assessment:
A statistical approach to situation assessment (early publicly available paper)
Course of Action Analysis (two projects for DSTO)
The CRC for Sensor Signal and Information Processing
(Sir Ross and Sir Keith Smith Fund project on Situation Assessment)
![]() |
The main purpose of modelling within the BI Group is to provide the basis for optimisation. For example, in our work for Polartechnics on the TruScan cervical cancer probe it was necessary to build a probabilistic model of the cancer detection process in order to provide equations that could be coded into the Polartechnics/CSIRO Darwin optimisation tool. The detection algorithms were then optimised using Darwin for ability to detect cancer in new patients.
Commonly, one seeks to find the model that "best" fits the data, and for this purpose it is necessary to define a likelihood function, which is P(data | structure, parameters) in which the "data" is all the relevant observations at your disposal, the "structure" is the functional form of the likelihood function and the "parameters" are its coefficients. The maximum likelihood method maximises this function over the parameters only or over both the structure and the parameters. Useful likelihood function structures fit the data well and lead to fast optimisations. It is often useful to constrain the parameters using a prior distribution, P(parameters, structure). We then maximise the joint probability of the data structure and parameters:
P(data, structure, parameters) = P(data | structure, parameters) * P(parameters, structure).
Much of the theoretical work in the Group is about how to create likelihood functions for new problem domains that yield fast and effective optimisation algorithms.
Projects: Statistical Algorithms for Large Models
Estimating Switched Gaussian Mixture Models
Clinical Instrumentation
The Darwin Statistical Optimisation Package
![]() |
Natural language processing aims to provide mechanisms to understand and generate natural language. The BI Group's interest in natural language processing is text understanding. We have constructed a parser based on a combinatory categorial grammar, and are currently optimising its performance. It is being adapted for use in deep semantic extraction. A recent independent report has shown that the parser has reached a level of performance comparable with those of similar leading edge parsers (email Daniel.McMichael@csiro.au for a copy).
The Group's research has led to algorithms for extracting stories from texts and corpora of texts. These stories are high level content representations that can be searched and analysed quickly.
Projects: Parsing
Story Extraction
Question Answering
Classification problems occur when there are a collection of objects and for each object there is a set of observations (features). In classification, we seek to assign each object to a class on the basis of its features. While classification algorithms can be hard wired using a fixed set of classification rules they can also be learned from data. The data is normally a set of preclassified objects. The Group has experience with algorithms such as CART, OC1, the hierarchical mixture of experts classifier, and many others. It has pioneered the shared mixture classifier, which has excellent robustness properties and good resistance to overfitting.
Projects: The shared mixture classifier The hierarchical mixture of experts classifier Clinical instrumentation |
![]() |
In the theory of probability a measure can be assigned to a set of variables. Such joint probability functions arise in very many practical applications. Commonly, they replace simple rule-based expert systems and give greater accuracy and robustness to uncertainty.
Bayesian networks are a form of complex joint probability function in which the nodes of the network are variables and the links indicate direct dependency between them. Our work on Bayesian networks has focussed on fast algorithms for optimising both structure and parameters of Bayesian networks. We have used Bernoulli mixture distributions to represent families of variables and have estimated the weights associated with each component. The components of the mixture correspond to different connectivities between a node an its possible parent nodes. We have applied such networks to complex games and modelling information systems.
People: Geoff Jarrad & Daniel McMichael
Projects: Bernoulli Mixture Models
last updated December 18, 2003 08:55 AM
Geoff.Jarrad@csiro.au