The aim of this thesis is to describe how a statistically based neural network technology the BCPNN (Bayesian Confidence Propagation Neural Networks) can be used within two different applications, data mining in a huge database and modelling of an industrial process. BCPNN has previously been successfully used within classification tasks like fault diagnosis, pattern recognition and hierarchical clustering analysis.
BCPNN is a neural network model reminding somewhat about Bayesian descision trees which are being used within artificial intelligence systems. As a neural network the BCPNN is rather different from backprop (BP) and other gradient methods. The learning process in BCPNN is based upon calculations of probabilities and dependencies which is often a more or less straight forward process compared to the usually time consuming iterative gradient methods. The interpretation of weight values in a BCPNN is also rather easy compared to interpretation of the weight values within a network which is trained by gradient methods.
When we say process modelling here, this refers to function approximation. A function in the general sense may be considered a spatio-temporal outcome of a spatio-temporal input. Function approximation in this sense is somewhat more complex than the modelling we do in this thesis, as we don't deal with time in those paper where we discuss process modelling. To give a glimpse of the BCPNN being able to deal also with time there are two papers included where we deal with some temporal aspects of BCPNN.
The most important results found in this thesis can be
summarized in the following:
We show how a Bayesian Neural Network can be extended to model
the uncertainties in the collected statistics to produce outcomes
as distributions from two different aspects: uncertainties induced
by sampling, which is useful for data mining; uncertainties
due to input data distributions, which is useful for process modelling.
We show how complex dependencies can be found within large
data sets but still avoiding combinatoric explosion.
We show how these techniques have been turned into a useful tool
for real world applications within the drug safety area in particular.
We compare some results of the BCPNN technique with the well
established non linear regression technique, BP (back prop networks),
for processing modelling, showing that the
BCPNN performs at least equally well, but provides extra
information about uncertainties of produced outcomes.
We present a simple but working method for doing automatic
temporal segmentation of data sequences.
We indicate some aspects of temporal tasks for which
a predictive Bayesian neural network may be useful. Showing how
the connection matrix can be reduced due to
regularities in the data.