Jasper Slingsby
Bayes’ Rule:
\[ \underbrace{p(\theta|D)}_\text{posterior} \; \propto \; \underbrace{p(D|\theta)}_\text{likelihood} \;\; \underbrace{p(\theta)}_\text{prior} \; \]
The posterior is proportional to the likelihood times the prior.
\[ \underbrace{p(\theta|D)}_\text{posterior} \; \propto \; \underbrace{p(D|\theta)}_\text{likelihood} \;\; \underbrace{p(\theta)}_\text{prior} \; \]
The posterior is the conditional probability of the parameters given the data \(p(\theta|D)\) and provides a probability distribution for the values any parameter can take,
This allows us to represent uncertainty in the model and forecasts as probabilities, which is powerful for indicating the probability of our forecast being correct.
\[ \underbrace{p(\theta|D)}_\text{posterior} \; \propto \; \underbrace{p(D|\theta)}_\text{likelihood} \;\; \underbrace{p(\theta)}_\text{prior} \; \]
The likelihood \(p(D|\theta)\) represents the probability of the data \(D\) given the model with parameter values \(\theta\), and is used in analyses to find the likelihood profiles of the parameters.
This term looks for the best estimate of the parameters using Maximum Likelihood Estimation, where the likelihood of the parameters are maximized for a given model by choosing the parameters that maximize the probability of the data.
\[ \underbrace{p(\theta|D)}_\text{posterior} \; \propto \; \underbrace{p(D|\theta)}_\text{likelihood} \;\; \underbrace{p(\theta)}_\text{prior} \; \]
The prior is the marginal probability of the parameters, \(p(\theta)\).
It represents the credibility of the parameter values, \(\theta\), without the data, and is specified using our prior belief of what the parameters should be, before interrogating the data. This provides a formal probabilistic framework for the scientific method, in that new evidence must be considered in the context of previous knowledge, providing the opportunity to update our beliefs.
Data can enter (or be fused with) a model in a variety of ways. Here we’ll discuss these and then give an example of the Fynbos postfire recovery model used in the practical.
The opportunities for data fusion are linked to model structure, so we’ll revisit how some aspects of model structure change as we move from Least Squares to Maximum Likelihood Estimation to “single-level” Bayes to Hierarchical Bayes and the data fusion opportunities provided by each.
Conceptually (and perhaps over-simplistically), one can think of the changes in model structure as being the addition of model layers, each of which provide more opportunities for data fusion.
Least Squares makes no distinction between the process model and the data model.
the process model models the drivers determining the pattern observed (i.e. is the model equation you will be familiar with, such as a linear model)
a data model models the observation error or data observation process, i.e. the factors that may cause mismatch between the process model and the data
in least squares the data model can only ever be a normal (also called Gaussian) distribution, because we require homogeneity of variance in order to minimize the sums of squares
the only opportunity to add data to a least squares model is via the process model