Thursday, May 19, 2005

GravStat 2005: Loredo

Tom Loredo's introductory talk on statistics.

Inference: deductive/inductive
Statistical inference: quantify inductive inference -> probability

Consider set of models M_i with parameters P_i.
Parameter estimation - given model i what can we say about P_i
Model uncertainty - which model is better ? Is M_0 adequate
Hybrid uncertainty - models share some common parameters: what can we say about them ?

Frequentist (F): devise procedure to choose among hypotheses H_i using D. Apply to D_obs. Report long-run performance.
Bayesian (B) : calculate probability of hypotheses given D_obs and modelling premises using rules of probability theory.

Frequency -> Probability - Bernoulli law of large numbers
Probability -> Frequency - Bayes' original paper

B vs F
B more general: can in principle contemplate probability of anything
B more narrow: data only appear through likelihood function
F can always base a procedure on a B calculation.

Decision theory: a rule a(D) that chooses a particular action when D is observed.
Risk : R(o) = Sum_D p(D|o) L(a(D),o)
where L is the loss function. Seek rules with small risks.

Inference: F calibration - the long-run average actual accuracy ~ long-run average reported accuracy. Decision theory is used to decide between calibrated rules.

B decision theory : average over outcomes.

Wald's Complete Class Theorem: Admissable F decision rules are B rules. Less useful than appears because sometimes inadmissable rules are better than admissable.

Model uncertainty: B method requires alternative model.

Comparison of B & F: B credible regions are usually close to F confidence regions but often better. Decision results are very different.

Counting experiment with background known - B approach (Helene 1983), F approach (Roe & Woodruffe 1999).

Nuisance parameters : profile likelihood - maximise over nuisance parameters
L_p(P) = max_Q L(P,Q)
can be biased and confidence intervals too small. Modern F approach is asymptotic adjustment
L_p(P) x |I_QQ(P)|^-1/2
where I_QQ is information matrix for Q. Alternatively evaluate F properties of B marginal solution.

Hypothesis testing :
Noone does Neymann-Pearson - you need to pick alpha ahead of time and only report this eg quote a 2 sigma detection even if actually 10 sigma.
Fisher proposed p-values to get round this. But depends on data and easily interpretable. p-values don't accurately measure how often the null will be wrongly rejected.
Recent work on conditional testing solves some of these problems. Comparison with B methods by Berger.

Multiple tests - False Discovery Rate. Active research topic.

Non-parametric situation between F and B is much more controversial and has not convered.

Keywords: statistics

No comments: