45 - Elemente der Bayes-Statistik

Grundbegriffe

$X_1,\ldots,X_n$ seien unabhängig und identisch verteilt mit $$X_i\sim f_{\vartheta}(x),$$ wobei $f_{\vartheta}$ eine (Zähl-)Dichte aus einer parametrischen Verteilungsfamilie $\mathcal{F}=\{f_{\vartheta}:\vartheta\in\Theta\}$ ist, und $\Theta\in\mathbb{R}^k$ den Parameterraum bezeichne.

Entscheidungsfunktion

Eine Entscheidungsfunktion $\delta$ ist eine Statistik $\delta:\mathbb{R}^n\rightarrow\mathcal{A}$, und $\mathcal{A}$ bezeichnet dann den Aktionsraum. Beobachtet man $\boldsymbol{x}=(x_1,\ldots,x_n)$, wird die Entscheidung $\delta(x_1,\ldots,x_n)$ getroffen. $\mathcal{D}$ bezeichne die Menge der möglichen Entscheidungsfunktionen.

Verlustfunktion

Eine nicht-negative Funktion $L:\Theta\times\mathcal{A}$ heißt Verlust oder Verlustfunktion.

Im Fall $\mathcal{A}=\Theta$ heißt $L(\vartheta,a)=(\vartheta-a)^2$ quadratische Verlustfunktion.

Risikofunktion

Der erwartete Verlust der Entscheidungsfunktion $\delta(X)$ im Punkt $\vartheta$ definiert die Risikofunktion $R:\Theta\times\mathcal{D}\rightarrow\mathbb{R}$, $$R(\vartheta,\delta)=E_{\vartheta}L(\vartheta,\delta(X)).$$

Minimax-Prinzip

$\delta^{\ast}$ heißt Minimax-Regel, wenn $$\max_{\vartheta\in\Theta}R(\vartheta,\delta^{\ast})\le\max_{\vartheta\in\Theta}R(\vartheta,\delta) \quad \text{für alle } \delta\in\mathcal{D}.$$

Bayes-Prinzip

Es wird angenommen, dass $\vartheta\sim\pi(\vartheta)$. $\pi(\vartheta)$ heißt a-priori-Verteilung oder Prior.

Statt $f_{\vartheta}(x)$ schreibt man nun $f(x|\vartheta)$. Für die gemeinsame Dichte von $X$ und $\vartheta$ gilt $$f(x,\vartheta)=f(x|\vartheta)\pi(\vartheta).$$ sowie $$f(x)=\int f(x,\vartheta)d\vartheta \quad \text{bzw.} \quad f(x)=\sum_{\vartheta}f(x,\vartheta)$$ und $$f(\vartheta|x)=\frac{f(x,\vartheta)}{f(x)}.$$

$f(\vartheta|x)$ heißt a posteriori-Verteilung (Posterior-Verteilung) von $\vartheta$.

Die Risikofunktion wird geschrieben als $R(\vartheta,\delta)=E(L(\vartheta,\delta(x))|\vartheta)$. Ist $X$ stetig, dann gilt $$R(\vartheta,\delta)\int L(\vartheta,\delta(x))f(x|\vartheta)dx$$ und wenn $X$ diskret ist $$R(\vartheta,\delta)=\sum_xL(\vartheta,\delta(x))f(x|\vartheta).$$

Bayes-Risiko

Der Mittelwert des bedingten Risikos $R(\vartheta,\delta)$ über $\vartheta$, $$R(\pi,\delta)=E_{\pi}R(\vartheta,\delta),$$ heißt Bayes-Risiko von $\delta$.

Ist $\pi(\vartheta)$ eine Dichte, dann ist $$R(\pi,\delta)=\int R(\vartheta,\delta)\pi(\vartheta)d\vartheta$$ und bei diskretem Prior ist $$R(\pi,\delta)=\sum_{\vartheta}R(\vartheta,\delta)\pi(\vartheta).$$

Bayes-Regel

Eine Entscheidungsfunktion $\delta^{\ast}\in\mathcal{D}$ heißt Bayes-Regel, wenn $$R(\pi,\delta^{\ast})=\min_{\delta}R(\pi,\delta).$$

Ist $X$ stetig, ist der Bayes-Schätzer der Erwartungswert der Posterior-Verteilung.

Konjugierte Prior-Familie

$\pi(\vartheta), \vartheta\in\Theta,$ heißt konjugierte Prior-Familie (oder konjugierter Prior) zu einem bedingten Verteilungsmodell $f(x|\vartheta)$, wenn die a p osteriori-Verteilung ein Element der Prior-Familie ist.

$\boldsymbol{f(x\|\vartheta)}$	$\boldsymbol{\pi(\vartheta)}$	$\boldsymbol{f(\vartheta\|x)}$
$N(\vartheta,\sigma^2)$	$N(\mu,\tau^2)$	$N\left(\frac{\sigma^2\mu+x\tau^2}{\sigma^2+\tau^2},\frac{\sigma^2\tau^2}{\sigma^2+\tau^2}\right)$
$\Gamma(\nu,\beta)$	$\Gamma(\alpha,\beta)$	$\Gamma(\alpha+\nu,\beta+x)$
$\text{Bin(n,p)}$	$\text{Beta}(\alpha,\beta)$	$\text{Beta}(\alpha+x,\beta+n-x)$

\(\boldsymbol{f(x\|\vartheta)}\)	\(\boldsymbol{\pi(\vartheta)}\)	\(\boldsymbol{f(\vartheta\|x)}\)
\(N(\vartheta,\sigma^2)\)	\(N(\mu,\tau^2)\)	\(N\left(\frac{\sigma^2\mu+x\tau^2}{\sigma^2+\tau^2},\frac{\sigma^2\tau^2}{\sigma^2+\tau^2}\right)\)
\(\Gamma(\nu,\beta)\)	\(\Gamma(\alpha,\beta)\)	\(\Gamma(\alpha+\nu,\beta+x)\)
\(\text{Bin(n,p)}\)	\(\text{Beta}(\alpha,\beta)\)	\(\text{Beta}(\alpha+x,\beta+n-x)\)