statistical risk minimization problem

Data

subject:: Data Science Methods for Large Scale Graphs
parent:: Graph Signals and Graph Signal Processing
theme:: math notes

Statistical Risk Minimization Problem

Suppose $x, y$ are related by some (known) statistical model $p (x, y)$ , and we want to estimate $p (x, y)$ using a model $\tilde{y} = f (x)$ , where $f$ is a member of our hypothesis class $F$ .

Let $ℓ (y, \hat{y})$ be our loss function. Then the statistical risk minimization problem is to minimize the expected loss over distribution $p (x, y)$ :

f^{*} = \arg min_{f \in F} E_{p (x, y)} {ℓ (y, f (x))}

The optimal estimator $f^{*}$ is the function $f \in F$ with minimal expected cost over all possible functions $f$ .

Note

Typically, we are interested in either

Predicting $y$ from $x$ with the convolutional distribution $y \sim p (y | x)$ .
ex: stochastic outputs: VAEs, diffusion, etc
Predicting $y$ from $x$ with the conditional expectation $y = E (y | x)$
ex: deterministic outputs, classical regime/supervised learning

Mentions

File
2025-02-03 graphs lecture 4
2025-02-05 graphs lecture 5