统计思维

参考书：
all of statistic 、 principle of statistic inference、computer age statistical inference

history:
1、经典统计
2、统计机器学习
3、数据学习

heroes：Fisher 、 Neyman

分布函数的估计
1678079315274

plugin

线性统计泛函
1678079644828

一本小册子
The Jacknife,the Bootstrap and Other Resampling plans

simulation

bootstrap

由大数定理
1678080085083

Hence, we can use the sample variance of the simulated values to approximate E(Y) and V(Y)

SGD(随机梯度法)

Important Sampling

$X_1\cdots,X_n\sim F$
F(CDF) f(pdf)

M=\int h(x)f(x)ds=\int h \frac{f}{g}g dx

这里g我们可能会选取正态密度函数等较好的函数，这样可看做在g里采样

\int h \frac{f}{g}g dx \approx\frac{1}{n}\sum h(x_i)\frac{f(x_i)}{g(x_i)}

The Sampling Importance Resampling (SIR)

Sampling candidates$ Y_1,Y_2,\cdots Y_n \sim g$
caculate the s importance weights $w(y_i)$
resampling $X_1,X_2,\cdots X_n$ from $Y_1,Y_2,\cdots Y_n$

P(X\in A|Y_1,Y_2 \cdots Y_n)=\sum I_{Y_i \in A}\frac{w^*_i}{\sum w^*_i}

由强大数定律，知上式趋向于

\int_A w(y_i)^* g(y)dy=\int_A f(y)dy

上面的证明是用来说明 $X_i$ 在样本量充分大时是近似取自f的

bootstrapping

我们有 $\theta=T(F)， \hat{\theta}=T(\hat{F})$
我们希望估计

R(x,F)=\frac{(T(\hat{F})-T(F))}{se{T(\hat{F})}}

R中的x是作用到F的，我们模拟 $R(x^*,\hat{F})$ 来逼近 $R(x,F)$

step1 Estimate $R(x,F)$ with $R(x^*,\hat{F})$
usually it’s diffcult to caculate $R(x^*,\hat{F})$ ,we have step tow
step2 Approximate $R(x^*,\hat{F})$ with simulation

Example n=3 ${x_1,x_2,x_3}={1,2,6}\sim i.i.d .F$ ，estimate mean $\theta$
$x^*$ （vector）有333=27种取法

$x^*$	$\theta$	$p^(\hat{\theta}^,\hat{F})$
111	1	$\frac{1}{27}$
112	$\frac{4}{3}$	$\frac{3}{27}$
…	…	…

在n很大时，如上枚举的手段是不切实际的，在此时我们只需要合理的利用采样即可

Bootstrap Variance Estimation

simulate $V_F(T_n)$ with $V_{\hat{F}}(T_n)$

1678084754261

1678083716399

3月9日

有数据 $X \rightarrow F$ ，我们希望做 $R(x,F)$ 的统计，我们利用 $R(x^*,\hat{F})$ 来估计 $R(x,F)$ ，其中

X^*=(X_1^*,\cdots,X_n^*)

Parametric bootstrap

X^*=(X_1^*,\cdots,X_n^*)\rightarrow F(x,\theta)

我们要做的事情是

X^*=(X_1^*,\cdots,X_n^*)\rightarrow\hat{\theta}\rightarrow X^*

bootstrapping regression

Y_i=X_i^T\beta+\epsilon_i,i=1,\cdots,n

the $\epsilon_i$ are assumed tobe i.i.d has mean zero and constant variance

(X_i,Y_i)\rightarrow\hat{\beta}\rightarrow \hat{Y_i} \rightarrow \hat{\epsilon}=Y_i-\hat{Y_i}\rightarrow\epsilon^*\rightarrow Y^*\rightarrow\beta^*

Bootstrap Confidence Interval

Method 1. The Normal Intervals

v_{boot}=\frac{1}{B}\sum_{b=1}^{B}(T^*_{n,b}-\bar{T^*_{n,b}})^2

nNormal Interval $ (Tn-z_{\frac{\alpha}{2}},Tn+z_{\frac{\alpha}{2}})$

Method 2

let $\theta=T(F)$ and $\hat{\theta_n}=T(\hat{F})$
Pivot $R_n = \hat{\theta_n}-\theta$
let $\hat{\theta_{n,1}},\cdots,\hat{\theta_{n,B}}$ be the bootstrap of $\hat{\theta}$

let $\hat{\theta_{\beta}}$ denote the $\beta$ sample quantile of $\hat{\theta_{n,1}},\cdots,\hat{\theta_{n,B}}$

then the $1-\alpha$ bootstrap pivotal confidence interval is

C_n=(2\hat{\theta_n}-\hat{\theta_{1-\frac{\alpha}{2}}},2\hat{\theta_n}-\hat{\theta_{\frac{\alpha}{2}}})

贴图证明
1678330825218
1678330861255

more widely
we could define pivot as

R_n = \frac{\phi(\hat{\theta_n})-\phi(\theta)}{1+a\phi(\theta)}+b

Method 3 Percentile Interval

C_n=(\theta_{\frac{\alpha}{2}^*},\theta_{\frac{1-\alpha}{2}^*}),\phi, \text{continuous,stictly increasing and distribution function H\\ symetric }

purpose :transform distribution F into G

1-\alpha = P^*(h_{\frac{\alpha}{2}}\le \phi(\hat{\theta^*}-\phi(\hat{\theta})\le h_{\frac{1-\alpha}{2}}))\\ =P*(\phi^{-1}(h_{\frac{\alpha}{2}}+\phi(\hat{\theta}))\le \hat{\theta} \le \phi^{-1}( h_{\frac{1-\alpha}{2}}+\phi(\hat{\theta})))

Jackknife

$T_n=T_(X_1,\cdots,X_n)$ and T^{-1} denote the statistic with the i obeservation removed let

\bar{T_n}=\frac{1}{n}\sum T_{(-i)}

then

Var(T_n)\approx\frac{n-1}{n}\sum_{i=1}^{n}(T_{(-i)}-\bar{T_n})^2

Exponential Families

场景：统计物理，信息几何
Def：

\text{let M be ameasure on } R^n,h:R^n\rightarrow R^n \text{ be a nonegtive function , and let } \\T_i,i=1,\cdots s \text{be masurable function } R^n\rightarrow R ,\eta \in R^s，\text{define}

A(\eta)=log \int exp(\sum_{1}{s}\eta_iT_i(x))