统计思维
参考书:
all of statistic 、 principle of statistic inference、computer age statistical inference
history:
1、经典统计
2、统计机器学习
3、数据学习
heroes:Fisher 、 Neyman
分布函数的估计

plugin
线性统计泛函

一本小册子
The Jacknife,the Bootstrap and Other Resampling plans
simulation
bootstrap
由大数定理

Hence, we can use the sample variance of the simulated values to approximate E(Y) and V(Y)
SGD(随机梯度法)
Important Sampling
X1⋯,Xn∼F
F(CDF) f(pdf)
M=∫h(x)f(x)ds=∫hgfgdx
这里g我们可能会选取正态密度函数等较好的函数,这样可看做在g里采样
∫hgfgdx≈n1∑h(xi)g(xi)f(xi)
The Sampling Importance Resampling (SIR)
- Sampling candidates$ Y_1,Y_2,\cdots Y_n \sim g$
- caculate the s importance weights w(yi)
- resampling X1,X2,⋯Xn from Y1,Y2,⋯Yn
P(X∈A∣Y1,Y2⋯Yn)=∑IYi∈A∑wi∗wi∗
由强大数定律,知上式趋向于
∫Aw(yi)∗g(y)dy=∫Af(y)dy
上面的证明是用来说明Xi在样本量充分大时是近似取自f的
bootstrapping
我们有θ=T(F),θ^=T(F^)
我们希望估计
R(x,F)=seT(F^)(T(F^)−T(F))
R中的x是作用到F的,我们模拟R(x∗,F^)来逼近 R(x,F)
- step1 Estimate R(x,F) with R(x∗,F^)
usually it’s diffcult to caculate R(x∗,F^),we have step tow
- step2 Approximate R(x∗,F^) with simulation
Example n=3 x1,x2,x3=1,2,6∼i.i.d.F,estimate mean θ
x∗(vector)有333=27种取法
x∗ |
θ |
p∗(θ^∗,F^) |
111 |
1 |
271 |
112 |
34 |
273 |
… |
… |
… |
在n很大时,如上枚举的手段是不切实际的,在此时我们只需要合理的利用采样即可
Bootstrap Variance Estimation
simulate VF(Tn) with VF^(Tn)


3月9日
有数据 X→F,我们希望做R(x,F)的统计,我们利用R(x∗,F^)来估计R(x,F),其中
X∗=(X1∗,⋯,Xn∗)
Parametric bootstrap
X∗=(X1∗,⋯,Xn∗)→F(x,θ)
我们要做的事情是
X∗=(X1∗,⋯,Xn∗)→θ^→X∗
bootstrapping regression
Yi=XiTβ+ϵi,i=1,⋯,n
the ϵi are assumed tobe i.i.d has mean zero and constant variance
(Xi,Yi)→β^→Yi^→ϵ^=Yi−Yi^→ϵ∗→Y∗→β∗
Bootstrap Confidence Interval
Method 1. The Normal Intervals
vboot=B1b=1∑B(Tn,b∗−Tn,b∗ˉ)2
nNormal Interval $ (Tn-z_{\frac{\alpha}{2}},Tn+z_{\frac{\alpha}{2}})$
Method 2
let θ=T(F) and θn^=T(F^)
Pivot Rn=θn^−θ
let θn,1^,⋯,θn,B^ be the bootstrap of θ^
let θβ^ denote the β sample quantile of θn,1^,⋯,θn,B^
then the 1−α bootstrap pivotal confidence interval is
Cn=(2θn^−θ1−2α^,2θn^−θ2α^)
贴图证明


more widely
we could define pivot as
Rn=1+aϕ(θ)ϕ(θn^)−ϕ(θ)+b
Method 3 Percentile Interval
Cn=(θ2α∗,θ21−α∗),ϕ,continuous,stictly increasing and distribution function H symetric
purpose :transform distribution F into G
1−α=P∗(h2α≤ϕ(θ∗^−ϕ(θ^)≤h21−α))=P∗(ϕ−1(h2α+ϕ(θ^))≤θ^≤ϕ−1(h21−α+ϕ(θ^)))
Jackknife
Tn=T(X1,⋯,Xn) and T^{-1} denote the statistic with the i obeservation removed let
Tnˉ=n1∑T(−i)
then
Var(Tn)≈nn−1i=1∑n(T(−i)−Tnˉ)2
Exponential Families
场景: 统计物理,信息几何
Def:
let M be ameasure on Rn,h:Rn→Rn be a nonegtive function , and let Ti,i=1,⋯sbe masurable function Rn→R,η∈Rs,define
A(η)=log∫exp(1∑sηiTi(x))