抱歉,您的浏览器无法访问本站
本页面需要浏览器支持(启用)JavaScript
了解详情 >

一方の笔记本

The only source of knowledge is experience.

这篇博客用于总结概率论相关的积分计算问题。

\(Z=g(X,Y)\)的概率密度函数

设有二维随机变量\((X,Y)\)以及其联合分布的概率密度函数\(f(x,y)\),试求\(Z=g(X,Y)\)的概率密度函数。

\(Z=X+Y\)的概率密度函数

\(\displaystyle F_{Z}(z)=P\{Z \leq z\} = P\{X+Y \leq z\} = \iint_{x+y\leq z} f(x,y)dxdy = \int_{-\infty}^{+\infty}dx \int_{-\infty}^{z-x}f(x,y)dy\)

\(\displaystyle\xlongequal{y+x=u}\int_{-\infty}^{+\infty}dx\int_{-\infty}^{z}f(x,u-x)du=\int_{-\infty}^{z}du\int_{-\infty}^{+\infty}f(x,u-x)dx\)

事实上,\(\displaystyle \int_{-\infty}^{+\infty}f(x,u-x)dx\) 是关于\(u\)的函数,那么有:

\[\displaystyle f_{Z}(z) = [F_{Z}(z)]_{z}'=\int_{-\infty}^{+\infty}f(x,z-x)dx=\int_{-\infty}^{+\infty}f(z-y,y)dy\]

此外,函数\(f(t)\)\(g(t)\)的卷积\(f * g\)就被定义为\(\displaystyle(f * g)(t) = \int_{-\infty}^{+\infty}f(t - \tau)g(\tau)d\tau\)

\(Z=X-Y\)的概率密度函数

习惯上,当\(g(X,Y)\)不关于\(X,Y\)对称时,先对\(x\)积分,有:

\(\displaystyle F_{Z}(z)=P\{Z \leq z\} = P\{X-Y \leq z\} = \iint_{x-y\leq z} f(x,y)dxdy = \int_{-\infty}^{+\infty}dy\int_{-\infty}^{y+z}f(x,y)dx\)

\(\displaystyle\xlongequal{x-y=u}\int_{-\infty}^{+\infty}dy\int_{-\infty}^{z}f(u+y,y)du=\int_{-\infty}^{z}du\int_{-\infty}^{+\infty}f(u+y,y)dy\)

同理有:

\[\displaystyle f_{Z}(z) = [F_{Z}(z)]_{z}'= \int_{-\infty}^{+\infty}f(z+y,y)dy\]

\(Z=XY\)的概率密度函数

\[\displaystyle F_{Z}(z)=P\{Z \leq z\} = P\{XY \leq z\} = \iint_{xy\leq z} f(x,y)dxdy\]

注意到上述积分需要分类讨论,不妨考虑\(z>0\)的情况,那么有:

\(\displaystyle F_{Z}(z) = \int_{-\infty}^{0}dx\int_{\frac{z}{x}}^{+\infty}f(x,y)dy+\int_{0}^{+\infty}dx\int_{-\infty}^{\frac{z}{x}}f(x,y)dy\)

\(\displaystyle=\int_{-\infty}^{0}dx\int_{z}^{-\infty}f(x,\frac{u}{x})\frac{1}{x}du + \int_{0}^{+\infty}dx\int_{-\infty}^{z}f(x,\frac{u}{x})du=\int_{-\infty}^{z}du\int_{-\infty}^{+\infty}f(x,\frac{u}{x})\frac{1}{|x|}dx\)

上述积分在进行\(u=xy\)换元时,一定注意无穷积分限的正负。对于\(z<0\)的情况同理,最终得到的结果为:

\[f_{Z}(z) = [F_{Z}(z)]_{z}'= \displaystyle \int_{-\infty}^{+\infty}f(x,\frac{z}{x})\frac{1}{|x|}dx=\int_{-\infty}^{+\infty}f(\frac{z}{y},y)\frac{1}{|y|}dy\]

\(Z=\displaystyle\frac{X}{Y}\)的概率密度函数

\(\displaystyle F_{Z}(z)=P\{Z \leq z\} = P\{\frac{X}{Y} \leq z\} = \iint_{\frac{x}{y}\leq z} f(x,y)dxdy\)

注意到上述积分需要分类讨论,不妨考虑\(z>0\)的情况,那么有:

\(\displaystyle F_{Z}(z) = \int_{-\infty}^{0}dy\int_{yz}^{+\infty}f(x,y)dx+\int_{0}^{+\infty}dy\int_{-\infty}^{yz}f(x,y)dx\)

\(\displaystyle = \int_{-\infty}^{0}dy\int_{z}^{-\infty}f(uy,y)ydu+\int_{0}^{+\infty}dy\int_{-\infty}^{z}f(uy,y)ydu=\int_{-\infty}^{z}du\int_{-\infty}^{+\infty}f(uy,y)|y|dy\)

上述积分在进行\(\displaystyle u=\frac{x}{y}\)换元时,一定注意无穷积分限的正负。对于\(z<0\)的情况同理,最终得到的结果为:

\[f_{Z}(z) = [F_{Z}(z)]_{z}'= \int_{-\infty}^{+\infty}f(yz,y)|y|dy\]

通用方法

参考17分钟搞定卷积公式,可利用二重积分的换元法求解。

\(z = g(x,y)\)等价变形为\(x = x(y,z)\),作变换\(\left\{\begin{matrix} x = x(y,z) \\ y = y \end{matrix}\right.\),那么\(xOy\)平面上的区域\(D\)被变换为\(yOz\)平面上的区域\(D'\),则:

\[\iint_{D} f(x,y) dxdy = \iint_{D'}f(x(y,z),y)|J|dydz\]

其中,\(J = \left|\begin{matrix} \displaystyle \frac{\partial x}{\partial y} \frac{\partial x}{\partial z} \\ \displaystyle \frac{\partial y}{\partial y} \frac{\partial y}{\partial z} \end{matrix}\right|\),为雅可比行列式。

此时令\(D=R^2\)有:

\[\iint_{D'}f(x(y,z),y)|\frac{\partial x}{\partial z}|dydz=1\]

可知\(\displaystyle f(x(y,z),y)|\frac{\partial x}{\partial z}|\)是二维随机变量\((Y,Z)\)的联合概率密度,那么:

\[\displaystyle f_{Z}(z) = \int_{-\infty}^{+\infty}f(x(y,z),y)|\frac{\partial x}{\partial z}|dy\]

常见一维随机变量分布的分布律/概率密度函数、期望和方差以及关系

上表中一维随机变量分布的期望和方差的计算过程

计算时善用常用积分值、裂项和方差定义\(D(X) = E(X^2)-[E(X)]^2\)

伯努利分布

根据定义结论显然。

二项分布

\(X\sim B(n,p)\),那么:

\(E(X) = \displaystyle \sum_{i}x_i P\{X = x_i\}=\sum_{i=0}^{n}i C_{n}^{i}p^i(1-p)^{n-i} = \sum_{i=1}^{n}i \frac{n!}{i!(n-i)!} p^i(1-p)^{n-i}\)

\(\displaystyle= np\sum_{i=1}^{n}\frac{(n-1)!}{(i-1)!(n-i)!}p^{i-1}(1-p)^{n-i} \xlongequal{j = i - 1} np\sum_{j=0}^{n-1}C_{n-1}^{j}p^j(1-p)^{n-1-j}=np\)

计算\(E(X^2)\)时,可以利用恒等式\(i^2=i(i-1) + i\)

\(E(X^2) = \displaystyle \sum_{i}x_i P\{X = x_i\}=\sum_{i=0}^{n}i^2 C_{n}^{i}p^i(1-p)^{n-i} = \sum_{i=1}^{n}C_{n-1}^{i-1}p^{i-1}(1-p)^{n-i} + \sum\frac{i(i-1)n!}{i!(n-i)!}p^i(1-p)^{n-i}\)

\(= E(X) + n(n-1)p^2\)

那么\(D(X) = E(X^2)-[E(X)]^2 = np(1-p)\)

事实上,还有另外一种求解方法。设随机变量\(X_i=\left\{\begin{matrix}1,\text{第}i\text{次实验成功} \\ 0,\text{否则} \end{matrix}\right. ,i \in \{1,2,\cdots,n\}\),显然各\(X_i\)独立同分布且\(X_1 \sim B(1,p)\),由伯努利分布的性质及期望的性质显而易见:\(E(X) = np,D(X)=np(1-p)\)

帕斯卡分布和几何分布

由于几何分布是帕斯卡分布的特例,因此可以直接求解帕斯卡分布的期望和方差,设\(X\sim NB(r,p)\),那么:

\[E(X) = \displaystyle \sum_{k=r}^{\infty}kC_{k-1}^{r-1}p^r(1-p)^{k-r}=\frac{r}{p}\sum_{k=r+1}^{\infty}C_{k-1}^{(r+1)-1}p^{r+1}(1-p)^{k-r}=\frac{r}{p}\]

仿照上式可以求解\(E(X^2)\)为:

\(E(X^2) = \displaystyle \sum_{k=r}^{\infty}k^2C_{k-1}^{r-1}p^r(1-p)^{k-r}=\sum_{k=r}^{\infty}(k+1)kC_{k-1}^{r-1}p^r(1-p)^{k-r} - E(X)\)

\(\displaystyle =\frac{r(r+2)}{p^2} \sum_{k=r+2} C_{k-1}^{(r+2)-1}p^{r+2}(1-p)^{k-r} - E(X) =\frac{r^2 + (1-p)r}{p^2} \Rightarrow D(X) = \frac{r(1-p)}{p^2}\)

此外,还可以利用幂级数求解。

\(r=1\)可知几何分布的期望和方差分别为\(\displaystyle \frac{1}{p}\)\(\displaystyle \frac{1-p}{p^2}\)

泊松分布

\(X\sim P(\lambda)\),那么:

\[E(X) = \displaystyle\sum_{n=0}^{\infty}n\frac{\lambda^n}{n!}e^{-\lambda}=\lambda \sum_{n-1=0}^{\infty}\frac{\lambda^{n-1}}{(n-1)!}e^{-\lambda}=\lambda\]

\[E(X^2)=\displaystyle\sum_{n=0}^{\infty}n^2\frac{\lambda^n}{n!}e^{-\lambda}=\sum_{n-2=0}^{\infty}\frac{\lambda^n}{(n-2)!}e^{-\lambda}+E(X)=\lambda^2+\lambda \Rightarrow D(X) = \lambda\]

均匀分布

\(X\sim U(a,b)\),那么:

\[E(X) = \displaystyle \frac{1}{b-a}\int_{a}^{b}xdx=\frac{a+b}{2}\]

\[E(X^2) = \displaystyle \frac{1}{b-a} \int_{a}^{b}x^2dx=\frac{a^2+ab+b^2}{3} \]

\[D(X) = E(X^2)-[E(X)^2]=\displaystyle \frac{(b-a)^2}{12}\]

正态分布

\(X\sim N(\mu,\sigma^2)\),那么:

\[\displaystyle E(X) = \int_{-\infty}^{+\infty}xf_{X}(x)dx=\int_{-\infty}^{+\infty}\frac{x}{\sqrt{2\pi}\sigma}e^{\frac{(x-\mu)^2}{2\sigma^2}}dx\xlongequal{t=\frac{x-\mu}{\sigma}}\int_{=\infty}^{+\infty}\frac{\mu+\sigma t}{\sqrt{2\pi}}e^{-\frac{t^2}{2}}dt=\mu\]

同理有:

\[\displaystyle E(X^2) = \int_{-\infty}^{+\infty}x^2f_{X}(x)dx=\int_{-\infty}^{+\infty}\frac{x}{\sqrt{2\pi}\sigma}e^{\frac{(x-\mu)^2}{2\sigma^2}}dx\xlongequal{t=\frac{x-\mu}{\sigma}}\int_{=\infty}^{+\infty}\frac{(\mu+\sigma t)^2}{\sqrt{2\pi}}e^{-\frac{t^2}{2}}dt=\mu^2+\sigma^2\]

最终有:\(E(X)=\mu,D(X)=\sigma\)

求解正态分布相关的积分时,有如下常用结论。

\(\displaystyle \int_{0}^{+\infty}x^ne^{-\lambda x}dx = \frac{n!}{\lambda^{n+1}}, n\in \text{N}, \lambda > 0\)

\(\displaystyle \int_{-\infty}^{+\infty}\frac{x^n}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}dx=\left\{\begin{matrix}(n-1)!!,n\text{为偶数} \\ 0,\text{n为奇数} \end{matrix}\right.\)

指数分布

\(X\sim E(\lambda)\),利用关于\(e^{-x}\)的反常积分的性质易得\(\displaystyle E(X) = \frac{1}{\lambda}, E(X^2)=\frac{2}{\lambda^2}\Rightarrow D(X) = \frac{1}{\lambda^2}\)

瑞利分布

\(X\sim R(\lambda)\),利用标准正态分布的\(n\)阶原点矩的结果有:

\[E(X) = \displaystyle \sqrt{2\pi} \sigma \int_{0}^{+\infty}(\frac{x}{\sigma})^2\frac{1}{\sqrt{2\pi}}e^{-\frac{1}{2}(\frac{x}{\sigma})^2}d(\frac{x}{\sigma}) = \sqrt{\frac{\pi}{2}}\sigma\]

再利用关于\(e^{-x}\)的反常积分的性质得:

\[E(X^2)=\displaystyle \int_{0}^{+\infty}\frac{x^2}{\sigma^2}e^{-\frac{x^2}{2\sigma^2}}d(\frac{x^2}{2}) \xlongequal{t=x^2}\frac{1}{2\sigma^2}\int_{0}^{+\infty}t e^{-\frac{t}{2\sigma^2}}dt=2\sigma^2\]

最终得到\(D(X) = E(X^2)-[E(X)]^2 = \displaystyle \frac{4-\pi}{2}\sigma^2\)

一维随机变量分布之间的关系

显而易见,伯努利分布是二项分布的特例,二项分布是帕斯卡分布的特例。

若独立同分布的两个随机变量\(X,Y\)均服从\(N(0,\sigma^2)\),那么\(U = \displaystyle \frac{X}{Y}\)服从柯西分布,\(V=\displaystyle \sqrt{X^2+Y^2}\)服从瑞利分布,下面给出证明。

\[f_{U}(u) = \displaystyle \int_{-\infty}^{+\infty}f(uy,y)|y|dy=\int_{-\infty}^{+\infty}\frac{1}{2\pi \sigma^2} e^{-\frac{(1+u^2)y^2}{2\sigma^2}} |y|dy \xlongequal{t=\frac{y}{\frac{\sigma}{\sqrt{1+u^2}}}}\frac{1}{\pi} \frac{1}{1+u^2}\frac{1}{2}\int_{-\infty}^{+\infty}|t|e^{-\frac{t^2}{2}}dt=\frac{1}{\pi} \frac{1}{1+u^2}\]

由上式可知\(U\sim C(1,0)\)

\(\displaystyle F_{V}(v) = P\{V \leq v\} = P\{\sqrt{X^2+Y^2} < v\} = \iint_{\sqrt{x^2+y^2} < v}f(x,y)dxdy\)

显然,当\(v\leq 0\)时,\(F_{V}(v) = 0\);当\(v > 0\)时有:

\(\displaystyle F_{V}(v) = \frac{1}{2\pi \sigma^2} \int_{0}^{2\pi}d\theta \int_{0}^{v} e^{-\frac{\rho^2}{2}} \rho d\rho=\frac{1}{\sigma^2}\int_{0}^{v}\rho e^{-\frac{\rho^2}{2\sigma^2}} d\rho \Rightarrow f_{V}(v) = \frac{v}{\sigma^2}e^{-\frac{v^2}{2\sigma^2}}\)

因而\(V\sim R(\sigma)\)

正态分布性质推导

一维正态分布的性质

\(X\sim N(\mu,\sigma^2)\),那么\(Y=aX+b\sim N(a\mu + b, (a\sigma)^2),a \neq 0\)。这一性质很好证明,当\(a>0\)时:

\[F_{Y}(y) = P\{Y \leq y\} = P\{aX + b \leq y\} = P\{\displaystyle X \leq \frac{y-b}{a} \}=F_{X}(\frac{y-b}{a})\]

那么:

\[f_{Y}(y) = [F_{Y}(y)]'_y = \frac{1}{a}f_{X}(\frac{y-b}{a}) = \displaystyle \frac{1}{\sqrt{2\pi}a\sigma}e^{-\frac{[y-(a\mu+b)]^2}{2(a\sigma)^2}}\]

显然\(Y\sim N(a\mu + b, a^2\sigma^2)\)\(a < 0\)时同理。

二维正态分布相关的性质

设二维随机变量\((X,Y)\sim N(\mu_1,\mu_2;\sigma_1^2,\sigma_2^2,\rho)\),那么有:

  • \(X\sim N(\mu_1, \sigma_1)\)\(Y \sim N(\mu_2, \sigma_2)\)
  • \(X\)\(Y\)不相关\(\Leftrightarrow X\)\(Y\)独立;
  • \(\rho_{XY}=\rho\)
  • \(Z = aX+bY+c\),那么\(Z\sim N(a\mu_1+b\mu_2+c, a^2\sigma_1^2 + 2\rho ab\sigma_1\sigma_2 + b^2\sigma_2^2)\)

服从二维正态分布随机变量的概率密度函数如下所示。

\[f(x,y)=\displaystyle \frac{1}{2\pi\sigma_1\sigma_2\sqrt{1-\rho^2}}\exp{\{-\frac{1}{2(1-\rho^2)}[\frac{(x-\mu_1)^2}{\sigma_1^2}-\frac{2\rho(x-\mu_1)(y-\mu_2)}{\sigma_1\sigma_2}+\frac{(y-\mu_2)^2}{\sigma_2^2}]\}}\]

首先求解随机变量\(X\)的概率密度函数。

\[\displaystyle \int_{-\infty}^{+\infty} f(x,y)dy=\frac{1}{\sqrt{2\pi}\sigma_1}e^{-\frac{(x-\mu_1)^2}{2\sigma_1^2}}\int_{-\infty}^{+\infty} \frac{1}{\sqrt{2\pi(1-\rho^2)}\sigma_2}\exp{\displaystyle\{-\frac{1}{2(1-\rho^2)}[(\frac{y-\mu_2}{\sigma_2}-\rho\frac{x-\mu_1}{\sigma_1})^2]\}}dy\]

\(\displaystyle\frac{y-\mu_2}{\sigma_2}-\rho\frac{x-\mu_1}{\sigma_1}=t\),那么\(dy=\sigma_2dt\),有:

\[\displaystyle \int_{-\infty}^{+\infty} f(x,y)dy=\frac{1}{\sqrt{2\pi}\sigma_1}e^{-\frac{(x-\mu_1)^2}{2\sigma_1^2}}\int_{-\infty}^{+\infty}\frac{1}{\sqrt{2\pi}\sqrt{1-\rho^2}}e^{-\frac{t^2}{2(\sqrt{1-\rho^2})^2}}dt=\frac{1}{\sqrt{2\pi}\sigma_1}e^{-\frac{(x-\mu_1)^2}{2\sigma_1^2}}\]

故可得随机变量\(X\)的概率密度函数为\(f_{X}(x)=\displaystyle \frac{1}{\sqrt{2\pi}\sigma_1}e^{-\frac{(x-\mu_1)^2}{2\sigma_1^2}}\),同理可得\(\displaystyle f_{Y}(y)=\frac{1}{\sqrt{2\pi}\sigma_2}e^{-\frac{(y-\mu_2)^2}{2\sigma_2^2}}\),据此可知\(f_X(x)\cdot f_{Y}(y)=f(x,y) \Leftrightarrow\rho =0\)

此时不妨深入探究\(\rho\)的性质,事实上\(\rho\)是随机变量\(X\)\(Y\)的相关系数。对于一般的随机变量而言,相关系数的公式为\(\rho=\displaystyle\frac{\text{Cov}(X,Y)}{\sqrt{D(X)D(Y)}}\),又\(\text{Cov}(X,Y)=E(XY)-E(X)E(Y)\),故可进行计算。

先证明二维随机变量\((X,Y)\sim N(\mu_1,\mu_2;\sigma_1^2,\sigma_2^2,\rho)\)时,\(E(X)=\mu_1,D(X)=\sigma_1^2\)

\(E(X)=\displaystyle\int_{R^2}xf(x,y)dxdy=\int_{-\infty}^{+\infty}xdx\int_{-\infty}^{+\infty}f(x,y)dy=\int_{-\infty}^{+\infty}xf_{X}(x)dx=\int_{-\infty}^{+\infty}x\frac{1}{\sqrt{2\pi}\sigma_1}e^{-\frac{(x-\mu_1)^2}{2\sigma_1^2}}dx\)

\(\xlongequal{t=\frac{x-\mu_1}{\sigma_1}}\displaystyle \int_{-\infty}^{+\infty}\frac{\sigma_1t+\mu_1}{\sqrt{2\pi}}e^{-\frac{t^2}{2}}dt=\sigma_1\int_{-\infty}^{+\infty} t\frac{1}{\sqrt{2\pi}} e^{-\frac{t^2}{2}} dt+\mu_1\int_{-\infty}^{+\infty} \frac{1}{\sqrt{2\pi}} e^{-\frac{t^2}{2}} dt=\mu_1\)

\(E(X^2)=\displaystyle\int_{R^2}x^2f(x,y)dxdy=\int_{-\infty}^{+\infty}x^2dx\int_{-\infty}^{+\infty}f(x,y)dy=\int_{-\infty}^{+\infty}x^2f_{X}(x)dx\)

\(\displaystyle=\int_{-\infty}^{+\infty}x^2\frac{1}{\sqrt{2\pi}\sigma_1}e^{-\frac{(x-\mu_1)^2}{2\sigma_1^2}}dx\xlongequal{t=\frac{x-\mu_1}{\sigma_1}}\displaystyle \int_{-\infty}^{+\infty}\frac{\sigma_1^2t^2+2\sigma_1\mu_1t+\mu_1^2}{\sqrt{2\pi}}e^{-\frac{t^2}{2}}dt\)

\(\displaystyle=\mu_1^2\int_{-\infty}^{+\infty} \frac{1}{\sqrt{2\pi}} e^{-\frac{t^2}{2}} dt + 2\sigma_1\mu_1 \int_{-\infty}^{+\infty} t \frac{1}{\sqrt{2\pi}} e^{-\frac{t^2}{2}} dt +\sigma_1^2\int_{-\infty}^{+\infty} t^2 \frac{1}{\sqrt{2\pi}} e^{-\frac{t^2}{2}} dt\)

此时不妨求解标准正态分布的\(n\)阶原点矩。

\(I_{2n}=\displaystyle \int_{-\infty}^{+\infty} t^{2n} \frac{1}{\sqrt{2\pi}} e^{-\frac{t^2}{2}} dt=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{+\infty}t^{2n-1}d(-e^{-\frac{t^2}{2}})\)

\(\displaystyle=\frac{1}{\sqrt{2\pi}}[-e^{-\frac{t^2}{2}}t^{2n-1}]_{-\infty}^{+\infty}+(2n-1)\cdot\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{+\infty}t^{2n-2}e^{-\frac{t^2}{2}}dt=(2n-1)I_{2n-2}\)

\(I_0=1\),故\(I_{2n}=(2n-1)!!\Rightarrow I_2=1\),最终得到如下结论:

\(X\sim N(0,1)\),那么\(E(X^n)= \left\{\begin{matrix} 0, n \text{为奇数} \\ (n-1)!!, n \text{为偶数} \end{matrix}\right.\)

所以\(E(X^2)=\mu_1^2+\sigma_1^2\Rightarrow D(X)=\sigma_1^2\)

接下来需要求解\(E(XY)\),计算可以仿照求解边缘分布密度函数的方法进行。

\(E(XY)=\displaystyle\int_{R^2}xyf(x,y)dxdy=\int_{-\infty}^{+\infty}dx\int_{-\infty}^{+\infty} xy f(x,y)dy\)

先计算\(\displaystyle \int_{-\infty}^{+\infty} xy f(x,y)dy\)

\(\displaystyle \int_{-\infty}^{+\infty} xy f(x,y)dy=\frac{x}{\sqrt{2\pi}\sigma_1}e^{-\frac{(x-\mu_1)^2}{2\sigma_1^2}}\int_{-\infty}^{+\infty} \frac{y}{\sqrt{2\pi(1-\rho^2)}\sigma_2}\exp{\displaystyle\{-\frac{1}{2(1-\rho^2)}[(\frac{y-\mu_2}{\sigma_2}-\rho\frac{x-\mu_1}{\sigma_1})^2]\}}dy\)

\(\displaystyle\frac{y-\mu_2}{\sigma_2}-\rho\frac{x-\mu_1}{\sigma_1}=t\),那么\(dy=\sigma_2dt\)\(y=\sigma_2\displaystyle t+\sigma_2\rho\frac{x-\mu_1}{\sigma_1}+\mu_2\),那么:

\(\displaystyle \int_{-\infty}^{+\infty} xy f(x,y)dy=\frac{x}{\sqrt{2\pi}\sigma_1}e^{-\frac{(x-\mu_1)^2}{2\sigma_1^2}}\int_{-\infty}^{+\infty}\frac{\sigma_2\displaystyle t+\sigma_2\rho\frac{x-\mu_1}{\sigma_1}+\mu_2}{\sqrt{2\pi}\sqrt{1-\rho^2}}e^{-\frac{t^2}{2(\sqrt{1-\rho^2})^2}}dt\)

再令\(\displaystyle u=\frac{t}{\sqrt{1-\rho^2}}\),那么:

\(\displaystyle \int_{-\infty}^{+\infty} xy f(x,y)dy=\frac{x}{\sqrt{2\pi}\sigma_1}e^{-\frac{(x-\mu_1)^2}{2\sigma_1^2}}\int_{-\infty}^{+\infty}\frac{\sigma_2\sqrt{1-\rho^2} u+\sigma_2\rho\frac{x-\mu_1}{\sigma_1}+\mu_2}{\sqrt{2\pi}}e^{-\frac{u^2}{2}}du\)

\(\displaystyle=\frac{1}{\sqrt{2\pi}\sigma_1}[\rho\frac{\sigma_2}{\sigma_1}x^2+(\mu_2-\rho\mu_1\frac{\sigma_2}{\sigma_1})x]e^{-\frac{(x-\mu_1)^2}{2\sigma_1^2}}\)

\(v=\displaystyle\frac{x-\mu_1}{\sigma_1}\),那么\(x=\sigma_1v+\mu_1\),有:

\(E(XY)=\displaystyle\int_{-\infty}^{+\infty}\frac{1}{\sqrt{2\pi}\sigma_1}[\rho\frac{\sigma_2}{\sigma_1}(\sigma_1v+\mu_1)^2+(\mu_2-\rho\mu_1\frac{\sigma_2}{\sigma_1})(\sigma_1v+\mu_1)]e^{-\frac{v^2}{2}}dv=\rho\sigma_1\sigma_2+\mu_1\mu_2\)

上述积分直接计算十分复杂,善用对称性和积分的线性性质能很快得出答案。将\(E(XY)\)\(E(X)\)\(E(Y)\)\(D(X)\)\(D(Y)\)的值代入相关系数的计算公式中可知\(\rho_{XY}=\rho\),得证。

\((X,Y)\sim N(\mu_1,\mu_2;\sigma_1^2,\sigma_2^2,\rho)\),那么\(Z=aX+bY+c\sim N(a\mu_1+b\mu_2+c,a^2\sigma_1^2+b^2\sigma_2^2+2\rho ab\sigma_1\sigma_2)\),其中\(\rho\)\(X,Y\)的相关系数。该性质的证明过程思路不难,但是计算量极大。

首先进行如下变换:

\[\displaystyle f_{Z}(z) = \int_{-\infty}^{+\infty}f(x(y,z),y)|\frac{\partial x}{\partial z}|dy=\frac{1}{|a|}\int_{-\infty}^{+\infty}f(\frac{z-by-c}{a},y)dy\]

上式中最复杂的部分是被积函数\(\displaystyle f(\frac{z-by-c}{a},y)\),积分变量在函数的两个变量位均有出现。类比前面对二维正态分布边缘分布的概率密度函数求解思路。被积函数的指数部分具有如下形式:

\[\displaystyle -\frac{1}{2(1-\rho^2)}T(y,z)\]

其中,\(T(y,z)=\displaystyle\frac{[z-by-(c+a\mu_1)]^2}{(a\sigma_1)^2}-\frac{2\rho}{a\sigma_1\sigma_2}[z-by-(c+a\mu_1)](y-\mu_2)+\frac{(y-\mu_2)^2}{\sigma_2^2}\)。下面是该性质证明中最繁琐的一部分,即对积分变量\(y\)进行配方,不妨设\(T(y,z)\)具有如下形式:

\[T(y,z) = Az^2 + By^2 + Cz + Dy + Eyz + F\]

然后对\(T(y,z)\)进行形式配方,得到如下结果:

\[T(y,z) = \displaystyle B(y+\frac{D+Ez}{2B})^2 + (A - \frac{E^2}{4B})(z+\frac{2BC - DE}{4AB-E^2})^2+F-\frac{(2BC-DE)^2}{4B(4AB-E^2)}-\frac{D^2}{4B}\]

\(T(y,z)\)展开,得到如下系数表:

系数
\(A\) \(\displaystyle \frac{1}{a^2\sigma_1^2}\)
\(B\) \(\displaystyle \frac{b^2}{a^2\sigma_1^2}+\frac{1}{\sigma_2^2}+\frac{2\rho b}{a\sigma_1\sigma_2}\)
\(C\) \(\displaystyle 2(\frac{\rho \mu_2}{a\sigma_1\sigma_2}-\frac{c+a\mu_1}{a^2\sigma_1^2})\)
\(D\) \(\displaystyle 2[\frac{\rho(c+a\mu_1)}{a\sigma_1\sigma_2}+\frac{b(c+a\mu_1)}{a^2\sigma_1^2}-\frac{\mu_2}{\sigma_2^2}-\frac{\rho b\mu_2}{a\sigma_1\sigma_2}]\)
\(E\) \(\displaystyle -2 [\frac{b}{a^2\sigma_1^2}+\frac{\rho}{a\sigma_1\sigma_2}]\)
\(F\) \(\displaystyle \frac{(c+a\mu_1)^2}{a^2\sigma_1^2}+\frac{\mu_2^2}{\sigma_2^2}-\frac{2\rho\mu_2(c+a\mu_1)}{a\sigma_1\sigma_2}\)

现在只需要将各项系数代入\(T(y,z)\)的表达式中并求解即可,不妨再列一个系数表:

系数
\(\displaystyle A-\frac{E^2}{4B}\) \(\displaystyle \frac{1-\rho^2}{a^2\sigma_1^2+b^2\sigma_2^2+2\rho ab \sigma_1\sigma_2}\)
\(2BC - DE\) \(\displaystyle -4\frac{(1-\rho^2)(a\mu_1+b\mu_2+c)}{a^2\sigma_1^2\sigma_2^2}\)
\(4AB-E^2\) \(\displaystyle 4\frac{(1-\rho^2)}{a^2\sigma_1^2\sigma_2^2}\)
\(\displaystyle F-\frac{(2BC-DE)^2}{4B(4AB-E^2)}-\frac{D^2}{4B}\) \(0\)
\(B\) \(\displaystyle \frac{a^2\sigma_1^2+b^2\sigma_2^2+2\rho ab\sigma_1\sigma_2}{a^2\sigma_1^2\sigma_2^2}\)

不妨记\(\sigma^2 = a^2\sigma_1^2+b^2\sigma_2^2+2\rho ab\sigma_1\sigma_2\)\(\mu = a\mu_1+b\mu_2+c\),那么\(B=\displaystyle \frac{\sigma^2}{a^2\sigma_1^2\sigma_2^2}\),且:

\(\displaystyle f_{Z}(z) = \frac{1}{\sqrt{2\pi}}e^{-\frac{(z-\mu)^2}{2\sigma^2}} \int_{-\infty}^{+\infty}\frac{1}{\sqrt{2\pi}\frac{|a|\sigma_1\sigma_2}{\sigma}\sqrt{1-\rho^2}}e^{-\frac{B(y+\frac{D+Ez}{2B})^2}{2(1-\rho^2)}}d(y+\frac{D+Ez}{2B})\)

\(\displaystyle\xlongequal{t=\frac{y+\frac{D+Ez}{2B}}{\sqrt{\frac{1-\rho^2}{B}}}}\frac{1}{\sqrt{2\pi}}e^{-\frac{(z-\mu)^2}{2\sigma^2}}\int_{-\infty}^{+\infty}\frac{1}{\sqrt{2\pi}}e^{-\frac{t^2}{2}}dt=\frac{1}{\sqrt{2\pi}}e^{-\frac{(z-\mu)^2}{2\sigma^2}}\)

原命题得证。

计算技巧

详见概率论相关计算技巧

评论