Describe real world systems with models.
$$Y_i \mathop{\sim}^{\rr{iid*}} \rr{Bernoulli}(p)$$
*Independent repetition is great.
Given a statistical model, all information about the parameters is contained in the likelihood function.
$$\hat{\theta}_{\rr{mle}} \mathop{\rightarrow}^p \theta_0, \rr{ } \rr{Var}\left(\hat{\theta}_{\rr{mle}}\right) \approx I^{-1} / n$$
Conditions:
The result is asymptotic.
MLE \\( \hat{\theta}_{\rr{mle}} \\): $$ \hat{\theta}_{\rr{mle}} = \rr{argmax}_\theta \left[ \log L\left(\theta \given y \right) \right]$$
Expected Fisher Information $$I(\theta)=\operatorname{E}_y \left[\left. \left(\frac{\partial}{\partial\theta} \log L(\theta \given y)\right)^2\right|\theta \right]$$
Expected Fisher Information \\( I(\theta) \\): $$I(\theta)=-\operatorname{E}_y \left[\left. \frac{\partial^2}{\partial\theta^2} \log L(\theta \given y)\right|\theta \right]$$
Let's show this.
$$ \mathcal{I}(\theta)=-\frac{\partial^2}{\partial\theta^2} \log L(\theta \given y) $$
It is the negative Hessian of the log-likelihood function.
\\( Y_i \sim \rr{Expo} ( \lambda )\\) represent time to failure of lightbulbs.
Likelihood $$L \left( \lambda \given y \right) = \prod_{i=1}^n \lambda e^{-\lambda y_i}$$
Score \\( s ( \lambda \given y) = \sum_{i=1}^n \left({ 1 \over \lambda} - y_i \right)\\).
Observed Fisher Information (= Expected!) $$\mathcal{I}(\lambda) = { n \over \lambda^2 }$$
\\( k \\) succcesses out of n trials: $$L(p \given y)=\frac{n!}{k!(n - k)!}p^k(1-p)^{n-k}$$
Score \\( s(p \given y)={\partial \over \partial p} \log L( p \given y) = \frac{k}{p}-\frac{n-k}{1-p}\\).
Expected Fisher Information $$\rr{Var}\left(s\right) =\operatorname{Var}(k)\left(\frac{1}{p}+\frac{1}{1-p}\right)^2 =\frac{n}{p(1-p)}$$
Let's obtain it via observed Fisher Information.
Optimality conditions for the MLE:
Define log-likelihood, score, and Fisher information function: $$l\left(\theta \given y \right) = log f_\theta(y), \rr{ } S(\theta) = \frac{\partial}{\partial \theta}l\left( \theta \given y\right)$$
$$I\left( \theta \right) = E_y\left[-\frac{\partial^2}{\partial\theta^2}l(\theta \given y)\right]$$
There are implementations in R such as optim that perform this numerically given the function \\( S \\).
$$f_\lambda(y) = \lambda e^{-\lambda y}$$
$$l\left( \lambda \given y\right) = log\lambda - \lambda y, \rr{ } S(\lambda) = \frac{1}{\lambda} - y, \rr{ } I(\lambda) = \frac{1}{\lambda^2}$$
Regardless of the procedure, verify its numeric properties:
Simulate from the model many times, fit the model on the generated data, calculate the 95% Confidence Interval around the estimator, and check how frequently it captures the true value of the parameter.
"Research Computing provides faculty and researchers with the tools they need to take on large-scale computing challenges. Odyssey, Harvard’s largest supercomputer, offers users over 2.5 Petabytes of raw storage, more than 17,000 processing cores, and numerous software modules and applications."
# Stat 221
# Problem set 2 answer 1d
# bash job execution script
# answer prefix - please do not change
answer="answer1d"
pset="\"Problem set 2\""
# set the student name
student="\"First name Last name\""
# set working directory - important!
wd="~/stat221/pset2"
# record the email of the user
email="fasusername@fas.harvard.edu"
# define the queue we'd like to use
queue="short_serial"
# define other arguments
fn="answer1d-odyssey-student"
infile="data/answer1c-INDEX.Rdata"
niter=4
msg="Submitting the optim vs. Fisher scoring summarization script"
file="../bsub/$answer.bsub"
# very important to remember to list all your arguments here
line="R --no-save --args answer $answer wd $wd student $student "
line+="pset $pset infile $infile niter $niter "
line+="< $fn.R 1> $wd/test/jobreports/$fn.out "
line+="2> $wd/test/jobreports/$fn.err"
echo "$line" >> ${file}
# start the job
echo "$msg."
bsub < ${file}