STA 721: Lecture 12
Duke University
Bounded Influence and Posterior Mean
Shrinkage properties and nonconcave penalties
conditions for optimal shrinkage and selection . . .
Readings (see reading link)
Carvalho, Polson & Scott (2010) propose an alternative shrinkage prior
In the case
Half-Cauchy prior induces a Beta(1/2, 1/2) distribution on
marginal prior (after integrating out )
Posterior mean of
Bounded Influence of the prior (in this setting) means that
For HS
unbiasedness for large
Diabetes data (from the lars
package)
64 predictors: 10 main effects, 2-way interactions and quadratic terms
sample size of 442
split into training and test sets
compare MSE for out-of-sample prediction using OLS, lasso and horseshoe priors
Root MSE for prediction for left out data based on 25 different random splits with 100 test cases
both Lasso and Horseshoe much better than OLS
Model
Penalized Least Squares
Bayes posterior mode (conditional) with prior
Fan & Li (JASA 2001) discuss variable selection via nonconcave penalties and oracle properties in the context of penalized likelihoods in this setting
with duality of the negative log prior as their penalty we can extend to Bayesian modal estimates where the prior is a function of
Requirements on penality
To find the optimal estimator take derivative of
Derivative is
setting derivative to zero gives
if
for large
as MLE is unbiased, the optimal estimator is approximately unbiased for large
As sufficient condition for a thresholding rule
if
if
a sufficient and necessary condition for continuity is that the minimum of
Prior
Penalty:
Unbiasedness: for large
not a thresholding rule as
is continuous as minimum is at zero
Penalty:
Unbiasedness: for large
Is a thresholding rule as
is continuous as minimum is at
The Generalized Double Pareto of Armagan, Dunson & Lee (2013)
has a prior density for
express as
Scale mixtures of Normals representation
is this a thresholding rule? unbiasedness? continuity?
for all parameters or are there restrictions?
The literature on shrinkage estimators (with or without selection) is extensive
For Bayes, choice of estimator
Prior/Posterior do not put any probability on the event
Uncertainty that the coefficient is zero?
Selection solved as a post-analysis decision problem
Selection part of model uncertainty