Examples
Contents
Ratio of two means
Consider a setting where independent pairs of random variables $(X_1, Y_1), \ldots, (X_n, Y_n)$ are observed, and suppose that interest is in the ratio of the mean of $Y_i$ to the mean of $X_i$, that is $\theta = \mu_Y / \mu_X$, with $\mu_X = E(X_i)$ and $\mu_Y = E(Y_i) \ne 0$ $(i = 1, \ldots, n)$.
Assuming that sampling is from an infinite population, one way of estimating $\theta$ without any further assumptions about the joint distribution of $(X_i, Y_i)$ is to set the unbiased estimating equation $\sum_{i = 1}^n (Y_i - \theta X_i) = 0$. The resulting $M$-estimator is then $\hat\theta = s_Y/s_X$ where $s_X = \sum_{i = 1}^n X_i$ and $s_Y = \sum_{i = 1}^n Y_i$.
The estimator $\hat\theta$ is generally biased, as can be shown, for example, by an application of the Jensen inequality assuming that $X_i$is independent of $Y_i$, and its bias can be reduced using the empirically adjusted estimating functions approach in Kosmidis & Lunardon (2020).
This example illustrates how GEEBRA can be used to calculate the $M$-estimator and its reduced-bias version.
julia> using GEEBRA, Random
Define a data type for ratio estimation problems
julia> struct ratio_data
y::Vector
x::Vector
end;
Write a function to compute the number of observations for objects of type ratio_data
.
julia> function ratio_nobs(data::ratio_data)
nx = length(data.x)
ny = length(data.y)
if (nx != ny)
error("length of x is not equal to the length of y")
end
nx
end;
Generate some data to test things out
julia> Random.seed!(123);
julia> my_data = ratio_data(randn(10), rand(10));
julia> ratio_nobs(my_data)
10
The estimating function for the ratio $\theta$ is
$\sum_{i = 1}^n (Y_i - \theta X_i)$
So, the contribution to the estimating function can be implemented as
julia> function ratio_ef(theta::Vector,
data::ratio_data,
i::Int64)
data.y[i] .- theta * data.x[i]
end;
The estimating_function_template
for the ratio estimation problem can now be set up using ratio_nobs
and ratio_ef
.
julia> ratio_template = estimating_function_template(ratio_nobs, ratio_ef);
We are now ready use ratio_template
and my_data
to compute the $M$-estimator of $\theta$ by solving the esitmating equation $\sum_{i = 1}^n (Y_i - \theta X_i) = 0$. The starting value for the nonlinear solver is set to 0.1
.
julia> result_m = fit(ratio_template, my_data, [0.1])
M-estimation with estimating function contributions ratio_ef
────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
────────────────────────────────────────────────────────────────────────
theta[1] 1.07548 0.573615 1.87492 0.0608 -0.0487819 2.19975
────────────────────────────────────────────────────────────────────────
Estimating functions: [-2.051803171809752e-12]
fit
uses methods from the NLsolve package for solving the estimating equations. Arguments can be passed directly to NLsolve.nlsolve
through keyword arguments to the fit
method. For example,
julia> result_m = fit(ratio_template, my_data, [0.1], show_trace = true)
Iter f(x) inf-norm Step 2-norm
------ -------------- --------------
0 4.321212e+00 NaN
1 3.878230e+00 1.000000e-01
2 2.992266e+00 2.000000e-01
3 1.220339e+00 4.000000e-01
4 2.051803e-12 2.754829e-01
M-estimation with estimating function contributions ratio_ef
────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
────────────────────────────────────────────────────────────────────────
theta[1] 1.07548 0.573615 1.87492 0.0608 -0.0487819 2.19975
────────────────────────────────────────────────────────────────────────
Estimating functions: [-2.051803171809752e-12]
Bias reduction in general $M$-estimation can be achieved by solving the adjusted estimating equation $\sum_{i = 1}^n (Y_i - \theta X_i) + A(\theta, Y, X) = 0$, where $A(\theta)$ are empirical bias-reducing adjustments depending on the first and second derivatives of the estimating function contributions. GEEBRA can use ratio_template
and automatic differentiation (see, ForwardDiff) to construct $A(\theta, Y, X)$ and, then, solve the bias-reducing adjusted estimating equations. All this is simply done by
julia> result_br = fit(ratio_template, my_data, [0.1], estimation_method = "RBM")
RBM-estimation with estimating function contributions ratio_ef
Bias reduction method: implicit_trace
────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
────────────────────────────────────────────────────────────────────────
theta[1] 1.06754 0.573499 1.86146 0.0627 -0.0564928 2.19158
────────────────────────────────────────────────────────────────────────
Adjusted estimating functions: [-1.3371248552829229e-14]
where RBM
stands for reduced-bias M
-estimation.
Kosmidis & Lunardon (2020) show that the reduced-bias estimator of $\theta$ is $\tilde\theta = (s_Y + s_{XY}/s_{X})/(s_X + s_{XX}/s_{X})$. The code chunks below tests that this is indeed the result GEEBRA returns.
julia> sx = sum(my_data.x);
julia> sxx = sum(my_data.x .* my_data.x);
julia> sy = sum(my_data.y);
julia> sxy = sum(my_data.x .* my_data.y);
julia> isapprox(sy/sx, result_m.theta[1])
true
julia> isapprox((sy + sxy/sx)/(sx + sxx/sx), result_br.theta[1])
true
Logistic regression
Using objective_function_template
Here, we use GEEBRA's objective_function_template
to estimate a logistic regression model using maximum likelihood and maximum penalized likelihood, with the empirical bias-reducing penalty in Kosmidis & Lunardon (2020).
julia> using GEEBRA
julia> using Random
julia> using Distributions
julia> using Optim
A data type for logistic regression models (consisting of a response vector y
, a model matrix x
, and a vector of weights m
) is
julia> struct logistic_data
y::Vector
x::Array{Float64}
m::Vector
end
A function to compute the number of observations from logistic_data
objects is
julia> function logistic_nobs(data::logistic_data)
nx = size(data.x)[1]
ny = length(data.y)
nm = length(data.m)
if (nx != ny)
error("number of rows in of x is not equal to the length of y")
elseif (nx != nm)
error("number of rows in of x is not equal to the length of m")
elseif (ny != nm)
error("length of y is not equal to the length of m")
end
nx
end
logistic_nobs (generic function with 1 method)
The logistic regression log-likelihood contribution at a parameter theta
for the $i$th observations of data data
is
julia> function logistic_loglik(theta::Vector,
data::logistic_data,
i::Int64)
eta = sum(data.x[i, :] .* theta)
mu = exp.(eta)./(1 .+ exp.(eta))
data.y[i] .* log.(mu) + (data.m[i] - data.y[i]) .* log.(1 .- mu)
end
logistic_loglik (generic function with 1 method)
Let's simulate some logistic regression data with $10$ covariates
julia> Random.seed!(123);
julia> n = 100;
julia> m = 1;
julia> p = 10
10
julia> x = Array{Float64}(undef, n, p);
julia> x[:, 1] .= 1.0;
julia> for j in 2:p
x[:, j] .= rand(n);
end
julia> true_betas = randn(p) * sqrt(p);
julia> y = rand.(Binomial.(m, cdf.(Logistic(), x * true_betas)));
julia> my_data = logistic_data(y, x, fill(m, n));
and set up an objective_function_template
for logistic regression
julia> logistic_template = objective_function_template(logistic_nobs, logistic_loglik)
objective_function_template(Main.ex-2.logistic_nobs, Main.ex-2.logistic_loglik)
The maximum likelihood estimates starting at true_betas
are
julia> o1_ml = fit(logistic_template, my_data, true_betas, optim_method = NelderMead())
M-estimation with objective contributions logistic_loglik
──────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
──────────────────────────────────────────────────────────────────────────
theta[1] 1.03786 2.46744 0.42062 0.6740 -3.79824 5.87395
theta[2] -6.33539 1.53213 -4.13502 <1e-4 -9.33832 -3.33247
theta[3] -6.76459 1.63753 -4.13096 <1e-4 -9.9741 -3.55508
theta[4] 0.994292 1.1077 0.89762 0.3694 -1.17676 3.16534
theta[5] 5.4302 1.6185 3.35508 0.0008 2.25799 8.6024
theta[6] 6.58354 1.92567 3.41884 0.0006 2.8093 10.3578
theta[7] 4.92474 1.39467 3.53111 0.0004 2.19123 7.65824
theta[8] 1.29824 1.25277 1.0363 0.3001 -1.15714 3.75362
theta[9] -3.00981 1.43218 -2.10156 0.0356 -5.81682 -0.202796
theta[10] -4.96494 1.85256 -2.68005 0.0074 -8.59589 -1.334
──────────────────────────────────────────────────────────────────────────
Maximum objetive: -31.9044
Takeuchi information criterion: 83.6381
Akaike information criterion: 83.8088
fit
uses methods from the Optim package internally. Here, we used the Optim.NelderMead
method. Alternative optimization methods and options can be supplied directly through the keyword arguments method
and optim.Options
, respectively. For example,
julia> o2_ml = fit(logistic_template, my_data, true_betas, optim_method = LBFGS(), optim_options = Optim.Options(g_abstol = 1e-05))
M-estimation with objective contributions logistic_loglik
───────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
───────────────────────────────────────────────────────────────────────────
theta[1] 1.03787 2.46744 0.420628 0.6740 -3.79822 5.87397
theta[2] -6.33546 1.53214 -4.13505 <1e-4 -9.3384 -3.33253
theta[3] -6.76458 1.63752 -4.13098 <1e-4 -9.97407 -3.55509
theta[4] 0.994237 1.1077 0.897571 0.3694 -1.17681 3.16528
theta[5] 5.43023 1.61849 3.35512 0.0008 2.25805 8.60241
theta[6] 6.58354 1.92566 3.41885 0.0006 2.80932 10.3578
theta[7] 4.9247 1.39466 3.53111 0.0004 2.19122 7.65818
theta[8] 1.29827 1.25278 1.03632 0.3001 -1.15712 3.75367
theta[9] -3.00978 1.43216 -2.10156 0.0356 -5.81676 -0.202795
theta[10] -4.96494 1.85254 -2.68006 0.0074 -8.59585 -1.33402
───────────────────────────────────────────────────────────────────────────
Maximum objetive: -31.9044
Takeuchi information criterion: 83.6381
Akaike information criterion: 83.8088
The reduced-bias estimates starting at the maximum likelihood ones are
julia> o1_br = fit(logistic_template, my_data, coef(o1_ml), estimation_method = "RBM")
RBM-estimation with objective contributions logistic_loglik
Bias reduction method: implicit_trace
───────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
───────────────────────────────────────────────────────────────────────────
theta[1] 0.928169 2.20201 0.42151 0.6734 -3.38769 5.24402
theta[2] -5.4578 1.20924 -4.51341 <1e-5 -7.82787 -3.08773
theta[3] -5.81835 1.26797 -4.58872 <1e-5 -8.30352 -3.33318
theta[4] 0.81906 0.980035 0.835746 0.4033 -1.10177 2.73989
theta[5] 4.64568 1.30015 3.5732 0.0004 2.09744 7.19393
theta[6] 5.66444 1.54657 3.66258 0.0002 2.63321 8.69566
theta[7] 4.16222 1.11835 3.72174 0.0002 1.97028 6.35415
theta[8] 1.14271 1.13618 1.00575 0.3145 -1.08416 3.36957
theta[9] -2.58574 1.2083 -2.13999 0.0324 -4.95396 -0.217525
theta[10] -4.22949 1.49518 -2.82876 0.0047 -7.15999 -1.299
───────────────────────────────────────────────────────────────────────────
Maximum penalized objetive: -36.5538
Takeuchi information criterion: 81.9311
Akaike information criterion: 84.284
Using estimating_function_template
The same results as above can be returned using an estimating_function_template
for logistic regression.
The contribution to the derivatives of the log-likelihood for logistic regression is
julia> function logistic_ef(theta::Vector,
data::logistic_data,
i::Int64)
eta = sum(data.x[i, :] .* theta)
mu = exp.(eta)./(1 .+ exp.(eta))
data.x[i, :] * (data.y[i] - data.m[i] * mu)
end
logistic_ef (generic function with 1 method)
Then, solving the bias-reducing adjusted estimating equations
julia> logistic_template_ef = estimating_function_template(logistic_nobs, logistic_ef);
julia> e1_br = fit(logistic_template_ef, my_data, true_betas, estimation_method = "RBM")
RBM-estimation with estimating function contributions logistic_ef
Bias reduction method: implicit_trace
───────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
───────────────────────────────────────────────────────────────────────────
theta[1] 0.928169 2.20201 0.42151 0.6734 -3.38769 5.24402
theta[2] -5.4578 1.20924 -4.51341 <1e-5 -7.82787 -3.08773
theta[3] -5.81835 1.26797 -4.58872 <1e-5 -8.30352 -3.33318
theta[4] 0.81906 0.980035 0.835746 0.4033 -1.10177 2.73989
theta[5] 4.64568 1.30015 3.5732 0.0004 2.09744 7.19393
theta[6] 5.66444 1.54657 3.66258 0.0002 2.63321 8.69566
theta[7] 4.16222 1.11835 3.72174 0.0002 1.97028 6.35415
theta[8] 1.14271 1.13618 1.00575 0.3145 -1.08416 3.36957
theta[9] -2.58574 1.2083 -2.13999 0.0324 -4.95396 -0.217525
theta[10] -4.22949 1.49518 -2.82876 0.0047 -7.15999 -1.299
───────────────────────────────────────────────────────────────────────────
Adjusted estimating functions: [-5.189099150371135e-12; -7.72212849220466e-12; … ; -5.199951580436846e-12; -4.3689080131414926e-12]
returns the reduced-bias estimates from maximum penalized likelihood:
julia> isapprox(coef(o1_br), coef(e1_br))
true
Bias-reduction methods
GEEBRA currently implements 2 alternative bias reduction methods, called implicit_trace
and explicit_trace
. implicit_trace
will adjust the estimating functions or penalize the objectives, as we have seen earlier. explicit_trace
, on the other hand, will form an estimate of the bias of the $M$-estimator and subtract that from the $M$-estimates. The default method is implicit_trace
.
For example, for logistic regression via estimating functions
julia> e2_br = fit(logistic_template_ef, my_data, true_betas, estimation_method = "RBM", br_method = "explicit_trace")
RBM-estimation with estimating function contributions logistic_ef
Bias reduction method: explicit_trace
───────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
───────────────────────────────────────────────────────────────────────────
theta[1] 0.895301 2.10014 0.426305 0.6699 -3.2209 5.0115
theta[2] -5.11645 1.11115 -4.60463 <1e-5 -7.29427 -2.93863
theta[3] -5.46494 1.15431 -4.73437 <1e-5 -7.72736 -3.20253
theta[4] 0.74801 0.936884 0.798402 0.4246 -1.08825 2.58427
theta[5] 4.34256 1.20242 3.6115 0.0003 1.98585 6.69926
theta[6] 5.29117 1.42302 3.71828 0.0002 2.50211 8.08023
theta[7] 3.87517 1.0339 3.74812 0.0002 1.84877 5.90158
theta[8] 1.07407 1.09374 0.98202 0.3261 -1.06962 3.21776
theta[9] -2.42976 1.13518 -2.14042 0.0323 -4.65466 -0.204855
theta[10] -3.93176 1.37586 -2.85768 0.0043 -6.62839 -1.23512
───────────────────────────────────────────────────────────────────────────
Adjusted estimating functions: [0.04705609374706787; -0.006424930231943682; … ; -0.023832956574366418; -0.03494242367189526]
which gives slightly different estimates that what are in the implict_trace
fit in e1_br
.
The same can be done using objective functions, but numerical differentiation (using the FiniteDiff package) is used to approximate the gradient of the bias-reducing penalty (i.e. $A(\theta)$).
julia> o2_br = fit(logistic_template, my_data, true_betas, estimation_method = "RBM", br_method = "explicit_trace")
RBM-estimation with objective contributions logistic_loglik
Bias reduction method: explicit_trace
───────────────────────────────────────────────────────────────────────────
Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95%
───────────────────────────────────────────────────────────────────────────
theta[1] 0.895301 2.10014 0.426305 0.6699 -3.2209 5.0115
theta[2] -5.11645 1.11115 -4.60463 <1e-5 -7.29427 -2.93863
theta[3] -5.46494 1.15431 -4.73437 <1e-5 -7.72736 -3.20253
theta[4] 0.74801 0.936884 0.798402 0.4246 -1.08825 2.58427
theta[5] 4.34256 1.20242 3.6115 0.0003 1.98585 6.69926
theta[6] 5.29117 1.42302 3.71828 0.0002 2.50211 8.08023
theta[7] 3.87517 1.0339 3.74812 0.0002 1.84877 5.90158
theta[8] 1.07407 1.09374 0.98202 0.3261 -1.06962 3.21776
theta[9] -2.42976 1.13518 -2.14042 0.0323 -4.65466 -0.204855
theta[10] -3.93176 1.37586 -2.85768 0.0043 -6.62839 -1.23512
───────────────────────────────────────────────────────────────────────────
Maximum penalized objetive: -36.607
Takeuchi information criterion: 81.6646
Akaike information criterion: 84.7635
julia> isapprox(coef(e2_br), coef(o2_br))
true