Examples
Contents
Ratio of two means
Consider a setting where independent pairs of random variables $(X_1, Y_1), \ldots, (X_n, Y_n)$ are observed, and suppose that interest is in the ratio of the mean of $Y_i$ to the mean of $X_i$, that is $\theta = \mu_Y / \mu_X$, with $\mu_X = E(X_i)$ and $\mu_Y = E(Y_i) \ne 0$ $(i = 1, \ldots, n)$.
Assuming that sampling is from an infinite population, one way of estimating $\theta$ without any further assumptions about the joint distribution of $(X_i, Y_i)$ is to set the unbiased estimating equation $\sum_{i = 1}^n (Y_i - \theta X_i) = 0$. The resulting $M$-estimator is then $\hat\theta = s_Y/s_X$ where $s_X = \sum_{i = 1}^n X_i$ and $s_Y = \sum_{i = 1}^n Y_i$.
The estimator $\hat\theta$ is generally biased, as can be shown, for example, by an application of the Jensen inequality assuming that $X_i$is independent of $Y_i$, and its bias can be reduced using the empirically adjusted estimating functions approach in Kosmidis & Lunardon (2020).
This example illustrates how MEstimation can be used to calculate the $M$-estimator and its reduced-bias version.
julia> using MEstimation, Random
Define a data type for ratio estimation problems
julia> struct ratio_data y::Vector x::Vector end;
Write a function to compute the number of observations for objects of type ratio_data
.
julia> function ratio_nobs(data::ratio_data) nx = length(data.x) ny = length(data.y) if (nx != ny) error("length of x is not equal to the length of y") end nx end;
Generate some data to test things out
julia> Random.seed!(123);
julia> my_data = ratio_data(randn(10), rand(10));
julia> ratio_nobs(my_data)
10
The estimating function for the ratio $\theta$ is
$\sum_{i = 1}^n (Y_i - \theta X_i)$
So, the contribution to the estimating function can be implemented as
julia> function ratio_ef(theta::Vector, data::ratio_data, i::Int64) data.y[i] .- theta * data.x[i] end;
The estimating_function_template
for the ratio estimation problem can now be set up using ratio_nobs
and ratio_ef
.
julia> ratio_template = estimating_function_template(ratio_nobs, ratio_ef);
We are now ready use ratio_template
and my_data
to compute the $M$-estimator of $\theta$ by solving the estimating equation $\sum_{i = 1}^n (Y_i - \theta X_i) = 0$. The starting value for the nonlinear solver is set to 0.1
.
julia> result_m = fit(ratio_template, my_data, [0.1])
M-estimation with estimating function contributions `ratio_ef` ───────────────────────────────────────────────────────────────────────── Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95% ───────────────────────────────────────────────────────────────────────── theta[1] -0.616287 0.455728 -1.35231 0.1763 -1.5095 0.276923 ───────────────────────────────────────────────────────────────────────── Estimating functions: [-1.4759304889366831e-12] Converged: true
fit
uses methods from the NLsolve package for solving the estimating equations. Arguments can be passed directly to NLsolve.nlsolve
through keyword arguments to the fit
method. For example,
julia> result_m = fit(ratio_template, my_data, [0.1], show_trace = true)
Iter f(x) inf-norm Step 2-norm ------ -------------- -------------- 0 4.044470e+00 NaN 1 3.479826e+00 1.000000e-01 2 2.350539e+00 2.000000e-01 3 9.196457e-02 4.000000e-01 4 1.475930e-12 1.628719e-02 M-estimation with estimating function contributions `ratio_ef` ───────────────────────────────────────────────────────────────────────── Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95% ───────────────────────────────────────────────────────────────────────── theta[1] -0.616287 0.455728 -1.35231 0.1763 -1.5095 0.276923 ───────────────────────────────────────────────────────────────────────── Estimating functions: [-1.4759304889366831e-12] Converged: true
Bias reduction in general $M$-estimation can be achieved by solving the adjusted estimating equation $\sum_{i = 1}^n (Y_i - \theta X_i) + A(\theta, Y, X) = 0$, where $A(\theta)$ are empirical bias-reducing adjustments depending on the first and second derivatives of the estimating function contributions. MEstimation can use ratio_template
and automatic differentiation (see, ForwardDiff) to construct $A(\theta, Y, X)$ and, then, solve the bias-reducing adjusted estimating equations. All this is simply done by
julia> result_br = fit(ratio_template, my_data, [0.1], estimation_method = "RBM")
RBM-estimation with estimating function contributions `ratio_ef` Bias reduction method: implicit_trace ───────────────────────────────────────────────────────────────────────── Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95% ───────────────────────────────────────────────────────────────────────── theta[1] -0.591874 0.454331 -1.30274 0.1927 -1.48235 0.298599 ───────────────────────────────────────────────────────────────────────── Adjusted estimating functions: [-2.853828284798965e-13] Converged: true
where RBM
stands for reduced-bias M
-estimation.
Kosmidis & Lunardon (2020) show that the reduced-bias estimator of $\theta$ is $\tilde\theta = (s_Y + s_{XY}/s_{X})/(s_X + s_{XX}/s_{X})$. The code chunks below tests that this is indeed the result MEstimation returns.
julia> sx = sum(my_data.x);
julia> sxx = sum(my_data.x .* my_data.x);
julia> sy = sum(my_data.y);
julia> sxy = sum(my_data.x .* my_data.y);
julia> isapprox(sy/sx, result_m.theta[1])
true
julia> isapprox((sy + sxy/sx)/(sx + sxx/sx), result_br.theta[1])
true
Logistic regression
Using objective_function_template
Here, we use MEstimation's objective_function_template
to estimate a logistic regression model using maximum likelihood and maximum penalized likelihood, with the empirical bias-reducing penalty in Kosmidis & Lunardon (2020).
julia> using MEstimation
julia> using Random
julia> using Distributions
julia> using Optim
A data type for logistic regression models (consisting of a response vector y
, a model matrix x
, and a vector of weights m
) is
julia> struct logistic_data y::Vector x::Array{Float64} m::Vector end
A function to compute the number of observations from logistic_data
objects is
julia> function logistic_nobs(data::logistic_data) nx = size(data.x)[1] ny = length(data.y) nm = length(data.m) if (nx != ny) error("number of rows in of x is not equal to the length of y") elseif (nx != nm) error("number of rows in of x is not equal to the length of m") elseif (ny != nm) error("length of y is not equal to the length of m") end nx end
logistic_nobs (generic function with 1 method)
The logistic regression log-likelihood contribution at a parameter theta
for the $i$th observations of data data
is
julia> function logistic_loglik(theta::Vector, data::logistic_data, i::Int64) eta = sum(data.x[i, :] .* theta) mu = exp.(eta)./(1 .+ exp.(eta)) data.y[i] .* log.(mu) + (data.m[i] - data.y[i]) .* log.(1 .- mu) end
logistic_loglik (generic function with 1 method)
Let's simulate some logistic regression data with $10$ covariates
julia> Random.seed!(123);
julia> n = 100;
julia> m = 1;
julia> p = 10
10
julia> x = Array{Float64}(undef, n, p);
julia> x[:, 1] .= 1.0;
julia> for j in 2:p x[:, j] .= rand(n); end
julia> true_betas = randn(p) * sqrt(p);
julia> y = rand.(Binomial.(m, cdf.(Logistic(), x * true_betas)));
julia> my_data = logistic_data(y, x, fill(m, n));
and set up an objective_function_template
for logistic regression
julia> logistic_template = objective_function_template(logistic_nobs, logistic_loglik)
objective_function_template(Main.logistic_nobs, Main.logistic_loglik)
The maximum likelihood estimates starting at true_betas
are
julia> o1_ml = fit(logistic_template, my_data, true_betas, optim_method = NelderMead())
M-estimation with objective contributions `logistic_loglik` ────────────────────────────────────────────────────────────────────────────── Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95% ────────────────────────────────────────────────────────────────────────────── theta[1] 0.0642275 5.40987 0.0118723 0.9905 -10.5389 10.6674 theta[2] -1.07189 2.61211 -0.410355 0.6815 -6.19154 4.04775 theta[3] -10.8887 2.51036 -4.33751 <1e-04 -15.8089 -5.96849 theta[4] -17.4507 5.02228 -3.47466 0.0005 -27.2942 -7.60723 theta[5] -10.1597 3.76219 -2.70048 0.0069 -17.5335 -2.78595 theta[6] -6.67685 1.98707 -3.36015 0.0008 -10.5714 -2.78226 theta[7] -23.5455 3.61585 -6.51176 <1e-10 -30.6325 -16.4586 theta[8] -30.1538 6.29408 -4.79082 <1e-05 -42.49 -17.8176 theta[9] -0.11266 1.67223 -0.0673712 0.9463 -3.39017 3.16485 theta[10] -0.219789 2.30336 -0.0954212 0.9240 -4.73429 4.29471 ────────────────────────────────────────────────────────────────────────────── Objective: -0.0 Takeuchi information criterion: 0.0 Akaike information criterion: 20.0 Converged: true
fit
uses methods from the Optim package internally. Here, we used the Optim.NelderMead
method. Alternative optimization methods and options can be supplied directly through the keyword arguments optim_method
and optim_options
, respectively. For example,
julia> o2_ml = fit(logistic_template, my_data, true_betas, optim_method = LBFGS(), optim_options = Optim.Options(g_abstol = 1e-05))
M-estimation with objective contributions `logistic_loglik` ────────────────────────────────────────────────────────────────────────── Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95% ────────────────────────────────────────────────────────────────────────── theta[1] -18.2474 3.85573 -4.73253 <1e-05 -25.8045 -10.6903 theta[2] -9.26473 2.25983 -4.09974 <1e-04 -13.6939 -4.83554 theta[3] -11.3257 2.12791 -5.32243 <1e-06 -15.4963 -7.15503 theta[4] -8.88823 2.69686 -3.29577 0.0010 -14.174 -3.60249 theta[5] -9.00095 2.85915 -3.14812 0.0016 -14.6048 -3.39712 theta[6] -8.9545 1.49549 -5.98768 <1e-08 -11.8856 -6.0234 theta[7] -11.8706 2.59587 -4.57288 <1e-05 -16.9584 -6.7828 theta[8] -10.5582 1.87893 -5.61924 <1e-07 -14.2408 -6.87552 theta[9] -11.9214 2.11974 -5.62401 <1e-07 -16.076 -7.76681 theta[10] -7.91036 2.47638 -3.19433 0.0014 -12.764 -3.05675 ────────────────────────────────────────────────────────────────────────── Objective: 0.0 Takeuchi information criterion: 0.0 Akaike information criterion: 20.0 Converged: true
The reduced-bias estimates starting at the maximum likelihood ones are
julia> o1_br = fit(logistic_template, my_data, coef(o1_ml), estimation_method = "RBM")
RBM-estimation with objective contributions `logistic_loglik` Bias reduction method: implicit_trace ────────────────────────────────────────────────────────────────────────────── Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95% ────────────────────────────────────────────────────────────────────────────── theta[1] 0.0642275 5.40987 0.0118723 0.9905 -10.5389 10.6674 theta[2] -1.07189 2.61211 -0.410355 0.6815 -6.19154 4.04775 theta[3] -10.8887 2.51036 -4.33751 <1e-04 -15.8089 -5.96849 theta[4] -17.4507 5.02228 -3.47466 0.0005 -27.2942 -7.60723 theta[5] -10.1597 3.76219 -2.70048 0.0069 -17.5335 -2.78595 theta[6] -6.67685 1.98707 -3.36015 0.0008 -10.5714 -2.78226 theta[7] -23.5455 3.61585 -6.51176 <1e-10 -30.6325 -16.4586 theta[8] -30.1538 6.29408 -4.79082 <1e-05 -42.49 -17.8176 theta[9] -0.11266 1.67223 -0.0673712 0.9463 -3.39017 3.16485 theta[10] -0.219789 2.30336 -0.0954212 0.9240 -4.73429 4.29471 ────────────────────────────────────────────────────────────────────────────── Penalized objetive: -0.0 Takeuchi information criterion: 0.0 Akaike information criterion: 20.0 Converged: true
Using estimating_function_template
The same results as above can be returned using an estimating_function_template
for logistic regression.
The contribution to the derivatives of the log-likelihood for logistic regression is
julia> function logistic_ef(theta::Vector, data::logistic_data, i::Int64) eta = sum(data.x[i, :] .* theta) mu = exp.(eta)./(1 .+ exp.(eta)) data.x[i, :] * (data.y[i] - data.m[i] * mu) end
logistic_ef (generic function with 1 method)
Then, solving the bias-reducing adjusted estimating equations
julia> logistic_template_ef = estimating_function_template(logistic_nobs, logistic_ef);
julia> e1_br = fit(logistic_template_ef, my_data, true_betas, estimation_method = "RBM")
RBM-estimation with estimating function contributions `logistic_ef` Bias reduction method: implicit_trace ────────────────────────────────────────────────────────────────────────────── Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95% ────────────────────────────────────────────────────────────────────────────── theta[1] -17.5124 2.088 -8.38719 <1e-16 -21.6048 -13.42 theta[2] 1.2354 1.13902 1.08461 0.2781 -0.997051 3.46784 theta[3] -3.8147 1.11491 -3.42154 0.0006 -5.99988 -1.62953 theta[4] 3.79134 1.05689 3.58725 0.0003 1.71987 5.8628 theta[5] -3.69434 1.40476 -2.62988 0.0085 -6.44761 -0.941064 theta[6] -2.55227 1.50629 -1.69441 0.0902 -5.50453 0.400002 theta[7] -6.8218 1.88731 -3.61457 0.0003 -10.5208 -3.12274 theta[8] -6.41405 1.33859 -4.79163 <1e-05 -9.03764 -3.79045 theta[9] -0.422571 0.928138 -0.455288 0.6489 -2.24169 1.39655 theta[10] -0.235556 1.2173 -0.193507 0.8466 -2.62141 2.1503 ────────────────────────────────────────────────────────────────────────────── Adjusted estimating functions: [-7.760109876105032e-9, -4.6554751571883975e-9, -3.1595236190293042e-9, -5.667127973939888e-9, -2.2173365607324946e-9, -2.70530005859938e-9, -2.2060763223242113e-9, -1.7088852876682721e-9, -5.208711230336687e-9, -3.3428348025724895e-9] Converged: true
returns the reduced-bias estimates from maximum penalized likelihood:
julia> isapprox(coef(o1_br), coef(e1_br))
false
Bias-reduction methods
MEstimation currently implements 2 alternative bias reduction methods, called implicit_trace
and explicit_trace
. implicit_trace
will adjust the estimating functions or penalize the objectives, as we have seen earlier. explicit_trace
, on the other hand, will form an estimate of the bias of the $M$-estimator and subtract that from the $M$-estimates. The default method is implicit_trace
.
For example, for logistic regression via estimating functions
julia> e2_br = fit(logistic_template_ef, my_data, true_betas, estimation_method = "RBM", br_method = "explicit_trace")
RBM-estimation with estimating function contributions `logistic_ef` Bias reduction method: explicit_trace ──────────────────────────────────────────────────────────────────────────────── Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95% ──────────────────────────────────────────────────────────────────────────────── theta[1] -18.7594 1.47295 -12.7359 <1e-36 -21.6464 -15.8725 theta[2] 0.980103 0.909501 1.07763 0.2812 -0.802485 2.76269 theta[3] -3.11008 0.969043 -3.20944 0.0013 -5.00937 -1.21079 theta[4] 3.22469 0.914142 3.52756 0.0004 1.43301 5.01638 theta[5] -2.92686 1.25522 -2.33175 0.0197 -5.38705 -0.466676 theta[6] -2.12556 1.10169 -1.92937 0.0537 -4.28484 0.0337069 theta[7] -5.22537 1.23829 -4.21982 <1e-04 -7.65238 -2.79836 theta[8] -5.09395 0.937559 -5.43321 <1e-07 -6.93154 -3.25637 theta[9] -0.38592 0.75559 -0.510754 0.6095 -1.86685 1.09501 theta[10] -0.219085 0.976058 -0.224459 0.8224 -2.13212 1.69395 ──────────────────────────────────────────────────────────────────────────────── Adjusted estimating functions: [-5.474460909514576e-9, -3.2215350169060783e-9, -2.340309274374181e-9, -3.879654537632486e-9, -1.6620069066073888e-9, -1.98093772445207e-9, -1.5709390430233212e-9, -1.30816713281371e-9, -3.4865208829177413e-9, -2.3720147831530654e-9] Converged: true
which gives slightly different estimates that what are in the implict_trace
fit in e1_br
.
The same can be done using objective functions, but numerical differentiation (using the FiniteDiff package) is used to approximate the gradient of the bias-reducing penalty (i.e. $A(\theta)$).
julia> o2_br = fit(logistic_template, my_data, true_betas, estimation_method = "RBM", br_method = "explicit_trace")
RBM-estimation with objective contributions `logistic_loglik` Bias reduction method: explicit_trace ────────────────────────────────────────────────────────────────────────── Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95% ────────────────────────────────────────────────────────────────────────── theta[1] -19.6877 3.81443 -5.16137 <1e-06 -27.1638 -12.2115 theta[2] -8.82863 2.04132 -4.32497 <1e-04 -12.8295 -4.82772 theta[3] -10.442 2.01928 -5.17117 <1e-06 -14.3997 -6.48431 theta[4] -8.9447 2.52308 -3.54514 0.0004 -13.8899 -3.99954 theta[5] -9.08359 2.86073 -3.17528 0.0015 -14.6905 -3.47667 theta[6] -9.20455 1.50095 -6.13249 <1e-09 -12.1464 -6.26275 theta[7] -11.4005 2.59277 -4.39704 <1e-04 -16.4822 -6.31878 theta[8] -10.5007 1.80888 -5.80507 <1e-08 -14.046 -6.95534 theta[9] -10.6194 1.75307 -6.05762 <1e-08 -14.0554 -7.18346 theta[10] -6.96113 2.27231 -3.06346 0.0022 -11.4148 -2.50749 ────────────────────────────────────────────────────────────────────────── Penalized objetive: -0.0 Takeuchi information criterion: 0.0 Akaike information criterion: 20.0 Converged: true
julia> isapprox(coef(e2_br), coef(o2_br))
false
Regularization
MEstimation allows to pass arbitrary regularizers to either the objective or the estimating functions. Below we illustrate that functionality for carrying out ridge logistic regression, and maximum penalized likelihood, with a Jeffreys-prior penalty.
Ridge logistic regression
The logistic_template
that we defined earlier can be used for doing L2-regularized logistic regression (aka ridge logistic regression); we only need to define a function that implements the L2 regularizer
julia> l2_penalty = (theta, data, λ) -> - λ * sum(theta.^2);
Then, the coefficient path can be computed as
julia> lambda = collect(0:0.5:10);
julia> deviance = similar(lambda);
julia> coefficients = Matrix{Float64}(undef, length(lambda), length(true_betas));
julia> coefficients[1, :] = coef(o1_ml);
julia> for j in 2:length(lambda) current_fit = fit(logistic_template, my_data, coefficients[j - 1, :], regularizer = (theta, data) -> l2_penalty(theta, data, lambda[j])) deviance[j] = 2 * current_fit.results.minimum coefficients[j, :] = coef(current_fit) end
The coefficients versus $\lambda$, and the deviance values are then
julia> using Plots
ERROR: ArgumentError: Package Plots not found in current path. - Run `import Pkg; Pkg.add("Plots")` to install the Plots package.
julia> plot(lambda, coefficients);
ERROR: UndefVarError: plot not defined
julia> savefig("coef_path1.svg");
ERROR: UndefVarError: savefig not defined
julia> plot(deviance, coefficients);
ERROR: UndefVarError: plot not defined
julia> savefig("coef_path2.svg");
ERROR: UndefVarError: savefig not defined
Another way to get the above is to define a new data type that has a filed for $\lambda$ and then pass
julia> l2_penalty = (theta, data) -> - data.λ * sum(theta.^2)
#5 (generic function with 1 method)
to the regularizer
argument when calling fit
. Such a new data type, though, would require to redefine logistic_loglik
, logistic_nobs
and logistic_template
.
Jeffreys-prior penalty for bias reduction
Firth (1993) showed that an alternative bias-reducing penalty for the logistic regression likelihood is the Jeffreys prior,. which can readily implemented and passed to fit
through the regularizer
interface that MEstimation provides. The logarithm of the Jeffreys prior for logistic regression is
julia> using LinearAlgebra
julia> function log_jeffreys_prior(theta, data) x = data.x probs = cdf.(Logistic(), x * theta) log(det((x .* (data.m .* probs .* (1 .- probs)))' * x)) / 2 end
log_jeffreys_prior (generic function with 1 method)
Then, the reduced-bias estimates of Firth (1993) are
julia> o_jeffreys = fit(logistic_template, my_data, true_betas, regularizer = log_jeffreys_prior)
M-estimation with objective contributions `logistic_loglik` and user-supplied regularizer ─────────────────────────────────────────────────────────────────────────── Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95% ─────────────────────────────────────────────────────────────────────────── theta[1] -2.72139 0.591575 -4.60025 <1e-05 -3.88086 -1.56193 theta[2] -0.709242 0.413271 -1.71616 0.0861 -1.51924 0.100755 theta[3] -0.15944 0.402685 -0.395942 0.6921 -0.948687 0.629808 theta[4] 0.29656 0.408597 0.725801 0.4680 -0.504275 1.09739 theta[5] 0.329316 0.410825 0.801597 0.4228 -0.475886 1.13452 theta[6] -0.222871 0.359498 -0.61995 0.5353 -0.927473 0.481732 theta[7] 0.679226 0.408675 1.66202 0.0965 -0.121763 1.48021 theta[8] -0.292296 0.392618 -0.744479 0.4566 -1.06181 0.477221 theta[9] -0.159811 0.378344 -0.422395 0.6727 -0.90135 0.581729 theta[10] -0.392374 0.39982 -0.981378 0.3264 -1.17601 0.391258 ─────────────────────────────────────────────────────────────────────────── Objective: -4.6144 Takeuchi information criterion: 10.2889 Akaike information criterion: 29.2289 Converged: true
Note here, that the regularizer
is only used to get estimates. Then all model quantities are computed at those estimates, but based only on logistic_loglik
(i.e. without adding the regularizer to it). Kosmidis & Firth (2020) provide a more specific procedure for computing the reduced-bias estimates from the penalization of the logistic regression likelihood by Jeffreys prior, which uses repeated maximum likelihood fits on adjusted binomial data. Kosmidis & Firth (2020) also show that, for logistic regression, the reduced-bias estimates from are always finite and shrink to zero relative to the maximum likelihood estimator.
Regularization is also available when fitting an estimating_function_template
. For example, the gradient of the log_jeffreys_prior
above is
julia> using ForwardDiff
julia> log_jeffreys_prior_grad = (theta, data) -> ForwardDiff.gradient(pars -> log_jeffreys_prior(pars, data), theta)
#7 (generic function with 1 method)
Then the same fit as o_jeffreys
can be obtained using estimating functions
julia> e_jeffreys = fit(logistic_template_ef, my_data, true_betas, regularizer = log_jeffreys_prior_grad)
M-estimation with estimating function contributions `logistic_ef` and user-supplied regularizer ─────────────────────────────────────────────────────────────────────────────── Estimate Std. Error z value Pr(>|z|) Lower 95% Upper 95% ─────────────────────────────────────────────────────────────────────────────── theta[1] -2.8426 3.53647 -0.803795 0.4215 -9.77395 4.08876 theta[2] 5.87135 4.9521 1.18563 0.2358 -3.8346 15.5773 theta[3] -5.62689 2.10177 -2.67722 0.0074 -9.74628 -1.5075 theta[4] 9.55396 5.10385 1.87191 0.0612 -0.449412 19.5573 theta[5] -20.6593 8.48364 -2.43519 0.0149 -37.2869 -4.03166 theta[6] 6.51817 3.74053 1.74258 0.0814 -0.813135 13.8495 theta[7] -17.5571 8.11688 -2.16304 0.0305 -33.4659 -1.64833 theta[8] -10.6909 4.57575 -2.33642 0.0195 -19.6592 -1.72258 theta[9] 3.21754 2.44352 1.31677 0.1879 -1.57167 8.00676 theta[10] 0.0920687 2.91743 0.0315582 0.9748 -5.62598 5.81012 ─────────────────────────────────────────────────────────────────────────────── Estimating functions: [-2.5512459989740726, -1.2249989785610222, -1.5796081066190875, -1.623849502093225, -0.45862967327262183, -1.6308852892708967, -0.39014295134018717, -0.7182185110539234, -1.4771057875003222, -1.0369154074033433] Converged: false
Note here that the value of the estimating functions shown in the output is that of the gradient of the log-likelihood, i.e.
julia> logistic_loglik_grad = estimating_function(coef(e_jeffreys), my_data, logistic_template_ef)
10-element Vector{Float64}: -2.5512459989740726 -1.2249989785610222 -1.5796081066190875 -1.623849502093225 -0.45862967327262183 -1.6308852892708967 -0.39014295134018717 -0.7182185110539234 -1.4771057875003222 -1.0369154074033433
instead of the regularized estimating functions, which, as expected, are very close to zero at the estimates
julia> logistic_loglik_grad .+ log_jeffreys_prior_grad(coef(e_jeffreys), my_data)
10-element Vector{Float64}: 0.0050076730923955814 0.002448939922122806 0.017770078868340144 -0.08557776917297888 0.08167023760982128 -0.03574273043676168 0.07032375509459765 0.06767600430348009 0.008041248591773797 0.018981843568285584