First systematic audit pass following the 0.1.0 pre-release. Fourteen
HIGH-severity and seven MED-severity bugs were fixed across the R and
Stata code, plus a new data = argument and a multi-script
verification harness. A 2026-05-01 Windows verification pass uncovered
three further HIGH-severity bugs (Stata weights not
forwarded to bandwidth selection; R rdhte_lincom silently
applied multcomp’s single-step multiplicity adjustment; harness
orchestrator regex/CSV-staging gaps) which are included below.
A subsequent expert-review pass on the Stata code on 2026-05-01 PM
uncovered six further HIGH-severity issues, surfaced one previously
silent latent inference bug in the no-covs_hte path, and
bumped the Stata floor from 14 to 16 (frames). All are included in this
version.
Stata 16+ now required (was Stata 14+). Driven by
the migration from preserve to frames for the
in-program working sample.
data = argument added to rdhte() and
rdbwhte(). When supplied, y, x,
covs.hte, covs.eff, weights,
cluster, and subset may be given as bare
variable names referring to columns of data.
covs.hte additionally accepts a one-sided formula
(e.g. "~ z1 + z2") whose variables are looked up in
data first. Mirrors the rdrobust 4.0.0 pattern. Implemented
via the internal .rdhte_resolve_data() NSE helper in
R/helpers.R.W in no-covs.hte +
covs.eff branch. The internal regression formula
referenced W even though no heterogeneity variable was
supplied. rdhte() now uses Y ~ T*Xp + covs
(and Xq for the q-fit). Previously crashed at
lm().covs.hte
path. Three separate bugs were fixed: a missing brace pair that
left fml.txt undefined for ~-prefixed input;
the formula’s environment now uses parent.frame() so
caller-frame variables resolve; and the
data.frame(covs.hte = model.matrix(...)) step was rebuilt
so the downstream grep("^covs.hte", ...) finds the correct
columns (previously resulted in “subscript out of bounds” deeper in
lm()). The illustration’s
covs.hte = "w_left*w_strength" example was silently
affected by these and now produces correct output.is.null(h.r) & is.null(h.r) (twice the same
argument) is now is.null(h.l) & is.null(h.r).h.lev[l] in
three locations across rdhte() and rdbwhte():
the right-side effective-sample-size count was using the
left bandwidth whenever
bwselect = "msetwo" or per-group
h.l/h.r were supplied. Now uses
h.lev[l, 1] and h.lev[l, 2].Nh.lev always 0 in
continuous-covs.hte auto-bw branch because
h[1]/h[2] was NULL. Now uses
h.lev[1,1]/h.lev[1,2].out$h shape inconsistency between
branches (vector in no-covs.hte, matrix in
covs.hte). All branches now return a consistent
n.lev × 2 matrix.W.names. = W.names
(trailing dot) so summary.rdhte() could not find
x$W.names for the continuous case. Renamed to
W.names.length(h.l) > 1 & length(h.r) always evaluated
TRUE (bare length(h.r) coerces to
TRUE when positive). Now
length(h.l) > 1 | length(h.r) > 1 correctly errors on
mismatched lengths.do.call-unsafe NSE.
deparse(substitute(covs.hte)) returned multi-line output
when covs.hte was passed as a value rather than a symbol
(e.g. via do.call), causing paste() recycling
to inflate W.names to length-126 and the
names(tau.hat.se) <- W.names assignment to error. The
deparse output is now collapsed to one line and length-capped (>100
chars falls back to the placeholder "covs.hte").N.lev is now consistently c(N_l, N_r)
across all branches; the categorical-covs.hte branch
previously stored table(W.covs), which made
summary.rdhte() show wrong “Number of obs” for any group
count other than two.subset = accepts both logical and integer indices
(matches rdrobust 4.0.0); previously only logical worked.rdbwhte() gained weights = and
subset = arguments to match rdhte()’s feature
surface, plus the same string-formula covs.hte
handling.weights = is now forwarded to all internal
rdbwselect() calls (previously dropped silently in three
locations).rdbwhte() no-covs.hte branch’s
covs.cont <- FALSE flipped to TRUE so the
print path matches rdhte().rdbwhte()’s continuous-covs.hte branch now
returns Nh as a 1 x 2 matrix (consistent with
the factor case) rather than a length-2 vector.rdhte_lincom(): validate level is in
[1, 100) so the common fraction-vs-percentage mistake
(e.g. level = 0.95) errors clearly rather than silently
producing a tiny CI.rdhte_lincom(): column renamed t_stat
-> z_stat. The values were already z-statistics
(multcomp’s glht uses a Gaussian approximation by default);
the column label is now correct.rdhte_lincom(): removed dead ses variable
and a commented-out se column.rdhte_lincom() p-values were silently
multiplicity-adjusted. summary.glht() defaults to
test = adjusted("single-step") whenever the user passes
more than one hypothesis, so the p_value column in the
individual data frame was an FWER-adjusted p-value rather
than the per-hypothesis two-sided z-test the docstring promises (and
that Stata’s lincom reports). For >1 hypothesis the
adjusted values were ~2x the raw values; the single-hypothesis case was
already correct. Now passes test = adjusted(type = "none").
Point estimates and CI bounds are unchanged.checkup_rdhte_basic.R was emitting
r$N[1]/r$N[2] (full-sample left/right counts,
post-na.omit) for the N_l/N_r
cross-language diff fields, which mismatched Stata’s
e(tau_N) (effective N within bandwidth). Now emits
r$Nh[1,1]/r$Nh[1,2] so both sides compare the same
quantity.Nh shape inconsistency in the
continuous-covs.hte branch. At
rdhte.R:509 the continuous (no W.lev) path
returned Nh as a length-2 vector while every other branch
returned a 1 x 2 (or n.lev x 2) matrix.
Downstream code that indexed Nh[i, j] uniformly – including
the Akhtari/Moreira/Trucco replication script – broke on the continuous
path. Now matches the rest of the branches with
matrix(c(Nh.lev.l, Nh.lev.r), 1, 2). Values are unchanged;
positional access (Nh[1], Nh[2]) still
works.rdhte y x, vce(hc1) errored at
regress. Stata’s regress only accepts
vce(robust), not vce(hc1). The vce string is
now translated to vce(robust) for the internal regress
calls (display still shows “HC1”).rdbwhte ... vce(cluster c) produced a different
bandwidth from the auto-BW inside
rdhte ... vce(cluster c) because rdbwhte was
forwarding the user’s raw vce string (and forcing
vce_select = "hc0" internally) while rdhte built a proper
vce_rdbw local. Both now use the same vce_rdbw
construction.rdbwhte ereturned e(cmd) = "rdhte"; now
correctly "rdbwhte".rdhte.ado:712 no longer hard-code 1.96;
now use invnormal(1 - (1 - level/100)/2) so
level() is honored.i. prefix on a 0/1 variable now warns when the
variable is constant (all 0 or all 1) and the heterogeneity analysis
collapses to an overall RD ATE. Mirrored across rdhte.ado
and rdbwhte.ado.rdhte_lincom: lincom is now wrapped in
capture so a malformed hypothesis errors with a clear
message instead of silently picking up stale r() values
from the previous estimation.rdhte_lincom accepts a new level() option
(default 95) – the R version already had it; the Stata version was using
the global set level only.weights was not forwarded to bandwidth
selection. rdhte.ado’s two internal
rdbwselect calls (the joint-bandwidth path and the
per-level factor path) did not pass the user’s weights()
option, so rdhte ..., weights(w) bwselect(...) silently
chose a different bandwidth from the R-side equivalent. Same shape of
bug as the vce_rdbw cluster fix in this release. Now
forwards weights() to every internal
rdbwselect invocation.rdbwhte.ado was missing the
weights() option entirely. The R-side
rdbwhte() already accepted weights =; the
Stata syntax line did not declare
weights(string) at all, and the three internal
rdbwselect calls (joint-bandwidth,
no-covs_hte, per-factor-level) did not forward it either.
Both gaps fixed.checkup_rdhte_stress.do had five
if _rc == 0 { di "..."; local ++PASS } lines that
triggered Stata’s
r(198) program error: code follows on the same line as open brace
under batch mode. All five expanded to multi-line. The same script’s
NA-handling probe referenced e(N_h_l) + e(N_h_r) (neither
saved by rdhte.ado); now asserts only
_rc == 0. Note: rdhte.ado’s e(N)
is the pre-NA-drop input row count (line 158), not the post-drop count
R’s sum(r$N) reports – a documented LOW-severity asymmetry
left for a follow-up.Eight new top-level scripts under the project root provide a verification gate analogous to rdrobust’s:
checkup_rdhte_lm.R – 40 R-only sanity checks. Compares
each rdhte result to a manual reconstruction from coef.full
/ vcov.full.checkup_rdhte_basic.{R,do} – 14 specs of cross-language
baseline parity (factor / continuous / cluster / weights / kernel /
bwselect / p-q / manual h / bw.joint).checkup_rdhte_extended.{R,do} – 9 specs combining 3+
non-default options each, with per-(label, level) emission.checkup_rdhte_bw.{R,do} – 14 specs of
standalone-rdbwhte parity, including the cluster path that the
H10/H11/H12 fixes target directly.checkup_rdhte_lincom.{R,do} – 7 hypotheses across 3
specs comparing R’s multcomp::glht to Stata’s
lincom.checkup_rdhte_vs_rdrobust.R – 11 R-only checks
confirming the illustration’s claim that rdhte reproduces rdrobust under
matched settings (h fixed, rho=1, vce=“hc3”). Matches to ~1e-15 on
tau.checkup_rdhte_stress.R – 11 R-only edge cases.checkup_rdhte_stress.do – 9 Stata-portable stress
probes.checkup_rdhte_strformula.R – 6 R-only checks for the
string-formula covs.hte path with caller-frame
variables.run_rdhte_checks.py orchestrates all of the above
end-to-end and reports per-spec |R-Stata|. Marker on
success: RUN_RDHTE_CHECKS_OK.
Two orchestrator-level bugs found and fixed during the 2026-05-01 Windows verification pass:
(\w+)=(-?[0-9.eE+\-]+),
which did not tolerate the whitespace that Stata’s %15.10f
display format inserts between = and the
number. Every Stata float field (tau_cl=...,
h_l=..., etc.) silently parsed as nothing – only the
unpadded integer fields (N_l=240) got through. The
orchestrator was therefore comparing only N_l/N_r, which under the old
full-sample-vs-effective N semantic produced ~700-unit “diffs” that
masked the real silent-OK pattern. Fixed by adding \s*
after =.C:\Temp\rdhte_stata_*
scratch dir, but the .do files reference
_rdhte_data.csv by relative path. run_STATA()
now stages _rdhte_*.csv from the project root into the
scratch dir before invoking Stata.A second pass on the Stata code, severities HIGH / MEDIUM / LOW. All HIGH fixed; most MEDIUM applied; LOW partially done.
e(V) bug in the
no-covs_hte path (surfaced by H4).
rdhte y x (no heterogeneity) had been silently returning an
e(V) matrix populated with .s – the no-covs
branch never wrote tau_V[1,1]. The
cap ereturn post b V masked the resulting
r(504) matrix has missing values, leaving downstream
test/lincom calls with broken inference. Now
correct (matrix tau_V[1,1] = se^2); the cap is
gone.c(sysdir_plus)/r/rdrobust.ado with
file open and tokenized to extract the version. Replaced
with findfile rdrobust.ado (works for any adopath location:
PERSONAL, SITE, PLUS) plus a regex scan of the first
*!...vN.M.K line. Robust to future rdrobust file-layout
changes.vce() parser validation restored.
vce(robust foo), vce(hc1 var),
vce(cluster a b), etc., previously had their validation
commented out (and silently mis-parsed). Now error with
r(198) and a clear message; the legal 2-token forms are
exactly vce(cluster|hc2|hc3 var).rdhte.ado (tau_h,
tau_hat, tau_bc, tau_se,
tau_V, tau_t, tau_pv,
tau_N, tau_h_l/h_r, tau_ci_l/r,
CV, plus the b/V ereturn pair)
and 2 in rdbwhte.ado (tau_h,
tau_N) are now tempname’d. They no longer leak
into the caller’s namespace and cannot clobber a user matrix of the same
name.rdhte_countvars no longer
double-counts. The if-elseif chain wasn’t mutually exclusive:
any pure factor#factor term (e.g. i.W#i.Z) incremented BOTH
n_interaction_factor AND
n_interaction_continuous. Restructured so each term routes
to exactly one bucket. Affected the downstream quad flag
and the BW-routing decision for higher-order factor interactions.rdhte_fvexpand declared
rclass. Previously the program had no class
declaration and callers relied on the inner fvexpand call’s
r() values surviving through program return – accidental,
undocumented, and brittle to future Stata. Now rclass,
captures r(varlist) / r(fvops) inside
nobreak before any reset, and explicitly returns them.preserve -> frames
(M1). rdhte.ado and rdbwhte.ado now
do all work in an isolated temporary frame:
frame put * if touse, into(workframe); cwf workframe; ...; cwf orig_frame; frame drop workframe.
Avoids the O(N) preserve disk round-trip and only the
touse=1 subset is copied (typically a fraction of the
user’s dataset). The user’s data and frame state are never mutated, even
on error. Drives the Stata 16+ floor bump (frames were introduced in
Stata 16).e(N_h_l) and e(N_h_r) added
(M2). Per-side bandwidth-effective sample sizes (summed across
factor levels). Mirrors rdrobust’s convention. The deeper post-NA-drop
semantic for e(N) across covs_hte / weights / cluster
missingness is documented as a deferred follow-up.qui count consolidation (M3). Three
qui count passes collapsed to two (left + right; total via
addition). One fewer O(N) pass per call.version 14.0 -> 16.0
(M8). Required by the frames refactor. Bumped across all 5
ados; rdhte_countvars.ado also gained the missing
version directive (it had none before).rdhte.ado (check-it-exists
scaffolding for value labels, dead tau_V covariance
attempts, redundant local k = i inside nested
forvalues, etc.) and rdbwhte.ado (orphan
*/, value-label scaffolding).e(outcomevar) removed (L2). Was
advertised in the help file but the 2026-04-27 harmonization audit had
already removed it; line 855 had silently reintroduced it. Users get
e(depvar) per Stata convention.rdhte.sthlp Stored results section refreshed
(L5). Removed e(outcomevar); added
e(N_h_l) and e(N_h_r); e(N)
description now correctly says “sample size after listwise deletion on
(y, x)” instead of “original number of observations”.covs_hte(W) (no i. prefix) and
W has fewer than 10 unique values, rdhte now
emits a note: recommending i.W for
subgroup-CATE estimation rather than the (probably unintended)
slope-coefficient interpretation.length(subinstr(...))
hash-counting refactor): arithmetic is correct, just opaque.rdhte.ado; should pull into a
subprogram.e(N) post-NA-drop for covs_hte / weights /
cluster: needs careful covs_hte string parsing to markout
factor variables.Initial release.