This is an outline of a presentation prepared for the SEM Working Group Meeting 2026, in Warsaw, Poland, 15–17 April 2026. It was developed with colleagues from the University of Medicine, Pharmacy, Sciences and Technology of Târgu Mureș: Ioan-Bogdan Bacos, Manuela Rozalia Gabor, Laura Barcutean & Petru-Alexandru Curta.
The arguments are old [i] but less known: statistical modeling is not statistical testing, and whereas modeling is done more intuitively graphically, in a structural way, statistical tests are just ‘hammers’ one use for different nails… David Kenny showed [ii] 3.5 decades ago that models can be easily expressed like [iii]
independent
variable -> dependent variable
He defined a model as “a formal representation of a set
of relationships between variables” (there is also Model Theory [iv]).
As Jim Jaccard and Jacob Jacobi have
shown [v]
(see pic in footnotes), many statistical tests really tackle the same model,
commonly some xcont -> ycont relation
(xcont means x is continuous; x01 instead is a binary x; for more causal-focused
discussions, go to Tinyurl.com/ONCAUSALITY
).
To make this ‘visible’, we show
how to ‘run’ several statistical tests, and that they necessarily have to reach
the same conclusion, in terms of the ‘p value’, to what extent we decide/not
that a relation is non-null. We share a link to the Copilot.AI
chat that implements the technical parts of our illustration (one does not need
to ‘know’ software coding in the age of AI…).
ivcont -> xcont -> mcont -> ycont [& xcont
-> ycont]
which saves a csv file; and dichotomized all variables in
Excel, around their means, to create binary counterparts; and we compute xbym
as the product xcont*ycont: this data will be then read into R and utilized for
the demonstration to follow.
We show the model equivalence of
the following statistical tests:
STATISTICAL TEST STRUCTURAL
MODEL
(1) t-test for
x01 -> y01 (x01 -> ycont similar)
(2) F-test for
x01 -> y01 (x01 -> ycont similar)
(3) chi-squared test for
x01 <-> y01 (cannot run x01 -> ycont)
(4) simple regression; and for x01 -> y01
(correlation x01 <-> y01; & xcont <-> ycont should reach similar
conclusion)
(5) a path model x01
-> y01 (x01 -> ycont similar)
And then add a third variable and show that it can play
several distinct roles:
(6) A mediator xcont
-> mcont -> ycont [& xcont -> ycont]
(7) An instrumental variable (IV) model ivcont -> xcont ->
ycont [no ivcont -> ycont path]
(8) Pearl’s mediating IV model xcont -> mcont ->
ycont [no xcont -> ycont]
Beyond this, adding a
xcont*ycont interaction term opens up modeling options for ‘causal’ mediation
too (a Mplus translation of Tyler Vandweweele’s SAS decomposition code is on
SEMNET; AIs can do this now right away, one for R exists already [vii]).
The results of simulation and
analyses are:
(1) t-test t
= 0.39753, df = 95.447, p-value = 0.6919
(2) F-test F
value 0.158, p-value = 0.692
(3) Chi-squared test X-squared
= 0.16103, df = 1, p-value = 0.6882
(4) Simple regression t value -0.398, Pr(>|t|)
= 0.692
(Pearson correlation mirrors the regression findings
necessarily t = -0.39757 , df = 98, p-value = 0.6918)
(5) Path model (with lavaan) z-value -0.402
P(>|z|) = 0.688
Their p-values align [viii]:
we would conclude the same thing.
All of them however can be
replaced by a ‘walk through’ the path model “x01 -> y01”, using as ‘raw’
data the variances and covariance between the variables. This ‘tracing rule
visual estimation’ will replicate the regression and path analysis results, in
terms of the actual estimate; the tracing rule does not run statistical
significance tests, however.
The effects estimated in R were: Regression: -0.040 &
Path analysis: -0.03982
The tracing rule simply leads to the solution
Effect x01 ->
y01 = Covariance(x01, y01)/Variance(x01)
which yields the same result: [Tracing rule] -0.04025
For (6)-(8), the codes are in the R appendix r_Poland.txt – all are easy to ‘grab’ with an AI assisting.
The file contains two more ‘free
gifts’, dagitty and MIIVsem codes to investigate what ‘statistical
adjustments/controls’ have to be done, and nOT done, when focused on specific
causal effects of interest.
Of course, each test is better
suited for some combination of continuous/categorical pair, e.g. the t-test and
the F test and the z-test in the regression model commonly use a continuous
outcome (but they work with binary too).
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PROMPT used in Copilot:
Using the notation ivcont, xcont, mcont, ycont, for 4
continuous variables, and x01, m01, y01 for 3 binary variables, generate R code
to analyze some of them using the following tests:
(1) t-test for x01 -> y01
(2) F-test for x01 -> y01
(3) chi-squared test for x01 <-> y01
(4) simple regression for
x01 -> y01
(5) a path model (with lavaan) x01
-> y01
(6) a mediation model (lavaan) xcont
-> mcont -> ycont [& xcont -> ycont]
(7) a instrumental variable (IV) model (lavaan) ivcont -> xcont -> ycont [no ivcont
-> ycont path]
(8) Pearl’s mediating IV model (lavaan) xcont -> mcont ->
ycont [no xcont -> ycont]
[then asked for Pearson correlation for x01 <-> y01]
[i] Robin
Beaumont has shown this in 2017 in great detail
SEM
equivalent to basic statistical procedures
[ii] Kenny,
D. A. (1987). Statistics for the social and behavioral sciences. Posted by
author at https://davidakenny.net/doc/statbook/kenny87.pdf
Little, Brown Boston.
[iii] “Research
in the behavioral and social sciences often involves testing statistical
models.
What Is a Model?
A statistical model is a formal representation of a set
of re1ationships between variables. Statistical models contain an outcome
variable that is the focus of study. […]
A very simple model is one in which the dependent
variable equals a constant plus the residual variable.
dependent variable
= constant variable + residual variable
[…] In simple
equation form the model is
dependent variable
= effect of the independent variable + residual variable
Instead of expressing the model as an equation, the
model could be just as easily specified by a diagram; arrows could be drawn
from cause to effect, as follows:
independent variable
-> dependent variable <-
residual variable
A representation of a model that uses arrows is called
a path diagram.” (Kenny, 1987), p. 184-5
[iv] Rizza,
D. (2025). Model Theory: The Algebraic Basics: Springer.
[v] Jaccard,
J., & Jacoby, J. (2009). Theory construction and model-building skills: A
practical guide for social scientists: Guilford Press.
[vii] Software
choice can be of course expanded at will, see e.g. Python and Stata



No comments:
Post a Comment