This is an outline of a presentation prepared for the SEM Working Group Meeting 2026, in Warsaw, Poland, 15–17 April 2026. It was developed with colleagues from the University of Medicine, Pharmacy, Sciences and Technology of Târgu Mureș: Ioan-Bogdan Bacos, Manuela Rozalia Gabor, Laura Barcutean & Petru-Alexandru Curta. A video walking through the process is at Tinyurl.com/STATSCAUSAL2
The arguments are old [i] but less known: statistical modeling is not statistical testing, and whereas modeling is done more intuitively graphically, in a structural way, statistical tests are just ‘hammers’ one use for different nails… David Kenny showed [ii] 3.5 decades ago that models can be easily expressed like [iii]
independent
variable -> dependent variable
He defined a model as “a formal representation of a set
of relationships between variables” (there is also Model Theory [iv]).
As Jim Jaccard and Jacob Jacobi have
shown [v]
(see pic in footnotes), many statistical tests really tackle the same model,
commonly some xcont -> ycont relation
(xcont means x is continuous; x01 instead is a binary x; for more causal-focused
discussions, go to Tinyurl.com/ONCAUSALITY
).
To make this ‘visible’, we show
how to ‘run’ several statistical tests, and that they necessarily have to reach
the same conclusion, in terms of the ‘p value’, to what extent we decide/not
that a relation is non-null. We share a link to the Copilot.AI
chat that implements the technical parts of our illustration (one does not need
to ‘know’ software coding in the age of AI…).
We first generated data in the very flexible and intuitive graphical modeling software Onyx[vi] using a data generating model
ivcont -> xcont -> mcont -> ycont [& xcont -> ycont]
which saves a csv file; and dichotomized all variables in Excel, around their means, to create binary counterparts; and we compute xbym as the product xcont*ycont: this data will be then read into R and utilized for the demonstration to follow.
We show the model equivalence of the following statistical tests:
STATISTICAL TEST STRUCTURAL MODEL
(1) t-test for x01 -> y01 (x01 -> ycont similar)
(2) F-test for x01 -> y01 (x01 -> ycont similar)
(3) chi-squared test for x01 <-> y01 (cannot run x01 -> ycont)
(4) simple regression; and for x01 -> y01 (correlation x01 <-> y01; & xcont <-> ycont should reach similar conclusion)
(5) a path model x01 -> y01 (x01 -> ycont similar)
And then add a third variable and show that it can play several distinct roles:
(6) A mediator xcont -> mcont -> ycont [& xcont -> ycont]
(7) An instrumental variable (IV) model ivcont -> xcont -> ycont [no ivcont -> ycont path]
(8) Pearl’s mediating IV model xcont -> mcont -> ycont [no xcont -> ycont]
Beyond this, adding a xcont*ycont interaction term opens up modeling options for ‘causal’ mediation
too (a Mplus translation of Tyler Vandweweele’s SAS decomposition code is on
SEMNET; AIs can do this now right away, one for R exists already [vii]).
(1) t-test t = 0.39753, df = 95.447, p-value = 0.6919
(2) F-test F value 0.158, p-value = 0.692
(3) Chi-squared test X-squared = 0.16103, df = 1, p-value = 0.6882
(4) Simple regression t value -0.398, Pr(>|t|) = 0.692
(Pearson correlation mirrors the regression findings necessarily t = -0.39757 , df = 98, p-value = 0.6918)
(5) Path model (with lavaan) z-value -0.402 P(>|z|) = 0.688
Their p-values align [viii]:
we would conclude the same thing.
The effects estimated in R were: Regression: -0.040 & Path analysis: -0.03982
The tracing rule simply leads to the solution
For (6)-(8), the codes are in the R appendix r_Poland.txt – all are easy to ‘grab’ with an AI assisting.
The file contains two more ‘free gifts’, dagitty and MIIVsem codes to investigate what ‘statistical adjustments/controls’ have to be done, and nOT done, when focused on specific causal effects of interest.
Of course, each test is better suited for some combination of continuous/categorical pair, e.g. the t-test and the F test and the z-test in the regression model commonly use a continuous outcome (but they work with binary too).
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PROMPT used in Copilot:
Using the notation ivcont, xcont, mcont, ycont, for 4 continuous variables, and x01, m01, y01 for 3 binary variables, generate R code to analyze some of them using the following tests:
(1) t-test for x01 -> y01
(2) F-test for x01 -> y01
(3) chi-squared test for x01 <-> y01
(4) simple regression for x01 -> y01
(5) a path model (with lavaan) x01 -> y01
(6) a mediation model (lavaan) xcont -> mcont -> ycont [& xcont -> ycont]
(7) a instrumental variable (IV) model (lavaan) ivcont -> xcont -> ycont [no ivcont -> ycont path]
(8) Pearl’s mediating IV model (lavaan) xcont -> mcont -> ycont [no xcont -> ycont]
[then asked for Pearson correlation for x01 <-> y01]
[i] Robin
Beaumont has shown this in 2017 in great detail
SEM
equivalent to basic statistical procedures
[ii] Kenny,
D. A. (1987). Statistics for the social and behavioral sciences. Posted by
author at https://davidakenny.net/doc/statbook/kenny87.pdf
Little, Brown Boston.
[iii] “Research
in the behavioral and social sciences often involves testing statistical
models.
What Is a Model?
A statistical model is a formal representation of a set
of re1ationships between variables. Statistical models contain an outcome
variable that is the focus of study. […]
A very simple model is one in which the dependent
variable equals a constant plus the residual variable.
dependent variable
= constant variable + residual variable
[…] In simple
equation form the model is
dependent variable
= effect of the independent variable + residual variable
Instead of expressing the model as an equation, the
model could be just as easily specified by a diagram; arrows could be drawn
from cause to effect, as follows:
independent variable
-> dependent variable <-
residual variable
A representation of a model that uses arrows is called
a path diagram.” (Kenny, 1987), p. 184-5
[iv] Rizza,
D. (2025). Model Theory: The Algebraic Basics: Springer.
[v] Jaccard,
J., & Jacoby, J. (2009). Theory construction and model-building skills: A
practical guide for social scientists: Guilford Press.
[vii] Software
choice can be of course expanded at will, see e.g. Python and Stata
[viii]
That t, z, F, and chi-squared tests are special cases of eachother and can be
mathematically derived one from another, under special conditions, Gemini.AI conformed to
us (but you can verify too).
# R CODE FOR ALL STEPS: first read data
cont01s <- read.csv("C:\\\\data\\\\4vars.iv.med.2.csv")
view(cont01s) ## view the data in a separate insert window
names(cont01s) ## view the variables in the data
# xcont mcont ycont ivcont xbym x01 m01 y01 iv01
# (1) t-test: x01 -> y01
# Compare mean of y01 across levels of x01 (both 0/1)
t.test(y01 ~ x01, data = cont01s)
# (x01 -> ycont similar) t.test(y01 ~ x01, data = cont01s)
# (2) F-test: x01 -> y01
# One-way ANOVA (equivalent to regression F-test for binary x01)
fit_aov <- aov(y01 ~ x01, data = cont01s)
summary(fit_aov)
# (x01 -> ycont similar)
# (3) chi-squared test: x01 <-> y01
# Treat both as categorical
tab_xy <- table(cont01s$x01, cont01s$y01)
chisq.test(tab_xy, correct = FALSE)
# (cannot run x01 -> ycont)
# (4) simple regression: x01 -> y01
fit_lm <- lm(y01 ~ x01, data = cont01s)
summary(fit_lm)
# (x01 -> ycont similar)
### # Pearson correlation for two binary variables
cor(cont01s$x01, cont01s$y01, method = "pearson")
#This will return the correlation, confidence interval, and p‑value
cor.test(dat$x01, dat$y01, method = "pearson")
# (x01 -> ycont similar)
# (5) path model (lavaan): x01 -> y01
install.packages("lavaan")
library(lavaan)
model_path <- '
y01 ~ x01
'
fit_path <- sem(model_path, data = cont01s)
summary(fit_path, standardized = FALSE, fit.measures = FALSE)
# (6) mediation model (lavaan):
# xcont -> mcont -> ycont, plus direct xcont -> ycont
model_med <- '
# Regressions
mcont ~ a * xcont
ycont ~ b * mcont + c_prime * xcont
# Indirect, direct, total effects
ind := a * b
direct := c_prime
total := ind + direct
'
fit_med <- sem(model_med, data = cont01s)
summary(fit_med, standardized = FALSE, fit.measures = FALSE)
3# (7) IV model (lavaan):
# ivcont -> xcont -> ycont, no direct ivcont -> ycont
model_iv <- '
# First stage
xcont ~ a * ivcont
# Second stage
ycont ~ b * xcont
# (No ycont ~ ivcont path)
# Indirect effect of ivcont on ycont via xcont
iv_ind := a * b
'
fit_iv <- sem(model_iv, data = cont01s)
summary(fit_iv, standardized = FALSE, fit.measures = FALSE)
# (8) Pearl’s mediating IV-style model (lavaan):
# xcont -> mcont -> ycont, no direct xcont -> ycont
model_pearl <- '
# Regressions
mcont ~ a * xcont
ycont ~ b * mcont # no xcont -> ycont path
# Indirect effect only
ind := a * b
'
fit_pearl <- sem(model_pearl, data = cont01s)
summary(fit_pearl, standardized = FALSE, fit.measures = FALSE)
# (8.a) MIIVsem
install.packages("dagitty")
install.packages("MIIVsem")
library("dagitty")
library("MIIVsem")
m_iv2 <- '
# IV part
xcont ~ ivcont
# m part
mcont ~ xcont
# y part
ycont ~ xcont + mcont
'
# This lists for each Y<-X the IV(s) needed as Y X IV1 IV2 etc
# Uses a model-implied instrumental variable (MIIV) search
miivs(m_iv2)
#
miive(m_iv2 , cont01s)
# (8.b) MIIVsem mediation
m_med1 <- '
# m part
mcont ~ xcont
# y part
ycont ~ xcont + mcont
'
miivs(m_med1 )
# Interpretation
# LHS RHS MIIVs
# mcont xcont xcont
# For mcont<-xcont one would need to use as IV xcont
# ycont xcont, mcont mcont, xcont
# For ycont<-xcont one would need to use as IV mcont
# For ycont<-mcont one would need to use as IV xcont
# Estimates using two stage least squares (2SLS)
miive(m_iv2 , cont01s)
# (8.b.d.) dagitty for Mediation
xymed <- dagitty('dag {
mcont [pos="1.1,1"]
xcont [pos="1,1.1"]
ycont [pos="1.2,1.1"]
xcont -> ycont
mcont -> ycont
xcont -> mcont
}')
plot(xymed)
adjustmentSets( xymed, "xcont", "ycont", type="all" ) ## none should be
# adjusting for the mediator gives you the direct effect, not the total effect.
adjustmentSets( xymed, "xcont", "ycont", effect="direct" ) ## should be the mediator
# !! { mcont }
adjustmentSets( xymed, "xcont", "ycont", effect="total" ) ## should be {}
# (8.b) MIIVsem nonrecursive/feedback/cyclical
# Nonrecursive
m_nonrec <- '
ycont ~ xcont + mcont
xcont ~ ycont + ivcont
'
miivs(m_nonrec)
miive(m_nonrec , cont01s)
# (8.b) dagitty
library("dagitty")
## This can generate data according to the model, use some classic parameter values
xydag <- dagitty('dag {
ivcont [pos="1,1"]
mcont [pos="1,1.5"]
xcont [pos="1.5,1"]
ycont [pos="1.5,1.5"]
xcont -> ycont
mcont -> ycont
ycont -> xcont
ivcont -> xcont
}')
plot(xydag)
adjustmentSets( xydag, "xcont", "ycont", type="all" )
adjustmentSets( xydag, "ycont", "xcont", type="all" )
> adjustmentSets( xydag, "x", "y", type="all" )
{}
{}








