Joint Modelling of Functional Regression and FPCA
2026-04-28
Source:vignettes/joint_FPCA_vignette.Rmd
joint_FPCA_vignette.RmdIntroduction
This vignette describes the joint FPCA option that
is available across the refundBayes regression
functions:
-
sofr_bayes()– Bayesian Scalar-on-Function Regression, -
fcox_bayes()– Bayesian Functional Cox Regression, -
fofr_bayes()– Bayesian Function-on-Function Regression.
In every case, the option is exposed via the same
joint_FPCA argument: a logical (TRUE /
FALSE) vector of length equal to the number of functional
predictors. When the
-th
entry is TRUE, the observed functional predictor
is not used directly as a covariate; instead, the regression
model is built on top of a functional principal component
analysis (FPCA) sub-model for
,
and the FPC scores are sampled jointly with the regression
coefficients.
The joint-FPCA option replaces each functional predictor
with a low-rank FPCA representation
— fixed eigenfunctions
from refund::fpca.sc(), subject-specific scores
centered on their frequentist FPCA estimates — and samples those scores
jointly with the regression coefficient
,
propagating predictor measurement-error uncertainty into the posterior
of
and correcting the errors-in-variables attenuation that arises when
noisy
is plugged into the regression as if it were observed exactly.
The methodology mirrors Section 4 of Jiang, Crainiceanu, and
Cui (2025), Tutorial on Bayesian Functional Regression Using
Stan, Statistics in Medicine 44(20–22), e70265.
The default option is (joint_FPCA = NULL) which do not
adopt joint modelling.
Why Joint FPCA?
In standard scalar-on-function or functional Cox regression, the integral is plugged in as if were observed without error. In practice, the curves are sampled at finitely many points and are typically corrupted by measurement noise. Treating as exact ignores this uncertainty and biases the regression coefficient toward zero (the classical errors-in-variables attenuation), and underestimates the posterior credible-interval width.
The joint FPCA approach replaces by a low-rank FPCA representation
and treats the subject-specific scores
as parameters that are sampled together with the
regression coefficients. The FPC eigenfunctions
are estimated once with refund::fpca.sc() and held fixed,
and the initial frequentist FPCA scores
are used as the prior mean for
.
This propagates the FPCA uncertainty into the posterior of
,
so the regression credible bands are correctly inflated.
The Joint Bayesian Model We Adopted
The model has two layers that share the FPC scores :
- an FPCA likelihood on the observed functional predictor; and
- the outcome regression (SoFR / FCox / FoFR), in which the scores enter the linear predictor.
Throughout this section we assume one functional predictor for clarity; the multi-predictor case is identical with one set of FPCA quantities per functional predictor.
FPCA likelihood
For each subject and each grid point , ,
so that
In Stan code, this corresponds to
target += - N_num * M_num_i * log(sigma_e_i)
- sum((xi_i * Phi_mat_i - M_mat_i)^2) / (2 * sigma_e_i^2);Here M_mat_i is the observed
matrix of
values, Phi_mat_i is the
matrix of fixed eigenfunctions, and xi_i is the
matrix of FPC score parameters.
FPC score prior
Each FPC score is given an independent Gaussian prior centered on the initial frequentist FPCA score with eigenvalue-determined scale,
In Stan code, the kernel is implemented as
for (nj in 1:J_num_i) {
target += - N_num * log(lambda_i[nj])
- sum((xi_i[, nj] - xi_hat_i[, nj])^2) / (2 * lambda_i[nj]^2);
target += inv_gamma_lpdf(lambda_i[nj]^2 | 0.001, 0.001);
}
target += inv_gamma_lpdf(sigma_e_i^2 | 0.001, 0.001);The eigenvalue scales
and the residual scale
both receive weakly informative inverse-Gamma priors. The prior mean
is computed once via refund::fpca.sc() on the observed
functional data.
Plugging the FPCA into the linear predictor
Because
in the FPCA representation, the functional integral that drives every
regression model in refundBayes becomes
Expanding in the spline basis , we have
where the FPCA-spline cross-product matrix is approximated by the Riemann sum
using the same integration weights as in the standard regression formulation. After the spectral reparameterization of via (the penalty matrix), the design matrix is split into a penalized random-effect block (size ) and an unpenalized fixed-effect block (size ). The linear predictor for SoFR / FCox is then
with and assigned non-informative priors — exactly the same prior structure used for the regression coefficients in the no-joint-FPCA case. This is the chunk in the Stan code:
For the function-on-function regression model the integration over is combined with a response-domain expansion in the basis of size , so that the bivariate coefficient leads to
where now is a matrix and is a matrix of bivariate coefficients. The resulting Stan model contribution is
This is the only structural difference relative to the joint-FPCA SoFR / FCox model: the spline coefficients become matrices indexed by both the predictor-domain and the response-domain bases, and the same row-wise response-domain penalty as in the standard FoFR is applied.
Full Bayesian model
The complete joint model (one functional predictor) reads
The four building blocks (outcome regression, FPCA likelihood, score
prior, hyperpriors) are independent of the family; only the
outcome distribution in the last line changes. This is precisely why the
joint FPCA option is available across SoFR, FCox, and FoFR with the same
joint_FPCA argument.
Note on the prior mean of
The prior is centered on the initial frequentist FPCA score
returned by refund::fpca.sc(), not on
.
This corresponds to using the standard FPCA fit as a working informative
prior. As
becomes large the prior reverts to a diffuse Gaussian and the joint
model reduces (in spirit) to plugging in the sample FPC scores; for
moderate
the posterior trades off the FPCA fit and the regression likelihood. The
observed-data matrix
is passed to Stan in its original (uncentered) form, exactly as in
Section 4 of the Tutorial supplement.
Identifying the functional data and the integration weights
The joint FPCA design matrix is built with the same Riemann-sum integration weights as the no-joint-FPCA design matrix. Specifically, the convention is:
- in
s(tindex, by = lmat * wmat, ...), the rightmost variable in the product (herewmat) is treated as the observed functional data ; - the product of the remaining variables (here
lmat) is treated as the Riemann-sum integration weight matrix.
This matches the Tutorial supplementary code and means that you do
not need to change your formulas when turning joint FPCA on or off; only
the joint_FPCA argument changes.
How to enable joint FPCA
Every regression function in refundBayes accepts the
same argument:
joint_FPCA = c(TRUE, FALSE, ...) # one entry per s(...) functional termThe default is joint_FPCA = NULL, which is treated as
rep(FALSE, n_func) and gives exactly the no-joint-FPCA
behavior of the respective regression vignettes. Below we walk through
one example for each of sofr_bayes(),
fcox_bayes(), and fofr_bayes().
Example 1: Joint FPCA in sofr_bayes()
We reuse the simulation setup from the SoFR vignette and switch on joint FPCA for the (single) functional predictor.
Simulate Data
set.seed(123)
n <- 100
M <- 50
tgrid <- seq(0, 1, length.out = M)
dt <- tgrid[2] - tgrid[1]
tmat <- matrix(rep(tgrid, each = n), nrow = n)
lmat <- matrix(dt, nrow = n, ncol = M)
# Smooth latent trajectory + measurement noise on the functional predictor
D_true <- t(apply(matrix(rnorm(n * M), n, M), 1, cumsum)) / sqrt(M)
wmat <- D_true + matrix(rnorm(n * M, sd = 0.3), nrow = n) ## noisy
beta_true <- sin(2 * pi * tgrid)
X1 <- rnorm(n)
eta <- 0.5 * X1 + D_true %*% (beta_true * dt) ## regression on D, not W
prob <- plogis(eta)
y <- rbinom(n, 1, prob)
data.SoFR <- data.frame(y = y, X1 = X1)
data.SoFR$tmat <- tmat
data.SoFR$lmat <- lmat
data.SoFR$wmat <- wmatNotice that the outcome is generated from the latent trajectory but only the noisy is observed. This is exactly the setting that motivates joint FPCA.
Fit the joint FPCA SoFR model
library(refundBayes)
fit_sofr_joint <- sofr_bayes(
formula = y ~ X1 + s(tmat, by = lmat * wmat, bs = "cc", k = 10),
data = data.SoFR,
family = binomial(),
joint_FPCA = c(TRUE), ## << turn joint FPCA on for the (only) functional term
niter = 1500,
nwarmup = 500,
nchain = 3,
ncores = 3
)For comparison, the no-joint-FPCA fit uses the same
call without the joint_FPCA argument:
fit_sofr_plain <- sofr_bayes(
formula = y ~ X1 + s(tmat, by = lmat * wmat, bs = "cc", k = 10),
data = data.SoFR,
family = binomial(),
niter = 1500, nwarmup = 500, nchain = 3, ncores = 3
)The two posterior estimates of can be plotted together:
Joint-FPCA credible bands are typically wider in the regions where the measurement noise on the functional predictor is informative.
Joint-FPCA quantities exposed in the Stan fit
When joint_FPCA = TRUE for the
-th
functional predictor, the Stan fit additionally exposes:
| Stan parameter | Meaning |
|---|---|
xi_i |
matrix of joint FPC scores |
lambda_i |
-vector of FPC eigenvalue SDs |
sigma_e_i |
scalar, FPC residual SD |
These can be inspected with rstan::extract() for further
analysis (for example, plotting the posterior of the FPC scores or the
eigenvalue SDs).
Example 2: Joint FPCA in fcox_bayes()
The functional Cox setup is identical to the SoFR setup, with one
extra piece (the censoring vector). The joint FPCA model is wired in by
setting joint_FPCA = c(TRUE) on a model that has one
functional predictor:
library(refundBayes)
## Use the example dataset shipped with the FCox vignette
Func_Cox_Data <- readRDS("data/example_data_Cox.rds")
Func_Cox_Data$wmat <- Func_Cox_Data$MIMS
Func_Cox_Data$cens <- 1 - Func_Cox_Data$event
fit_cox_joint <- fcox_bayes(
formula = survtime ~ X1 + s(tmat, by = lmat * wmat, bs = "cc", k = 10),
data = Func_Cox_Data,
cens = Func_Cox_Data$cens,
joint_FPCA = c(TRUE),
niter = 5000,
nwarmup = 2000,
nchain = 1,
ncores = 1
)The Stan code generated under the hood matches Section 4 of the Tutorial supplement: the linear predictor
is plugged into the Cox log-likelihood
with the same M-spline / I-spline baseline-hazard construction as in the no-joint case. The FPCA likelihood and the FPC-score prior are appended to the joint Stan target function exactly as described in The Joint Bayesian Model We Adopted above.
Inspecting the generated Stan code
fcox_code <- fcox_bayes(
formula = survtime ~ X1 + s(tmat, by = lmat * wmat, bs = "cc", k = 10),
data = Func_Cox_Data,
cens = Func_Cox_Data$cens,
joint_FPCA = c(TRUE),
runStan = FALSE
)
cat(fcox_code$stancode)You will see the FPCA-related declarations in the data
block (Phi_mat_1, xi_hat_1,
M_mat_1, J_num_1, M_num_1,
X_mat_r_1, X_mat_f_1), the FPCA-related
parameters in the parameters block (xi_1,
lambda_1, sigma_e_1), and the FPCA
log-likelihood + score prior contributions in the model
block.
Example 3: Joint FPCA in fofr_bayes()
For function-on-function regression, joint FPCA on the functional predictor is constructed in exactly the same way — only the contribution to now carries the response-domain basis :
with as matrices of size and respectively. Both directions of smoothness ( via the random-effect reparameterization and via the response-domain penalty) are imposed exactly as in the no-joint FoFR model.
Simulated FoFR with measurement error on the predictor
library(refundBayes)
set.seed(42)
n <- 200
L <- 30
M <- 30
sindex <- seq(0, 1, length.out = L)
tindex <- seq(0, 1, length.out = M)
# Smooth latent functional predictor + measurement noise
D_true <- matrix(0, nrow = n, ncol = L)
for (i in 1:n) {
D_true[i, ] <- rnorm(1) * sin(2 * pi * sindex) +
rnorm(1) * cos(2 * pi * sindex) +
rnorm(1) * sin(4 * pi * sindex)
}
X_func <- D_true + matrix(rnorm(n * L, sd = 0.3), nrow = n) ## noisy
age <- rnorm(n)
# True coefficient functions
beta_true <- outer(sin(2 * pi * sindex), cos(2 * pi * tindex))
alpha_true <- 0.5 * sin(pi * tindex)
# Generate response from the latent D_true (not from X_func!)
signal_scalar <- outer(age, alpha_true)
signal_func <- (D_true %*% beta_true) / L
epsilon <- matrix(rnorm(n * M, sd = 0.3), nrow = n)
Y_mat <- signal_scalar + signal_func + epsilon
dat <- data.frame(age = age)
dat$Y_mat <- Y_mat
dat$X_func <- X_func
dat$sindex <- matrix(rep(sindex, n), nrow = n, byrow = TRUE)Fit the joint FPCA FoFR model
fit_fofr_joint <- fofr_bayes(
formula = Y_mat ~ age + s(sindex, by = X_func, bs = "cr", k = 10),
data = dat,
joint_FPCA = c(TRUE),
spline_type = "bs",
spline_df = 10,
niter = 2000,
nwarmup = 1000,
nchain = 3,
ncores = 3
)The joint FPCA contributes the additional Stan parameters
xi_1
(),
lambda_1
(-vector),
and sigma_e_1 (scalar), and adds the FPCA log-likelihood
and prior to the model block; the bivariate coefficient
is reconstructed from
exactly as in the no-joint FoFR case.
Visualize the bivariate coefficient
beta_est <- apply(fit_fofr_joint$bivar_func_coef[[1]], c(2, 3), mean)
par(mfrow = c(1, 2), mar = c(4, 4, 2, 1))
image(sindex, tindex, beta_true,
xlab = "s (predictor domain)", ylab = "t (response domain)",
main = expression("True " * beta(s,t)),
col = hcl.colors(64, "Blue-Red 3"))
image(sindex, tindex, beta_est,
xlab = "s (predictor domain)", ylab = "t (response domain)",
main = expression("Joint-FPCA " * hat(beta)(s,t)),
col = hcl.colors(64, "Blue-Red 3"))Practical Recommendations
When to use joint FPCA. Joint FPCA is most useful when the observed functional predictor contains substantial measurement noise relative to the signal in the regression, or when the curves are sparse / irregularly observed. For dense, low-noise functional data the difference is small.
Number of FPC components . Controlled by the default variance-explained criterion in
refund::fpca.sc(). A value such that cumulative variance explained exceeds 95 % is a reasonable default. A larger increases the number of joint FPC score parameters () and may slow down sampling.-
Convergence. Joint FPCA introduces extra latent parameters. Multiple chains and the standard
rstandiagnostics (traceplots, , bulk / tail ESS) are recommended: Multiple functional predictors. The argument is a vector with one entry per
s(...)functional term. Setting some entries toTRUEand others toFALSEis supported: each functional predictor is treated independently.Prior strength. The default
Inv-Gamma(0.001, 0.001)priors on and are weakly informative. If the posterior of is shrunk too hard toward , consider loosening the prior by editing the Stan code (setrunStan = FALSE, modify theinv_gamma_lpdf(...)lines, and pass the resulting program torstan::stan()directly).
References
- Jiang, Z., Crainiceanu, C., and Cui, E. (2025). Tutorial on Bayesian Functional Regression Using Stan. Statistics in Medicine, 44(20–22), e70265. (See Section 4 for the joint FPCA formulation that this vignette implements.)
- Crainiceanu, C. M., Goldsmith, J., Leroux, A., and Cui, E. (2024). Functional Data Analysis with R. CRC Press.
- Goldsmith, J., Bobb, J., Crainiceanu, C. M., Caffo, B., and Reich, D. (2011). Penalized Functional Regression. Journal of Computational and Graphical Statistics, 20(4), 830–851.
- Yao, F., Mueller, H.-G., and Wang, J.-L. (2005). Functional Data Analysis for Sparse Longitudinal Data. Journal of the American Statistical Association, 100(470), 577–590.
- Goldsmith, J., Greven, S., and Crainiceanu, C. M. (2013). Corrected Confidence Bands for Functional Data Using Principal Components. Biometrics, 69(1), 41–51.
- Carpenter, B., Gelman, A., Hoffman, M. D., et al. (2017). Stan: A Probabilistic Programming Language. Journal of Statistical Software, 76(1), 1–32.