Make Fake Data
simulate_pdd_data.RdPrepares a minimal example of data in the right format for analysis via the demcrit package functions. It is compulsory to specify FAQ summary data (mean, variance and covariance, see below) and at least one additional global cognitive measure summary data.
Arguments
- N
A numeric indicating number of simulated patients. Defaults to 203.
- defaults
A logical indicator whether default values derived from the observed data should be used (
TRUE, default) or the user will provide their values instead (FALSE).- Mu
Named numeric vector of means. Needs to contain one component denoting FAQ. If
defaults = TRUE, it is set to values from observed data.- Sigma
Named matrix of variances and covariances. If
defaults = TRUE, it is set at empirical values from the observed data.- crits
A data.frame with named rows and four columns:
rownames indicate algorithm labels,
IADLindicate variable from the data used for IADL deficit calculation,IADL_thresindicate threshold above which there is IADL deficit,cognitionindicate variable from the data used for cognitive deficit calculation,cognition_thresindicate threshold below which there is cognitive deficit.
- cens
A matrix or data frame with two columns, the first indicating lower bound, the second the upper bound of a scale in the rowname. If
defaults == TRUE, orparam = NULLwhich does not censor data (and prints latent scores instead).
Value
A tibble containing:
- id
Character; identificator of a simulated patient.
- type
Character; algorithm names specified by rownames of
crits- PDD
Logical; PDD diagnosis for each patient (
id) calculated via each algorithm (type),- raw scores
Numeric; Scores showing raw simulated data for each variable supplied to the data-generating proces. Defaults to columns
FAQ,MMSE,MoCA,sMoCAandFAQ9.
Details
The function simulates N synthetic patients' PDD status via the following steps:
generate latent FAQ and cognitive scores via
MASS::mvrnorm(),generate observed FAQ item 9 data via the
rbinom()following tau-equivalence assumption applied to the FAQ questionnaire,censor the scores according to
censand round to nearest integer to obtain observed dataapply criteria for PDD from
critsto get per algorithm/patient pair probable PDD.
See also
prepare_defaults() for correct format of simulation parameters specification.
Examples
if (FALSE) { # \dontrun{
# Simulate using defaults:
sim1 <- simulate_pdd_data()
sim2 <- simulate_pdd_data(defaults = TRUE) # the same data-generating process as sim1
# Generate using two cognitive measures not correlated with FAQ:
mu <- c(FAQ = 4.05, MMSE = 26.69, MoCA = 24.07)
sd <- c(FAQ = 4.89, MMSE = 2.22, MoCA = 3.48)
corrs <- matrix(
c(1, 0, 0, 0, 1, 0.63, 0, 0.63, 1), nrow = 3,
dimnames = lapply(seq_len(2), \(i) c("FAQ", "MMSE", "MoCA"))
)
cens <- matrix(
c(rep(0, 3), rep(30, 3)), nrow = 3,
dimnames = list(c("FAQ", "MMSE", "MoCA"))
)
sigma <- MBESS::cor2cov(corrs, sd)
crits <- data.frame(
row.names = c("A1", "A2", "A3", "A4"),
IADL = rep(c("FAQ", "FAQ9"), 2),
IADL_thres = rep(c(7, 1), 2),
cognition = c(rep("MMSE", 2), rep("MoCA", 2)),
cognition_thres = c(rep(26, 2), rep(26, 2))
)
sim3 <- simulate_pdd_data(
N = 2000,
defaults = FALSE, # It is crucial to set this to FALSE
Mu = mu, Sigma = sigma
)
} # }