| Title: | Flexible Bayesian Model Selection and Model Averaging |
|---|---|
| Description: | Implements the Mode Jumping Markov Chain Monte Carlo algorithm described in <doi:10.1016/j.csda.2018.05.020> and its Genetically Modified counterpart described in <doi:10.1613/jair.1.13047> as well as the sub-sampling versions described in <doi:10.1016/j.ijar.2022.08.018> for flexible Bayesian model selection and model averaging. |
| Authors: | Jon Lachmann [cre, aut], Aliaksandr Hubin [aut] |
| Maintainer: | Jon Lachmann <[email protected]> |
| License: | GPL-2 |
| Version: | 1.3 |
| Built: | 2026-05-18 09:20:24 UTC |
| Source: | https://github.com/jonlachmann/fbms |
Implements the Mode Jumping Markov Chain Monte Carlo algorithm described in <doi:10.1016/j.csda.2018.05.020> and its Genetically Modified counterpart described in <doi:10.1613/jair.1.13047> as well as the sub-sampling versions described in <doi:10.1016/j.ijar.2022.08.018> for flexible Bayesian model selection and model averaging.
Maintainer: Jon Lachmann [email protected]
Authors:
Jon Lachmann [email protected]
Aliaksandr Hubin [email protected]
Other contributors:
Florian Frommlet [email protected] [contributor]
Geir Storvik [email protected] [contributor]
Lachmann, J., Storvik, G., Frommlet, F., & Hubin, A. (2022). A subsampling approach for Bayesian model selection. International Journal of Approximate Reasoning, 151, 33-63. Elsevier.
Hubin, A., Storvik, G., & Frommlet, F. (2021). Flexible Bayesian Nonlinear Model Configuration. Journal of Artificial Intelligence Research, 72, 901-942.
Hubin, A., Frommlet, F., & Storvik, G. (2021). Reversible Genetically Modified MJMCMC. Under review in EYSM 2021.
Hubin, A., & Storvik, G. (2018). Mode jumping MCMC for Bayesian variable selection in GLMM. Computational Statistics & Data Analysis, 127, 281-297. Elsevier.
%% ~~ A concise (1-5 lines) description of the dataset. ~~
A data frame with 4177 observations on the following 9 variables.
Diameter Perpendicular to length, continuous
Height with with meat in shell, continuous.
Longest shell measurement, continuous
+1.5 gives the age in years, integer
Sex of the abalone, F is female, M male, and I infant, categorical.
Grams after being dried, continuous.
Grams weight of meat, continuous.
Grams gut weight (after bleeding), continuous.
Grams whole abalone, continuous.
See the web page https://archive.ics.uci.edu/ml/datasets/Abalone for more information about the data set.
Dua, D. and Graff, C. (2019). UCI Machine Learning Repository https://archive.ics.uci.edu/ml/. Irvine, CA: University of California, School of Information and Computer Science.
data(abalone) ## maybe str(abalone) ; plot(abalone) ...data(abalone) ## maybe str(abalone) ; plot(abalone) ...
Dispatches to methods for extracting aggregated predictions from objects.
aggr(object, ...)aggr(object, ...)
object |
An object. |
... |
Additional arguments passed to methods. |
Aggregated predictions (format depends on the object class).
Extracts the aggregated predictions (mean and quantiles) from an FBMS prediction object.
## S3 method for class 'fbms_predict' aggr(object, ...)## S3 method for class 'fbms_predict' aggr(object, ...)
object |
Object of class "fbms_predict". |
... |
Additional arguments (ignored). |
List containing aggregated mean and quantiles.
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet) pred <- predict(model, exoplanet[51:60, -1]) aggr(pred)data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet) pred <- predict(model, exoplanet[51:60, -1]) aggr(pred)
Arcsinh Transform
arcsinh(x)arcsinh(x)
x |
The vector of values |
arcsinh(x)
arcsinh(2)arcsinh(2)
Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image.
data(breastcancer)data(breastcancer)
A data frame with 569 rows and 32 variables
Separating plane described above was obtained using Multisurface Method-Tree (MSM-T) (K. P. Bennett, "Decision Tree Construction Via Linear Programming." Proceedings of the 4th Midwest Artificial Intelligence and Cognitive Science Society, pp. 97-101, 1992), a classification method which uses linear programming to construct a decision tree. Relevant features were selected using an exhaustive search in the space of 1-4 features and 1-3 separating planes.
The actual linear program used to obtain the separating plane in the 3-dimensional space is that described in: (K. P. Bennett and O. L. Mangasarian: "Robust Linear Programming Discrimination of Two Linearly Inseparable Sets", Optimization Methods and Software 1, 1992, 23-34).
The variables are as follows:
ID number
Diagnosis (1 = malignant, 0 = benign)
Ten real-valued features are computed for each cell nucleus
Dataset downloaded from the UCI Machine Learning Repository. http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)
Creators:
Dr. William H. Wolberg, General Surgery Dept. University of Wisconsin, Clinical Sciences Center Madison, WI 53792 wolberg 'at' eagle.surgery.wisc.edu
W. Nick Street, Computer Sciences Dept. University of Wisconsin, 1210 West Dayton St., Madison, WI 53706 street 'at' cs.wisc.edu 608-262-6619
Olvi L. Mangasarian, Computer Sciences Dept. University of Wisconsin, 1210 West Dayton St., Madison, WI 53706 olvi 'at' cs.wisc.edu
Donor: Nick Street
W.N. Street, W.H. Wolberg and O.L. Mangasarian. Nuclear feature extraction for breast tumor diagnosis. IS&T/SPIE 1993 International Symposium on Electronic Imaging: Science and Technology, volume 1905, pages 861-870, San Jose, CA, 1993.
Lichman, M. (2013). UCI Machine Learning Repository http://archive.ics.uci.edu/ml. Irvine, CA: University of California, School of Information and Computer Science.
Extracts coefficients from a BGNLM model.
## S3 method for class 'bgnlm_model' coef(object, ...)## S3 method for class 'bgnlm_model' coef(object, ...)
object |
Object of class "bgnlm_model". |
... |
Additional arguments (ignored). |
Vector of coefficients.
data(exoplanet) model <- get.best.model(fbms(semimajoraxis ~ ., data = exoplanet, family = "gaussian")) coef(model)data(exoplanet) model <- get.best.model(fbms(semimajoraxis ~ ., data = exoplanet, family = "gaussian")) coef(model)
Extracts coefficients from the best GMJMCMC model found.
## S3 method for class 'gmjmcmc' coef(object, ...)## S3 method for class 'gmjmcmc' coef(object, ...)
object |
Object of class "gmjmcmc". |
... |
Additional arguments (ignored). |
Vector of coefficients from the best model found.
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "gmjmcmc", transforms = c("sigmoid")) coef(model)data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "gmjmcmc", transforms = c("sigmoid")) coef(model)
Extracts coefficients from the best GMJMCMC merged model.
## S3 method for class 'gmjmcmc_merged' coef(object, ...)## S3 method for class 'gmjmcmc_merged' coef(object, ...)
object |
Object of class "gmjmcmc_merged". |
... |
Additional arguments (ignored). |
Vector of coefficients from the best model found.
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "gmjmcmc.parallel", transforms = c("sigmoid"), runs = 2, cores = 1) coef(model)data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "gmjmcmc.parallel", transforms = c("sigmoid"), runs = 2, cores = 1) coef(model)
Extracts coefficients from the best MJMCMC model.
## S3 method for class 'mjmcmc' coef(object, ...)## S3 method for class 'mjmcmc' coef(object, ...)
object |
Object of class "mjmcmc". |
... |
Additional arguments (ignored). |
Vector of coefficients from the best model found.
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc") coef(model)data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc") coef(model)
Extracts coefficients from the best MJMCMC parallel model.
## S3 method for class 'mjmcmc_parallel' coef(object, ...)## S3 method for class 'mjmcmc_parallel' coef(object, ...)
object |
Object of class "mjmcmc_parallel". |
... |
Additional arguments (ignored). |
Vector of coefficients from the best model found.
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc.parallel", cores = 1, runs = 2) coef(model)data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc.parallel", cores = 1, runs = 2) coef(model)
This function computes model averaged effects for specified covariates using a fitted model object. The effects are expected change in the BMA linear predictor having an increase of the corresponding covariate by one unit, while other covariates are fixed to 0. Users can provide custom labels and specify quantiles for the computation of effects.
compute_effects(object, labels, quantiles = c(0.025, 0.5, 0.975))compute_effects(object, labels, quantiles = c(0.025, 0.5, 0.975))
object |
A fitted model object, typically the result of a regression or predictive modeling. |
labels |
A vector of labels for which effects are to be computed. |
quantiles |
A numeric vector specifying the quantiles to be calculated. Default is c(0.025, 0.5, 0.975). |
A matrix of treatment effects for the specified labels, with rows corresponding to labels and columns to quantiles.
data <- data.frame(matrix(rnorm(600), 100)) result <- mjmcmc.parallel(runs = 2, cores = 1, y = matrix(rnorm(100), 100), x = data, loglik.pi = gaussian.loglik) compute_effects(result,labels = names(data))data <- data.frame(matrix(rnorm(600), 100)) result <- mjmcmc.parallel(runs = 2, cores = 1, y = matrix(rnorm(100), 100), x = data, loglik.pi = gaussian.loglik) compute_effects(result,labels = names(data))
Cosine Function for Degrees
cos_deg(x)cos_deg(x)
x |
The vector of values in degrees |
The cosine of x
cos_deg(0)cos_deg(0)
Plots the convergence of summary statistics (e.g., median, mean) of log posteriors or marginal likelihoods over populations for a GMJMCMC or GMJMCMC merged result object, with confidence intervals.
diagn_plot( res, FUN = median, conf = 0.95, burnin = 0, window = 5, ylim = NULL, ... )diagn_plot( res, FUN = median, conf = 0.95, burnin = 0, window = 5, ylim = NULL, ... )
res |
Object of class |
FUN |
Function to compute summary statistics (e.g., |
conf |
Numeric; confidence level for intervals (e.g., 0.95 for 95%). Default is 0.95. |
burnin |
Integer; number of initial populations to skip. Default is 0. |
window |
Integer; size of the sliding window for computing standard deviation. Default is 5. |
ylim |
Numeric vector; y-axis limits for the plot. If |
... |
Additional graphical parameters passed to |
Returns invisible(NULL). The function is called for its side effect of producing a plot.
data(exoplanet) result <- fbms(semimajoraxis ~ ., data = exoplanet, method = "gmjmcmc", transforms = c("sin")) diagn_plot(result, FUN = median, conf = 0.95, main = "Convergence Plot")data(exoplanet) result <- fbms(semimajoraxis ~ ., data = exoplanet, method = "gmjmcmc", transforms = c("sin")) diagn_plot(result, FUN = median, conf = 0.95, main = "Convergence Plot")
Erf Function
erf(x)erf(x)
x |
The vector of values |
2 * pnorm(x * sqrt(2)) - 1
erf(2)erf(2)
Data fields include planet and host star attributes.
data(exoplanet)data(exoplanet)
A data frame with 223 rows and 11 variables
The variables are as follows:
semimajoraxis: Semi-major axis of the planetary object's orbit in astronomical units
mass: Mass of the planetary object in Jupiter masses
radius: Radius of the planetary object in Jupiter radii
period: Orbital period of the planetary object in days
eccentricity: Eccentricity of the planetary object's orbit
hoststar_mass: Mass of the host star in solar masses
hoststar_radius: Radius of the host star in solar radii
hoststar_metallicity: Metallicity of the host star
hoststar_temperature: Effective temperature of the host star in Kelvin
binaryflag: Flag indicating the type of planetary system
Dataset downloaded from the Open Exoplanet Catalogue Repository. https://github.com/OpenExoplanetCatalogue/oec_tables/
Creators:
Prof. Hanno Rein, Department for Physical and Environmental Sciences. University of Toronto at Scarborough Toronto, Ontario M1C 1A4 hanno.rein 'at' utoronto.ca
Double Exponential Function
exp_dbl(x)exp_dbl(x)
x |
The vector of values |
e^(-abs(x))
exp_dbl(2)exp_dbl(2)
This function fits a model using the relevant MCMC sampling. The user can specify the formula, family, data, transforms, and other parameters to customize the model.
fbms( formula = NULL, family = "gaussian", beta_prior = list(type = "g-prior"), model_prior = NULL, extra_params = NULL, data = NULL, impute = FALSE, loglik.pi = NULL, method = "mjmcmc", verbose = TRUE, ... )fbms( formula = NULL, family = "gaussian", beta_prior = list(type = "g-prior"), model_prior = NULL, extra_params = NULL, data = NULL, impute = FALSE, loglik.pi = NULL, method = "mjmcmc", verbose = TRUE, ... )
formula |
A formula object specifying the model structure. Default is NULL. |
family |
The distribution family of the response variable. Currently supports "gaussian", "binomial", "poisson", "gamma", and "custom". Default is "gaussian". |
beta_prior |
Type of prior as a string (default: "g-prior" with a = max(n, p^2)). Possible values include:
- "beta.prime": Beta-prime prior (GLM/Gaussian, no additional args)
- "CH": Compound Hypergeometric prior (GLM/Gaussian, requires
|
model_prior |
a list with parameters of model priors, by default r should be provided |
extra_params |
extra parameters to be passed to the loglik.pi function |
data |
A data frame or matrix containing the data to be used for model fitting. If the outcome variable is in the first column of the data frame, the formula argument in fbms can be omitted, provided that all other columns are intended to serve as input covariates. |
impute |
TRUE means imputation combined with adding a dummy column with indicators of imputed values, FALSE (default) means only full data is used. |
loglik.pi |
Custom function to compute the logarithm of the posterior mode based on logarithm of marginal likelihood and logarithm of prior functions (needs specification only used if family = "custom") |
method |
Which fitting algorithm should be used, currently implemented options include "gmjmcmc", "gmjmcmc.parallel", "mjmcmc" and "mjmcmc.parallel" with "mjmcmc" being the default and 'mjmcmc' means that only linear models will be estimated |
verbose |
If TRUE, print detailed progress information during the fitting process. Default is TRUE. |
... |
Additional parameters to be passed to the underlying method. |
An object containing the results of the fitted model and MCMC sampling.
mjmcmc, gmjmcmc, gmjmcmc.parallel
# Fit a Gaussian multivariate time series model fbms_result <- fbms( X1 ~ ., family = "gaussian", method = "gmjmcmc.parallel", data = data.frame(matrix(rnorm(600), 100)), transforms = c("sin","cos"), P = 10, runs = 1, cores = 1 ) summary(fbms_result)# Fit a Gaussian multivariate time series model fbms_result <- fbms( X1 ~ ., family = "gaussian", method = "gmjmcmc.parallel", data = data.frame(matrix(rnorm(600), 100)), transforms = c("sin","cos"), P = 10, runs = 1, cores = 1 ) summary(fbms_result)
This function serves as a unified interface to compute the log marginal likelihood for different regression models and priors by calling specific log likelihood functions.
fbms.mlik.master( y, x, model, complex, mlpost_params = list(family = "gaussian", beta_prior = list(type = "g-prior"), r = NULL) )fbms.mlik.master( y, x, model, complex, mlpost_params = list(family = "gaussian", beta_prior = list(type = "g-prior"), r = NULL) )
y |
A numeric vector containing the dependent variable. |
x |
A matrix containing the precalculated features (independent variables). |
model |
A logical vector indicating which variables to include in the model. |
complex |
A list of complexity measures for the features. |
mlpost_params |
A list of parameters controlling the model family, prior, and tuning parameters. Key elements include:
|
A list with elements:
crit |
Log marginal likelihood combined with the log prior. |
coefs |
Posterior mode of the coefficients. |
fbms.mlik.master(y = rnorm(100), x = matrix(rnorm(100)), c(TRUE,TRUE), list(oc = 1), mlpost_params = list(family = "gaussian", beta_prior = list(type = "g-prior", a = 2), r = exp(-0.5)))fbms.mlik.master(y = rnorm(100), x = matrix(rnorm(100)), c(TRUE,TRUE), list(oc = 1), mlpost_params = list(family = "gaussian", beta_prior = list(type = "g-prior", a = 2), r = exp(-0.5)))
Extracts the mean predictions from an FBMS prediction object (alias for mean).
## S3 method for class 'fbms_predict' fitted(object, ...)## S3 method for class 'fbms_predict' fitted(object, ...)
object |
Object of class "fbms_predict". |
... |
Additional arguments (ignored). |
Vector of mean predictions.
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet) pred <- predict(model, exoplanet[51:60, -1]) fitted(pred)data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet) pred <- predict(model, exoplanet[51:60, -1]) fitted(pred)
Log Likelihood Function for Gaussian Regression with a Jeffreys Prior and BIC Approximation
gaussian.loglik(y, x, model, complex, mlpost_params)gaussian.loglik(y, x, model, complex, mlpost_params)
y |
A vector containing the dependent variable |
x |
The matrix containing the precalculated features |
model |
The model to estimate as a logical vector |
complex |
A list of complexity measures for the features |
mlpost_params |
A list of parameters for the log likelihood, supplied by the user |
A list with the log marginal likelihood combined with the log prior (crit) and the posterior mode of the coefficients (coefs).
gaussian.loglik(rnorm(100), matrix(rnorm(100)), TRUE, list(oc = 1), NULL)gaussian.loglik(rnorm(100), matrix(rnorm(100)), TRUE, list(oc = 1), NULL)
GELU Function
gelu(x)gelu(x)
x |
The vector of values |
x*pnorm(x)
gelu(2)gelu(2)
This function generates the full list of parameters required for the Generalized Mode Jumping Markov Chain Monte Carlo (GMJMCMC) algorithm, building upon the parameters from gen.params.mjmcmc. The generated parameter list includes feature generation settings, population control parameters, and optimization controls for the search process.
gen.params.gmjmcmc(ncov)gen.params.gmjmcmc(ncov)
ncov |
The number of covariates in the dataset that will be used in the algorithm |
A list of parameters for controlling GMJMCMC behavior:
feat)feat$DMaximum feature depth, default 5. Limits the number of recursive feature transformations. For fractional polynomials, it is recommended to set D = 1.
feat$LMaximum number of features per model, default 15. Increase for complex models.
feat$alphaStrategy for generating $alpha$ parameters in non-linear projections:
"unit"(Default) Sets all components to 1.
"deep"Optimizes $alpha$ across all feature layers.
"random"Samples $alpha$ from the prior for a fully Bayesian approach.
feat$pop.maxMaximum feature population size per iteration. Defaults to min(100, as.integer(1.5 * p)), where p is the number of covariates.
feat$keep.orgLogical flag; if TRUE, original covariates remain in every population (default FALSE).
feat$prel.filterThreshold for pre-filtering covariates before the first population generation. Default 0 disables filtering.
feat$prel.selectIndices of covariates to include initially. Default NULL includes all.
feat$keep.minMinimum proportion of features to retain during population updates. Default 0.8.
feat$epsThreshold for feature inclusion probability during generation. Default 0.05.
feat$check.colLogical; if TRUE (default), checks for collinearity during feature generation.
feat$max.proj.sizeMaximum number of existing features used to construct a new one. Default 15.
rescale.largeLogical flag for rescaling large data values for numerical stability. Default FALSE.
burn_inThe burn-in period for the MJMCMC algorithm, which is set to 100 iterations by default.
mhA list containing parameters for the regular Metropolis-Hastings (MH) kernel:
neigh.sizeThe size of the neighborhood for MH proposals with fixed proposal size, default set to 1.
neigh.minThe minimum neighborhood size for random proposal size, default set to 1.
neigh.maxThe maximum neighborhood size for random proposal size, default set to 2.
largeA list containing parameters for the large jump kernel:
neigh.sizeThe size of the neighborhood for large jump proposals with fixed neighborhood size, default set to the smaller of 0.35 * p and 35, where is the number of covariates.
neigh.minThe minimum neighborhood size for large jumps with random size of the neighborhood, default set to the smaller of 0.25 * p and 25.
neigh.maxThe maximum neighborhood size for large jumps with random size of the neighborhood, default set to the smaller of 0.45 * p and 45.
randomA list containing a parameter for the randomization kernel:
probThe small probability of changing the component around the mode, default set to 0.01.
saA list containing parameters for the simulated annealing kernel:
probsA numeric vector of length 6 specifying the probabilities for different types of proposals in the simulated annealing algorithm.
neigh.sizeThe size of the neighborhood for the simulated annealing proposals, default set to 1.
neigh.minThe minimum neighborhood size, default set to 1.
neigh.maxThe maximum neighborhood size, default set to 2.
t.initThe initial temperature for simulated annealing, default set to 10.
t.minThe minimum temperature for simulated annealing, default set to 0.0001.
dtThe temperature decrement factor, default set to 3.
MThe number of iterations in the simulated annealing process, default set to 12.
greedyA list containing parameters for the greedy algorithm:
probsA numeric vector of length 6 specifying the probabilities for different types of proposals in the greedy algorithm.
neigh.sizeThe size of the neighborhood for greedy algorithm proposals, set to 1.
neigh.minThe minimum neighborhood size for greedy proposals, set to 1.
neigh.maxThe maximum neighborhood size for greedy proposals, set to 2.
stepsThe number of steps for the greedy algorithm, set to 20.
triesThe number of tries for the greedy algorithm, set to 3.
loglikA list to store log-likelihood values, which is by default empty.
data <- data.frame(y = rnorm(100), x1 = rnorm(100), x2 = rnorm(100)) params <- gen.params.gmjmcmc(ncol(data) - 1) str(params)data <- data.frame(y = rnorm(100), x1 = rnorm(100), x2 = rnorm(100)) params <- gen.params.gmjmcmc(ncol(data) - 1) str(params)
Generate a Parameter List for MJMCMC (Mode Jumping MCMC)
gen.params.mjmcmc(ncov)gen.params.mjmcmc(ncov)
ncov |
The number of covariates in the dataset that will be used in the algorithm |
A list of parameters to use when running the mjmcmc function.
The list contains the following elements:
burn_inThe burn-in period for the MJMCMC algorithm, which is set to 100 iterations by default.
mhA list containing parameters for the regular Metropolis-Hastings (MH) kernel:
neigh.sizeThe size of the neighborhood for MH proposals with fixed proposal size, default set to 1.
neigh.minThe minimum neighborhood size for random proposal size, default set to 1.
neigh.maxThe maximum neighborhood size for random proposal size, default set to 2.
largeA list containing parameters for the large jump kernel:
neigh.sizeThe size of the neighborhood for large jump proposals with fixed neighborhood size, default set to the smaller of 0.35 and 35, where is the number of covariates.
neigh.minThe minimum neighborhood size for large jumps with random size of the neighborhood, default set to the smaller of 0.25 and 25.
neigh.maxThe maximum neighborhood size for large jumps with random size of the neighborhood, default set to the smaller of 0.45 and 45.
randomA list containing a parameter for the randomization kernel:
probThe small probability of changing the component around the mode, default set to 0.01.
saA list containing parameters for the simulated annealing kernel:
probsA numeric vector of length 6 specifying the probabilities for different types of proposals in the simulated annealing algorithm.
neigh.sizeThe size of the neighborhood for the simulated annealing proposals, default set to 1.
neigh.minThe minimum neighborhood size, default set to 1.
neigh.maxThe maximum neighborhood size, default set to 2.
t.initThe initial temperature for simulated annealing, default set to 10.
t.minThe minimum temperature for simulated annealing, default set to 0.0001.
dtThe temperature decrement factor, default set to 3.
MThe number of iterations in the simulated annealing process, default set to 12.
greedyA list containing parameters for the greedy algorithm:
probsA numeric vector of length 6 specifying the probabilities for different types of proposals in the greedy algorithm.
neigh.sizeThe size of the neighborhood for greedy algorithm proposals, set to 1.
neigh.minThe minimum neighborhood size for greedy proposals, set to 1.
neigh.maxThe maximum neighborhood size for greedy proposals, set to 2.
stepsThe number of steps for the greedy algorithm, set to 20.
triesThe number of tries for the greedy algorithm, set to 3.
loglikA list to store log-likelihood values, which is by default empty.
Note that the $loglik item is an empty list, which is passed to the log likelihood function of the model,
intended to store parameters that the estimator function should use.
gen.params.mjmcmc(matrix(rnorm(600), 100))gen.params.mjmcmc(matrix(rnorm(600), 100))
Generate a Probability List for GMJMCMC (Genetically Modified MJMCMC)
gen.probs.gmjmcmc(transforms)gen.probs.gmjmcmc(transforms)
transforms |
A list of the transformations used (to get the count). |
A named list with eight elements:
largeThe probability of a large jump kernel in the MJMCMC algorithm. With this probability, a large jump proposal will be made; otherwise, a local Metropolis-Hastings proposal will be used. One needs to consider good mixing around and between modes when specifying this parameter.
large.kernA numeric vector of length 4 specifying the probabilities for different types of large jump kernels. The four components correspond to:
Random change with random neighborhood size
Random change with fixed neighborhood size
Swap with random neighborhood size
Swap with fixed neighborhood size
These probabilities will be automatically normalized if they do not sum to 1.
localopt.kernA numeric vector of length 2 specifying the probabilities for different local optimization methods during large jumps. The first value represents the probability of using simulated annealing, while the second corresponds to the greedy optimizer. These probabilities will be normalized if needed.
random.kernA numeric vector of length 2 specifying the probabilities
of first two randomization kernels applied after local optimization. These correspond
to the same kernel types as in large.kern but are used for local proposals
where type and 2 only are allowed.
mhA numeric vector specifying the probabilities of different standard Metropolis-Hastings kernels, where the first four as the same as for other kernels, while fifths and sixes components are uniform addition/deletion of a covariate.
filterA numeric value controlling the filtering of features
with low posterior probabilities in the current population. Features with
posterior probabilities below this threshold will be removed with a probability
proportional to .
genA numeric vector of length 4 specifying the probabilities of different feature generation operators. These determine how new nonlinear features are introduced. The first entry gives the probability for an interaction, followed by modification, nonlinear projection, and a mutation operator, which reintroduces discarded features. If these probabilities do not sum to 1, they are automatically normalized.
transA numeric vector of length equal to the number of elements in transforms,
specifying the probabilities of selecting each nonlinear transformation from .
By default, a uniform distribution is assigned, but this can be modified by providing a specific
transforms argument.
gen.probs.gmjmcmc(c("p0", "exp_dbl"))gen.probs.gmjmcmc(c("p0", "exp_dbl"))
Generate a Probability List for MJMCMC (Mode Jumping MCMC)
gen.probs.mjmcmc()gen.probs.mjmcmc()
A named list with five elements:
largeA numeric value representing the probability of making a large jump. If a large jump is not made, a local MH (Metropolis-Hastings) proposal is used instead.
large.kernA numeric vector of length 4 specifying the probabilities for different types of large jump kernels. The four components correspond to:
Random change with random neighborhood size
Random change with fixed neighborhood size
Swap with random neighborhood size
Swap with fixed neighborhood size
These probabilities will be automatically normalized if they do not sum to 1.
localopt.kernA numeric vector of length 2 specifying the probabilities for different local optimization methods during large jumps. The first value represents the probability of using simulated annealing, while the second corresponds to the greedy optimizer. These probabilities will be normalized if needed.
random.kernA numeric vector of length 2 specifying the probabilities of different randomization kernels applied after local optimization of type one or two. These correspond to the first two kernel types as in large.kern but are used for local proposals with different neighborhood sizes.
mhA numeric vector specifying the probabilities of different standard Metropolis-Hastings kernels, where the first four as the same as for other kernels, while fifths and sixes components are uniform addition/deletion of a covariate.
gen.probs.mjmcmc()gen.probs.mjmcmc()
Retrieves the best model from the results of MJMCMC, MJMCMC parallel, GMJMCMC, or GMJMCMC merged runs
based on the maximum criterion value (crit). The returned list includes the model probability,
selected features, criterion value, intercept parameter, and named coefficients.
get.best.model(result, labels = FALSE, ...)get.best.model(result, labels = FALSE, ...)
result |
An object of class |
labels |
Logical; if |
... |
Additional arguments passed to methods. |
The function identifies the best model by selecting the one with the highest crit value.
Selection logic depends on the class of the result object:
mjmcmcSelects the top model from a single MJMCMC run.
mjmcmc_parallelIdentifies the best chain, then selects the best model from that chain.
gmjmcmcSelects the best population and model within that population.
gmjmcmc_mergedFinds the best chain and population before extracting the top model.
A list containing the details of the best model:
probA numeric value representing the model's probability.
modelA logical vector indicating which features are included in the best model.
critThe criterion value used for model selection (e.g., marginal likelihood or posterior probability).
alphaThe intercept parameter of the best model.
coefsA named numeric vector of model coefficients, including the intercept and selected features.
data(exoplanet) result <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc") get.best.model(result)data(exoplanet) result <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc") get.best.model(result)
This function extracts the Median Probability Model (MPM) from a fitted model object. The MPM includes features with marginal posterior inclusion probabilities greater than 0.5. It constructs the corresponding model matrix and computes the model fit using the specified likelihood.
get.mpm.model( result, y, x, labels = F, family = "gaussian", loglik.pi = gaussian.loglik, params = NULL )get.mpm.model( result, y, x, labels = F, family = "gaussian", loglik.pi = gaussian.loglik, params = NULL )
result |
A fitted model object (e.g., from |
y |
A numeric vector of response values. For |
x |
A |
labels |
If specified, custom labels of covariates can be used. Default is |
family |
Character string specifying the model family. Supported options are:
If an unsupported family is provided, a warning is issued and the Gaussian likelihood is used by default. |
loglik.pi |
A function that computes the log-likelihood. Defaults to |
params |
Parameters of |
A bgnlm_model object containing:
probThe log marginal likelihood of the MPM.
modelA logical vector indicating included features.
critCriterion label set to "MPM".
coefsA named numeric vector of model coefficients, including the intercept.
## Not run: # Simulate data set.seed(42) x <- data.frame( PlanetaryMassJpt = rnorm(100), RadiusJpt = rnorm(100), PeriodDays = rnorm(100) ) y <- 1 + 0.5 * x$PlanetaryMassJpt - 0.3 * x$RadiusJpt + rnorm(100) # Assume 'result' is a fitted object from gmjmcmc or mjmcmc result <- mjmcmc(cbind(y,x)) # Get the MPM mpm_model <- get.mpm.model(result, y, x, family = "gaussian") # Access coefficients mpm_model$coefs ## End(Not run)## Not run: # Simulate data set.seed(42) x <- data.frame( PlanetaryMassJpt = rnorm(100), RadiusJpt = rnorm(100), PeriodDays = rnorm(100) ) y <- 1 + 0.5 * x$PlanetaryMassJpt - 0.3 * x$RadiusJpt + rnorm(100) # Assume 'result' is a fitted object from gmjmcmc or mjmcmc result <- mjmcmc(cbind(y,x)) # Get the MPM mpm_model <- get.mpm.model(result, y, x, family = "gaussian") # Access coefficients mpm_model$coefs ## End(Not run)
Main Algorithm for GMJMCMC (Genetically Modified MJMCMC)
gmjmcmc( x, y, transforms, P = 10, N = 100, N.final = NULL, probs = NULL, params = NULL, loglik.pi = NULL, loglik.alpha = gaussian.loglik.alpha, mlpost_params = list(family = "gaussian", beta_prior = list(type = "g-prior")), intercept = TRUE, fixed = 0, sub = FALSE, verbose = TRUE )gmjmcmc( x, y, transforms, P = 10, N = 100, N.final = NULL, probs = NULL, params = NULL, loglik.pi = NULL, loglik.alpha = gaussian.loglik.alpha, mlpost_params = list(family = "gaussian", beta_prior = list(type = "g-prior")), intercept = TRUE, fixed = 0, sub = FALSE, verbose = TRUE )
x |
matrix containing the design matrix with data to use in the algorithm |
y |
response variable |
transforms |
A character vector including the names of the non-linear functions to be used by the modification and the projection operator. |
P |
The number of population iterations for GMJMCMC. The default value is P = 10, which was used in our initial example for illustrative purposes. However, a larger value, such as P = 50, is typically more appropriate for most practical applications. |
N |
The number of MJMCMC iterations per population. The default value is N = 100; however, for real applications, a larger value such as N = 1000 or higher is often preferable. |
N.final |
The number of MJMCMC iterations performed for the final population. Per default one has N.final = N, but for practical applications, a much larger value (e.g., N.final = 1000) is recommended. Increasing N.final is particularly important if predictions and inferences are based solely on the last population. |
probs |
A list of various probability vectors used by GMJMCMC, generated by |
params |
A list of various parameter vectors used by GMJMCMC, generated by |
loglik.pi |
A function specifying the marginal log-posterior of the model up to a constant, including the logarithm of the model prior:
|
loglik.alpha |
Relevant only if the non-linear projection features depend on parameters |
mlpost_params |
All parameters for the estimator function loglik.pi |
intercept |
Logical. Whether to include an intercept in the design matrix. Default is |
fixed |
Integer specifying the number of leading columns in the design matrix to always include in the model. Default is 0. |
sub |
Logical. If |
verbose |
Logical. Whether to print messages during execution. Default is |
A list containing the following elements:
models |
All models per population. |
mc.models |
All models accepted by mjmcmc per population. |
populations |
All features per population. |
marg.probs |
Marginal feature probabilities per population. |
model.probs |
Marginal feature probabilities per population. |
model.probs.idx |
Marginal feature probabilities per population. |
best.margs |
Best marginal model probability per population. |
accept |
Acceptance rate per population. |
accept.tot |
Overall acceptance rate. |
best |
Best marginal model probability throughout the run, represented as the maximum value in |
result <- gmjmcmc(y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), P = 2, transform = c("p0", "exp_dbl")) summary(result) plot(result)result <- gmjmcmc(y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), P = 2, transform = c("p0", "exp_dbl")) summary(result) plot(result)
Run Multiple GMJMCMC (Genetically Modified MJMCMC) Runs in Parallel.
gmjmcmc.parallel( x, y, loglik.pi = NULL, mlpost_params = list(family = "gaussian", beta_prior = list(type = "g-prior")), loglik.alpha = gaussian.loglik.alpha, transforms, runs = 2, cores = getOption("mc.cores", 2L), verbose = FALSE, merge.options = list(populations = "best", complex.measure = 2, tol = 1e-07), ... )gmjmcmc.parallel( x, y, loglik.pi = NULL, mlpost_params = list(family = "gaussian", beta_prior = list(type = "g-prior")), loglik.alpha = gaussian.loglik.alpha, transforms, runs = 2, cores = getOption("mc.cores", 2L), verbose = FALSE, merge.options = list(populations = "best", complex.measure = 2, tol = 1e-07), ... )
x |
matrix containing the design matrix with data to use in the algorithm |
y |
response variable |
loglik.pi |
The (log) density to explore |
mlpost_params |
parameters for the estimator function loglik.pi |
loglik.alpha |
The likelihood function to use for alpha calculation |
transforms |
A Character vector including the names of the non-linear functions to be used by the modification |
runs |
The number of runs to run |
cores |
The number of cores to run on |
verbose |
A logical denoting if messages should be printed |
merge.options |
A list of options to pass to the |
... |
Further parameters passed to mjmcmc. |
Results from multiple gmjmcmc runs
result <- gmjmcmc.parallel( runs = 1, cores = 1, loglik.pi = NULL, y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), transforms = c("p0", "exp_dbl") ) summary(result) plot(result)result <- gmjmcmc.parallel( runs = 1, cores = 1, loglik.pi = NULL, y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), transforms = c("p0", "exp_dbl") ) summary(result) plot(result)
Heavy Side Function
hs(x)hs(x)
x |
The vector of values |
as.integer(x>0)
hs(2)hs(2)
Imputes missing values in the data using median imputation based on the data set.
impute_x(object, x)impute_x(object, x)
object |
A fitted model object with an "imputed" attribute indicating columns to impute. |
x |
A matrix or data frame |
A matrix with imputed values and additional columns for missingness indicators.
set.seed(123) x <- matrix(rnorm(60), 10, 6) colnames(x) <- paste0("X", 1:6) x[1:2, 1] <- NA # Introduce missing values model <- list(imputed = c(1)) attr(model, "imputed") <- c(1) x_imputed <- impute_x(model, x) dim(x_imputed) # 10 rows, 7 columns (6 original + 1 missingness indicator) any(is.na(x_imputed)) # FALSE, no missing valuesset.seed(123) x <- matrix(rnorm(60), 10, 6) colnames(x) <- paste0("X", 1:6) x[1:2, 1] <- NA # Introduce missing values model <- list(imputed = c(1)) attr(model, "imputed") <- c(1) x_imputed <- impute_x(model, x) dim(x_imputed) # 10 rows, 7 columns (6 original + 1 missingness indicator) any(is.na(x_imputed)) # FALSE, no missing values
Imputes missing values in the test data using median imputation based on the training set.
impute_x_pred(object, x_test, x_train)impute_x_pred(object, x_test, x_train)
object |
A fitted model object with an "imputed" attribute indicating columns to impute. |
x_test |
A matrix or data frame containing the test data. |
x_train |
A matrix or data frame containing the training data. |
A matrix with imputed values and additional columns for missingness indicators.
set.seed(123) x_test <- matrix(rnorm(60), 10, 6) colnames(x_test) <- paste0("X", 1:6) x_test[1:2, 1] <- NA # Introduce missing values x_train <- matrix(rnorm(300), 50, 6) colnames(x_train) <- paste0("X", 1:6) model <- list(imputed = c(1)) attr(model, "imputed") <- c(1) x_imputed <- impute_x_pred(model, x_test, x_train) dim(x_imputed) # 10 rows, 7 columns (6 original + 1 missingness indicator) any(is.na(x_imputed)) # FALSE, no missing valuesset.seed(123) x_test <- matrix(rnorm(60), 10, 6) colnames(x_test) <- paste0("X", 1:6) x_test[1:2, 1] <- NA # Introduce missing values x_train <- matrix(rnorm(300), 50, 6) colnames(x_train) <- paste0("X", 1:6) model <- list(imputed = c(1)) attr(model, "imputed") <- c(1) x_imputed <- impute_x_pred(model, x_test, x_train) dim(x_imputed) # 10 rows, 7 columns (6 original + 1 missingness indicator) any(is.na(x_imputed)) # FALSE, no missing values
Log Model Prior Function
log_prior(mlpost_params, complex)log_prior(mlpost_params, complex)
mlpost_params |
list of passed parameters of the likelihood in GMJMCMC |
complex |
list of complexity measures of the features included into the model |
A numeric with the log model prior.
log_prior(mlpost_params = list(r=2), complex = list(oc = 2))log_prior(mlpost_params = list(r=2), complex = list(oc = 2))
This function is created as an example of how to create an estimator that is used to calculate the marginal likelihood of a model.
logistic.loglik(y, x, model, complex, mlpost_params = list(r = exp(-0.5)))logistic.loglik(y, x, model, complex, mlpost_params = list(r = exp(-0.5)))
y |
A vector containing the dependent variable |
x |
The matrix containing the precalculated features |
model |
The model to estimate as a logical vector |
complex |
A list of complexity measures for the features |
mlpost_params |
A list of parameters for the log likelihood, supplied by the user |
A list with the log marginal likelihood combined with the log prior (crit) and the posterior mode of the coefficients (coefs).
logistic.loglik(as.integer(rnorm(100) > 0), matrix(rnorm(100)), TRUE, list(oc = 1))logistic.loglik(as.integer(rnorm(100) > 0), matrix(rnorm(100)), TRUE, list(oc = 1))
Function for Calculating Marginal Inclusion Probabilities of Features Given a List of Models
marginal.probs(models)marginal.probs(models)
models |
The list of models to use. |
A numeric vector of marginal model probabilities based on relative frequencies of model visits in MCMC.
result <- gmjmcmc(x = matrix(rnorm(600), 100), y = matrix(rnorm(100), 100), P = 2, transforms = c("p0", "exp_dbl")) marginal.probs(result$models[[1]])result <- gmjmcmc(x = matrix(rnorm(600), 100), y = matrix(rnorm(100), 100), P = 2, transforms = c("p0", "exp_dbl")) marginal.probs(result$models[[1]])
This function will weight the features based on the best marginal posterior in that population and merge the results together, simplifying by merging equivalent features (having high correlation).
merge_results( results, populations = NULL, complex.measure = NULL, tol = NULL, data = NULL )merge_results( results, populations = NULL, complex.measure = NULL, tol = NULL, data = NULL )
results |
A list containing multiple results from GMJMCMC (Genetically Modified MJMCMC). |
populations |
Which populations should be merged from the results, can be "all", "last" (default) or "best". |
complex.measure |
The complex measure to use when finding the simplest equivalent feature, 1=total width, 2=operation count and 3=depth. |
tol |
The tolerance to use for the correlation when finding equivalent features, default is 0.0000001 |
data |
Data to use when comparing features, default is NULL meaning that mock data will be generated, if data is supplied it should be of the same form as is required by gmjmcmc, i.e. with both x, y and an intercept. |
An object of class "gmjmcmc_merged" containing the following elements:
features |
The features where equivalent features are represented in their simplest form. |
marg.probs |
Importance of features. |
counts |
Counts of how many versions that were present of each feature. |
results |
Results as they were passed to the function. |
pop.best |
The population in the results which contained the model with the highest log marginal posterior. |
thread.best |
The thread in the results which contained the model with the highest log marginal posterior. |
crit.best |
The highest log marginal posterior for any model in the results. |
reported |
The highest log marginal likelihood for the reported populations as defined in the populations argument. |
rep.pop |
The index of the population which contains reported. |
best.log.posteriors |
A matrix where the first column contains the population indices and the second column contains the model with the highest log marginal posterior within that population. |
rep.thread |
The index of the thread which contains reported. |
result <- fbms(semimajoraxis ~ ., data = exoplanet, method = "gmjmcmc.parallel", transforms = c("sigmoid"), runs = 2, cores = 1) summary(result) plot(result) merge_results(result$results.raw)result <- fbms(semimajoraxis ~ ., data = exoplanet, method = "gmjmcmc.parallel", transforms = c("sigmoid"), runs = 2, cores = 1) summary(result) plot(result) merge_results(result$results.raw)
Main Algorithm for MJMCMC (Genetically Modified MJMCMC)
mjmcmc( x, y, N = 1000, probs = NULL, params = NULL, loglik.pi = NULL, mlpost_params = list(family = "gaussian", beta_prior = list(type = "g-prior")), intercept = TRUE, fixed = 0, sub = FALSE, verbose = TRUE )mjmcmc( x, y, N = 1000, probs = NULL, params = NULL, loglik.pi = NULL, mlpost_params = list(family = "gaussian", beta_prior = list(type = "g-prior")), intercept = TRUE, fixed = 0, sub = FALSE, verbose = TRUE )
x |
matrix containing the design matrix with data to use in the algorithm, |
y |
response variable |
N |
The number of MJMCMC iterations to run for (default 100) |
probs |
A list of various probability vectors used by GMJMCMC, generated by |
params |
A list of various parameter vectors used by MJMCMC, generated by |
loglik.pi |
A function specifying the marginal log-posterior of the model up to a constant, including the logarithm of the model prior:
|
mlpost_params |
All parameters for the estimator function loglik.pi |
intercept |
Logical. Whether to include an intercept in the design matrix. Default is |
fixed |
Integer specifying the number of leading columns in the design matrix to always include in the model. Default is 0. |
sub |
Logical. If |
verbose |
Logical. Whether to print messages during execution. Default is |
A list containing the following elements:
models |
All visited models in both mjmcmc and local optimization. |
accept |
Average acceptance rate of the chain. |
mc.models |
All models visited during mjmcmc iterations. |
best.crit |
The highest log marginal probability of the visited models. |
marg.probs |
Marginal probabilities of the features. |
model.probs |
Marginal probabilities of all of the visited models. |
model.probs.idx |
Indices of unique visited models. |
populations |
The covariates represented as a list of features. |
result <- mjmcmc( y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), loglik.pi = gaussian.loglik) summary(result) plot(result)result <- mjmcmc( y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), loglik.pi = gaussian.loglik) summary(result) plot(result)
Run Multiple MJMCMC Runs in Parallel, Merging the Results Before Returning.
mjmcmc.parallel(runs = 2, cores = getOption("mc.cores", 2L), ...)mjmcmc.parallel(runs = 2, cores = getOption("mc.cores", 2L), ...)
runs |
The number of runs to run |
cores |
The number of cores to run on |
... |
Further parameters passed to mjmcmc. |
Merged results from multiple mjmcmc runs
result <- mjmcmc.parallel(runs = 1, cores = 1, loglik.pi = FBMS::gaussian.loglik, y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100)) summary(result) plot(result)result <- mjmcmc.parallel(runs = 1, cores = 1, loglik.pi = FBMS::gaussian.loglik, y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100)) summary(result) plot(result)
Function to Generate a Function String for a Model Consisting of Features
model.string(model, features, link = "I", round = 2)model.string(model, features, link = "I", round = 2)
model |
A logical vector indicating which features to include |
features |
The population of features |
link |
The link function to use, as a string |
round |
Rounding error for the features in the printed format |
A character representation of a model
result <- gmjmcmc(y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), P = 2, transforms = c("p0", "exp_dbl")) summary(result) plot(result) model.string(c(TRUE, FALSE, TRUE, FALSE, TRUE), result$populations[[1]]) model.string(result$models[[1]][1][[1]]$model, result$populations[[1]])result <- gmjmcmc(y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), P = 2, transforms = c("p0", "exp_dbl")) summary(result) plot(result) model.string(c(TRUE, FALSE, TRUE, FALSE, TRUE), result$populations[[1]]) model.string(result$models[[1]][1][[1]]$model, result$populations[[1]])
Negative GELU Function
ngelu(x)ngelu(x)
x |
The vector of values |
-x*pnorm(-x)
ngelu(2)ngelu(2)
Negative Heavy Side Function
nhs(x)nhs(x)
x |
The vector of values |
as.integer(x<0)
nhs(2)nhs(2)
Not x
not(x)not(x)
x |
The vector of binary values |
1-x
not(TRUE)not(TRUE)
Negative ReLU Function
nrelu(x)nrelu(x)
x |
The vector of values |
max(-x,0)
nrelu(2)nrelu(2)
p0 Polynomial Term
p0(x)p0(x)
x |
The vector of values |
log(abs(x) + .Machine$double.eps)
p0(2)p0(2)
p05 Polynomial Term
p05(x)p05(x)
x |
The vector of values |
(abs(x)+.Machine$double.eps)^(0.5)
p05(2)p05(2)
p0p0 Polynomial Term
p0p0(x)p0p0(x)
x |
The vector of values |
p0(x)*p0(x)
p0p0(2)p0p0(2)
p0p05 Polynomial Term
p0p05(x)p0p05(x)
x |
The vector of values |
p0(x)*(abs(x)+.Machine$double.eps)^(0.5)
p0p05(2)p0p05(2)
p0p1 Polynomial Term
p0p1(x)p0p1(x)
x |
The vector of values |
p0(x)*x
p0p1(2)p0p1(2)
p0p2 Polynomial Term
p0p2(x)p0p2(x)
x |
The vector of values |
p0(x)*x^(2)
p0p2(2)p0p2(2)
p0p3 Polynomial Term
p0p3(x)p0p3(x)
x |
The vector of values |
p0(x)*x^(3)
p0p3(2)p0p3(2)
p0pm05 Polynomial Term
p0pm05(x)p0pm05(x)
x |
The vector of values |
p0(x)sign(x)(abs(x)+.Machine$double.eps)^(-0.5)
p0pm05(2)p0pm05(2)
p0pm1 Polynomial Terms
p0pm1(x)p0pm1(x)
x |
The vector of values |
p0(x)*(x+.Machine$double.eps)^(-1)
p0pm1(2)p0pm1(2)
p0pm2 Polynomial Term
p0pm2(x)p0pm2(x)
x |
The vector of values |
p0(x)sign(x)(abs(x)+.Machine$double.eps)^(-2)
p0pm2(2)p0pm2(2)
p2 Polynomial Term
p2(x)p2(x)
x |
The vector of values |
x^(2)
p2(2)p2(2)
p3 Polynomial Term
p3(x)p3(x)
x |
The vector of values |
x^(3)
p3(2)p3(2)
Plots the coefficients of a BGNLM model.
## S3 method for class 'bgnlm_model' plot(x, ...)## S3 method for class 'bgnlm_model' plot(x, ...)
x |
Object of class "bgnlm_model". |
... |
Additional arguments passed to barplot. |
The input object (invisibly).
data(exoplanet) model <- get.best.model(fbms(semimajoraxis ~ ., data = exoplanet, family = "gaussian")) plot(model)data(exoplanet) model <- get.best.model(fbms(semimajoraxis ~ ., data = exoplanet, family = "gaussian")) plot(model)
Plots the mean predictions and quantile intervals from an FBMS prediction object, with quantiles in varying shades of grey.
## S3 method for class 'fbms_predict' plot(x, ...)## S3 method for class 'fbms_predict' plot(x, ...)
x |
Object of class "fbms_predict". |
... |
Additional arguments passed to plot. |
Plots the predictions and returns NULL.
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet) pred <- predict(model, exoplanet[51:60, -1], quantiles = c(0.025, 0.1, 0.5, 0.9, 0.975)) plot(pred)data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet) pred <- predict(model, exoplanet[51:60, -1], quantiles = c(0.025, 0.1, 0.5, 0.9, 0.975)) plot(pred)
Function to Plot GMJMCMC Results and Merged Results from merge.results
## S3 method for class 'gmjmcmc' plot(x, count = "all", pop = "best", tol = 1e-07, data = NULL, ...)## S3 method for class 'gmjmcmc' plot(x, count = "all", pop = "best", tol = 1e-07, data = NULL, ...)
x |
The results to use |
count |
The number of features to plot, defaults to all |
pop |
The population to plot, defaults to last |
tol |
The tolerance to use for the correlation when finding equivalent features, default is 0.0000001 |
data |
Data to merge on, important if pre-filtering was used |
... |
Not used. |
No return value, just creates a plot
result <- gmjmcmc(y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), P = 2, transforms = c("p0", "exp_dbl")) plot(result)result <- gmjmcmc(y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), P = 2, transforms = c("p0", "exp_dbl")) plot(result)
Plot a gmjmcmc_merged Run
## S3 method for class 'gmjmcmc_merged' plot(x, count = "all", pop = NULL, tol = 1e-07, data = NULL, ...)## S3 method for class 'gmjmcmc_merged' plot(x, count = "all", pop = NULL, tol = 1e-07, data = NULL, ...)
x |
The results to use |
count |
The number of features to plot, defaults to all |
pop |
The population to plot, defaults to last |
tol |
The tolerance to use for the correlation when finding equivalent features, default is 0.0000001 |
data |
Data to merge on, important if pre-filtering was used |
... |
Not used. |
No return value, just creates a plot
result <- gmjmcmc.parallel( runs = 1, cores = 1, y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), P = 2, transforms = c("p0", "exp_dbl") ) plot(result)result <- gmjmcmc.parallel( runs = 1, cores = 1, y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), P = 2, transforms = c("p0", "exp_dbl") ) plot(result)
Function to Plot GMJMCMC Results and Merged Results from merge.results
## S3 method for class 'mjmcmc' plot(x, count = "all", ...)## S3 method for class 'mjmcmc' plot(x, count = "all", ...)
x |
The results to use |
count |
The number of features to plot, defaults to all |
... |
Not used. |
No return value, just creates a plot
result <- mjmcmc( y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), loglik.pi = gaussian.loglik) plot(result)result <- mjmcmc( y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), loglik.pi = gaussian.loglik) plot(result)
Plot an mjmcmc_parallel Run
## S3 method for class 'mjmcmc_parallel' plot(x, count = "all", ...)## S3 method for class 'mjmcmc_parallel' plot(x, count = "all", ...)
x |
The results to use |
count |
The number of features to plot, defaults to all |
... |
Not used. |
No return value, just creates a plot
result <- mjmcmc.parallel(runs = 1, cores = 1, y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), loglik.pi = gaussian.loglik) plot(result)result <- mjmcmc.parallel(runs = 1, cores = 1, y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), loglik.pi = gaussian.loglik) plot(result)
pm05 Polynomial Term
pm05(x)pm05(x)
x |
The vector of values |
(abs(x)+.Machine$double.eps)^(-0.5)
pm05(2)pm05(2)
pm1 Polynomial Term
pm1(x)pm1(x)
x |
The vector of values |
sign(x)*(abs(x)+.Machine$double.eps)^(-1)
pm1(2)pm1(2)
pm2 Polynomial Term
pm2(x)pm2(x)
x |
The vector of values |
sign(x)*(abs(x)+.Machine$double.eps)^(-2)
pm2(2)pm2(2)
This function generates predictions from a fitted bgnlm_model object given a new dataset.
## S3 method for class 'bgnlm_model' predict( object, x, link = function(x) { x }, x_train = NULL, ... )## S3 method for class 'bgnlm_model' predict( object, x, link = function(x) { x }, x_train = NULL, ... )
object |
A fitted |
x |
A |
link |
A link function to apply to the linear predictor.
By default, it is the identity function |
x_train |
Training design matrix to be provided when imputations are to be made from them |
... |
Additional arguments to pass to prediction function. |
A numeric vector of predicted values for the given data x.
These predictions are calculated as ,
where is the design matrix and are the model coefficients.
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet) preds <- predict(get.best.model(model), exoplanet[,-1])data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet) preds <- predict(get.best.model(model), exoplanet[,-1])
Predict Using a GMJMCMC Result Object
## S3 method for class 'gmjmcmc' predict( object, x, link = function(x) x, quantiles = c(0.025, 0.5, 0.975), pop = NULL, tol = 1e-07, x_train = NULL, ... )## S3 method for class 'gmjmcmc' predict( object, x, link = function(x) x, quantiles = c(0.025, 0.5, 0.975), pop = NULL, tol = 1e-07, x_train = NULL, ... )
object |
The model to use. |
x |
The new data to use for the prediction, a matrix where each row is an observation. |
link |
The link function to use |
quantiles |
The quantiles to calculate credible intervals for the posterior modes (in model space). |
pop |
The population to plot, defaults to last |
tol |
The tolerance to use for the correlation when finding equivalent features, default is 0.0000001 |
x_train |
Training design matrix to be provided when imputations are to be made from them |
... |
Not used. |
A list containing aggregated predictions and per model predictions.
aggr |
Aggregated predictions with mean and quantiles. |
preds |
A list of lists containing individual predictions per model per population in object. |
result <- gmjmcmc( x = matrix(rnorm(600), 100), y = matrix(rnorm(100), 100), P = 2, transforms = c("p0", "exp_dbl") ) preds <- predict(result, matrix(rnorm(600), 100))result <- gmjmcmc( x = matrix(rnorm(600), 100), y = matrix(rnorm(100), 100), P = 2, transforms = c("p0", "exp_dbl") ) preds <- predict(result, matrix(rnorm(600), 100))
Predict Using a Merged GMJMCMC Result Object
## S3 method for class 'gmjmcmc_merged' predict( object, x, link = function(x) x, quantiles = c(0.025, 0.5, 0.975), pop = NULL, tol = 1e-07, x_train = NULL, ... )## S3 method for class 'gmjmcmc_merged' predict( object, x, link = function(x) x, quantiles = c(0.025, 0.5, 0.975), pop = NULL, tol = 1e-07, x_train = NULL, ... )
object |
The model to use. |
x |
The new data to use for the prediction, a matrix where each row is an observation. |
link |
The link function to use |
quantiles |
The quantiles to calculate credible intervals for the posterior modes (in model space). |
pop |
The population to plot, defaults to last |
tol |
The tolerance to use for the correlation when finding equivalent features, default is 0.0000001 |
x_train |
Training design matrix to be provided when imputations are to be made from them |
... |
Not used. |
A list containing aggregated predictions and per model predictions.
aggr |
Aggregated predictions with mean and quantiles. |
preds |
A list of lists containing individual predictions per model per population in object. |
result <- gmjmcmc.parallel( runs = 1, cores = 1, x = matrix(rnorm(600), 100), y = matrix(rnorm(100), 100), P = 2, transforms = c("p0", "exp_dbl") ) preds <- predict(result, matrix(rnorm(600), 100))result <- gmjmcmc.parallel( runs = 1, cores = 1, x = matrix(rnorm(600), 100), y = matrix(rnorm(100), 100), P = 2, transforms = c("p0", "exp_dbl") ) preds <- predict(result, matrix(rnorm(600), 100))
Predict Using a GMJMCMC Result Object from a Parallel Run
## S3 method for class 'gmjmcmc_parallel' predict( object, x, link = function(x) x, quantiles = c(0.025, 0.5, 0.975), x_train = NULL, ... )## S3 method for class 'gmjmcmc_parallel' predict( object, x, link = function(x) x, quantiles = c(0.025, 0.5, 0.975), x_train = NULL, ... )
object |
The model to use. |
x |
The new data to use for the prediction, a matrix where each row is an observation. |
link |
The link function to use |
quantiles |
The quantiles to calculate credible intervals for the posterior modes (in model space). |
x_train |
Training design matrix to be provided when imputations are to be made from them |
... |
Additional arguments to pass to merge_results. |
A list containing aggregated predictions and per model predictions.
aggr |
Aggregated predictions with mean and quantiles. |
preds |
A list of lists containing individual predictions per model per population in object. |
result <- gmjmcmc.parallel( runs = 1, cores = 1, x = matrix(rnorm(600), 100), y = matrix(rnorm(100), 100), P = 2, transforms = c("p0", "exp_dbl") ) preds <- predict(result, matrix(rnorm(600), 100))result <- gmjmcmc.parallel( runs = 1, cores = 1, x = matrix(rnorm(600), 100), y = matrix(rnorm(100), 100), P = 2, transforms = c("p0", "exp_dbl") ) preds <- predict(result, matrix(rnorm(600), 100))
Predict Using an MJMCMC Result Object
## S3 method for class 'mjmcmc' predict( object, x, link = function(x) x, quantiles = c(0.025, 0.5, 0.975), x_train = NULL, ... )## S3 method for class 'mjmcmc' predict( object, x, link = function(x) x, quantiles = c(0.025, 0.5, 0.975), x_train = NULL, ... )
object |
The model to use. |
x |
The new data to use for the prediction, a matrix where each row is an observation. |
link |
The link function to use |
quantiles |
The quantiles to calculate credible intervals for the posterior modes (in model space). |
x_train |
Training design matrix to be provided when imputations are to be made from them |
... |
Not used. |
A list containing aggregated predictions.
mean |
Mean of aggregated predictions. |
quantiles |
Quantiles of aggregated predictions. |
result <- mjmcmc( x = matrix(rnorm(600), 100), y = matrix(rnorm(100), 100), loglik.pi = gaussian.loglik) preds <- predict(result, matrix(rnorm(600), 100))result <- mjmcmc( x = matrix(rnorm(600), 100), y = matrix(rnorm(100), 100), loglik.pi = gaussian.loglik) preds <- predict(result, matrix(rnorm(600), 100))
Predict Using an MJMCMC Result Object from a Parallel Run
## S3 method for class 'mjmcmc_parallel' predict( object, x, link = function(x) x, quantiles = c(0.025, 0.5, 0.975), x_train = NULL, ... )## S3 method for class 'mjmcmc_parallel' predict( object, x, link = function(x) x, quantiles = c(0.025, 0.5, 0.975), x_train = NULL, ... )
object |
The model to use. |
x |
The new data to use for the prediction, a matrix where each row is an observation. |
link |
The link function to use |
quantiles |
The quantiles to calculate credible intervals for the posterior modes (in model space). |
x_train |
Training design matrix to be provided when imputations are to be made from them |
... |
Not used. |
A list containing aggregated predictions.
mean |
Mean of aggregated predictions. |
quantiles |
Quantiles of aggregated predictions. |
result <- mjmcmc.parallel(runs = 1, cores = 1, x = matrix(rnorm(600), 100), y = matrix(rnorm(100), 100), loglik.pi = gaussian.loglik) preds <- predict(result, matrix(rnorm(600), 100))result <- mjmcmc.parallel(runs = 1, cores = 1, x = matrix(rnorm(600), 100), y = matrix(rnorm(100), 100), loglik.pi = gaussian.loglik) preds <- predict(result, matrix(rnorm(600), 100))
Dispatches to methods for extracting quantile predictions from objects.
predmean(object, ...)predmean(object, ...)
object |
An object. |
... |
Additional arguments passed to methods. |
Posterior mean predictions (format depends on the object class).
Extracts the mean predictions from an FBMS prediction object.
## S3 method for class 'fbms_predict' predmean(object, ...)## S3 method for class 'fbms_predict' predmean(object, ...)
object |
Object of class "fbms_predict". |
... |
Additional arguments (ignored). |
Vector of mean predictions.
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc") pred <- predict(model, exoplanet[51:60, -1]) predmean(pred)data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc") pred <- predict(model, exoplanet[51:60, -1]) predmean(pred)
Dispatches to methods for extracting quantile predictions from objects.
predquantiles(object, ...)predquantiles(object, ...)
object |
An object. |
... |
Additional arguments passed to methods. |
Quantile predictions (format depends on the object class).
Extracts the quantile predictions from an FBMS prediction object.
## S3 method for class 'fbms_predict' predquantiles(object, ...)## S3 method for class 'fbms_predict' predquantiles(object, ...)
object |
Object of class "fbms_predict". |
... |
Additional arguments (ignored). |
Matrix of quantile predictions, or NULL if not available.
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc") pred <- predict(model, exoplanet[51:60, -1]) predquantiles(pred)data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc") pred <- predict(model, exoplanet[51:60, -1]) predquantiles(pred)
Displays the coefficients of a BGNLM model object.
## S3 method for class 'bgnlm_model' print(x, ...)## S3 method for class 'bgnlm_model' print(x, ...)
x |
Object of class "bgnlm_model". |
... |
Additional arguments (ignored). |
Prints a summary of the model and returns NULL
data(exoplanet) model <- get.best.model(fbms(semimajoraxis ~ ., data = exoplanet, family = "gaussian")) print(model) model <- get.mpm.model(fbms(semimajoraxis ~ ., data = exoplanet, family = "gaussian"), y = exoplanet[,1],x = exoplanet[,-1]) print(model)data(exoplanet) model <- get.best.model(fbms(semimajoraxis ~ ., data = exoplanet, family = "gaussian")) print(model) model <- get.mpm.model(fbms(semimajoraxis ~ ., data = exoplanet, family = "gaussian"), y = exoplanet[,1],x = exoplanet[,-1]) print(model)
Displays a summary of an FBMS prediction object, including mean predictions and quantiles.
## S3 method for class 'fbms_predict' print(x, ...)## S3 method for class 'fbms_predict' print(x, ...)
x |
Object of class "fbms_predict". |
... |
Additional arguments (ignored). |
Prints a summary and returns NULL.
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet) pred <- predict(model, exoplanet[51:60, -1]) print(pred)data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet) pred <- predict(model, exoplanet[51:60, -1]) print(pred)
Print Method for \"feature\" Class
## S3 method for class 'feature' print( x, dataset = FALSE, fixed = 0, alphas = FALSE, labels = FALSE, round = FALSE, ... )## S3 method for class 'feature' print( x, dataset = FALSE, fixed = 0, alphas = FALSE, labels = FALSE, round = FALSE, ... )
x |
An object of class "feature" |
dataset |
Set the regular covariates as columns in a dataset |
fixed |
How many of the first columns in dataset are fixed and do not contribute to variable selection |
alphas |
Print a "?" instead of actual alphas to prepare the output for alpha estimation |
labels |
Should the covariates be named, or just referred to as their place in the data.frame. |
round |
Should numbers be rounded when printing? Default is FALSE, otherwise it can be set to the number of decimal places. |
... |
Not used. |
String representation of a feature
result <- gmjmcmc(x = matrix(rnorm(600), 100), y = matrix(rnorm(100), 100), P = 2, transforms = c("p0", "exp_dbl")) print(result$populations[[1]][1])result <- gmjmcmc(x = matrix(rnorm(600), 100), y = matrix(rnorm(100), 100), P = 2, transforms = c("p0", "exp_dbl")) print(result$populations[[1]][1])
Displays a concise summary of a GMJMCMC model object.
## S3 method for class 'gmjmcmc' print(x, ...)## S3 method for class 'gmjmcmc' print(x, ...)
x |
Object of class "gmjmcmc". |
... |
Additional arguments passed to summary method. |
Prints a summary of the model and returns NULL
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet,method = "gmjmcmc", transforms = c("sigmoid")) print(model)data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet,method = "gmjmcmc", transforms = c("sigmoid")) print(model)
Displays a concise summary of a GMJMCMC merged model object.
## S3 method for class 'gmjmcmc_merged' print(x, ...)## S3 method for class 'gmjmcmc_merged' print(x, ...)
x |
Object of class "gmjmcmc_merged". |
... |
Additional arguments passed to summary method. |
Prints a summary of the model and returns NULL
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "gmjmcmc.parallel", cores = 1, runs = 2, transforms = c("sigmoid")) print(model)data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "gmjmcmc.parallel", cores = 1, runs = 2, transforms = c("sigmoid")) print(model)
Displays a concise summary of an MJMCMC model object.
## S3 method for class 'mjmcmc' print(x, ...)## S3 method for class 'mjmcmc' print(x, ...)
x |
Object of class "mjmcmc". |
... |
Additional arguments passed to summary method. |
Prints a summary of the model and returns NULL
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc") print(model)data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc") print(model)
Displays a concise summary of an MJMCMC parallel model object.
## S3 method for class 'mjmcmc_parallel' print(x, ...)## S3 method for class 'mjmcmc_parallel' print(x, ...)
x |
Object of class "mjmcmc_parallel". |
... |
Additional arguments passed to summary method. |
Prints a summary of the model and returns NULL
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc.parallel", cores = 1, runs = 2) print(model)data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc.parallel", cores = 1, runs = 2) print(model)
ReLU Function
relu(x)relu(x)
x |
The vector of values |
max(x,0)
relu(2)relu(2)
Computes residuals as the difference between observed and predicted values.
## S3 method for class 'bgnlm_model' residuals(object, y, x, ...)## S3 method for class 'bgnlm_model' residuals(object, y, x, ...)
object |
Object of class "bgnlm_model". |
y |
Respnse. |
x |
Covariates. |
... |
Additional arguments (ignored). |
Vector of residuals.
library(FBMS) data(exoplanet) model <- get.best.model(fbms(semimajoraxis ~ ., data = exoplanet, family = "gaussian")) hist(residuals(model, exoplanet[,1], exoplanet[,-1]))library(FBMS) data(exoplanet) model <- get.best.model(fbms(semimajoraxis ~ ., data = exoplanet, family = "gaussian")) hist(residuals(model, exoplanet[,1], exoplanet[,-1]))
Computes residuals as the difference between observed and predicted values.
## S3 method for class 'gmjmcmc' residuals(object, y, x, ...)## S3 method for class 'gmjmcmc' residuals(object, y, x, ...)
object |
Object of class "gmjmcmc". |
y |
Respnse. |
x |
Covariates. |
... |
Additional arguments (ignored). |
Vector of residuals.
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "gmjmcmc", transforms = c("sigmoid")) hist(residuals(model, exoplanet[,1], exoplanet[,-1]))data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "gmjmcmc", transforms = c("sigmoid")) hist(residuals(model, exoplanet[,1], exoplanet[,-1]))
Computes residuals as the difference between observed and predicted values.
## S3 method for class 'gmjmcmc_merged' residuals(object, y, x, ...)## S3 method for class 'gmjmcmc_merged' residuals(object, y, x, ...)
object |
Object of class "gmjmcmc_merged". |
y |
Respnse. |
x |
Covariates. |
... |
Additional arguments (ignored). |
Vector of residuals.
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "gmjmcmc.parallel", transforms = c("sigmoid"), runs = 2, cores = 1) hist(residuals(model, exoplanet[,1], exoplanet[,-1]))data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "gmjmcmc.parallel", transforms = c("sigmoid"), runs = 2, cores = 1) hist(residuals(model, exoplanet[,1], exoplanet[,-1]))
Computes residuals as the difference between observed and predicted values.
## S3 method for class 'mjmcmc' residuals(object, y, x, ...)## S3 method for class 'mjmcmc' residuals(object, y, x, ...)
object |
Object of class "mjmcmc". |
y |
Respnse. |
x |
Covariates. |
... |
Additional arguments (ignored). |
Vector of residuals.
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc") hist(residuals(model, exoplanet[,1], exoplanet[,-1]))data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc") hist(residuals(model, exoplanet[,1], exoplanet[,-1]))
Computes residuals as the difference between observed and predicted values.
## S3 method for class 'mjmcmc_parallel' residuals(object, y, x, ...)## S3 method for class 'mjmcmc_parallel' residuals(object, y, x, ...)
object |
Object of class "mjmcmc_parallel". |
y |
Respnse. |
x |
Covariates. |
... |
Additional arguments (ignored). |
Vector of residuals.
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc.parallel",runs = 2, cores = 1) hist(residuals(model, exoplanet[,1], exoplanet[,-1]))data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet, method = "mjmcmc.parallel",runs = 2, cores = 1) hist(residuals(model, exoplanet[,1], exoplanet[,-1]))
This function applies a function in parallel to a list or vector (X) using multiple cores.
On Linux/macOS, it uses mclapply, while on Windows it uses a hackish version of parallelism.
The Windows version is based on parLapply to mimic forking following Nathan VanHoudnos.
rmclapply(runs, args, fun, mc.cores = NULL)rmclapply(runs, args, fun, mc.cores = NULL)
runs |
The runs to run |
args |
The arguments to pass to fun |
fun |
The function to run |
mc.cores |
Number of cores to use for parallel processing. Defaults to |
A list of results, with one element for each element of X.
A 210 times 3221 matrix with indviduals along the rows and expression data along the columns
data(SangerData2)data(SangerData2)
A data frame with 210 rows and 3221 variables
The first column corresponds to column number 24266 (with name GI_6005726-S) in the original data. Column names give the name of the variables, row names the "name" of the individuals. This is a subset of SangerData where the 3220 last rows are select among all original rows following the same pre-processing procedure as in (section 1.6.1). See also the file Read_sanger_data.R
Dataset downloaded from https://ftp.sanger.ac.uk/pub/genevar/
References:
Stranger, BE et al (2007): Relative impact of nucleotide and copy number variation on gene expression phenotypes Science, 2007•science.org
Bogdan et al (2020): Handbook of Multiple Comparisons, https://arxiv.org/pdf/2011.12154
This is also done when running the algorithm, but this function allows for it to be done manually.
set.transforms(transforms)set.transforms(transforms)
transforms |
The vector of non-linear transformations |
No return value, just sets the gmjmcmc-transformations option
set.transforms(c("p0","p1"))set.transforms(c("p0","p1"))
Sigmoid Function
sigmoid(x)sigmoid(x)
x |
The vector of values |
The sigmoid of x
sigmoid(2)sigmoid(2)
Sine Function for Degrees
sin_deg(x)sin_deg(x)
x |
The vector of values in degrees |
The sine of x
sin_deg(0)sin_deg(0)
Square Root Function
sqroot(x)sqroot(x)
x |
The vector of values |
The square root of the absolute value of x
sqroot(4)sqroot(4)
Function to Get a Character Representation of a List of Features
string.population(x, round = 2)string.population(x, round = 2)
x |
A list of feature objects |
round |
Rounding precision for parameters of the features |
A matrix of character representations of the features of a model.
result <- gmjmcmc(y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), P = 2, transforms = c("p0", "exp_dbl")) string.population(result$populations[[1]])result <- gmjmcmc(y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), P = 2, transforms = c("p0", "exp_dbl")) string.population(result$populations[[1]])
Function to Get a Character Representation of a List of Models
string.population.models(features, models, round = 2, link = "I")string.population.models(features, models, round = 2, link = "I")
features |
A list of feature objects on which the models are build |
models |
A list of model objects |
round |
Rounding precision for parameters of the features |
link |
The link function to use, as a string |
A matrix of character representations of a list of models.
result <- gmjmcmc(y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), P = 2, transforms = c("p0", "exp_dbl")) string.population.models(result$populations[[2]], result$models[[2]])result <- gmjmcmc(y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), P = 2, transforms = c("p0", "exp_dbl")) string.population.models(result$populations[[2]], result$models[[2]])
Provides a detailed summary of an FBMS prediction object, including prediction ranges.
## S3 method for class 'fbms_predict' summary(object, ...)## S3 method for class 'fbms_predict' summary(object, ...)
object |
Object of class "fbms_predict". |
... |
Additional arguments (ignored). |
Prints a summary and returns NULL.
data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet) pred <- predict(model, exoplanet[51:60, -1]) summary(pred)data(exoplanet) model <- fbms(semimajoraxis ~ ., data = exoplanet) pred <- predict(model, exoplanet[51:60, -1]) summary(pred)
Function to Print a Quick Summary of the Results
## S3 method for class 'gmjmcmc' summary( object, pop = "best", tol = 1e-04, labels = FALSE, effects = NULL, data = NULL, verbose = TRUE, ... )## S3 method for class 'gmjmcmc' summary( object, pop = "best", tol = 1e-04, labels = FALSE, effects = NULL, data = NULL, verbose = TRUE, ... )
object |
The results to use |
pop |
The population to print for, defaults to "best", other options are "last" and "all" |
tol |
The tolerance to use as a threshold when reporting the results. |
labels |
Should the covariates be named, or just referred to as their place in the data.frame. |
effects |
Quantiles for posterior modes of the effects across models to be reported, if either effects are NULL or if labels are NULL, no effects are reported. |
data |
Data to merge on, important if pre-filtering was used |
verbose |
If the summary should be printed to the console or just returned, defaults to TRUE |
... |
Not used. |
A data frame containing the following columns:
feats.strings |
Character representation of the features ordered by marginal probabilities. |
marg.probs |
Marginal probabilities corresponding to the ordered feature strings. |
result <- gmjmcmc(y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), P = 2, transforms = c("p0", "exp_dbl")) summary(result, pop = "best")result <- gmjmcmc(y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), P = 2, transforms = c("p0", "exp_dbl")) summary(result, pop = "best")
Function to Print a Quick Summary of the Results
## S3 method for class 'gmjmcmc_merged' summary( object, tol = 1e-04, labels = FALSE, effects = NULL, pop = NULL, data = NULL, verbose = TRUE, ... )## S3 method for class 'gmjmcmc_merged' summary( object, tol = 1e-04, labels = FALSE, effects = NULL, pop = NULL, data = NULL, verbose = TRUE, ... )
object |
The results to use |
tol |
The tolerance to use as a threshold when reporting the results. |
labels |
Should the covariates be named, or just referred to as their place in the data.frame. |
effects |
Quantiles for posterior modes of the effects across models to be reported, if either effects are NULL or if labels are NULL, no effects are reported. |
pop |
If null same as in merge.options for running parallel gmjmcmc otherwise results will be re-merged according to pop that can be "all", "last", "best" |
data |
Data to merge on, important if pre-filtering was used |
verbose |
If the summary should be printed to the console or just returned, defaults to TRUE |
... |
Not used. |
A data frame containing the following columns:
feats.strings |
Character representation of the features ordered by marginal probabilities. |
marg.probs |
Marginal probabilities corresponding to the ordered feature strings. |
result <- gmjmcmc.parallel( runs = 1, cores = 1, y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), P = 2, transforms = c("p0", "exp_dbl") ) summary(result)result <- gmjmcmc.parallel( runs = 1, cores = 1, y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), P = 2, transforms = c("p0", "exp_dbl") ) summary(result)
Function to Print a Quick Summary of the Results
## S3 method for class 'mjmcmc' summary( object, tol = 1e-04, labels = FALSE, effects = NULL, verbose = TRUE, ... )## S3 method for class 'mjmcmc' summary( object, tol = 1e-04, labels = FALSE, effects = NULL, verbose = TRUE, ... )
object |
The results to use |
tol |
The tolerance to use as a threshold when reporting the results. |
labels |
Should the covariates be named, or just referred to as their place in the data.frame. |
effects |
Quantiles for posterior modes of the effects across models to be reported, if either effects are NULL or if labels are NULL, no effects are reported. |
verbose |
If the summary should be printed to the console or just returned, defaults to TRUE |
... |
Not used. |
A data frame containing the following columns:
feats.strings |
Character representation of the covariates ordered by marginal probabilities. |
marg.probs |
Marginal probabilities corresponding to the ordered feature strings. |
result <- mjmcmc(y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), loglik.pi = gaussian.loglik) summary(result)result <- mjmcmc(y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), loglik.pi = gaussian.loglik) summary(result)
Function to Print a Quick Summary of the Results
## S3 method for class 'mjmcmc_parallel' summary( object, tol = 1e-04, labels = FALSE, effects = NULL, verbose = TRUE, ... )## S3 method for class 'mjmcmc_parallel' summary( object, tol = 1e-04, labels = FALSE, effects = NULL, verbose = TRUE, ... )
object |
The results to use |
tol |
The tolerance to use as a threshold when reporting the results. |
labels |
Should the covariates be named, or just referred to as their place in the data.frame. |
effects |
Quantiles for posterior modes of the effects across models to be reported, if either effects are NULL or if labels are NULL, no effects are reported. |
verbose |
If the summary should be printed to the console or just returned, defaults to TRUE |
... |
Not used. |
A data frame containing the following columns:
feats.strings |
Character representation of the covariates ordered by marginal probabilities. |
marg.probs |
Marginal probabilities corresponding to the ordered feature strings. |
result <- mjmcmc.parallel(runs = 1, cores = 1, y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), loglik.pi = gaussian.loglik) summary(result)result <- mjmcmc.parallel(runs = 1, cores = 1, y = matrix(rnorm(100), 100), x = matrix(rnorm(600), 100), loglik.pi = gaussian.loglik) summary(result)
Cube Root Function
troot(x)troot(x)
x |
The vector of values |
The cube root of x
troot(27)troot(27)