| Title: | Identify differential selection |
|---|---|
| Description: | This package implements a statistical method to detect differential selection of somatic mutations under different environments |
| Authors: | Siming Zhao [aut, cre], Jie Zhou [aut], Qirui Zhang [aut] |
| Maintainer: | Siming Zhao <[email protected]> |
| License: | MIT |
| Version: | 0.1.7 |
| Built: | 2026-06-27 07:24:02 UTC |
| Source: | https://github.com/szhaolab/diffdriver |
Add intercept, if have functtypecode, then code and move to the front. different from driverMAPS! only allows functtypecode =7 ||8 when functypecode is included in selectvars.
ddmcode(matrixlist, selectvars, functypecodelevel = NULL)ddmcode(matrixlist, selectvars, functypecodelevel = NULL)
This model is applied on data of a single gene. It will infer effect size for both sample-level variable and positional level functional annotations. We used an EM algorithm to infer parameters.
ddmodel(mut, e, mr, fe, label, ...)ddmodel(mut, e, mr, fe, label, ...)
mut |
a matrix of mutation status 0 or 1, rows positions, columns are samples. |
e |
a vector,phenotype of each sample,
should match the columns of |
mr |
a matrix, mutation rate of each sample at each mutation (log scale) that is not dependent on sample level factor |
fe |
a vector, increased mutation rate at each position, depending on e (log scale),
should match the rows of |
This function uses the model as cmodel.frac, but generalizes to take more than 1 functional categories. This model is applied on data of a single gene
ddmodel_binary(mut, e, bmr, fe)ddmodel_binary(mut, e, bmr, fe)
mut |
a matrix of mutation status 0 or 1 |
e |
a vector,phenotype of each sample,
should match the columns of |
bmr |
a matrix, background mutation rate of each sample at each mutation (log scale) |
fe |
a vector, increased mutation rate at each mutation, due to functional effect (log scale),
should match the rows of |
This function uses the model as cmodel.frac, but generalizes to take more than 1 functional categories. This model is applied on data of a single gene. This should give the same results as ddmodel_binary defined above.
ddmodel_binary_simple(mut, e, bmr, fe)ddmodel_binary_simple(mut, e, bmr, fe)
mut |
a matrix of mutation status 0 or 1 |
e |
a vector,phenotype of each sample,
should match the columns of |
bmr |
a matrix, background mutation rate of each sample at each mutation (log scale), as we assume bmr the same across samples, only the first column will be used. |
fe |
a vector, increased mutation rate at each mutation, due to functional effect (log scale),
should match the rows of |
This function runs diffDriver.
diffdriver( gene, mut, pheno, anno_dir = ".", k = 6, totalnttype = 96, BMRmode = c("signature", "regular"), output_dir = ".", output_prefix = "diffdriver_results" )diffdriver( gene, mut, pheno, anno_dir = ".", k = 6, totalnttype = 96, BMRmode = c("signature", "regular"), output_dir = ".", output_prefix = "diffdriver_results" )
gene |
A vector of genes to be included in the analysis. |
mut |
A data frame containing all somatic mutations from the cohort. The format is:
Example: Chromosome Position Ref Alt SampleID 1 19 55653236 C T TCGA-N6-A4VE-01A-11D-A28R-08 |
pheno |
A data frame containing sample phenotypes. The format is: #'
Example: SampleID SmokingCessation BMI TCGA-N5-A4R8-01A-11D-A28R-08 0.5319630 20.0 TCGA-N5-A4RD-01A-11D-A28R-08 0.0448991 24.4 |
anno_dir |
The path to the directory with all the annotation files. Please download from Zenodo.The default is current folder |
k |
The number of topics used in modeling background mutation rate. The default is 6. |
totalnttype |
either 9 or 96. Will look for annotation files anno9_ntypexxx_annodata.txt when totalnttype is 9 or anno96_ntypexxx_annodata when totalnttype is 96. |
BMRmode |
There are two modes to run diffdriver. One is "signature", this will model individual level BMR, this is the default. The second one is "regular", this assumes BMR is the same across individuals, only models position-level difference. |
output_dir |
The path to output directory |
output_prefix |
The prefix being added to the output file names. |
gene level binomial test
genebinom(mut, e)genebinom(mut, e)
gene level fisher's exact test
genefisher(mut, e)genefisher(mut, e)
gene level logistic regression
genelr(mut, e, covariates = rep(1, length(e)))genelr(mut, e, covariates = rep(1, length(e)))
gene level multiple linear regression
mlr(mut, e, covariates = rep(1, length(e)))mlr(mut, e, covariates = rep(1, length(e)))
gene level multiple linear regression, correcting for total number of mutations
mlr.v2(mut, e, nmut, covariates = 1)mlr.v2(mut, e, nmut, covariates = 1)
its like optim, but with fixed parameters.
optifix( par, fixed, fn, gr = NULL, ..., method = c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN"), lower = -Inf, upper = Inf, control = list(), hessian = FALSE )optifix( par, fixed, fn, gr = NULL, ..., method = c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN"), lower = -Inf, upper = Inf, control = list(), hessian = FALSE )
specify a second argument 'fixed', a vector of TRUE/FALSE values. If TRUE, the corresponding parameter in fn() is fixed. Otherwise its variable and optimised over.
The return thing is the return thing from optim() but with a couple of extra bits - a vector of all the parameters and a vector copy of the 'fixed' argument.
Written by Barry Rowlingson <[email protected]> October 2011
This file released under a CC By-SA license: http://creativecommons.org/licenses/by-sa/3.0/
and must retain the text: "Originally written by Barry Rowlingson" in comments.
plot phenotype, mutation and annotation for a gene across samples
plot_mut( gene_name, mut, pheno, totalnttype = 96, anno_dir = ".", output_prefix = "plot", output_dir = "." )plot_mut( gene_name, mut, pheno, totalnttype = 96, anno_dir = ".", output_prefix = "plot", output_dir = "." )