Skip to contents

fit_powerlaw_tail_optim() uses stats::optim to find optimal A and alpha whch maximizes SFS area under the powerlaw curve (sampled region of SFS and the range of f values below the maximum SFS value does not count) and minimizes negative error - where the curve is above the real SFS (sampled are does not count). Penalty for the negative error depends on the number of points with the negative error value. Penalty value is the sum of error values to the power of x, where x is the length of the vector of negative error values. This allows the powerlaw curve to detach from the SFS for 1 or two bins, but then the penalty rises extremely.

Usage

fit_powerlaw_tail_optim(object, ...)

# S3 method for cevodata
fit_powerlaw_tail_optim(
  object,
  name = "powerlaw_optim",
  bootstraps = FALSE,
  allowed_zero_bins = 2,
  y_treshold = 1,
  y_threshold_pct = 0.01,
  av_filter = c(1/3, 1/3, 1/3),
  peak_detection_upper_limit = 0.3,
  reward_upper_limit = 0.4,
  control = list(maxit = 1000, ndeps = c(0.1, 0.01)),
  verbose = get_cevomod_verbosity(),
  ...
)

# S3 method for cevo_SFS_bootstraps
fit_powerlaw_tail_optim(
  object,
  name = "powerlaw_optim",
  allowed_zero_bins = 2,
  y_treshold = 1,
  y_threshold_pct = 0.01,
  av_filter = c(1/3, 1/3, 1/3),
  peak_detection_upper_limit = 0.3,
  reward_upper_limit = 0.4,
  control = list(maxit = 1000, ndeps = c(0.1, 0.01)),
  verbose = get_cevomod_verbosity(),
  ...
)

# S3 method for cevo_SFS_tbl
fit_powerlaw_tail_optim(
  object,
  name = "powerlaw_optim",
  allowed_zero_bins = 2,
  y_treshold = 1,
  y_threshold_pct = 0.01,
  av_filter = c(1/3, 1/3, 1/3),
  peak_detection_upper_limit = 0.3,
  reward_upper_limit = 0.4,
  control = list(maxit = 1000, ndeps = c(0.1, 0.01)),
  verbose = get_cevomod_verbosity(),
  ...
)

Arguments

object

cevodata

...

other arguments passed to stats::optim()

name

name in the models' slot

bootstraps

Number of bootstrap samples, or FALSE to make no resampling. This option significantly extendis the model fitting time!!

allowed_zero_bins

number of allowed empty bins in the interval

y_treshold

bins with less mutations will be considered empty

y_threshold_pct

bins that have less mutations than this param times the height of the higherst peak will be considered empty

av_filter

average filter values to be applied to f

peak_detection_upper_limit

Upper f value up to which the main peak is searched

reward_upper_limit

Mutations under the curve up to this limit will be rewarded

control

control param of stats::optim()

verbose

verbose?

Examples

data("tcga_brca_test")
cd <- tcga_brca_test |>
  dplyr::filter(sample_id %in% c("TCGA-AC-A23H-01","TCGA-AN-A046-01")) |>
  fit_powerlaw_tail_optim()
#> Fitting optimized power-law models...
#> Models fitted in 1.26474070549011 seconds