Skip to content

Commit

Permalink
New formula for proportion transmission (#99)
Browse files Browse the repository at this point in the history
* Create proportion_transmission2.R

* Update proportion_transmission2.R

* integrate proportion_transmission2 into proportion_transmission and add method argument

* add utility functions for proportion_transmission method = t_20

* add unit tests for proportion_transmission method = t_20

* add @dcadam as package author

* update WORDLIST

* linting documentation

* update functions, function documentation, tests and vignettes to use new {epiparameter} function and class names

---------

Co-authored-by: Joshua Lambert <[email protected]>
Co-authored-by: Dillon Adam <[email protected]>
  • Loading branch information
dcadam and joshwlambert authored Aug 28, 2024
1 parent 9dd23e0 commit f495577
Show file tree
Hide file tree
Showing 9 changed files with 328 additions and 13 deletions.
9 changes: 8 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,13 @@ Authors@R:
email = "[email protected]",
comment = c(ORCID = "0000-0001-8814-9421")
),
person(
given = "Dillon C.",
family = "Adam",
role = c("aut"),
email = "[email protected]",
comment = c(ORCID = "0000-0002-7485-9905")
),
person(
given = "Sebastian",
family = "Funk",
Expand Down Expand Up @@ -53,7 +60,7 @@ URL: https://github.com/epiverse-trace/superspreading, https://epiverse-trace.gi
BugReports: https://github.com/epiverse-trace/superspreading/issues
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
RoxygenNote: 7.3.2
Imports:
checkmate,
stats,
Expand Down
109 changes: 100 additions & 9 deletions R/proportion_transmission.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,17 +9,62 @@
#' This can be calculated using `proportion_transmission()` at varying values of
#' \eqn{R} and for different values of percentage transmission.
#'
#' @details Multiple values of R and k can be supplied and a `<data.frame>` of
#' There are two methods for calculating the proportion of transmission,
#' \eqn{p_{80}} (default) and \eqn{t_{20}}, see `method` argument or details for
#' more information.
#'
#' @details
#' Calculates the expected proportion of transmission from a given
#' proportion of infectious cases. There are two methods to calculate this with
#' distinct formulations, \eqn{p_{80}} and \eqn{t_{20}} these can be specified
#' by the `method` argument.
#'
#' `method = p_80` calculates relative transmission heterogeneity
#' from the offspring distribution of secondary cases, \eqn{Z}, where the upper
#' proportion of the distribution comprise \eqn{x%} of total number of cases
#' given R0 and k, where \eqn{x} is typically defined as 0.8 or 80%. e.g. 80%
#' of all transmissions are generated by the upper 20% of cases, or
#' `p_80 = 0.2`, per the 80/20 rule. In this formulation, changes in R can
#' have a significant effect on the estimate of \eqn{p_80} even when k is
#' constant. Importantly, this formulation **does not** allow for true
#' homogeneity when `k = Inf` i.e. \eqn{p_{80} = 0.8}.
#'
#' `method = t_20` calculates a similar ratio, instead in terms of
#' the theoretical individual reproductive number and infectiousness given
#' R0 and k. The individual reproductive number, 'v', is described in
#' Lloyd-Smith JO et al. (2005), "as a random variable representing the
#' expected number of secondary cases caused by a particular infected
#' individual. Values for v are drawn from a continuous gamma probability
#' distribution with population mean R0 and dispersion parameter k, which
#' encodes all variation in infectious histories of individuals, including
#' properties of the host and pathogen and environmental circumstances." The
#' value of k corresponds to the shape parameters of the gamma distribution
#' which encodes the variation in the gamma-poisson mixture aka the negative
#' binomial
#'
#' For `method = t_20`, we define the upper proportion of infectiousness,
#' which is typically 0.2 i.e. the upper 20% most infectious
#' cases, again per the 80/20 rule. e.g. the most infectious 20% of cases,
#' are expected to produce 80% of all infections, or `t_20 = 0.8`. Unlike
#' `method = p_80`, changes in R have no effect on the estimate
#' of t_80 when k is constant, but R is still required for the underlying
#' calculation. This formulation **does** allow for true homogeneity when
#' k = Inf i.e. t_20 = 0.2, or t_80 = 0.8.
#'
#' Multiple values of R and k can be supplied and a `<data.frame>` of
#' every combination of these will be returned.
#'
#' The numerical calculation uses random number generation to simulate secondary
#' contacts so the answers may minimally vary between calls. The number of
#' simulation replicates is fixed to `1e5`.
#' The numerical calculation for `method = p_80` uses random number generation
#' to simulate secondary contacts so the answers may minimally vary between
#' calls. The number of simulation replicates is fixed to `1e5`.
#'
#' @inheritParams proportion_cluster_size
#' @inheritParams probability_epidemic
#' @param percent_transmission A `number` of the percentage transmission
#' for which a proportion of cases has produced.
#' @param method A `character` string defining which method is used to calculate
#' the proportion of transmission. Options are `"p_80"` (default) or `"t_20"`.
#' See details for more information on each of these methods.
#' @param simulate A `logical` whether the calculation should be done
#' numerically (i.e. simulate secondary contacts) or analytically. Default is
#' `FALSE` which uses the analytical calculation.
Expand All @@ -37,6 +82,20 @@
#' sizes outside China. Wellcome Open Research, 5.
#' \doi{10.12688/wellcomeopenres.15842.3}
#'
#' The \eqn{t_{20}} method follows the formula defined in section 2.2.5 of the
#' supplementary material for:
#'
#' Lloyd-Smith JO, Schreiber SJ, Kopp PE, Getz WM. Superspreading and the
#' effect of individual variation on disease emergence. Nature. 2005
#' Nov;438(7066):355–9. \doi{10.1038/nature04153}
#'
#' The original code for the \eqn{t_{20}} method is from ongoing work
#' originating from \url{https://github.com/dcadam/kt} and:
#'
#' Adam D, Gostic K, Tsang T, Wu P, Lim WW, Yeung A, et al.
#' Time-varying transmission heterogeneity of SARS and COVID-19 in Hong Kong.
#' 2022. \doi{10.21203/rs.3.rs-1407962/v1}
#'
#' @examples
#' # example of single values of R and k
#' percent_transmission <- 0.8 # 80% of transmission
Expand Down Expand Up @@ -65,6 +124,7 @@
#' )
proportion_transmission <- function(R, k,
percent_transmission,
method = c("p_80", "t_20"),
simulate = FALSE,
...,
offspring_dist,
Expand All @@ -76,6 +136,7 @@ proportion_transmission <- function(R, k,
call. = FALSE
)
}
method <- match.arg(method)

# check inputs
chkDots(...)
Expand All @@ -90,6 +151,13 @@ proportion_transmission <- function(R, k,
checkmate::assert_logical(simulate, any.missing = FALSE, len = 1)
checkmate::assert_logical(format_prop, any.missing = FALSE, len = 1)

if (simulate && method == "t_20") {
stop(
"The simulate argument must be FALSE when method = t_20.",
call. = FALSE
)
}

df <- expand.grid(R = R, k = k, NA_real_)
colnames(df) <- c("R", "k", paste0("prop_", percent_transmission * 100))

Expand All @@ -101,11 +169,19 @@ proportion_transmission <- function(R, k,
percent_transmission = percent_transmission
)
} else {
prop <- .prop_transmission_analytical(
R = df[i, "R"],
k = df[i, "k"],
percent_transmission = percent_transmission
)
if (method == "p_80") {
prop <- .prop_transmission_analytical(
R = df[i, "R"],
k = df[i, "k"],
percent_transmission = percent_transmission
)
} else {
prop <- .prop_transmission_t20(
R = df[i, "R"],
k = df[i, "k"],
percent_transmission = percent_transmission
)
}
}

if (format_prop) {
Expand Down Expand Up @@ -161,3 +237,18 @@ proportion_transmission <- function(R, k,

return(out)
}

#' Calculates the expected proportion of transmission from the upper `prop`%
#' of most infectious cases
#'
#' @return A `numeric`
#' @keywords internal
#' @noRd
.prop_transmission_t20 <- function(R, k, percent_transmission) {
u <- solve_for_u(prop = percent_transmission, R = R, k = k)
integral_result <- stats::integrate(
function(u) u * fvx(u, R, k), lower = 0, upper = u
)
res <- integral_result$value / R
return(1 - res)
}
65 changes: 65 additions & 0 deletions R/utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -193,3 +193,68 @@ ic_tbl <- function(..., sort_by = c("AIC", "BIC", "none")) {
}

`%||%` <- function(x, y) if (is.null(x)) y else x

#' Defines the gamma functions in terms of the mean reproductive number
#' (R) and the dispersion parameter (k)
#'
#' @description
#' * [dgammaRk()] for the gamma density function
#' * [pgammaRk()] for the gamma distribution function
#' * [fvx()] fore the gamma probability density function (pdf) describing the
#' individual reproductive number \eqn{\nu} given R0 and k
#'
#' @inheritParams stats::dgamma
#' @inheritParams probability_epidemic
#'
#' @keywords internal
#' @name gamma
dgammaRk <- function(x, R, k) {
out <- stats::dgamma(x, shape = k, scale = R / k)
return(out)
}

#' @name gamma
pgammaRk <- function(x, R, k) {
out <- stats::pgamma(x, shape = k, scale = R / k)
return(out)
}

#' @name gamma
fvx <- function(x, R, k) {
dgammaRk(x = x, R = R, k = k)
}


#' Generates a log scaled sequence of real numbers
#'
#' @inheritParams base::seq
#'
#' @inherit base::seq return
#' @keywords internal
lseq <- function(from, to, length.out) {
exp(seq(log(from), log(to), length.out = length.out))
}

#' Unitroot solver for the solution to u, or the value of x from the gamma CDF
#' given the desired proportion of transmission
#'
#' @return A `numeric` with the location of the root.
#' @keywords internal
#' @noRd
solve_for_u <- function(prop, R, k) {
f <- function(x, prop) {
res <- 1 - pgammaRk(x, R, k)
res - prop
}
# Initial interval for u
lower <- 0
upper <- 1
# Find the root using uniroot
root <- stats::uniroot(
f,
prop = prop,
interval = c(lower, upper),
extendInt = "yes"
)
return(root$root)
}
8 changes: 8 additions & 0 deletions inst/WORDLIST
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,11 @@ Fitdistrplus
frac
Getz
ggplot
Gostic
Hanne
Heleze
hline
Inf
infectees
infector
infectors
Expand All @@ -45,6 +47,7 @@ Koen
Kremer
Laure
Lifecycle
Lim
linetype
Loucoubar
lw
Expand Down Expand Up @@ -76,15 +79,20 @@ SARS
Schreiber
Selina
Sien
SJ
Soropogui
testthat
Torneri
Tsang
vars
Verdonschot
viridis
vline
Wellcome
WM
WW
xintercept
Yeung
yintercept
ymax
ymin
Expand Down
32 changes: 32 additions & 0 deletions man/gamma.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

29 changes: 29 additions & 0 deletions man/lseq.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit f495577

Please sign in to comment.