Hostname: page-component-788cddb947-xdx58 Total loading time: 0 Render date: 2024-10-18T04:35:44.741Z Has data issue: false hasContentIssue false

Asymptotic persistence time formulae for multitype birth–death processes

Published online by Cambridge University Press:  21 March 2023

Frank G Ball*
Affiliation:
University of Nottingham
Damian Clancy*
Affiliation:
Heriot-Watt University
*
*Postal address: School of Mathematical Sciences, University of Nottingham, Nottingham, NG7 2RD, UK. Email address: frank.ball@nottingham.ac.uk
**Postal address: Department of Actuarial Mathematics and Statistics, Maxwell Institute for Mathematical Sciences, Heriot-Watt University, Edinburgh, EH14 4AS, UK. Email address: d.clancy@hw.ac.uk
Rights & Permissions [Opens in a new window]

Abstract

We consider a class of processes describing a population consisting of k types of individuals. The process is almost surely absorbed at the origin within finite time, and we study the expected time taken for such extinction to occur. We derive simple and precise asymptotic estimates for this expected persistence time, starting either from a single individual or from a quasi-equilibrium state, in the limit as a system size parameter N tends to infinity. Our process need not be a Markov process on $ {\mathbb Z}_+^k$; we allow the possibility that individuals’ lifetimes may follow more general distributions than the exponential distribution.

Type
Original Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

A fundamental issue in modelling biological populations is the risk of a population or species becoming extinct. The corresponding issue in modelling infectious spread is extinction of infection from a population. In either case, a random variable of particular interest is the persistence time until extinction occurs.

A substantial body of recent research addresses the issue via the semi-rigorous WKB (Wentzel–Kramers–Brillouin) approach; see the review papers [Reference Assaf and Meerson2, Reference Ovaskainen and Meerson20]. Such work focuses specifically on the supercitical case (defined precisely in Section 2 below), so that the expected persistence time grows exponentially with population size. For many naturally one-dimensional models (processes that can be approximated, in a sense made precise in Section 2.3, by the solution of a one-dimensional ordinary differential equation), it can be shown that in the supercritical case, as a system size parameter N tends to infinity, the expected persistence time $\tau$ starting from quasi-equilibrium satisfies

(1) \begin{equation}\tau \sim \frac{K}{\sqrt{N}} \exp (A N) , \end{equation}

where we use $\sim$ to denote that the ratio of the two sides converges to 1 as $N \to \infty$ . Here A and K are constants whose values do not depend upon N, with explicit formulae available for A and K in terms of parameters of the process.

For multidimensional models, it is usually only possible to establish results of the cruder form $\ln \tau \sim A N$ , and to evaluate the leading-order constant A via numerical solution of a system of ordinary differential equations. An exception is provided by [Reference Clancy9], where a relation of the form (1) and explicit formulae for A and K are obtained using the WKB approach for a susceptible–infectious–susceptible (SIS) infection model in a heterogeneous population, a naturally multidimensional process.

In a different vein, a rather more rigorous approach is taken in [Reference Chazottes, Collet and Méléard7]. For a general class of multitype birth–death processes, the existence of a quasi-stationary distribution is proved and a bound established for the total variation distance between the process conditioned to non-extinction before time t and the quasi-stationary distribution, As a by-product of this analysis, it is shown that there exist constants $d_1 \gt d_2 \gt 0$ such that for all sufficiently large N, the mean persistence time $\tau$ starting from quasi-stationarity satisfies $\exp ( d_2 N ) \le \tau \le \exp ( d_1 N )$ .

A limitation of both the WKB approach and the approach of [Reference Chazottes, Collet and Méléard7] is that the models considered generally assume that individuals’ lifetimes or infectious periods are exponentially distributed. This is not usually biologically realistic, the assumption being rather for reasons of mathematical tractability. In [Reference Ball, Britton and Neal5] the effects of relaxing this assumption were considered for a class of (one-dimensional) birth–death processes, via application of a network insensitivity result of [Reference Zachary23]. The subcritical and critical cases were considered in addition to the more widely studied supercritical case. In the critical case it was found that the expected persistence time $\tau_0$ , starting from one initial individual, depends logarithmically upon the system size parameter N as $N \to \infty$ , whereas in the subcritical case, $\tau_0$ was shown to converge to a finite limit as $N \to \infty$ . In [Reference Clancy9], the multitype version of the insensitivity result of [Reference Zachary23] was applied to allow for non-exponentially distributed infectious periods in the SIS infection model in a heterogeneous population, in the supercritical case.

In the current paper, we consider a particular class of multitype birth–death processes, defined in Section 2. We build on the approach of [Reference Ball, Britton and Neal5], so we do not need to assume exponentially distributed lifetimes. For this class of processes we are able to obtain unusually precise results, including a result of the form (1) in the supercritical case, with explicit formulae for the constants A and K. The only multidimensional process for which such a precise result has previously been available is the heterogeneous population SIS model of [Reference Clancy9]. Not only is the class of processes considered here considerably more general than in [Reference Clancy9], but the methods of proof are quite different, and in contrast to [Reference Clancy9] we consider the subcritical and critical cases as well as the supercritical case. The class of processes is more restricted than that of [Reference Chazottes, Collet and Méléard7], but our results, unlike those of [Reference Chazottes, Collet and Méléard7], do not require exponentially distributed lifetimes. In estimating the mean persistence time starting from a single individual, our approach is fully rigorous, unlike the WKB method of [Reference Clancy9]. In estimating the mean persistence time starting from quasi-stationarity, we present only a heuristic sketch proof, referring the reader to [Reference Ball, Britton and Neal4] for full details of the proof in a specific one-dimensional setting.

The remainder of the paper is structured as follows. In Section 2, we define our general modelling framework and give two illustrative examples, before describing two standard approximating processes that provide motivation and intuition for what follows. In Section 3, we define a restarted version of our general process and show how this restarted process may be used to analyse extinction times for the original process of interest. In Sections 4 and 5, we consider the subcritical and critical cases, respectively, obtaining limiting results (Theorems 1, 2, and 3) for the expected extinction time starting from a single initial individual. We move on in Section 6 to the supercritical case, for which we again obtaining a limiting result (Theorem 4) for the expected extinction time starting from a single initial individual, before going on to outline the corresponding result (the formula (49)) for the expected extinction time starting from quasi-equilibrium. Finally, in Section 7 we present some concluding discussion.

2. The model

We consider a sequence of multitype birth–death processes indexed by N, where N is an overall system size parameter, our interest being in the limiting behaviour as $N \to \infty$ . The state of the process at time $t \ge 0$ is denoted by

\begin{equation*}\boldsymbol{{X}}^{(N)} (t) = \left( X_1^{(N)} (t) , X_2^{(N)} (t) , \ldots , X_k^{(N)} (t) \right) \in {\mathbb Z}_+^k,\end{equation*}

where k is the number of types of individual, and for $i=1,2,\ldots,k$ , $X_i^{(N)} (t)$ is the number of type i individuals present at time t. We assume that the state space S is either the whole of ${\mathbb Z}_+^k$ , or the finite set $\left\{ \boldsymbol{{x}} = \left( x_1 , x_2 , \ldots , x_k \right) :\ 0 \le x_i \le N_i \mbox{ for } i=1,2,\ldots,k \right\}$ for some $\boldsymbol{{N}} = \left( N_1 , N_2 , \ldots , N_k \right)$ , and that the extinction state $\boldsymbol{{x}} = {\bf 0}$ is absorbing, while $C = S \setminus \{ {\bf 0} \}$ forms a single communicating class. In the case of finite state space, noting that $N_i$ is the upper bound for the number of individuals of type i, we take $N = N_1 + N_2 + \cdots + N_k$ , set $f_i = N_i / N$ for $i=1,2,\ldots,k$ , and assume that $f_1 , f_2 , \ldots , f_k$ are all strictly positive. For simplicity we assume that $f_1 , f_2 , \ldots , f_k$ do not vary with N, so when we write, for instance, $N \to \infty$ , it is implicit that N increases through a subsequence of integers such that $N f_i$ is always integer-valued for all i. To avoid overly cumbersome notation, we suppress the superscript N from now on.

We assume that all birth and death rates scale with N in the manner of the ‘density-dependent’ processes of Chapter 11 of [Reference Ethier and Kurtz13], so that when N is large the scaled process $\boldsymbol{{X}} (t) / N$ may be approximated by a deterministic process (see Section 2.3 for details). We allow the birth and death rates for type i individuals to depend upon (i) the (scaled) number of type i individuals present, and (ii) the (scaled) total number of individuals of all types. More precisely, if we denote by $\xi_1 , \xi_2 , \ldots , \xi_k$ a set of independent homogeneous Poisson processes of rate 1, then for $i=1,2,\ldots,k$ , new type i individuals are born at the points of the time-transformed process

(2) \begin{equation}\xi_i \left( N \int_{u=0}^t b_0 \left( \sum_{j=1}^k X_j (u) / N \right) b_i \left( X_i (u) / N \right) du \right) \end{equation}

for some functions $b_0 , b_1 , \ldots , b_k$ from ${\mathbb R}^+$ to ${\mathbb R}^+$ . Each newly born type i individual is assigned a ‘lifeforce’ distributed as a non-negative random variable $Q_i$ , independent of other individuals’ lifeforce variables and of the processes $\xi_1 , \xi_2 , \ldots , \xi_k$ , with $\mathbb{E} \left[ Q_i \right] = 1$ for $i=1,2,\ldots,k$ . Each type i individual’s lifeforce reduces at rate

(3) \begin{equation}N d_0 \left( \sum_{j=1}^k X_j (t) / N \right) d_i \left( X_i (t) / N \right) / X_i (t)\end{equation}

for some functions $d_0 , d_1 , \ldots , d_k$ from ${\mathbb R}^+$ to ${\mathbb R}^+$ , until it reaches zero, at which point the individual is removed from the population (dies). Note that if $d_0 (y) = 1$ and $d_i(y) = y / \alpha_i$ for $i=1,\ldots,k$ , for some constants $\alpha_1 , \alpha_2 , \ldots , \alpha_k \gt 0$ , then the time that a type i individual remains alive in the population is distributed as $\alpha_i Q_i$ , so the lifeforce random variables are simply scaled lifetimes. More generally, the functions $d_0 , d_1 , \ldots , d_k$ may be interpreted as competition effects, so that $d_0$ represents the effect on an individual’s lifespan of competition from the entire population, while for $i=1,2,\ldots,k$ , $d_i$ represents the effect on a type i individual’s lifespan of competition from individuals of its own type. Alternatively, these could be ‘safety in numbers’ effects, whereby the presence of others enhances an individual’s survival chances.

The model that we have defined may be regarded as a piecewise deterministic Markov process [Reference Davis11], in which the state of the process at time t is given by $\boldsymbol{{X}}(t)$ together with the remaining lifeforce values at time t for every individual alive in the population at that time. However, our approach avoids the need for the general machinery of piecewise deterministic Markov processes, so we will not pursue this. In the case that $Q_i$ is exponentially distributed for $i=1,2,\ldots,k$ , $\left\{ \boldsymbol{{X}} (t) :\ t \ge 0 \right\}$ is a continuous-time Markov process with transition rates as follows (where $\boldsymbol{{e}}_i$ denotes the unit vector with ith element equal to 1 and all other elements equal to 0):

\begin{align*}\boldsymbol{{x}} \to \boldsymbol{{x}} + \boldsymbol{{e}}_i & \quad\mbox{ at rate } \quad N b_0 \left( \left. \sum_{j=1}^k x_j \right/ N \right) b_i ( x_i / N ) , \\\boldsymbol{{x}} \to \boldsymbol{{x}} - \boldsymbol{{e}}_i & \quad\mbox{ at rate } \quad N d_0 \left( \left. \sum_{j=1}^k x_j \right/ N \right) d_i ( x_i / N ) .\end{align*}

To see this, we use an argument based on a construction of [Reference Sellke21]. Suppose that $\boldsymbol{{X}}(t) = \boldsymbol{{x}}$ for some $\boldsymbol{{x}} \in {\mathbb Z}_+^k$ . Since each $Q_i$ has constant hazard rate equal to 1, it follows from the expression (3) that the probability that a particular type i individual is removed in the time interval $[t , t + \delta t )$ is $N d_0 \left( \sum_{j=1}^k x_j / N \right) \left( d_i \left( x_i / N \right) / x_i \right) \delta t + o ( \delta t )$ . Independence of the lifeforce random variables of different individuals then implies that the probability that exactly one individual is removed in the time interval $[t , t + \delta t )$ , and they are of type i, is $N d_0 \left( \sum_{j=1}^k x_j / N \right) d_i \left( x_i / N \right) \delta t + o ( \delta t )$ . Similarly, the expression (2) together with independence of the processes $\xi_1 , \xi_2 , \ldots , \xi_k$ implies that the probability that exactly one individual is born in the time interval $[t , t + \delta t )$ , and they are of type i, is $N b_0 \left( \left. \sum_{j=1}^k x_j \right/ N \right) b_i ( x_i / N ) \delta t + o ( \delta t )$ . Independence between the lifeforce random variables and the processes $\xi_i$ then yields the required result.

2.1. Examples

Two examples of processes that illustrate the biological usefulness of the framework described above are as follows.

Example 1. The susceptible–infectious–susceptible (SIS) infection model of [Reference Clancy8, Reference Clancy9] with heterogeneous susceptibilities and infectious periods. This process has finite state space, with $N_i$ now representing the (constant) total number of type i individuals in the population, while $X_i (t)$ is the number of infectious type i individuals at time t. Rate functions are $b_0(y) = \beta y$ , $d_0 (y) = 1$ , and for $i=1,2,\ldots,k$ , $b_i (y) = \mu_i ( f_i - y )$ , $d_i (y) = y/\alpha_i$ . Here $\beta$ is an overall infection rate parameter, while for $i=1,2,\ldots,k$ , $\mu_i$ gives the level of susceptibility of type i individuals, $\alpha_i$ is the mean infectious period of type i individuals, and $f_i = N_i / N$ is the proportion of all individuals that are of type i. We assume that $\beta \gt 0$ and that $\alpha_i , \mu_i , f_i \gt 0$ for $i=1,2,\ldots, k$ , and without loss of generality we scale the $\mu_i$ values so that $\sum_{i=1}^k \mu_i f_i = 1$ .

Example 2. The multitype birth–death process with linear birth rates and quadratic death rates described in [Reference Chazottes, Collet and Méléard7, Section 2.2]. This process has state space ${\mathbb Z}_+^k$ , and may be obtained by setting $b_0(y) = \lambda y$ , $d_0(y) = \mu + \kappa y$ , and for $i=1,2,\ldots,k$ , $b_i (y) = 1$ , $d_i (y) = y$ , with parameters $\lambda , \mu , \kappa \gt 0$ . In [Reference Chazottes, Collet and Méléard7] it is additionally assumed that the lifeforce variables $Q_i$ are exponentially distributed.

2.2. Assumptions

In order to prove our results, we adopt the following assumptions throughout.

In the case that $S = {\mathbb Z}_+^k$ we set $\tilde C_0 = \tilde C_1 = \cdots = \tilde C_k = (0,\infty)$ , while in the case of finite state space we set $\tilde C_0 = (0,1)$ and $\tilde C_i = \left( 0 , f_i \right)$ for $i=1,2,\ldots,k$ . To ensure that ${\bf 0}$ is an absorbing state, that $C = S \setminus \{ {\bf 0} \}$ is a communicating class, and that the process cannot leave S, we require that

(4) \begin{equation} b_0 (0) = 0 , \ b_0 (y) \gt 0 \mbox{ for } y \in \tilde C_0 ,\ d_0 (y) \gt 0 \mbox{ for } y \in \tilde C_0 , \end{equation}

and for $i=1,2,\ldots,k$ ,

(5) \begin{equation} b_i (0) \gt 0 , \ b_i (y) \gt 0 \mbox{ for } y \in \tilde C_i , \ d_i (0) = 0 , \ d_i (y) \gt 0 \mbox{ for } y \in \tilde C_i . \end{equation}

In the case of finite state space, we additionally require

(6) \begin{equation} d_0 \left( 1 \right) \gt 0\mbox{ and for } i=1,2,\ldots,k, b_i \left( f_i \right) = 0 , \ d_i \left( f_i \right) \gt 0 . \end{equation}

We further assume that

(7) \begin{equation} b_0 , b_1 , \ldots , b_k, d_0 , d_1 , \ldots , d_k \mbox{ are differentiable throughout their domains},\end{equation}
(8) \begin{equation}d_0 (0) \gt 0, \mbox{ and } b_0^\prime (0) , d_1^\prime (0) , d_2^\prime (0) , \ldots , d_k^\prime (0) \gt 0.\qquad\qquad\qquad\qquad\qquad \end{equation}

Finally, define

(9) \begin{equation}a ( \boldsymbol{{x}} ) = \frac{1}{N d_0 (1/N)} \prod_{r=1}^{\sum_j x_j - 1} \left( \frac{b_0(r/N)}{d_0((r+1)/N)} \right) \prod_{i=1}^k \prod_{r=0}^{x_i-1} \left( \frac{b_i (r/N)}{d_i((r+1)/N)} \right) . \end{equation}

In Section 3 we will define a modified version of our process, restarted in a particular way whenever the original process is absorbed at the state $\boldsymbol{{x}} = {\bf 0}$ . The stationary distribution of this restarted process is found to be proportional to $\{ a ( \boldsymbol{{x}} ) : \boldsymbol{{x}} \in S \}$ ; see Section 3. We assume that for all sufficiently large N,

(10) \begin{equation}\qquad\qquad\qquad\qquad\qquad\qquad\quad\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} ) \lt \infty \end{equation}
(11) \begin{equation}\mbox{and } \sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} ) b_0 \left( \frac{\sum_{j=1}^k x_j}{N} \right) \sum_{i=1}^k b_i ( x_i / N ) \lt \infty . \end{equation}

The significance of the conditions (10) and (11) is discussed in Section 3. Note that in the case of a finite state space, these two conditions are trivially satisfied. For Example 2 of Section 2.1, we verify (10) as follows. Recalling the multinomial formula

\begin{equation*}k^l = (1 + 1 + \cdots + 1 )^l= \sum_{\left\{ \boldsymbol{{x}} \in \mathbb{Z}_{+}^k : \sum_i x_i = l \right\}} \frac{l!}{\prod_{i=1}^k x_i !},\end{equation*}

we have

\begin{align*}\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} ) & = \frac{1}{\mu + ( \kappa / N)} \sum_{l=1}^\infty \left( \frac{\lambda}{\kappa} \right)^{l-1} (l-1)! \prod_{r=1}^{l - 1} \left( \frac{1}{(\mu / \kappa ) + ((r+1)/N)} \right)\\ & \times \sum_{\left\{ \boldsymbol{{x}} \in {\mathbb Z}_+^k : \sum_i x_i = l \right\}} \prod_{i=1}^k \frac{1}{x_i !} \\&= \frac{k}{\mu + ( \kappa / N)} \sum_{l=1}^\infty \left( \frac{\lambda k}{\kappa} \right)^{l-1} \frac{1}{l} \prod_{r=1}^{l - 1} \left( \frac{1}{(\mu / \kappa ) + ((r+1)/N)} \right) \\& < \frac{k}{\mu + ( \kappa / N)} \sum_{l=1}^\infty \left( \frac{\lambda k N}{\kappa} \right)^{l-1} \left( \frac{1}{l} \right) \left( \frac{1}{l!} \right) \\& < \frac{\kappa}{\lambda N \left( \mu + ( \kappa / N) \right)} \left( \exp \left( \lambda k N / \kappa \right) - 1 \right)\;\;<\;\; \infty .\end{align*}

The condition (11) follows by similar arguments.

We assume the conditions (4)–(8), (10), (11) throughout. For the supercritical case, we require the further assumptions (35)–(39), set out in Section 6.

2.3. Deterministic and branching process approximations

We now describe two approximating processes, valid in different regimes, that are useful in motivating our methods and giving some intuition behind our results.

Firstly, when N is large and, for $i=1,2,\ldots,k$ , $X_i(t)$ is of comparable order to N, the scaled process $\boldsymbol{{X}}(t) / N$ may be approximated by the deterministic process $\boldsymbol{{y}}(t) = \left( y_1 (t) , y_2 (t) , \ldots , y_k (t) \right)$ defined by

(12) \begin{align}y_i (t) & = \int_{s=0}^\infty b_0 \left( \sum_{j=1}^k y_j (t-s) \right) b_i ( y_i (t-s) ) \nonumber\\ & \times\mathbb{P} \left( Q_i \gt \int_{u=0}^s \frac{d_0 \left( \sum_{j=1}^k y_j (t-u) \right) d_i ( y_i (t-u) )}{y_i (t-u)} \, du \right) ds \end{align}

for $i=1,2,\ldots,k$ . That is, in the deterministic process, the (scaled) number of individuals of type i at time t, $y_i (t)$ , consists of all those individuals born into group i at time $t-s$ who remain alive at time t, integrated over $s \ge 0$ . Individuals are born into group i at time $t-s$ at rate $b_0 \left( \sum_{j=1}^k y_j (t-s) \right) b_i ( y_i (t-s) )$ . The proportion of such individuals remaining alive at time t is equal to $\mathbb{P} \left( Q_i \gt \int_{u=0}^s \eta_i (t-u) \, du \right)$ , where $\eta_i (t-u)$ is the rate of reduction of each type i individual’s lifeforce at time $t-u$ , given by $\eta_i (t-u) = \left. d_0 \left( \sum_{j=1}^k y_j (t-u) \right) d_i ( y_i (t-u) ) \right/ y_i (t-u)$ .

When the $Q_i$ are exponentially distributed, (12) is equivalent to the system

(13) \begin{align}\frac{dy_i}{dt} &= b_0 \left( \sum_{j=1}^k y_j \right) b_i ( y_i ) - d_0 \left( \sum_{j=1}^k y_j \right) d_i ( y_i ) \mbox{ for } i=1,2,\ldots,k . \end{align}

The assumptions (4) and (5) ensure that the system (13) has an equilibrium point at $\boldsymbol{{y}} = {\bf 0}$ , corresponding to extinction. Assuming that $\boldsymbol{{X}}(0)/N \to \boldsymbol{{y}}_0$ as $N \to \infty$ , where $\boldsymbol{{y}}_0 \ne \textbf{0}$ , it follows immediately from [Reference Ethier and Kurtz13, Section 11.2] that if the $Q_i$ are exponentially distributed, then under certain conditions on the functions $b_0 , b_1 , \ldots , b_k$ and $d_0 , d_1 , \ldots , d_k$ , in the limit as $N \to \infty$ the scaled process $\boldsymbol{{X}}(t)/N$ converges in probability over finite time intervals to the process $\boldsymbol{{y}}(t)$ satisfying the ordinary differential equations (13) with initial condition $\boldsymbol{{y}}(0)=\boldsymbol{{y}}_0$ . We will not need to make use of this result; rather the approximating deterministic process is used merely to provide intuitive motivation.

Secondly, when N is large and, for $i=1,2,\ldots,k$ , $X_i(t)$ is small compared to N, the process $\boldsymbol{{X}}(t)$ may be approximated by a (linear) multitype branching process as follows. Suppose that $\boldsymbol{{X}} (t) = \boldsymbol{{x}}$ for some $\boldsymbol{{x}} \in {\mathbb Z}_+^k$ . Then, recalling the assumptions (5) and (7), the rate (3) at which each type i individual’s lifeforce reduces may be approximated by

\begin{align*}\lim_{\boldsymbol{{x}} / N \to {\bf 0}} d_0 \left( \sum_{j=1}^k x_j / N \right) \frac{d_i (x_i/N)}{x_i/N} = d_0(0) d_i^\prime (0) .\end{align*}

Hence each type i individual in the approximating branching process lives for a time distributed as $\left( d_0 (0) d_i^\prime (0) \right)^{-1} Q_i$ . During its lifetime, the rate at which each type i individual gives birth to type j individuals in the approximating branching process is given, from the expression (2), recalling the assumptions (4) and (7), by

\begin{align*}\lim_{\boldsymbol{{x}} / N \to {\bf 0}} \frac{b_0 \left( \sum_{l=1}^k x_l / N \right)}{\sum_{l=1}^k x_l/N} b_j (x_j/N) = b_0^\prime (0) b_j (0)\end{align*}

for $j=1,2,\ldots,k$ .

The mean number of type j offspring produced by a type i individual over its lifetime in this approximating process is

(14) \begin{align}m_{ij} & = \frac{b_0^\prime (0) b_j (0)}{d_0 (0) d_i^\prime (0)} \mbox{ for } i,j = 1,2,\ldots,k. \end{align}

Denoting by M the matrix with elements $m_{ij}$ , the assumptions (5) and (8) ensure that all elements of M are finite and strictly positive. Further, it follows from the expression (14) that M is of rank 1, with Perron–Frobenius eigenvalue $R_0$ given by its trace, viz.

(15) \begin{align}R_0 & = \frac{b_0^\prime (0)}{d_0 (0)} \sum_{j=1}^k \frac{b_j (0)}{d_j^\prime (0)} . \end{align}

This type of branching process approximation is made rigorous for certain epidemic models in, for example, [Reference Ball3, Reference Ball and O’Neill6]. We will not make use of such results, as we employ the approximating branching process only for intuitive motivation and in the heuristic argument of Section 6.2. In the context of epidemic modelling, the matrix M is referred to as the next-generation matrix and its Perron–Frobenius eigenvalue, $R_0$ , as the basic reproduction number. In general, the next-generation matrix M need not be of rank 1; this is a consequence of our model assumptions, in particular the fact that in the approximating branching process, the rate at which type i individuals give birth to type j individuals depends only upon j.

Denote by $\omega_i$ the probability that the multitype branching process initiated by a single individual of type i produces a finite number of progeny, and by $\phi_i ( \cdot )$ the moment generating function of the lifeforce random variable $Q_i$ ; then from standard theory of multitype branching processes, e.g. [Reference Mode19], the probabilities $\omega_i$ satisfy

(16) \begin{align}\omega_i & = \phi_i \left( - \, \frac{b_0^\prime (0)}{d_0 (0) d_i^\prime (0)} \sum_{j=1}^k ( 1 - \omega_j ) b_j (0) \right) \mbox{ for } i=1,2,\ldots,k. \end{align}

For $R_0 \le 1$ , we have $\omega_1 = \omega_2 = \cdots = \omega_k = 1$ , while for $R_0 \gt 1$ , $\boldsymbol{\omega} = \left( \omega_1 , \omega_2 , \ldots , \omega_k \right)$ is the unique solution of the equations (16) in $[0,1]^k \setminus \{ {\bf 1} \}$ , and $\omega_i \lt 1$ for $i=1,2,\ldots,k$ (see [Reference Mode19, Section 1.7]).

This approximating multitype branching process can alternatively be viewed as a single-type branching process, as follows. Each individual has lifetime distributed according to a mixture distribution, distributed as $\left( d_0(0) d_i^\prime(0) \right)^{-1} Q_i$ with probability $b_i(0) / \sum_j b_j(0)$ for $i=1,2,\ldots,k$ , and during its lifetime gives birth at rate $b_0 ^\prime(0) \sum_j b_j(0)$ . Denote by $\omega$ the probability that this single-type branching process produces a finite number of progeny. Then for $R_0 \le 1$ , we have $\omega = 1$ , while for $R_0 \gt 1$ , $\omega$ is the unique solution in [0,1) of

(17) \begin{align}\omega &= \frac{1}{\sum_{j=1}^k b_j (0)} \sum_{i=1}^k b_i (0) \phi_i \left( - \, \frac{b_0^\prime (0)}{d_0 (0) d_i^\prime (0)} \left( \sum_{j=1}^k b_j (0) \right) \left( 1 - \omega \right) \right) . \end{align}

This single-type branching process is equivalent to the preceding multitype branching process provided that the multitype process is initiated by a single individual which is of type i with probability $b_i(0) / \sum_j b_j (0)$ for $i=1,2,\ldots,k$ . Consequently, we have the relationship

(18) \begin{align}\omega &= \left. \left( \sum_{i=1}^k b_i (0) \omega_i \right) \right/ \sum_{j=1}^k b_j (0) . \end{align}

Note that combining the relationship (18) with the equations (16) gives

\begin{align*}\omega_i &= \phi_i \left( - \, \frac{b_0^\prime (0)}{d_0 (0) d_i^\prime (0)} \left( \sum_{j=1}^k b_j (0) \right) ( 1 - \omega ) \right) \mbox{ for } i=1,2,\ldots,k ,\end{align*}

so that each probability $\omega_i$ ( $i=1,2,\ldots,k$ ) can be expressed in terms of the single probability $\omega$ .

A link between the approximating deterministic process and the approximating branching process is as follows. Recalling the assumptions (4) and (5), with $\delta_{ij}$ denoting the Kronecker delta, the Jacobian $J ( {\bf 0} )$ of the deterministic system (13) at $\boldsymbol{{y}} = {\bf 0}$ has elements

(19) \begin{equation}J_{ij} ( {\bf 0} )= b_0^{\prime} (0) b_i (0) - d_0 (0) d_i^{\prime} (0) \delta_{ij} .\end{equation}

From [Reference Diekmann, Heesterbeek and Roberts12, Theorem A.1] or [Reference Van den Driessche and Watmough22, Theorem 2], all eigenvalues of $J ( {\bf 0} )$ have strictly negative real part precisely when $R_0 \lt 1$ , and conversely $J ( {\bf 0} )$ has at least one eigenvalue with strictly positive real part precisely when $R_0 \gt 1$ . That is, subcriticality of the approximating branching process implies local stability of the extinction equilibrium point $\boldsymbol{{y}} = {\bf 0}$ of the deterministic system (13), while supercriticality of the branching process implies local instability of the extinction point.

3. The restarted process

In order to analyse the long-term behaviour of our process, we follow [Reference Ball, Britton and Neal5, Reference Hernández-Suárez and Castillo-Chavez14] in considering first the stationary behaviour of a modified process without extinction. Specifically, we introduce a regeneration step as follows. Whenever the process reaches the state $\boldsymbol{{x}} = {\bf 0}$ , it remains there for an exponentially distributed time of mean 1, after which a birth occurs, being of type i with probability $\rho_i$ for some distribution $\boldsymbol{\rho} = \left( \rho_1 , \rho_2 , \ldots , \rho_k \right)$ , and the process then continues as before. Taking

(20) \begin{align}\rho_i &= \frac{b_i (0)}{\sum_{j=1}^k b_j (0)} \mbox{ for } i=1,2,\ldots , k \end{align}

ensures that for the case of exponentially distributed lifeforce variables $Q_i$ , the restarted process is reversible [Reference Kelly16]. We can then apply detailed balance conditions to find its stationary distribution $\boldsymbol{\pi} = \{ \pi ( \boldsymbol{{x}} ) :\ \boldsymbol{{x}} \in S \}$ . Recalling the definition (9) of $a ( \boldsymbol{{x}} )$ , we find that

(21) \begin{align}\pi ( \boldsymbol{{x}} ) &= \frac{a ( \boldsymbol{{x}} )}{\sum_{j=1}^k b_j (0)} \, \pi ( {\bf 0} ) \mbox{ for } \boldsymbol{{x}} \in C ,\end{align}

with

(22) \begin{eqnarray}\pi ( {\bf 0} ) \,&=&\, \left( 1 + \frac{\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} )}{\sum_{j=1}^k b_j (0)} \right)^{-1} . \end{eqnarray}

The assumptions (10) and (11) imply, respectively, that $\boldsymbol{\pi}$ is a proper distribution and that the condition (11) of [Reference Zachary23] is satisfied. From [Reference Zachary23, Theorem 2] it now follows that $\boldsymbol{\pi}$ is the stationary distribution of numbers of individuals of each type in the system, regardless of the distributions of the $Q_i$ (recall that $\mathbb{E} [Q_i]=1$ for $i=1,2, \dots,k$ ).

The stationary distribution $\boldsymbol{\pi}$ of the restarted process, restricted to C, may be used to approximate the quasi-stationary distribution $\boldsymbol{{q}}$ of the original process (assuming that a unique quasi-stationary distribution exists). For the classic (single-type) SIS epidemic model, this gives the approximation of [Reference Kryscio and Lefèvre17, Section 3.1]. It has been shown [Reference Clancy and Pollett10, Reference Keilson and Ramaswamy15] that for general single-type birth–death processes (with exponentially distributed lifetimes), $\boldsymbol{\pi}$ restricted to C is a lower bound for $\boldsymbol{{q}}$ in the sense of likelihood ratio ordering of distributions.

The mean time between regenerations is $1 / \pi ( {\bf 0} )$ . Denoting by $\tau_i$ the mean time taken for the process to first hit state $\bf 0$ starting from a single type i individual (which is the same for the original process as for the restarted version) and recalling the form (20) of the probabilities $\rho_i$ and the formula (22), we thus have

(23) \begin{align}\frac{1}{\pi ( {\bf 0} )} & = 1 + \sum_{i=1}^k \rho_i \tau_i \nonumber \\\Rightarrow \quad\sum_{i=1}^k b_i(0) \tau_i & = \left( \frac{1}{\pi ( {\bf 0})} - 1 \right) \sum_{j=1}^k b_j (0) \nonumber \\& = \sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} ) . \end{align}

To investigate mean persistence time, we consider the asymptotic behaviour of the sum (23) as $N \to \infty$ .

4. The subcritical case

For the subcritical case $R_0 \lt 1$ , we have the following.

Theorem 1. Suppose that $R_0 \lt 1$ , that the assumptions (4)–(8) are satisfied, and that $b_0^\prime , b_1 , b_2 , \ldots , b_k$ are non-increasing functions, while $d_0 , d_1^\prime , d_2^\prime , \ldots , d_k^\prime$ are non-decreasing functions.

Then the conditions (10), (11) are automatically satisfied, and recalling that $\tau_i$ denotes the mean time to extinction starting from a single type i individual, we have

\begin{eqnarray*}\sum_{i=1}^k b_i(0) \tau_i &\to& - \frac{1}{b_0^\prime (0)} \log \left( 1 - R_0 \right) \mbox{ as } N \to \infty .\end{eqnarray*}

Proof. First, note that the conditions of the theorem imply that

\begin{align*}\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} )& = \frac{1}{N d_0 (1/N)} \sum_{l=1}^\infty \prod_{r=1}^{l - 1} \left( \frac{b_0(r/N)}{d_0((r+1)/N)} \right) \\ & \times \sum_{\left\{ \boldsymbol{{x}} \in C : \sum_i x_i = l \right\}} \prod_{i=1}^k \prod_{r=0}^{x_i-1} \left( \frac{b_i (r/N)}{d_i((r+1)/N)} \right) \\&\le \frac{1}{d_0(0)} \sum_{l=1}^\infty (l-1)! \left( \frac{b_0^\prime(0)}{d_0 (0)} \right)^{l-1} \sum_{\left\{ \boldsymbol{{x}} \in C : \sum_i x_i = l \right\}} \prod_{i=1}^k \frac{1}{x_i !} \left( \frac{b_i(0)}{d_i^\prime (0)} \right)^{x_i} .\end{align*}

Now from the multinomial theorem, we have

\begin{align*}\left( \sum_{i=1}^k \frac{b_i(0)}{d_i^\prime (0)} \right)^l & = \sum_{\left\{ \boldsymbol{{x}} \in \mathbb{Z}_+^k : \sum_i x_i = l \right\}} \frac{l!}{\prod_{i=1}^k x_i !} \prod_{i=1}^k \left( \frac{b_i(0)}{d_i^\prime (0)} \right)^{x_i} \\ &\ge l! \sum_{\left\{ \boldsymbol{{x}} \in C : \sum_i x_i = l \right\}} \prod_{i=1}^k \frac{1}{x_i !} \left( \frac{b_i(0)}{d_i^\prime (0)} \right)^{x_i} \end{align*}

and so

\begin{align*}\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} )&\le \frac{1}{b_0^\prime (0)} \sum_{l=1}^\infty \frac{1}{l} \left( \frac{b_0^\prime(0)}{d_0 (0)} \right)^l \left( \sum_{i=1}^k \frac{b_i(0)}{d_i^\prime (0)} \right)^l \\& = \frac{1}{b_0^\prime (0)} \sum_{l=1}^\infty \frac{R_0^l}{l} \\& = - \frac{1}{b_0^\prime (0)} \log \left( 1 - R_0 \right)\;\;<\;\; \infty ,\end{align*}

so that the condition (10) is satisfied, and from Equation (23) it follows that

\begin{eqnarray*}\limsup_{N \to \infty} \sum_{i=1}^k b_i (0) \tau_i &\le& - \frac{1}{b_0^\prime (0)} \log \left( 1 - R_0 \right) .\end{eqnarray*}

A similar argument shows that under the conditions of the theorem,

$$\matrix{ {\sum\limits_{{\boldsymbol{x}} \in C} a ({\boldsymbol{x}}){b_0}\left( {\sum\limits_{j = 1}^k {{x_j}} /N} \right)\sum\limits_{i = 1}^k {{b_i}} ({x_i}/N)} \hfill \cr { \le (1/N)\sum\limits_{i = 1}^k {{b_i}} (0){R_0}/\left( {1 - {R_0}} \right)\;\; < \;\;\infty ,} \hfill \cr } $$

so that the condition (11) is satisfied.

Next, for any $m \in \mathbb{N}$ , we have

\begin{eqnarray*}\sum_{i=1}^k b_i(0) \tau_i&\ge& \frac{1}{N d_0 (1/N)} \sum_{l=1}^m \prod_{r=1}^{l - 1} \left( \frac{b_0(r/N)}{d_0((r+1)/N)} \right) \\ && {} \times \sum_{\left\{ \boldsymbol{{x}} \in C : \sum_i x_i = l \right\}} \prod_{i=1}^k \prod_{r=0}^{x_i-1} \left( \frac{b_i (r/N)}{d_i((r+1)/N)} \right) . \end{eqnarray*}

Since m is fixed, then as $N \to \infty$ , for each $r \le m$ we have $N b_0 (r/N) \to r b_0^\prime(0)$ and $d_0 ((r+1)/N) \to d_0 (0)$ , and for $i=1,2,\ldots,k$ we have $b_i (r/N) \to b_i (0)$ and $N d_i((r+1)/N) \to (r+1) d_i^\prime(0)$ . Note also that for $l \le m$ , for all sufficiently large N, $\left\{ \boldsymbol{{x}} \in C : \sum_i x_i = l \right\} = \left\{ \boldsymbol{{x}} \in {\mathbb Z}_+^k : \sum_i x_i = l \right\}$ . Hence

\begin{eqnarray*}\liminf_{N \to \infty} \sum_{i=1}^k b_i(0) \tau_i&\ge& \frac{1}{d_0 (0)} \sum_{l=1}^m\left( \frac{b_0^\prime(0)}{d_0(0)} \right)^{l-1} (l-1)!\\ && {} \times \sum_{\left\{ \boldsymbol{{x}} \in {\mathbb Z}_+^k : \sum_i x_i = l \right\}} \prod_{i=1}^k \frac{1}{x_i !} \left( \frac{b_i (0)}{d_i^\prime(0)} \right)^{x_i} \\\,&=&\, \frac{1}{b_0^\prime (0)} \sum_{l=1}^m \frac{1}{l} \left( \frac{b_0^\prime(0)}{d_0(0)} \right)^l\left( \sum_{i=1}^k \frac{b_i (0)}{d_i^\prime(0)} \right)^l \\ \,&=&\, \frac{1}{b_0^\prime (0)} \sum_{l=1}^m \frac{R_0^l}{l} ,\end{eqnarray*}

and letting $m \to \infty$ we obtain

\begin{eqnarray*}\liminf_{N \to \infty} \sum_{i=1}^k b_i(0) \tau_i &\ge& - \frac{1}{b_0^\prime (0)} \log \left( 1 - R_0 \right) .\end{eqnarray*}

The result follows.

Remark 1. Note that Examples 1 and 2 of Section 2.1 both satisfy the conditions of Theorem 1.

Remark 2. In the symmetric case where $b_1 = b_2 = \cdots = b_k$ and $d_1 = d_2 = \cdots = d_k$ , we have $\tau_1 = \tau_2 = \cdots = \tau_k$ , and so for $i=1,2,\ldots,k$ we have

\begin{eqnarray*}\tau_i &\to& - \frac{1}{k b_0^\prime (0) b_1 (0)} \log \left( 1 - R_0 \right) \mbox{ as } N \to \infty .\end{eqnarray*}

In particular, for Example 2 of Section 2.1, with $R_0 = \lambda k / \mu$ , we have that for $i=1,2,\ldots,k$ ,

\begin{eqnarray*}\tau_i &\to& - \frac{1}{\lambda k} \log \left( 1 - \frac{\lambda k}{\mu} \right) \mbox{ as } N \to \infty .\end{eqnarray*}

Remark 3. For Example 1 of Section 2.1, the heterogeneous SIS model, with $R_0 = \beta \sum_{i=1}^k \alpha_i \mu_i f_i$ , Theorem 1 yields that

(24) \begin{eqnarray}\sum_{i=1}^k \mu_i f_i \tau_i &\to& - \, \frac{\log \left( 1 - R_0 \right)}{\beta} \mbox{ as } N \to \infty. \end{eqnarray}

It seems intuitively clear that for this model, $\tau_i$ will depend upon i only through the mean infectious period $\alpha_i$ , and that larger $\alpha_i$ values will correspond to larger values of $\tau_i$ . Given the form of (24), a natural conjecture then is that $\tau_i \to - \left( \alpha_i / R_0 \right) \log \left( 1 - R_0 \right)$ as $N \to \infty$ . For more general processes satisfying the conditions of Theorem 1, we correspondingly conjecture that $\tau_i \to - \left( \log \left( 1 - R_0 \right) \right) / \left( d_0 (0) d_i^\prime (0) R_0 \right)$ as $N \to \infty$ .

5. The critical case

For the critical case $R_0 = 1$ , we present results for two models, Example 1 of Section 2.1 and an extension of Example 2. The algebra is rather more intricate and model-specific than in the subcritical case.

Theorem 2. For the SIS infection model with heterogeneous susceptibilities and infectious periods, Example 1 of Section 2.1, if $R_0=1$ then

\begin{eqnarray*}\sum_{i=1}^k \mu_i f_i \tau_i = \sum_{i=1}^k b_i(0) \tau_i &\sim& \frac{\log N}{2\beta} \mbox{ as } N \to \infty .\end{eqnarray*}

Proof. For $\boldsymbol{{x}} \in C$ and $\sum_{i=1}^k x_i=l$ ,

(25) \begin{eqnarray}a(\boldsymbol{{x}}) \,&=&\, \frac{1}{N}\prod_{r=1}^{l-1} \left(\beta \frac{r}{N}\right)\prod_{i=1}^k \prod_{r=0}^{x_i-1} \left(\frac{\alpha_i \mu_i\left(f_i-\frac{r}{N}\right)}{\frac{r+1}{N}}\right)\nonumber\\\,&=&\,\frac{\beta^{l-1}}{l} \frac{l!}{\prod_{i=1}^k x_i!}\prod_{i=1}^k \left[(\alpha_i \mu_i f_i)^{x_i} \prod_{r=0}^{x_i-1}\left(1-\frac{r}{f_i N}\right)\right].\end{eqnarray}

Now,

\begin{equation*}\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} ) = A_N+B_N,\end{equation*}

where

\begin{equation*}A_N=\sum_{l=1}^{\lfloor \sqrt{N} \rfloor}\sum_{\left\{ \boldsymbol{{x}} \in C : \sum_i x_i = l \right\}} a ( \boldsymbol{{x}} )\qquad \text{and}\qquad B_N=\sum_{l=\lfloor \sqrt{N} \rfloor+1}^N \sum_{\left\{ \boldsymbol{{x}} \in C : \sum_i x_i = l \right\}} a ( \boldsymbol{{x}} ).\end{equation*}

(For $x \in \mathbb{R}$ , $\lfloor x \rfloor$ denotes the greatest integer $\le x$ and, for future reference, $\lceil x \rceil$ denotes the smallest integer $\ge x$ .) Using (25) and the multinomial theorem,

(26) \begin{equation}A_N \le \sum_{l=1}^ {\lfloor \sqrt{N} \rfloor} \frac{\beta^{l-1}}{l}\left(\sum_{i=1}^k \alpha_i \mu_i f_i\right)^l=\beta^{-1}\sum_{l=1}^ {\lfloor \sqrt{N} \rfloor}\frac{1}{l},\end{equation}

since $R_0=\beta \sum_{i=1}^k \alpha_i \mu_i f_i =1$ .

Turning to $B_N$ , note that by the AM/GM inequality, if $x_i>0$ then

\begin{equation*}\prod_{r=0}^{x_i-1}\left(1-\frac{r}{f_i N}\right) \le \left[\frac{1}{x_i}\sum_{r=0}^{x_i-1}\left(1-\frac{r}{f_i N}\right)\right]^{x_i}=\left(1-\frac{x_i-1}{2f_i N}\right)^{x_i},\end{equation*}

and that this inequality holds trivially if $x_i=0$ . Therefore,

(27) \begin{align}\sum_{\left\{ \boldsymbol{{x}} \in C : \sum_i x_i = l \right\}}\frac{l!}{\prod_{i=1}^k x_i!}&\prod_{i=1}^k\left[(\alpha_i \mu_i f_i)^{x_i} \prod_{r=0}^{x_i-1}\left(1-\frac{r}{f_i N}\right)\right]\nonumber\\&\le \sum_{\left\{ \boldsymbol{{x}} \in C : \sum_i x_i = l \right\}}\frac{l!}{\prod_{i=1}^k x_i!}\prod_{i=1}^k(\alpha_i \mu_i f_i)^{x_i} \left(1-\frac{x_i-1}{2f_i N}\right)^{x_i}.\end{align}

Further, if $l \gt \lfloor \sqrt{N} \rfloor$ and $x_1+x_2+\dots+x_k=l$ , then at least one of $x_1, x_2, \dots, x_k$ is $\ge \sqrt{N}/k$ . Hence, for all sufficiently large N, for such l,

\begin{equation*}\prod_{i=1}^k(\alpha_i \mu_i f_i)^{x_i} \left(1-\frac{x_i-1}{2f_i N}\right)^{x_i}\le \sum_{i=1}^k \left[(\alpha_i \mu_i f_i)\left(1-\frac{(\sqrt{N}/k)-1}{2f_i N}\right)\right]^{x_i}\prod_{\substack{j=1\\ j\ne i}}^k (\alpha_j \mu_j f_j)^{x_j},\end{equation*}

and using the multinomial theorem, it follows from (27) that

\begin{align*}\sum_{\left\{ \boldsymbol{{x}} \in C : \sum_i x_i = l \right\}}\frac{l!}{\prod_{i=1}^k x_i!}\prod_{i=1}^k&\left[(\alpha_i \mu_i f_i)^{x_i} \prod_{r=0}^{x_i-1}\left(1-\frac{r}{f_i N}\right)\right]\\&\le \sum_{i=1}^k \left[\alpha_i \mu_i f_i\left(1-\frac{(\sqrt{N}/k)-1}{2f_i N}\right)+\sum_{\substack{j=1\\ j\ne i}}^k \alpha_j \mu_j f_j\right]^l\\&= \sum_{i=1}^k \left[\frac{1}{\beta}-\frac{[(\sqrt{N}/k)-1]\alpha_i\mu_i}{2N}\right]^l,\end{align*}

since $R_0=1$ . Thus,

\begin{align*}B_N &\le \sum_{l=\lfloor \sqrt{N} \rfloor+1}^N \frac{1}{\beta l} \sum_{i=1}^k \left(1-\frac{\beta\alpha_i\mu_i[(\sqrt{N}/k)-1]}{2N}\right)^l\\&\le \frac{1}{\beta\sqrt{N}}\sum_{i=1}^k\frac{2N}{\beta\alpha_i\mu_i[(\sqrt{N}/k)-1]}\\&\le \frac{4k}{\beta^2}\sum_{i=1}^k \frac{1}{\alpha_i\mu_i}\end{align*}

for all sufficiently large N, which together with (26) implies

(28) \begin{equation}\limsup_{N \to \infty}\frac{\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} )}{\log N} \le \frac{1}{2\beta}.\end{equation}

Fix $V>0$ . Then it follows from (25) that for any $\boldsymbol{{x}} \in C$ with $\sum_{i=1}^k x_i=l$ , where $l \le \lfloor V\sqrt{N} \rfloor$ ,

\begin{equation*}a(\boldsymbol{{x}}) \ge \left[\prod_{i=1}^k \prod_{r=0}^{\lfloor V\sqrt{N} \rfloor}\left(1-\frac{r}{f_i N}\right)\right]\frac{\beta^{l-1}}{l} \frac{l!}{\prod_{i=1}^k x_i!}\prod_{i=1}^k(\alpha_i \mu_i f_i)^{x_i}.\end{equation*}

Thus, for $l \le \lfloor V\sqrt{N} \rfloor$ and $R_0=1$ , application of the multinomial theorem yields

\begin{equation*}\sum_{\left\{ \boldsymbol{{x}} \in C : \sum_i x_i = l \right\}} a(\boldsymbol{{x}}) \ge \frac{1}{\beta l}\left[\prod_{i=1}^k \prod_{r=0}^{\lfloor V\sqrt{N} \rfloor}\left(1-\frac{r}{f_i N}\right)\right],\end{equation*}

whence

\begin{equation*}\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} ) \ge \sum_{l=1}^{\lfloor V\sqrt{N} \rfloor} \sum_{\left\{ \boldsymbol{{x}} \in C : \sum_i x_i = l \right\}} a(\boldsymbol{{x}})\ge \left[\prod_{i=1}^k \prod_{r=0}^{\lfloor V\sqrt{N} \rfloor}\left(1-\frac{r}{f_i N}\right)\right] \frac{1}{\beta} \sum_{l=1}^{\lfloor V\sqrt{N} \rfloor}\frac{1}{l}.\end{equation*}

Straightforward analysis (see Appendix A) yields

(29) \begin{equation}\lim_{N \to \infty} \prod_{i=1}^k \prod_{r=0}^{\lfloor V\sqrt{N} \rfloor}\left(1-\frac{r}{f_i N}\right)=\exp\left(-\frac{V^2}{2}\sum_{i=1}^k f_i^{-1}\right).\end{equation}

Therefore,

\begin{equation*}\liminf_{N \to \infty}\frac{\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} )}{\log N} \ge \frac{1}{2\beta}\exp\left(-\frac{V^2}{2}\sum_{i=1}^k f_i^{-1}\right),\end{equation*}

and letting $V \downarrow 0$ ,

(30) \begin{equation}\liminf_{N \to \infty}\frac{\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} )}{\log N} \ge \frac{1}{2\beta}.\end{equation}

Recalling (23), the theorem follows from (28) and (30).

Remark 4. Theorem 2 for the case $k=1$ corresponds to the relation (3.7) of [Reference Ball, Britton and Neal5].

We consider now an extension of Example 2 of Section 2.1, in which $d_0(y)=\mu+\kappa y^{\eta}$ , where $\eta \gt 0$ . Note from (15) that $R_0=k \lambda/\mu$ .

Theorem 3. For the above extension of Example 2, if $R_0=1$ , then for $i=1,2,\ldots,k$ ,

\begin{eqnarray*}\tau_i & \sim & \frac{\eta \log N}{\mu (1+\eta)} \mbox{ as } N \to \infty .\end{eqnarray*}

Proof. For $\boldsymbol{{x}} \in C$ and $\sum_{i=1}^k x_i=l$ ,

\begin{align*}a(\boldsymbol{{x}}) &= \frac{1}{N\left(\mu +\kappa (1/N)^{\eta}\right)}\left(\prod_{r=1}^{l-1} \frac{\lambda(r/N)}{\mu+\kappa[(r+1)/N]^{\eta}}\right)\prod_{i=1}^k \prod_{r=0}^{x_i-1} \left(\frac{1}{(r+1)/N}\right)\nonumber\\&= \frac{1}{\mu} \left(\frac{\lambda}{\mu}\right)^{l-1} \frac{(l-1)!}{\prod_{i=1}^k x_i!}\frac{1}{\prod_{i=1}^l \left(1+\theta(i/N)^{\eta}\right)},\end{align*}

where $\theta=\kappa/\mu$ . Thus, using the multinomial theorem,

\begin{eqnarray*}\sum_{\left\{ \boldsymbol{{x}} \in C : \sum_i x_i = l \right\}} a(\boldsymbol{{x}}) \,&=&\, \frac{1}{\mu l} \left(\frac{\lambda}{\mu}\right)^{l-1} \frac{k^l}{\prod_{i=1}^l \left(1+\theta(i/N)^{\eta}\right)}\end{eqnarray*}

and, since $R_0=1$ ,

\begin{eqnarray*}\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} )\,&=&\,\frac{k}{\mu} \sum_{l=1}^{\infty} \frac{1}{l \prod_{i=1}^l \left(1+\theta(i/N)^{\eta}\right)}.\end{eqnarray*}

Now

\begin{equation*}\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} )= A_N+B_N,\end{equation*}

where

\begin{equation*}A_N=\frac{k}{\mu}\sum_{l=1}^{\lceil N^{\frac{\eta}{1+\eta}} \rceil} \frac{1}{l \prod_{i=1}^l \left(1+\theta(i/N)^{\eta}\right)}\qquad \text{and}\qquad B_N=\frac{k}{\mu}\sum_{l=\lceil N^{\frac{\eta}{1+\eta}} \rceil+1}^{\infty} \frac{1}{l \prod_{i=1}^l \left(1+\theta(i/N)^{\eta}\right)}.\end{equation*}

Clearly,

(31) \begin{equation}A_N \le \frac{k}{\mu} \sum_{l=1}^{\lceil N^{\frac{\eta}{1+\eta}} \rceil} \frac{1}{l}.\end{equation}

Also,

\begin{equation*}\prod_{i=1}^l \left(1+\theta(i/N)^{\eta}\right) \ge 1+\frac{\theta}{N^{\eta}}\sum_{i=1}^l i^{\eta}\ge 1+\frac{\theta}{N^{\eta}} \int_0^l u^{\eta}\, du\ge \frac{\theta l^{\eta+1}}{(\eta+1)N^{\eta}},\end{equation*}

so

\begin{align*}\sum_{l=\lceil N^{\frac{\eta}{1+\eta}} \rceil+1}^{\infty} \frac{1}{l \prod_{i=1}^l \left(1+\theta(i/N)^{\eta}\right)}&\le \frac{(\eta+1)N^{\eta}}{\theta} \sum_{l=\lceil N^{\frac{\eta}{1+\eta}} \rceil+1}^{\infty} \frac{1}{l^{\eta+2}}\\&\le \frac{(\eta+1)N^{\eta}}{\theta} \int_{\lceil N^{\frac{\eta}{1+\eta}} \rceil}^{\infty} u^{-(\eta+2)}\, du \le \theta^{-1}.\end{align*}

Note that in conjunction with (31) this shows that the condition (10) is satisfied, and a similar argument shows that the condition (11) is satisfied. Further, it follows that

(32) \begin{equation}\limsup_{N \to \infty}\frac{\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} )}{\log N} \le \frac{k\eta}{\mu(1+\eta)}.\end{equation}

Fix $L>0$ . Then

\begin{eqnarray*}\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} ) &\ge& \frac{k}{\mu}\sum_{l=1}^{\lfloor LN^{\frac{\eta}{1+\eta}} \rfloor}\frac{1}{l \prod_{i=1}^l \left(1+\theta(i/N)^{\eta}\right)} \\&\ge& \frac{k}{\mu} \left(\sum_{l=1}^{\lfloor LN^{\frac{\eta}{1+\eta}} \rfloor}\frac{1}{l}\right) \left(\frac{1}{\prod_{i=1}^{\lfloor LN^{\frac{\eta}{1+\eta}} \rfloor} \left(1+\theta(i/N)^{\eta}\right)}\right).\end{eqnarray*}

Straightforward analysis (see Appendix A) yields that, for $\eta, L>0$ ,

(33) \begin{equation}\lim_{N \to \infty} \prod_{i=1}^{\lfloor LN^{\frac{\eta}{1+\eta}} \rfloor} \left(1+\theta(i/N)^{\eta}\right)=\exp\left(\frac{\theta L^{\eta+1}}{\eta+1}\right),\end{equation}

so

(34) \begin{equation}\liminf_{N \to \infty}\frac{\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} )}{\log N} \ge \frac{k\eta}{\mu(1+\eta)}\exp\left(-\frac{\theta L^{\eta+1}}{\eta+1}\right).\end{equation}

From (32) and letting $L \downarrow 0$ in (34), it follows that

\begin{equation*}\sum_{i=1}^k \tau_i = \sum_{i=1}^k b_i(0) \tau_i \sim \frac{k \eta \log N}{\mu (1+\eta)} \mbox{ as } N \to \infty .\end{equation*}

Recalling Remark 2, from the symmetry of the process we have $\tau_1 = \tau_2 = \cdots = \tau_k$ , and the result follows.

6. The supercritical case

Our assumptions of Section 2.2 ensure that the system (13) has an equilibrium point at $\boldsymbol{{y}} = {\bf 0}$ . We now consider the supercritical case $R_0 \gt 1$ under the following further assumptions:

  1. 1.

    (35)
  2. 2.

    (36)
  3. 3.

    (37)
  4. 4.

    (38)
  5. 5.

    (39)

We have already seen that when $R_0 \gt 1$ the Jacobian of the system (13) at $\boldsymbol{{y}} = {\bf 0}$ has at least one eigenvalue with strictly positive real part, so that the extinction equilibrium point is locally unstable; the condition (36) ensures that the equilibrium point $\boldsymbol{{y}}^*$ is locally stable.

6.1. Time to extinction starting from a single individual

Corresponding to Theorem 1 for the subcritical case, in the supercritical case we have the following.

Theorem 4. Suppose that $R_0 \gt 1$ , and that the assumptions (4)–(8), (10), (11) together with the assumptions (35)–(39) are all satisfied. With $\delta_{ij}$ denoting the Kronecker delta, denote by H the matrix with elements $h_{ij}$ given by

(40) \begin{eqnarray}h_{ij} \,&=&\,\frac{b_0^{\prime} \left( \sum_l y_l^* \right)}{b_0 \left( \sum_l y_l^* \right)} - \frac{d_0^{\prime} \left( \sum_l y_l^* \right)}{d_0 \left( \sum_l y_l^* \right)}+ \left( \frac{b_i^{\prime} \left( y_i^* \right)}{b_i \left( y_i^* \right)} - \frac{d_i^{\prime} \left( y_i^* \right)}{d_i \left( y_i^* \right)} \right) \delta_{ij} \end{eqnarray}

for $i,j = 1,2,\ldots,k$ . Recalling that $\tau_i$ denotes the mean time to extinction starting from a single individual of type i, as $N \to \infty$ we have

\begin{eqnarray*}\sum_{i=1}^k b_i(0) \tau_i &\sim& \frac{K_0}{\sqrt{N}} \exp ( A N ),\end{eqnarray*}

where

(41) \begin{eqnarray}A \,&=&\, \int_0^{\sum_j y^*_j} \log \left( \frac{b_0(u)}{d_0(u)} \right) du + \sum_{i=1}^k \int_0^{y^*_i} \log \left( \frac{b_i (u)}{d_i (u)} \right) du \end{eqnarray}

and

(42) \begin{eqnarray}K_0 \,&=&\, \sqrt{\frac{2\pi d_0 (0)}{b_0^\prime (0) b_0 \left( \sum_j y_j^* \right) d_0 \left( \sum_j y_j^* \right) \mbox{det} ( -H)} \prod_{i=1}^k \frac{b_i(0) d_i^\prime (0)}{b_i \left( y_i^* \right) d_i \left( y_i^* \right)}} , \end{eqnarray}

with

(43) \begin{eqnarray}\mbox{det} ( -H )\,&=&\,\prod_{i=1}^k \left( \frac{d_i^\prime \left( y_i^* \right)}{d_i \left( y_i^* \right)} - \frac{b_i^\prime \left( y_i^* \right)}{b_i \left( y_i^* \right)} \right) \nonumber\\ && {} + \left( \frac{d_0^\prime \left( \sum_l y_l^* \right)}{d_0 \left( \sum_l y_l^* \right)} - \frac{b_0^\prime \left( \sum_l y_l^* \right)}{b_0 \left( \sum_l y_l^* \right)} \right)\sum_{i=1}^k\prod_{j \ne i} \left( \frac{d_j^\prime \left( y_j^* \right)}{d_j \left( y_j^* \right)} - \frac{b_j^\prime \left( y_j^* \right)}{b_j \left( y_j^* \right)} \right) . \end{eqnarray}

Proof. Letting $\boldsymbol{{y}} = \boldsymbol{{x}} / N$ and $\tilde a ( \boldsymbol{{y}} ) = a ( N \boldsymbol{{y}} )$ , the formula (9) may be written as

\begin{eqnarray*}\tilde a ( \boldsymbol{{y}} ) \,&=&\, \frac{1}{N d_0 (1/N)} \prod_{r=1}^{N \sum_j y_j - 1} \left( \frac{b_0(r/N)}{d_0((r+1)/N)} \right) \prod_{i=1}^k \prod_{r=0}^{N y_i-1} \left( \frac{b_i (r/N)}{d_i((r+1)/N)} \right) ,\end{eqnarray*}

so that

(44) \begin{eqnarray}\log \left( \tilde a ( \boldsymbol{{y}} ) \right) \,&=&\,- \log \left( N d_0(1/N) \right)+ \sum_{r=1}^{N \sum_j y_j - 1} \log \left( \frac{b_0(r/N)}{d_0((r+1)/N)} \right) \nonumber\\ && {}+ \sum_{i=1}^k \sum_{r=0}^{N y_i-1} \log \left( \frac{b_i (r/N)}{d_i((r+1)/N)} \right) \nonumber\\\,&=&\, - \log \left( N d_0(1/N) \right) - \log \left( \frac{b_0 \left( \sum_j y_j \right)}{d_0 (1/N)} \right)+ \sum_{r=1}^{N \sum_j y_j} \log \left( \frac{b_0(r/N)}{d_0(r/N)} \right) \nonumber\\ && {}+ \sum_{i=1}^k\left( \log \left( \frac{b_i (0)}{b_i \left( y_i \right)} \right) +\sum_{r=1}^{N y_i} \log \left( \frac{b_i (r/N)}{d_i(r/N)} \right) \right) \nonumber\\\,&=&\, - \log \left( N b_0 \left( \sum_j y_j \right) \right)+ \sum_{i=1}^k\log \left( \frac{b_i (0)}{b_i \left( y_i \right)} \right) \nonumber\\ && {}+ \sum_{r=1}^{N \sum_j y_j} \log \left( \frac{b_0(r/N)}{d_0(r/N)} \right)+ \sum_{i=1}^k\left(\sum_{r=1}^{N y_i} \log \left( \frac{b_i (r/N)}{d_i(r/N)} \right) \right) . \end{eqnarray}

For some $i=1,2,\ldots,k$ , consider the sum $\sum_{r=1}^{N y_i} \log \left( \left. b_i ( r/N ) \right/ d_i (r/N) \right)$ for large N. We can approximate this sum by an integral via the trapezium rule, with the complication that $d_i (0) = 0$ , so that the integrand diverges at the lower end. We deal with this by regularizing the integrand as follows:

\begin{eqnarray*}\sum_{r=1}^{N y_i} \log \left( \frac{b_i ( r/N )}{d_i (r/N)} \right) \,&=&\,\sum_{r=1}^{N y_i} \log \left( \frac{(r/N) b_i (r/N)}{d_i (r/N)} \right) - \sum_{r=1}^{N y_i} \log ( r/N ) \\\,&=&\, \sum_{r=1}^{N y_i} \log \left( \frac{(r/N) b_i(r/N)}{d_i(r/N)} \right) - \log \left( \left( N y_i \right)! \right) + N y_i \log N \\\,&=&\, N \int_0^{y_i} \log \left( \frac{u b_i(u)}{d_i (u)} \right) du + \frac{1}{2} \log \left( \frac{y_i b_i \left( y_i \right) / d_i \left( y_i \right)}{b_i (0) \lim_{u \to 0} ( u / d_i (u) )} \right)\\ && {}- \log ( \left( N y_i \right)! ) + Ny_i \log N + O (1/N) ,\end{eqnarray*}

where we have made use of the assumption (39) to control the order of the approximation error in the trapezium rule. From the assumption (8) we have that $d_i^\prime (0) \gt 0$ , so applying l’Hôpital’s rule to the term $\lim_{u \to 0} ( u / d_i (u) )$ and Stirling’s formula to the factorial term yields

(45) \begin{eqnarray}&&{\sum_{r=1}^{N y_i} \log \left( \frac{b_i ( r/N )}{d_i (r/N)} \right)} \nonumber \\\,&=&\, N \int_0^{y_i} \log \left( \frac{u b_i (u)}{d_i (u)} \right) du + \frac{1}{2} \log \left( \frac{y_i b_i \left( y_i \right) / d_i \left( y_i \right)}{b_i (0) / d_i^\prime (0)} \right)\nonumber \\ && {}- N y_i \log \left( N y_i \right) + N y_i -(1/2) \log \left( 2 \pi N y_i \right) + N y_i \log N + o(1) \nonumber \\\,&=&\, N \left( \int_0^{y_i} \log \left( \frac{b_i (u)}{d_i(u)} \right) du + \int_0^{y_i} \log u \, du \right)+ \frac{1}{2} \log \left( \frac{y_i b_i \left( y_i \right) d_i^\prime (0)}{b_i(0) d_i \left( y_i \right)} \right)\nonumber \\ && {}- N y_i \log y_i + N y_i - (1/2) \log \left( 2 \pi N y_i \right) + o(1) \nonumber \\\,&=&\, N \int_0^{y_i} \log \left( \frac{b_i (u)}{d_i (u)} \right) du + N \left( y_i \log y_i - y_i \right)+ \frac{1}{2} \log \left( \frac{y_i b_i \left( y_i \right) d_i^\prime (0)}{b_i(0) d_i \left( y_i \right)} \right)\nonumber \\ && {}- N y_i \log y_i + N y_i - (1/2) \log \left( 2 \pi N y_i \right) + o(1) \nonumber \\\,&=&\, N \int_0^{y_i} \log \left( \frac{b_i (u)}{d_i (u)} \right) du+ \frac{1}{2} \log \left( \frac{y_i b_i \left( y_i \right) d_i^\prime (0)}{b_i(0) d_i \left( y_i \right)} \right)\nonumber \\ && {}- (1/2) \log \left( 2 \pi N y_i \right) + o(1) \nonumber \\\,&=&\, N \int_0^{y_i} \log \left( \frac{b_i (u)}{d_i (u)} \right) du + \frac{1}{2} \log \left( \frac{b_i \left( y_i \right) d_i^\prime (0)}{2 \pi N b_i (0) d_i \left( y_i \right)} \right) + o(1) . \end{eqnarray}

The term $\sum_{r=1}^{N \sum_j y_j} \log \left( b_0 (r/N) / d_0 (r/N) \right)$ in (44) may be treated similarly, noting that from the assumptions (4), (8) we have $b_0 (0) = 0$ , $d_0 (0) \gt 0$ , and $b_0^\prime (0) \gt 0$ , and making use of the assumption (38) in invoking the trapezium rule. Equation (44) now becomes

\begin{eqnarray*}\log \left( \tilde a ( \boldsymbol{{y}} ) \right)\,&=&\, N \int_0^{\sum_j y_j} \log \left( \frac{b_0(u)}{d_0 (u)} \right) du + \frac{1}{2} \log \left( \frac{2 \pi N b_0 \left( \sum_j y_j \right) d_0 (0)}{b_0^\prime (0) d_0 \left( \sum_j y_j \right)} \right)\\ && {}+ \sum_{i=1}^k \left( N \int_0^{y_i} \log \left( \frac{b_i (u)}{d_i (u)} \right) du + \frac{1}{2} \log \left( \frac{b_i \left( y_i \right) d_i^\prime (0)}{2 \pi N b_i (0) d_i \left( y_i \right)} \right) \right)\\ && {}- \log \left( N b_0 \left( \sum_j y_j \right) \right) + \sum_{i=1}^k\log \left( \frac{b_i (0)}{b_i \left( y_i \right)} \right) + o(1) \\\,&=&\,N \left( \int_0^{\sum_j y_j} \log \left( \frac{b_0(u)}{d_0 (u)} \right) du+ \sum_{i=1}^k \int_0^{y_i} \log \left( \frac{b_i (u)}{d_i (u)} \right) du \right)\\ && {}- \left( \frac{k+1}{2} \right) \log N - \left( \frac{k-1}{2} \right) \log ( 2 \pi )\\ && {}+ \frac{1}{2} \log \left( \frac{d_0 (0)}{b_0^\prime (0) b_0 \left( \sum_j y_j \right) d_0 \left( \sum_j y_j \right)} \right)\\ && {}+ \frac{1}{2} \sum_{i=1}^k \log \left( \frac{b_i(0) d_i^\prime (0)}{b_i \left( y_i \right) d_i \left( y_i \right)} \right)+ o(1) .\end{eqnarray*}

That is, as $N \to \infty$ ,

(46) \begin{eqnarray}&&{\log \left( \tilde a ( \boldsymbol{{y}} ) \right)} \nonumber \\\,&=&\, N f ( \boldsymbol{{y}} ) - \left( \frac{k+1}{2} \right) \log N - \left( \frac{k-1}{2} \right) \log ( 2 \pi ) + g ( \boldsymbol{{y}} ) + o(1) , \end{eqnarray}

where

\begin{eqnarray*}f ( \boldsymbol{{y}} ) \,&=&\, \int_0^{\sum_j y_j} \log \left( \frac{b_0(u)}{d_0(u)} \right) du + \sum_{i=1}^k \int_0^{y_i} \log \left( \frac{b_i (u)}{d_i (u)} \right) du\end{eqnarray*}

and

\begin{eqnarray*}g ( \boldsymbol{{y}} ) \,&=&\, \frac{1}{2} \log \left( \frac{d_0 (0)}{b_0^\prime (0) b_0 \left( \sum_j y_j \right) d_0 \left( \sum_j y_j \right)} \right)+ \frac{1}{2} \sum_{i=1}^k \log \left( \frac{b_i (0) d_i^\prime (0)}{b_i \left( y_i \right) d_i \left( y_i \right)} \right) .\end{eqnarray*}

It is clear from the form of (46) that for large N, contributions to the sum (23) will be dominated by terms $a ( N \boldsymbol{{y}} ) = \tilde a (\boldsymbol{{y}})$ where $f ( \boldsymbol{{y}} )$ achieves its maximal value. Recalling from (21) that the stationary distribution of the scaled restarted process is proportional to $\{ \tilde a(\boldsymbol{{y}}) :\ N \boldsymbol{{y}} \in S \}$ , one would intuitively expect this stationary distribution to be peaked around a unique mode that converges to the deterministic stable equilibrium point $\boldsymbol{{y}}^*$ as $N \to \infty$ , and to become increasingly strongly peaked as N increases. We will now show that $f ( \boldsymbol{{y}} )$ does indeed have a unique maximum point at $\boldsymbol{{y}}^*$ .

For $i=1,2,\ldots,k$ ,

\begin{eqnarray*}\frac{\partial f}{\partial y_i} \,&=&\, \log \left( \frac{b_0 \left( \sum_j y_j \right) b_i \left( y_i \right)}{d_0 \left( \sum_j y_j \right) d_i \left( y_i \right)} \right) ,\end{eqnarray*}

so the condition for a stationary point of $f ( \boldsymbol{{y}} )$ is precisely the condition for an equilibrium point of the deterministic system (13). We have assumed that the system (13) has exactly two equilibrium points, at $\bf 0$ and $\boldsymbol{{y}}^*$ . We now show that $\boldsymbol{{y}}^*$ is a local maximum of $f ( \boldsymbol{{y}} )$ , and hence the unique global maximum.

The elements $h_{ij} = \frac{\partial^2 f}{\partial y_i \partial y_j}$ of the Hessian matrix H of $f ( \boldsymbol{{y}} )$ at $\boldsymbol{{y}}^*$ are given by (40). Since all off-diagonal elements of H are equal, application of the matrix determinant lemma shows that the determinant $\mbox{det} (-H)$ is given by the formula (43). Noting that $\boldsymbol{{y}}^*$ is an equilibrium point of (13), for $i=1,2,\ldots,k$ we have $b_0 \left( \sum_l y_l^* \right) b_i ( y_i^* ) = d_0 \left( \sum_l y_l^* \right) d_i ( y_i^* )$ , and so

\begin{eqnarray}h_{ij}\,&=&\, \frac{1}{b_0 \left( \sum_l y_l^* \right) b_i \left( y_i^* \right)} \left(b_0^{\prime} \left( \sum_l y_l^* \right) b_i \left( y_i^* \right) - d_0^{\prime} \left( \sum_l y_l^* \right) d_i \left( y_i^* \right) \nonumber\right. \\ && {} \left.+ \left( b_0 \left( \sum_l y_l^* \right) b_i^{\prime} \left( y_i^* \right) - d_0 \left( \sum_l y_l^* \right) d_i^{\prime} \left( y_i^* \right) \right) \delta_{ij} \right) \nonumber\end{eqnarray}

for $i,j = 1,2,\ldots,k$ . That is, denoting by B the diagonal matrix with entries $b_{ii} = b_0 \left( \sum_l y_l^* \right) b_i \left( y_i^* \right)$ for $i=1,2,\ldots,k$ , we have $H = B^{-1} J ( \boldsymbol{{y}}^* )$ , where $J ( \boldsymbol{{y}}^* )$ is the Jacobian of the system (13) evaluated at $\boldsymbol{{y}}^*$ . Setting $\Sigma = - H^{-1}$ and noting that $\Sigma = \Sigma^T$ , it follows that $\Sigma$ satisfies the Lyapunov equation $J ( \boldsymbol{{y}}^* ) \Sigma + \Sigma J ( \boldsymbol{{y}}^* )^T = - 2 B$ . From [Reference Laub18, Theorem 13.21], this Lyapunov equation has a unique solution $\Sigma$ . Since all diagonal elements of B are strictly positive and, by the assumption (36), the eigenvalues of $J ( \boldsymbol{{y}}^* )$ all have negative real part, it follows from [Reference Laub18, Theorem 13.24] that $\Sigma$ is positive definite. Hence H is negative definite, and $\boldsymbol{{y}}^*$ is a local maximum point of $f ( \boldsymbol{{y}} )$ .

Writing $\tilde C = \tilde C_1 \times \tilde C_2 \times \cdots \times \tilde C_k$ , then as $N \to \infty$ , from (46) we obtain

(47) \begin{eqnarray}\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} ) &\sim& N^k \int_{\tilde C} \tilde a ( \boldsymbol{{y}} ) \, d \boldsymbol{{y}} \nonumber \\\,&=&\, N^k \int_{\tilde C} \exp \left( N f ( \boldsymbol{{y}} ) - \left( \frac{k+1}{2} \right) \log N - \left( \frac{k-1}{2} \right) \log ( 2 \pi ) + g ( \boldsymbol{{y}} ) + o(1) \right) d \boldsymbol{{y}} \nonumber \\\,&=&\, \left( \frac{N}{2\pi} \right)^{(k-1)/2} \int_{\tilde C} \exp \left( N f ( \boldsymbol{{y}} ) + g ( \boldsymbol{{y}} ) + o(1) \right) d\boldsymbol{{y}} . \end{eqnarray}

Now the condition (10) implies that for sufficiently large N the integral on the right-hand side of (47) is convergent, and the conditions (38), (39) ensure that the o(1) term converges uniformly to zero within a neighbourhood of $\boldsymbol{{y}}^*$ . We can therefore apply the multivariate Laplace approximation to this integral, yielding

\begin{eqnarray*}\sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} )&\sim& \sqrt{\frac{2 \pi}{N \mbox{det} ( -H )}} \, \exp \left( N f ( \boldsymbol{{y}}^* ) + g ( \boldsymbol{{y}}^* ) \right) .\end{eqnarray*}

So finally,

\begin{eqnarray*}\sum_{i=1}^k b_i(0) \tau_i \,&=&\, \sum_{\boldsymbol{{x}} \in C} a ( \boldsymbol{{x}} )\;\;\sim\;\; \frac{K_0}{\sqrt{N}} \exp (AN)\end{eqnarray*}

where

\begin{eqnarray*}A \,&=&\, f ( \boldsymbol{{y}}^* ) , \\K_0 \,&=&\, \sqrt{\frac{2 \pi}{\mbox{det}(-H)}} \, \exp (g ( \boldsymbol{{y}}^* )) ,\end{eqnarray*}

and the result follows.

6.2. Time to extinction starting from quasi-equilibrium

In the supercritical case, under the assumptions (35)–(36), as well as the mean time to extinction starting from a single individual, another quantity of interest is the mean time to extinction starting from close to the quasi-equilibrium state $N \boldsymbol{{y}}^*$ . Denoting by $\tau$ this mean extinction time starting close to quasi-equilibrium, we follow [Reference Ball, Britton and Neal5] in outlining a heuristic derivation of the asymptotic form of $\tau$ .

Recall from Section 2.3 that in its initial phase, the process initiated from a single individual may be approximated by a multitype branching process. In the supercritical case $R_0 \gt 1$ , this branching process either goes extinct quickly, or takes off to produce an infinite number of progeny. When the approximating branching process takes off, the original multitype birth–death process will move quickly towards the quasi-equilibrium state $N \boldsymbol{{y}}^*$ , and spend a long time fluctuating close to this state before eventually going extinct. This can be seen from the form of the equilibrium distribution $\pi ( \boldsymbol{{x}} )$ , since we have seen in Section 6.1 that for large N, the probability mass is concentrated close to $N \boldsymbol{{y}}^*$ .

For a process initiated by a single individual of type i ( $i=1,2,\ldots,k$ ), denote by $\zeta_i$ the expected time to extinction of the process conditional upon rapid extinction, and by $\sigma_i$ the expected time taken to first attain a small neighbourhood of $N \boldsymbol{{y}}^*$ conditional upon the process taking off, and recall that $\omega_i$ denotes the extinction probability of the approximating branching process. Then in the limit as $N \to \infty$ , we have

\begin{eqnarray*}\tau_i &\sim& \omega_i \zeta_i + \left( 1 - \omega_i \right) \left( \sigma_i + \tau \right) ,\end{eqnarray*}

where both $\zeta_i$ and $\sigma_i$ are negligible in comparison to $\tau$ , so

(48) \begin{eqnarray}\tau_i &\sim& \left( 1 - \omega_i \right) \tau. \end{eqnarray}

Recalling the relationship (18), from Theorem 4 we now obtain

(49) \begin{eqnarray}\tau &\sim& \frac{K}{\sqrt{N}} \, \exp (AN), \end{eqnarray}

where A is given by (41), and

(50) \begin{eqnarray}K \,&=&\, \frac{K_0}{(1-\omega) \sum_i b_i(0)} , \end{eqnarray}

with $K_0$ given by Equation (42) and $\omega$ being the solution to Equation (17).

The above heuristic argument is made rigorous for the case of the SIS infection model in $k=1$ dimension, with the additional condition that lifeforce random variables are of finite variance, in [Reference Ball, Britton and Neal4, Appendix B]. The proof is rather lengthy, even in this particular case. It is conjectured in [Reference Ball, Britton and Neal4] that the finite variance condition may not be necessary.

Example 1. For the heterogeneous-population SIS infection model (Example 1 of Section 2.1), (49) reduces, after some algebraic simplification, to the formula (6) of [Reference Clancy9].

Example 2. For Example 2 of Section 2.1, the multitype birth–death process with linear birth rates and quadratic death rates, we obtain

(51) \begin{equation}A = \frac{k \lambda - \mu}{\kappa} + \frac{\mu}{\kappa} \log \left( \frac{\mu}{k \lambda} \right) ,\end{equation}
(52) \begin{equation}K = \frac{\sqrt{2 \pi \mu \kappa}}{\left( 1 - \omega \right) k \lambda \left( k \lambda - \mu \right)} . \end{equation}

Note that for this process, if lifeforce variables are exponentially distributed, then the total number of individuals $\sum_{i=1}^k X_i (t)$ is a one-dimensional Markov process, and the formulae (51), (52) can alternatively be obtained from Equation (57) of [Reference Assaf and Meerson1].

6.3. Exponentially distributed lifeforces

In the case that the lifeforce random variables $Q_i$ are exponentially distributed, so that $\boldsymbol{{X}}(t)$ is a Markov process, the equations (16) become

(53) \begin{eqnarray}\omega_i \,&=&\, \frac{d_0 (0) d_i^\prime (0)}{d_0(0) d_i^\prime (0) + b_0^\prime (0) \sum_{j=1}^k ( 1 - \omega_j ) b_j (0)} \mbox{ for } i=1,2,\ldots,k . \end{eqnarray}

Setting $D = \left( \left. b_0^\prime (0) \right/ d_0 (0) \right) \sum_{j=1}^k ( 1 - \omega_j ) b_j (0)$ , we may write the equations (53) as

(54) \begin{eqnarray}\omega_i \,&=&\, \frac{d_i^\prime (0)}{d_i^\prime (0) + D} \mbox{ for } i=1,2,\ldots,k . \end{eqnarray}

Substituting from (54) back into both sides of (53) and rearranging, we find that for $R_0 \gt 1$ , D is the unique positive solution of

\begin{eqnarray*} \frac{b_0^\prime (0)}{d_0 (0)} \sum_{j=1}^k \left( \frac{b_j (0)}{d_j^\prime (0) + D} \right) \,&=&\, 1.\end{eqnarray*}

Recalling the relationship (18), it is then straightforward to show that

\begin{eqnarray*}\omega\,&=&\, 1 - \frac{D d_0(0)}{b_0^\prime (0) \sum_j b_j (0)} ,\end{eqnarray*}

so that (50) becomes

(55) \begin{eqnarray}K \,&=&\, \frac{K_0 b_0^\prime(0)}{D d_0(0)} . \end{eqnarray}

The formula (49) with K given by (55) can alternatively be obtained via the WKB approximation method, following the same steps as in [Reference Clancy9]. Although neither the argument of Section 6.2 nor the WKB method is entirely rigorous, the fact that the two methods are in agreement provides some reassurance.

7. Discussion

In this paper, we have greatly extended the class of multidimensional processes for which precise asymptotic formulae such as (1) are available. We have found that in the subcritical case, expected persistence time remains bounded as the system size parameter N grows to infinity; in the examples considered in the critical case, expected persistence time grows logarithmically with N; and in the supercritical case, expected persistence time grows exponentially with N. This is as one might expect. What is more surprising is that we were able to obtain such precise and explicit formulae as those provided by Theorem 4 and (49). The key to our approach is that the restarted process of Section 3 is time-reversible, although the details of our proofs are also dependent upon the particular form of the transition rate functions. The extension of our results to more general restart-reversible processes is the subject of ongoing work.

It is worth noting that while formulae such as (41) and (42) may appear formidable, their evaluation is essentially trivial. This contrasts with the standard characterization of mean persistence time from quasi-stationarity in terms of an eigenvalue of the transition rate matrix, in which the eigenvalue equation appears simple, but evaluation of the relevant eigenvalue can present a formidable computational challenge.

For supercritical processes, arguably of greatest interest is the mean extinction time from quasi-equilibrium, $\tau$ . For the mean extinction time $\tau_i$ starting from a single individual of type i, Theorems 1, 2, and 4 provide limiting and asymptotic formulae only for the linear combination $\sum_{i=1}^k b_i (0) \tau_i$ . In the subcritical case, we conjectured in Section 4 that $\tau_i \to - \left( \log \left( 1 - R_0 \right) \right) / \left( d_0 (0) d_i^\prime (0) R_0 \right)$ as $N \to \infty$ for $i=1,2,\ldots,k$ . In the supercritical case, the heuristic arguments of Section 6.2 imply that $\tau_i \sim ( 1 - \omega_i ) \tau$ as $N \to \infty$ (the relationship (48)), which together with (49) gives the asymptotic form of $\tau_i$ for $i=1,2,\ldots,k$ .

We have focused here upon processes with a single absorbing state at the origin. Another interesting class of population processes is those for which absorption occurs at the boundary $A_0 = \{ \boldsymbol{{x}} \in {\mathbb Z}_+^k :\ x_i = 0 \mbox{ for some } i \}$ . The study of such processes is a topic for future work.

Appendix A. Proofs of Equations (29) and (33)

Consider first Equation (29). Fix $V, \gamma>0$ and for $N \gt (V\gamma)^2$ let

\begin{equation*}S_N= \prod_{r=0}^{\lfloor V\sqrt{N} \rfloor}\left(1-\frac{r\gamma}{N}\right),\end{equation*}

so

\begin{equation*}\log S_N=\sum_{r=0}^{\lfloor V\sqrt{N} \rfloor} \log \left(1-\frac{r\gamma}{N}\right).\end{equation*}

By Taylor’s theorem with the Lagrange form of the remainder, for $x \in (0,1)$ ,

\begin{equation*}-x-\frac{x^2}{2(1-x)^2} \lt \log(1-x) <-x.\end{equation*}

Hence,

(56) \begin{equation}\limsup_{N \to \infty} \log S_N \le \limsup_{N \to \infty} \left(-\frac{\gamma}{N} \sum_{r=0}^{\lfloor V\sqrt{N} \rfloor}r\right)=-\frac{\gamma V^2}{2}.\end{equation}

Also, $1-\frac{r\gamma}{N} \ge 1-\frac{V \gamma}{N^{\frac{1}{2}}}$ for $r=0,1,\dots, \lfloor V\sqrt{N} \rfloor$ , so

\begin{equation*}\lim_{N \to \infty} \sum_{r=0}^{\lfloor V\sqrt{N} \rfloor} \left(\frac{\frac{r\gamma}{N}}{1-\frac{r\gamma}{N}}\right)^2\le\lim_{N \to \infty} \left( \frac{1}{1-\frac{V \gamma}{N^{\frac{1}{2}}}} \right)^2 \frac{\gamma^2}{N^2} \frac{\lfloor V\sqrt{N} \rfloor(\lfloor V\sqrt{N} \rfloor+1)(2\lfloor V\sqrt{N} \rfloor+1)}{6}=0.\end{equation*}

Thus,

\begin{equation*}\liminf_{N \to \infty} \log S_N \ge -\frac{\gamma V^2}{2},\end{equation*}

which together with (56) implies

\begin{equation*}\lim_{N \to \infty} S_N=\exp\left(-\frac{\gamma V^2}{2}\right),\end{equation*}

and (29) follows.

Turning to Equation (33), recall that $\eta, L>0$ , and let $c_N=\lfloor LN^{\frac{\eta}{1+\eta}} \rfloor$ and

\begin{equation*}R_N=\prod_{i=1}^{\lfloor LN^{\frac{\eta}{1+\eta}} \rfloor} \left(1+\theta(i/N)^{\eta}\right).\end{equation*}

For $x>0$ ,

\begin{equation*}x-\frac{x^2}{2} \lt \log (1+x) <x,\end{equation*}

so

\begin{equation*}d_N-\frac{e_N}{2} \lt \log R_N \lt d_N,\end{equation*}

where

\begin{equation*}d_N=\frac{\theta}{N^{\eta}}\sum_{i=1}^{c_N} i^{\eta}\qquad \text{and}\qquad e_N=\frac{\theta^2}{N^{2\eta}}\sum_{i=1}^{c_N} i^{2\eta}.\end{equation*}

Now,

\begin{equation*}\frac{c_N^{\eta+1}}{\eta+1}=\int_0^{c_N} x^{\eta} dx \lt \sum_{i=1}^{c_N} i^{\eta} \lt \int_0^{c_N+1} x^{\eta} dx =\frac{(c_N+1)^{\eta+1}}{\eta+1}.\end{equation*}

Also, $\lim_{N \to \infty}N^{-\eta}c_N^{\eta+1} = L^{\eta+1}$ , so $\lim_{N \to \infty} d_N=\frac{\theta L^{\eta+1}}{\eta+1}$ . Further,

\begin{equation*}0 \lt N^{-2\eta} c_N^{2\eta+1} \le L^{2\eta+1} N^{\frac{\eta(2\eta+1)}{\eta+1}-2 \eta}=L^{2\eta+1}N^{-\frac{\eta}{\eta+1}},\end{equation*}

so $\lim_{N \to \infty} e_N=0$ . Thus,

\begin{equation*}\lim_{N \to \infty} \log R_N=\frac{\theta L^{\eta+1}}{\eta+1}\end{equation*}

and (33) follows.

Acknowledgement

It is a pleasure to thank two anonymous reviewers for detailed remarks that improved the presentation of this paper.

Funding information

There are no funding bodies to thank in relation to the creation of this article.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

References

Assaf, M. and Meerson, B. (2010). Extinction of metastable stochastic populations. Phys. Rev. E 81, article no. 021116.CrossRefGoogle ScholarPubMed
Assaf, M. and Meerson, B. (2017). WKB theory of large deviations in stochastic populations. J. Phys. A 50, article no. 263001.10.1088/1751-8121/aa669aCrossRefGoogle Scholar
Ball, F. G. (1983). The threshold behaviour of epidemic models. J. Appl. Prob. 20, 227241.10.2307/3213797CrossRefGoogle Scholar
Ball, F. G., Britton, T. and Neal, P. (2014). On expected durations of birth–death processes, with applications to branching processes and SIS epidemics. Preprint. Available at https://arxiv.org/abs/1408.0641.Google Scholar
Ball, F. G., Britton, T. and Neal, P. (2016). On expected durations of birth–death processes, with applications to branching processes and SIS epidemics. J. Appl. Prob. 53, 203215.10.1017/jpr.2015.19CrossRefGoogle Scholar
Ball, F. G. and O’Neill, P. D. (1994). Strong convergence of stochastic epidemics. Adv. Appl. Prob. 26, 629655.10.2307/1427812CrossRefGoogle Scholar
Chazottes, J. B., Collet, P. and Méléard, S. (2019). On time scales and quasi-stationary distributions for multitype birth-and-death processes. Ann. Inst. H. Poincaré Prob. Statist. 55, 22492294.10.1214/18-AIHP948CrossRefGoogle Scholar
Clancy, D. (2018). Persistence time of SIS infections in heterogeneous populations and networks. J. Math. Biol. 77, 545570.CrossRefGoogle ScholarPubMed
Clancy, D. (2018). Precise estimates of persistence time for SIS infections in heterogeneous populations. Bull. Math. Biol. 80, 28712896.10.1007/s11538-018-0491-6CrossRefGoogle ScholarPubMed
Clancy, D. and Pollett, P. K. (2003). A note on quasi-stationary distributions of birth–death processes and the SIS logistic epidemic. J. Appl. Prob. 40, 821825.10.1239/jap/1059060909CrossRefGoogle Scholar
Davis, M. H. A. (1993). Markov Models and Optimization. Chapman and Hall, London.10.1007/978-1-4899-4483-2CrossRefGoogle Scholar
Diekmann, O., Heesterbeek, J. A. P. and Roberts, M. G. (2010). The construction of next-generation matrices for compartmental epidemic models. J. R. Soc. Interface 7, 873885.10.1098/rsif.2009.0386CrossRefGoogle ScholarPubMed
Ethier, S. N. and Kurtz, T. G. (2005). Markov Processes: Characterization and Convergence. John Wiley, New York.Google Scholar
Hernández-Suárez, C. M. and Castillo-Chavez, C. (1999). A basic result on the integral for birth–death Markov processes. Math. Biosci. 161, 95104.10.1016/S0025-5564(99)00034-6CrossRefGoogle ScholarPubMed
Keilson, J. and Ramaswamy, R. (1984). Convergence of quasistationary distributions in birth–death processes. Stoch. Process. Appl. 18, 301312.10.1016/0304-4149(84)90302-8CrossRefGoogle Scholar
Kelly, F. P. (2011). Reversibility and Stochastic Networks. Cambridge University Press.Google Scholar
Kryscio, R. J. and Lefèvre, C. (1989). On the extinction of the S-I-S stochastic logistic epidemic. J. Appl. Prob. 26, 685694.10.2307/3214374CrossRefGoogle Scholar
Laub, A. J. (2005). Matrix Analysis for Scientists and Engineers. Society for Industrial and Applied Mathematics, Philadelphia.Google Scholar
Mode, C. J. (1971). Multitype Branching Processes: Theory and Applications. Elsevier, New York.Google Scholar
Ovaskainen, O. and Meerson, B. (2010). Stochastic models of population extinction. Trends Ecol. Evolution 25, 643652.10.1016/j.tree.2010.07.009CrossRefGoogle ScholarPubMed
Sellke, T. (1983). On the asymptotic distribution of the size of a stochastic epidemic. J. Appl. Prob. 20, 390394.CrossRefGoogle Scholar
Van den Driessche, P. and Watmough, J. (2002). Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission. Math. Biosci. 180, 2948.10.1016/S0025-5564(02)00108-6CrossRefGoogle ScholarPubMed
Zachary, S. (2007). A note on insensitivity in stochastic networks. J. Appl. Prob. 44, 238248.CrossRefGoogle Scholar