Optimal management and valuation of a natural resource: the case of optimal harvesting

M'hamed Gaïgi; Idris Kharroubi; Thomas Lim

doi:10.1017/S0269964822000043

Optimal management and valuation of a natural resource: the case of optimal harvesting

Published online by Cambridge University Press: 11 March 2022

M'hamed Gaïgi

Idris Kharroubi and

Thomas Lim

Show author details

M'hamed Gaïgi: Affiliation:
ENIT-LAMSIN, Université de Tunis El Manar, Tunis, Tunisia. E-mail: mhamed.gaigi@enit.utm.tn
Idris Kharroubi: Affiliation:
LPSM, CNRS, UMR 8001, Sorbonne Université, Paris, France. E-mail: idris.kharroubi@upmc.fr
Thomas Lim: Affiliation:
LaMME, CNRS UMR 8071, ENSIIE, Évry, France. E-mail: lim@ensiie.fr

Article contents

Abstract
Introduction
Problem formulation
HJB characterization
Numerical results
Conclusion
References

Rights & Permissions

Abstract

In this paper, we consider the problem of sustainable harvesting. We explain how the manager maximizes his/her profit according to the quantity of natural resource available in a harvesting area and under the constraint of penalties and fines when the quota is exceeded. We characterize the optimal values and some optimal strategies using a verification result. We then show by numerical examples that this optimal strategy is better than naive ones. Moreover, we define a level of fines which insures the double objective of the sustainable harvesting: a remaining quantity of available natural resource to insure its sustainability and an acceptable income for the manager.

Keywords

Optimal control Optimal harvesting Renewable resource States constraints Verification theorem

Type: Research Article
Information: Probability in the Engineering and Informational Sciences , Volume 37 , Issue 3 , July 2023 , pp. 674 - 694

DOI: https://doi.org/10.1017/S0269964822000043 [Opens in a new window]
Copyright: © The Author(s), 2022. Published by Cambridge University Press

1. Introduction

Natural resources management is the balance between harvesting and its ecological implications. It is important to harvest in such a way that a species is sustainable and not becoming endangered or going extinct. For instance, according to the Food and Agriculture Organization of the United Nations, three quarters of the world's fish stocks are fully exploited or over-exploited and the proportion of those stocks that are too intensively exploited is growing. These statistics prove the fact that natural resources need to be managed with an effective and carefully defined objective in order to prevent over-harvesting and to allow the depleted stock to replenish. As a consequence, scarce resource management increasingly involves restoration and conservation objectives, along with the more conventional ecological and economic objectives that are identification of desirable levels of the natural resource and profitability from harvesting. Recent examples are the restoration plans discussed and/or adopted by the European Commission for several collapsed stocks in the E.U. waters, or the international commitment by the countries present at the 2002 Johannesburg Summit on Sustainable Development to return fisheries to levels allowing their maximum sustainable yield by 2015. For example, the objective of the precautionary approach promoted by the International Council for the Exploitation of the Sea region is to maintain spawning stock above a limit reference point Blim, while keeping fishing mortality below a limit Flim. A criticism of this approach is that it adopts a viewpoint which is too ichthyocentric, as it focuses on the conservation of fish populations and stocks only. Social and economic considerations are not included and the question of an acceptable income for the manager is not considered.

The key idea of this paper is to maximize harvest in a sustainable manner, since we want the greatest catch to supply the demand for the natural resource, but we do not want to deplete the population to keep diversity of resources and allow it to be harvested in the future. The concept of sustainable harvesting refers to methods designed not to over-exploit the resources, leads to the definition of measures and rules including fines delivered by authorities to avoid over-harvesting. The theoretical problem is, therefore, to determine a cost rule for fines in order to allow the conservation of the natural resources that are exploited by humans according to a sustainable perspective. In this respect, the amount of fines and/or prohibition on harvesting are to ensure that the natural resource population does not fall below a certain threshold that guarantees its natural renewal. But it must also allow the manager to make profits to prevent him/her going bankrupt.

Efficiency in managing the exploitation of manager resources has been widely analyzed in resource literature. In general terms, these studies use deterministic models that consider that an efficient policy consists of maintaining the exploitation levels of the harvesting ground at steady-state values. Clarke and Reed [Reference Clark and Reed3,Reference Reed and Clark15] introduced price and growth uncertainty in a forest harvest model, modeling the price process as geometric Brownian motion and assuming stock growth to be age or size dependent. Recently, some papers consider the case when the growth of the a fish stock is stochastic, for example Danielsson [Reference Danielsson5], Weitzman [Reference Weitzman20] and Nostbakken [Reference Nostbakken12], and others consider the price is stochastic, for example Murillas and Chamorro [Reference Murillas and Chamorro11] and Nostbakken [Reference Nostbakken12]. In these cases, optimal control theory has proven to be a suitable technique to design optimal harvesting strategies (see, e.g., [Reference Conrad and Clark4,Reference Kharroubi, Lim and Ly Vath8,Reference Reed and Heras16]).

The purpose of this study is to analyze how uncertainty in stock growth and price, prohibition to harvest if the quantity of natural resource available is smaller than a level and tax influence the optimal harvest of the natural resource. In our modeling, managers are controlled at fixed dates, while harvesting is continuous depending on the quantity of natural resource available in the harvesting region (the natural resource population evolves according to a logistic stochastic differential equation). They are prohibited to harvest if the quantity of natural resource is lower than a level when there is a control, and they must pay a fine in case of exceeding their harvesting quota at the maturity of the problem. They are therefore seeking to maximize their profit, that is, the quantity of natural resource to harvest given the condition of prohibition and the fine to be paid if their quota is exceeded.

Unlike the other articles dealing with this issue (see, e.g., [Reference Nostbakken, Thébaud and Sorensen13]), the selling price of natural resource is not constant; it seems reasonable to assume that the price depends on the quantity of natural resources remaining in the harvesting region. Consequently, it is endogenous to the problem of sustainable harvesting. That can be justified because the evolution of the price according to scarcity is a basic rule in economics. Given that, we will show how the resolution of this problem allows, on the one hand, to explain the behavior of the manager according to the amount of the fines, and, on the other hand, to fix a rule of price for the fines to guarantee a sustainable harvesting.

The remainder of this article is structured as follows. Section 2 presents the problem formulation: the function of the expected profit for manager and its two value functions. Section 3 characterizes the value functions by a verification result involving Hamilton Jacobi Bellman (HJB in short) equations. Section 4 provides numerical results and interpretations which allow us to understand how the manager adapts his/her strategy w.r.t. the fines. This allows to fix a level of fines to insure the sustainability of the resource. Concluding remarks are offered in Section 5.

2. Problem formulation

2.1. The model

Let $(\Omega, {\mathcal {F}}, \mathbb {P})$ be a complete probability space. We assume that this space is equipped with two one-dimensional standard Brownian motions $B$ and $W$. We denote by $\mathbb {F} := ({\mathcal F} _t)_{0\leq t \leq T}$ the right continuous complete filtration generated by these two Brownian motions, where $T$ is a positive constant which corresponds to the maturity of the problem. We assume that the correlation between the two Brownian motions is given by $\langle B, W \rangle _t = \rho \; t$.

In the sequel, we consider a manager who can harvest in a harvesting area, and we denote by $X_t$ the quantity of natural resource available in this area at time $t$. In the past, several articles, see for example Schaefer [Reference Schaefer18] or Pella and Tomlinson [Reference Pella and Tomlinson14], proposed to use a logistic model to represent the natural resource growth if there is no harvest, this model is given by

$$dX_t = \eta X_t(\lambda - X_t)dt ,$$

where $\eta$ and $\lambda$ are positive constants, $\eta \lambda$ corresponds to the intrinsic rate of population growth and $1/ \lambda$ is the carrying capacity of the environment. The model is interesting since it is well known that, for natural resource stocks, the growth rate is inversely related to the stock level because of natural constraints, and that is well represented with the logistic growth model (see, e.g., in [Reference Conrad and Clark4]). However, since the evolution of the natural resource depends on perturbations due to environmental and other factors, we add a term which models these perturbations by using a Brownian motion, which is called the classical logistic stochastic differential equation (see, e.g., [Reference Sarkar17]) and given by

$$dX_t=\eta X_t(\lambda - X_t)dt + \gamma X_t dB_t ,$$

where $\eta$, $\lambda$, and $\gamma$ are three positive constants. It is well known that the previous SDE admits a unique strong solution that does not reach either zero or infinity in finite time. Furthermore, it has a closed-form formula (see, e.g., [Reference Skiadas19]). The product $\eta \lambda$ corresponds to the intrinsic rate of population growth and $1/ \lambda$ is the carrying capacity of the environment. This model is, for example, used in Nostbakken [Reference Nostbakken12] or Kvamsdal et al. [Reference Kvamsdal, Poudel and Sandal10]. We assume that the manager can harvest this resource and we denote by $\alpha _t$ the harvest rate at time $t$. For a given strategy $\alpha =(\alpha _t)_{0 \leq t \leq T}$, $X^{\alpha }_t$ denotes the associated quantity of natural resource available at time $t$, thus this one follows the stochastic differential equation

$$dX^{\alpha}_t = \eta X^{\alpha}_t(\lambda - X^{\alpha}_t)dt + \gamma X^{\alpha}_t dB_t - \alpha_t dt.$$

The manager sells the harvest on the market at time $t$ for the price $P_t$ by unit, where the price $P$ evolves according to the following stochastic differential equation

$$dP_t = P_t( \mu(X^{\alpha}_t) dt + \sigma dW_t) ,$$

where $\sigma$ is a positive constant and $\mu$ is a map from $\mathbb{R} _+$ to $\mathbb{R} _+$ which corresponds to the drift of the price. We can see in the literature that some authors choose to model the price by a geometric brownian motion (see, e.g., [Reference Murillas and Chamorro11] or [Reference Nostbakken12]), that means the map $\mu$ is a constant in our case. We propose to add a dependence of $\mu$ w.r.t. the quantity of the resource since we can remark, in fish or wood markets for instance, that if the quantity of the resource is low then the price is expensive. In Kvansdal et al. [Reference Kvamsdal, Poudel and Sandal10], the authors assume that the price is mean-reverting and depends on the harvest rate. More precisely, a higher harvest makes the price lower. Here, we model a scarce resource management where the price depends more on the available quantity of resource than on the harvest.

$(\mathbf {H}\mu )$ $\mu : \mathbb{R} _+\rightarrow \mathbb{R} _+$ is a nonincreasing and Lipschitz continuous: there exists a positive constant $L$ such that

$$| \mu(x) - \mu(x')| \leq L |x - x'| ,$$

for all $x, x' \in \mathbb{R} _+$.

In Assumption $(\mathbf {H}\mu )$, the monotonicity condition means the greater is the quantity of natural resource available the lower is the price of the natural resource.

We consider a positive increasing sequence $(T_i)_{1 \leq i \leq N}$ where each $T_i$ represents the time at which the regulatory body checks the quantity of natural resource available $X^{\alpha }$ with $T_N=T$. We assume that $T_i$ is a constant for any $i \in \{1, \ldots, N\}$. If $X^{\alpha }_{T_i} \gt \Gamma$, then the manager can continue to harvest, if $X^{\alpha }_{T_i} \leq \Gamma$, then the manager can no more harvest until the next checking time. If so, the first time the manager is permitted to resume harvesting can be represented mathematically as follows: $\tau ^{\alpha }_i := \inf \{T_k, \; k \geq i : X^{\alpha }_{T_k} \gt \Gamma \}$.

We define the set ${\mathcal {A}}$ of admissible controls as the set of strategies $\alpha$ which are an $\mathbb {F}$-adapted process defined in $[0, \bar a]$, $X^{\alpha }$ is nonnegative and $\alpha$ is null on $[T_i, \tau ^{\alpha }_i)$ for any $1 \leq i \leq N$. The harvest rate is upper bounded by the constant $\bar a$. This last assumption is natural since the manager has some technical constraint and he/she cannot harvest more than a given quantity, which depends for example, on the fishing boat or the truck's dump body volume when harvesting trees.

The standard assumption that an agent seeks to maximize the expected present value of net revenues from the harvesting on the interval $[0,T]$ to the dynamic constraints is made. Thus, the objective of the manager is given by

(2.1)

\begin{equation} V_0(x,p) := \sup_{\alpha \in {\mathcal{A}}} \mathbb{E} \left[\int_0^{T} e^{-\beta t}( P_t\alpha_t-C(\alpha_t)) dt - e^{-\beta T} f((\Gamma - X^{\alpha}_T)^{+},P^{\alpha}_T) \right] , \end{equation}

and finding an optimal strategy $\alpha ^{*} \in {\mathcal {A}}$ such that

$$V_0(x,p) = \mathbb{E} \left[\int_0^{T} e^{-\beta t} (P_t\alpha^{*}_t-C(\alpha^{*}_t)) dt - e^{-\beta T} f((\Gamma - X^{\alpha^{*}}_T)^{+},P^{\alpha^{*}}_T) \right] ,$$

where $\beta$ is a positive constant corresponding on the discount rate, $(\cdot )^{+}$ denotes the positive part, $C$ is a positive increasing convex function representing the cost of harvesting, and $f$ is a map from $\mathbb{R} _+\times \mathbb{R} _+$ to $\mathbb{R} _+$ which corresponds to a tax that the manager must pay if at time $T$ the quantity of natural resource available $X^{\alpha }_T$ is lower than the level $\Gamma$. This tax depends on the quantity of natural resource available and also on the natural resource price at time $T$. Indeed, if the tax does not depend on the natural resource price, then the manager would be willing to pay it if the natural resource price is high since the earnings by selling the harvest will hedge the tax. On the contrary, if the natural resource price is low, then these earnings do not hedge the tax, and the manager would not accept to pay.

$(\mathbf {H}f)$ $f: \mathbb{R} _+\times \mathbb{R} _+ \rightarrow \mathbb{R} _+$ is a nondecreasing and Lipschitz function w.r.t. both of its arguments: there exists a positive constant $L$ such that

$$| f(x,y) - f(x',y')| \leq L ( |x - x'|+ |y - y'|) ,$$

for all $(x,y), (x',y') \in \mathbb{R} _+\times \mathbb{R} _+$, and $f(0,y)=0$ for any $y \in \mathbb{R} _+$.

2.1.1. An example of explicit solution

We describe in this paragraph a case where an explicit solution to (2.1) can be computed.

Take $f\equiv 0$, $\gamma \equiv 0$, and $C(a)=-a^{2}/2$ for $a\in [0,\bar a]$. Suppose that $X^{a}$ and $P$ are given by

\begin{align*} X_t^{\alpha} & = x+\int_0^{t}\eta X_s^{\alpha} (\lambda - X_s^{\alpha})ds-\int_0^{t}\alpha_sds\\ P_t & = p+\int_0^{t}P_s\big(\mu ds+\sigma dB_s) \end{align*}

for $t\in [0,T]$, where $\mu$ and $\eta$ are constants and $\lambda$ and $\sigma$ are positive constants. We recall that the process $B$ is a standard one-dimensional Brownian motion. The value function is then given by

$$V_0(x,p) = \sup_{\alpha \in {\mathcal{A}}} \mathbb{E} \left[\int_{0}^{T} e^{-\beta t} \left( P_t\alpha_t-\frac{\alpha_t^{2}}{2} \right) dt \right].$$

Proposition 2.1 Suppose that

(2.2)

\begin{align} \frac{\eta \lambda^{2}}{4} & \gt \bar a \end{align}

(2.3)

\begin{align} x& \in (x_-,x_+) \end{align}

and

(2.4)

\begin{equation} x_-{+}\frac{x_+{-}x_-}{\frac{x-x_+}{x-x_-}e^{-\eta(x_+{-}x_-)T}-1} \geq \Gamma \end{equation}

where

$$x_{{\pm}} = \frac{1}{2}\left(\lambda\pm\sqrt{\lambda^{2}-4\frac{\bar a}{\eta}}\right)\;.$$

Then, an optimal strategy is given by

$$\alpha^{*}_t = P_t\wedge \bar a, \quad t\in[0,T],$$

and we have

\begin{align*} V_0(x,p) & =\mathbb{E} \left[\int_{0}^{T} e^{-\beta t}\left( P_t(P_t\wedge \bar a)-\frac{(P_t\wedge \bar a)^{2}}{2}\right)dt \right] \\ & = \int_{0}^{T} e^{-\beta t}\left( \frac{p^{2}}{2} e^{(2\mu+\sigma^{2})t}F \left(\frac{\log({\bar a}{p})-(\mu+\frac{3\sigma^{2}}{2})t}{\sigma\sqrt{t}}\right)\right.\\ & \quad \left.+p\bar a e^{\mu t} F \left(\frac{\log(\frac{p}{\bar a})+(\mu+\frac{\sigma^{2}}{2})t} {\sigma\sqrt{t}}\right)-\frac{\bar a^{2}}{2} F \left(\frac{\log(\frac{p}{\bar a}) +(\mu-\frac{\sigma^{2}}{2})t}{\sigma\sqrt{t}}\right) \right)dt, \end{align*}

where $F$ is the cumulative distribution function of ${{\mathcal {N}}(0,1)}$.

Proof. We proceed in four steps.

Step 1. We first notice that the value function can be rewritten as

$$V_0(x,p) =\sup_{\alpha \in \bar {\mathcal{A}}} \mathbb{E} \left[\sum_{i=0}^{N-1}\int_{T_i}^{T_{i+1}} e^{-\beta t} \left( P_t\alpha_t-\frac{\alpha_t^{2}}{2}\right)\mathbb{1}_{X^{\alpha}_{T_i}\geq \Gamma} dt \right] ,$$

where $\bar {\mathcal {A}}$ is the set of strategies $\alpha$ which are $\mathbb {F}$-adapted processes defined in $[0, \bar a]$ such that $X^{\alpha }$ is nonnegative.

Step 2. Denote by $X^{\bar a}$ the solution to the SDE

$$X_t^{\bar a} = x+\int_0^{t}\eta X_s^{\bar a} (\lambda - X_s^{\bar a})ds-\int_0^{t}\bar a ds,\quad t\in[0,T].$$

Then $X^{\bar a}$ is uniquely defined as a solution to a locally Lipschitz ordinary differential equation and by using the classical results about Riccati equation, we get

$$X_t^{\bar a} = x_-{+}\frac{x_+{-}x_-}{1-\frac{x-x_+}{x-x_-}e^{-\eta(x_+{-}x_-)t}} ,\quad t\in[0,T]\;.$$

From (2.3), we get that $X^{\bar a}$ is nondecreasing. Then, from (2.4), we get

(2.5)

\begin{equation} X^{\bar a}_t \geq \Gamma\gt0 \end{equation}

for all $t\in [0,T]$.

Step 3. We next have

$$X^{\alpha}_t \geq X^{\bar a}_t ,\quad t\in[0,T].$$

for any adapted process $\alpha$ valued in $[0,\bar a]$. Indeed, denote by $\delta X$ the process $X^{\alpha }-X^{\bar a}$. This process is solution to

$$\delta X_t = \int_0^{t}\delta X_s\Delta_s ds + \int_0^{t}(\bar a-\alpha_s)ds,\quad t\in[0,T],$$

with $\Delta _t= \eta (\lambda - X^{\alpha }_t-X^{\bar a}_t)$. Therefore, we get

$$\delta X_t = e^{\int_0^{t}\Delta_sds} + \int_0^{t}e^{-\int_0^{t}\Delta_sds}(\bar a-\alpha_s)ds \geq 0$$

for all $t\in [0,T]$.

Step 4. We deduce from (2.5), Step 2 and Step 3 that

$$X^{\alpha}_{t} \geq X^{\bar a}_{t}\geq \Gamma \gt 0$$

for any $t\in [0,T]$ and any adapted process $\alpha$ valued in $[0,\bar a]$. Therefore, $\bar {\mathcal {A}}$ is the set of adapted processes $\alpha$ valued in $[0,\bar a]$ and we have

$$V_0(x,p) =\sup_{\alpha\in\bar {\mathcal{A}}}\mathbb{E} \left[\int_{0}^{T} e^{-\beta t}\left( P_t\alpha_t-\frac{\alpha_t^{2}}{2}\right)dt \right].$$

Maximizing the term inside the integral we get from the first-order condition

$$\alpha^{*}_t = P_t\wedge \bar a, \quad t\in[0,T],$$

and we have

$$V_0(x,p) = \mathbb{E} \left[\int_{0}^{T} e^{-\beta t} \left( P_t(P_t\wedge \bar a)-\frac{(P_t\wedge \bar a)^{2}}{2}\right)dt \right].$$

Following a computation similar to that of the Call and Put prices in the Black & Scholes model, we have

\begin{align*} \mathbb{E} [P_t(P_t\wedge \bar a)] & = p^{2} e^{(2\mu + \sigma^{2})t}F \left(\frac{\log(\frac{\bar a} {p})-(\mu+\frac{3\sigma^{2}}{2})t}{\sigma\sqrt{t}}\right)\\ & \quad +p\bar a e^{\mu t} F \left(\frac{\log(\frac{p}{\bar a})+(\mu+\frac{\sigma^{2}}{2})t}{\sigma\sqrt{t}}\right) \end{align*}

and

\begin{align*} \mathbb{E} [(P_t\wedge \bar a)^{2}] & = p^{2} e^{(2\mu+\sigma^{2})t}F \left(\frac{\log(\frac{\bar a} {p})-(\mu+\frac{3\sigma^{2}}{2})t}{\sigma\sqrt{t}}\right)\\ & \quad +\bar a^{2} F\left(\frac{\log(\frac{p}{\bar a})+(\mu-\frac{\sigma^{2}}{2})t}{\sigma\sqrt{t}}\right), \end{align*}

which allows to get the final expression for $V_0(x,p)$.

In the previous example, we are able to derive a computable representation of the value function and give the associated optimal strategy. This model remains relevant as it takes into account the harvesting effort via the quadratic term $\alpha _t^{2}$. Moreover, the explicit computation of the optimal strategy can be done since the conditions on the parameters ensure the constraint related to $\Gamma$. We notice that this optimal strategy is nondecreasing in the price resource $P_t$ and the maximal effort $\bar a$. This behavior is quite natural as for a higher price, the manager should harvest more to increase the gain.

Unfortunately, we cannot always compute explicit solutions for our optimization problem due to the complexity of the state space. We therefore provide in the sequel a PDE characterization of the value function.

2.2. The value function

In order to provide an analytic characterization of the value function $V_0$ defined by (2.1), we need to extend the definition of this control problem to general initial conditions.

Unfortunately, the considered controlled system is not Markovian. Indeed, the control process $\alpha$ is subject to the constraint that is fixed only at each time $T_k$ but holds over $[T_k,T_{k+1})$. Thus, we need to keep in mind the constraint and we therefore consider two cases (we can harvest or we can not) and two value functions. This approach is inspired by Bruder and Pham [Reference Bruder and Pham1] who consider a delayed controlled system. They enlarge the controlled system to make it Markovian. Similarly, we enlarge our system by adding a parameter which indicates whether the agent is allowed to harvest or not on the considered period $[T_k,T_{k+1})$. However, we notice that our resulting partial differential equation (in short PDE) is different from theirs since we get a coupled system whereas they get a recursive one.

For any $t \in [0,T]$, $x \geq 0$ and $i \in \{0,1\}$, we denote ${\mathcal A} _{t,i}(x)$ the set

\begin{align*} {\mathcal A}_{t,i}(x) & :=\{ \alpha = (\alpha_s)_{t \leq s \leq T} , \ \alpha_s \text{ is }{\mathcal F}_s\text{-measurable and valued in } [0, \bar a] , \\ & \qquad \alpha_s=0 \text{ on } [t ,\tau^{\alpha}_{q(t)}) \text{ if } i=0 ,\\ & \qquad \alpha_s=0 \text{ on } [T_k ,\tau^{\alpha}_k) \text{ for any } q(t) +1 \leq k \leq N \} , \end{align*}

where $q(t) := \sup \{j , T_j \leq t\}$.

Let ${\mathcal {Z}}:= \mathbb{R} _+ \times (0,+\infty ) \times \{0,1\}$. For $z=(x,p,i) \in {\mathcal {Z}}$ and $\alpha \in {\mathcal A} _{t,i}(x)$, we denote by $Z^{t,z,\alpha } := (X^{t,x,\alpha }, P^{t,z,\alpha },I^{t,z,\alpha })$ the triple of processes defined by

\begin{align*} X^{t,x,\alpha}_s & = x + \int_t^{s} \eta X^{t,x,\alpha}_u ( \lambda - X^{t,x,\alpha}_u) du + \int_t^{s} \gamma X^{t,x,\alpha}_u dB_u - \int_t^{s} \alpha_u du ,\\ P^{t,z,\alpha}_s & = p + \int_t^{s} \mu(X^{t,x,\alpha}_u) P^{t,z,\alpha}_u du + \int_t^{s} \sigma P^{t,z,\alpha}_u dW_u ,\\ I^{t,z,\alpha}_s & = i {\mathbb 1}_{t \leq s \lt T_{q(t) + 1}} + \sum_{k=q(t)+1}^{N-1} {\mathbb 1}_{X^{t,x,\alpha}_{T_k} \gt \Gamma} {\mathbb 1}_{T_k \leq s \lt T_{k + 1}}. \end{align*}

For any $t \in [0,T]$ and $z\in {\mathcal {Z}}$, we consider the value function $v$ defined by

$$v(t,z) := \sup_{\alpha \in {\mathcal A}_{t,i}(x)} \mathbb{E} \left[ \int_t^{T} e^{-\beta (s-t)} (P^{t,z,\alpha}_s \alpha_s-C(\alpha_s)) ds - e^{-\beta (T-t)} f\big((\Gamma - X^{t,x,\alpha}_T)^{+},P^{t,p,\alpha}_T) \right].$$

We also consider the two value functions $v_0$ and $v_1$ defined on $[0,T] \times \mathbb{R} _+ \times (0,+\infty )$ by

$$v(t,z)=v_0(t,x,p){\mathbb 1}_{i=0} + v_1(t,x,p){\mathbb 1}_{i=1}.$$

The value function $v_0$ corresponds to the case where at time $t$ the manager can not harvest until the next checking time, while the value function $v_1$ corresponds to the case where at time $t$ the manager can harvest.

3. HJB characterization

We use the HJB equation to characterize the value functions $v_0$ and $v_1$. The HJB equations related to the value functions $v_0$ and $v_1$ are for any $x \in \mathbb{R} _+$ and $p \in (0,+\infty )$

(3.1)

\begin{equation} \left\{\begin{array}{l} -\partial_t v_0(t,x,p) - {\mathcal{L}}^{0} v_0(t,x,p) = 0, \quad t \in [0,T] - \{T_j\}_{1 \leq j \leq N} \\ v_0(T^{-}_j,x,p) = v_0(T_j,x,p) {\mathbb 1}_{x \leq \Gamma} + v_1(T_j,x,p) {\mathbb 1}_{x \gt \Gamma}, \quad j \in \{1,\ldots,N-1\} \\ v_0(T^{-}_N,x,p) ={-} f((\Gamma -x)^{+},p) \end{array}\right. \end{equation}

and

(3.2)

\begin{equation} \left\{\begin{array}{l} - \partial_t v_1(t,x,p) - \displaystyle\sup_{0 \leq a \leq \bar a} \{{\mathcal{L}}^{a} v_1(t,x,p)+pa-C(a)\} =0, \quad t \in [0,T] - \{T_j\}_{1 \leq j \leq N} \\ v_1(T^{-}_j,x,p) = v_0(T_j,x,p){\mathbb 1}_{x \leq \Gamma} + v_1(T_j,x,p){\mathbb 1}_{x\gt \Gamma}, \quad j \in \{1,\ldots,N-1\} \\ v_1(T^{-}_N,x,p) ={-}f((\Gamma -x)^{+},p), \end{array}\right. \end{equation}

where ${\mathcal {L}}^{a}$ is the operator associated to the diffusions

$${\mathcal{L}}^{a} \varphi={-}\beta \varphi + \eta x( \lambda - x) \partial_x \varphi + \frac{|\gamma x |^{2} }{2} \partial^{2}_x \varphi + \mu(x)p \partial_p \varphi + \frac{|\sigma p |^{2}}{2} \partial^{2}_p \varphi + \rho \sigma \gamma p x \partial^{2}_{px} \varphi - a \partial_x \varphi.$$

Let $C^{0}$ be the set of continuous functions and $C^{1,2}$ be the set of functions that are differentiable with continuous derivative in their first argument and twice differentiable with continuous second derivatives in their second argument. We have the following verification result.

Theorem 3.1 Let $w^{0}$ and $w^{1}$ be two functions in $C^{1,2}([T_j, T_{j+1}) \times \mathbb{R} _+ \times (0,+\infty )) \;\cap \; C^{0}([T_j,T_{j+1}] \times \mathbb{R} _+ \times (0,+\infty ))$ for any $j \in \{0, \ldots, N-1\}$, with $T_0=0$, and satisfying a quadratic growth condition, that is, there exists a positive constant $C$ such that

$$|w^{0}(t,x,p)| + |w^{1}(t,x,p)| \leq C ( 1 + |x|^{2} + |p|^{2}) , \quad \forall \; (t,x,p) \in [0,T] \times \mathbb{R}_+{\times} (0,+\infty).$$

(i) Suppose that for any $x \in \mathbb{R} _+$ and $p \in (0,+\infty )$, we have
(3.3)\begin{equation} \left\{\begin{array}{l} - \partial_t w_0 (t,x,p) - {\mathcal{L}}^{0} w_0(t,x,p) \geq 0, \quad t \in [0,T] - \{T_j\}_{1 \leq j \leq N} \\ w_0(T^{-}_j,x,p) \geq w_0(T_j,x,p) {\mathbb 1}_{x \leq \Gamma} + w_1(T_j,x,p) {\mathbb 1}_{x \gt \Gamma}, \quad j \in \{1, \ldots N-1\} \\ w_0(T^{-},x,p) \geq{-}f((\Gamma-x)^{+},p) \end{array}\right. \end{equation}
and
(3.4)\begin{equation} \left\{\begin{array}{l} - \partial_t w_1(t,x,p) -\displaystyle\sup_{0 \leq a \leq \bar a}\{ {\mathcal{L}}^{a} w_1(t,x,p) + pa-C(a)\} \geq 0, \quad t \in [0,T] - \{T_j\}_{1 \leq i \leq N} \\ w_1(T^{-}_j,x,p) \geq w_0(T_j,x,p){\mathbb 1}_{x\leq \Gamma} + w_1(T_j,x,p){\mathbb 1}_{x \gt \Gamma}, \quad j \in \{1, \ldots N-1\} \\ w_1(T^{-},x,p) \geq{-}f((\Gamma-x)^{+},p). \end{array}\right. \end{equation}
Then, the function $w$ defined by $w(t,z) :=w_0(t,x,p){\mathbb 1}_{i=0} + w_1(t,x,p){\mathbb 1}_{i=1}$ satisfies $w(t,z) \geq v(t,z)$ on $[0,T] \times {\mathcal {Z}}$.
(ii) Suppose further that for any $z \in {\mathcal {Z}}$, there exists a measurable function $\hat \alpha (t,z)$ valued in $[0, \bar a]$ such that if $i =0$, we have
$$-\partial_t w(t,z) - {\mathcal{L}}^{0} w(t,z) = 0$$
and if $i=1$, we have
\begin{align*} & \partial_t w(t,z) + \sup_{a \in [0, \bar a]} [ {\mathcal{L}}^{a} w(t,z) + pa -C(a) ]\\ & \quad =\partial_t w(t,z) + {\mathcal{L}}^{\hat \alpha(t,z)} w(t,z) + p\hat \alpha(t,z)-C(\hat \alpha(t,z)) =0 \end{align*}
with
$$w(T^{-}_j,z) = w(T_j,x,p,0) {\mathbb 1}_{x \leq \Gamma} + w(T_j,x,p,1) {\mathbb 1}_{x \gt \Gamma}, \quad (j,z) \in \{1, \ldots, N-1\} \times {\mathcal{Z}}$$
and
$$w(T^{-},z) ={-}f((\Gamma-x)^{+},p)$$
the stochastic differential equations
\begin{align*} X^{t,x,\hat{\alpha}}_s & = x + \int_t^{s} \eta X^{t,x,\hat{\alpha}}_u ( \lambda - X^{t,x,\hat{\alpha}}_u) du + \int_t^{s} \gamma X^{t,x,\hat{\alpha}}_u dB_u - \int_t^{s} \hat{\alpha}_u du\\ P^{t,z,\hat{\alpha}}_s & = p + \int_t^{s} \mu(X^{t,x,\hat{\alpha}}_u) P^{t,z,\hat{\alpha}}_u du + \int_t^{s} \sigma P^{t,z,\hat{\alpha}}_u dW_u \\ I^{t,z, \hat \alpha}_s & = i {\mathbb 1}_{t \leq s \lt T_{q(t) + 1}} + \sum_{k=q(t)+1}^{N-1} {\mathbb 1}_{X^{t,x,\hat{\alpha}}_{T_k} \gt \Gamma} {\mathbb 1}_{T_k \leq s \lt T_{k + 1}} \end{align*}
admit a unique solution, denoted by $\hat Z^{t,z}_s$ given an initial condition $Z_t=z$, and the process $\{\hat {\alpha }(s, \hat Z^{t,z}_s), \; t \leq s \leq T\}$ lives in ${\mathcal A} _{t,i}(x)$. Then,
$$w=v \quad \text{on } [0,T] \times {\mathcal{Z}}$$
and $\hat \alpha$ is an optimal Markovian control.

Proof. In the proof, to simplify the notation, we introduce $K^{t,k,\alpha }_s := (X^{t,x,\alpha }_s, P^{t,z,\alpha }_s)$ and $k :=(x,p)$ for any $z \in {\mathcal {Z}}$ and $\alpha \in {\mathcal A} _{t,i}(x)$.

(i) We prove by induction that $w \geq v$ on $[T_j, T_{j+1}]$ for any $j \in \{0, \ldots, N-1\}$.

We first consider the case $j=N-1$ and $i=0$, that means the manager can not harvest on $[T_{N-1},T_N]$, thus $v(t,z)=\mathbb {E}[-e^{-\beta (T-t)} f( ( \Gamma - X^{t,x,0}_T)^{+},P^{t,p,0}_T)]$.

Since $w^{0}$ is $C^{1,2}([T_{N-1},T_N) \times \mathbb{R} _+ \times (0,+\infty )) \cap C^{0}([T_{N-1},T_N] \times \mathbb{R} _+ \times (0,+\infty ))$, we have for any $(t, x,p) \in [T_{N-1},T_N) \times \mathbb{R} _+ \times (0,+\infty )$, $\alpha \in {\mathcal A} _{t,0}(x)$, $s \in [t,T_{N})$, and any stopping time $\tau$ valued in $[t,T]$, by Itô's formula

\begin{align*} & e^{-\beta (s \wedge \tau)} w_0(s \wedge \tau, K^{t,k,\alpha}_{s \wedge \tau})\\ & \quad = e^{-\beta t}w_0(t,k) + \int_t^{s \wedge \tau} e^{-\beta u}(\partial_t w_0(u, K^{t,k,\alpha}_u) + {\mathcal{L}}^{0} w_0(u, K^{t,k,\alpha}_u) ) du\\ & \qquad + \int_t^{s \wedge \tau} e^{-\beta u}( \partial_x w_0(u, K^{t,k,\alpha}_u) \gamma X^{t,x,\alpha}_u dB_u + \partial_p w_0(u, K^{t,k,\alpha}_u) \sigma P^{t,z,\alpha}_u dW_u ). \end{align*}

We choose $\tau = \tau _n := \inf \{s \geq t : \int _t^{s} (|\partial _x w_0(u, K^{t,k,\alpha }_u) $ $X^{t,x,\alpha }_u|^{2} + |\partial _p w_0(u, K^{t,k,\alpha }_u) P^{t,z,\alpha }_u|^{2}) du \geq n\} \wedge T$ and we remark $(\tau _n)_{n \geq 1}$ is an increasing sequence going to $T$ when $n$ goes to $\infty$. By taking the expectation, we get

\begin{align*} & \mathbb{E} [ e^{-\beta (s \wedge \tau_n)} w_0(s \wedge \tau_n, K^{t,k,\alpha}_{s \wedge \tau_n}) ]\\ & \quad =e^{-\beta t}w_0(t,k) + \mathbb{E} \left[ \int_t^{s \wedge \tau_n} e^{-\beta u} (\partial_t w_0(u, K^{t,k,\alpha}_u) + {\mathcal{L}}^{0} w_0(u, K^{t,k,\alpha}_u) ) du\right]. \end{align*}

Since $w_0$ satisfies (3.3), we have

$$\mathbb{E} [ e^{-\beta (s \wedge \tau_n)} w_0(s \wedge \tau_n, K^{t,k,\alpha}_{s \wedge \tau_n}) ] \leq e^{-\beta t}w_0(t,k).$$

By the quadratic growth condition on $w_0$ and the integrability condition on $K^{t,k,\alpha }$, we may apply the dominated convergence theorem and send $n$ to infinity

$$\mathbb{E} [e^{-\beta s} w_0(s, K^{t,k,\alpha}_s) ] \leq e^{-\beta t} w_0(t,k).$$

By sending $s$ to $T_N$, we obtain by the dominated convergence theorem

$$\mathbb{E} [ e^{-\beta T_N } w_0(T^{-}_N, K^{t,k,\alpha}_{T^{-}_N}) ] \leq e^{-\beta t} w_0(t,k) ,$$

which implies

$$e^{-\beta t} v(t,z) = \mathbb{E} [{-}e^{-\beta T_N } f((\Gamma - X^{t,x,0}_{T_N})^{+},P^{t,p,0}_{T_N}) ] \leq e^{-\beta t} w_0(t,k).$$

We now consider the case $j=N-1$ and $i=1$. Since $w^{1}$ is $C^{1,2}([T_{N-1},T_N) \times \mathbb{R} _+ \times (0,+\infty )) \cap C^{0}([T_{N-1},T_N] \times \mathbb{R} _+ \times (0,+\infty ))$, we have for any $(t, x,p) \in [T_{N-1},T_N) \times \mathbb{R} _+ \times (0,+\infty )$, $\alpha \in {\mathcal A} _{t,1}(x)$, $s \in [t,T_N)$, and any stopping time $\tau$ valued in $[t,T]$, by Itô's formula

\begin{align*} & e^{-\beta (s \wedge \tau)} w_1(s \wedge \tau, K^{t,k,\alpha}_{s \wedge \tau})\\ & \quad = e^{-\beta t} w_1(t,k) + \int_t^{s \wedge \tau} e^{-\beta u} (\partial_t w_1(u, K^{t,k,\alpha}_u)+ {\mathcal{L}}^{\alpha_u} w_1(u, K^{t,k,\alpha}_u) ) du\\ & \qquad + \int_t^{s \wedge \tau} e^{-\beta u}( \partial_x w_1(u, K^{t,k,\alpha}_u) \gamma X^{t,x, \alpha}_u dB_u + \partial_p w_1(u, K^{t,k,\alpha}_u) \sigma P^{t,z,\alpha}_u dW_u ). \end{align*}

We choose $\tau = \tau _n := \inf \{s \geq t : \int _t^{s} (|\partial _x w_1(u, K^{t,k,\alpha }_u)$ $X^{t,x, \alpha }_u|^{2} + |\partial _p w_1(u, K^{t,k,\alpha }_u) P^{t,z,\alpha }_u|^{2}) du \geq n\} \wedge T$ and we remark $(\tau _n)_{n \geq 1}$ is an increasing sequence going to $T$ when $n$ goes to infinity. This stopping time ensures that the coefficients appearing in the stochastic integrals are bounded so they are martingales. By taking the expectation, we get

\begin{align*} & \mathbb{E} [e^{-\beta (s \wedge \tau_n)} w_1(s \wedge \tau_n, K^{t,k,\alpha}_{s \wedge \tau_n}) ]\\ & \quad = e^{-\beta t} w_1(t,k) + \mathbb{E} \left[ \int_t^{s \wedge \tau_n} e^{-\beta u}(\partial_t w_1(u, K^{t,k,\alpha}_u)+ {\mathcal{L}}^{\alpha_u} w_1(u, K^{t,k,\alpha}_u)) du\right]. \end{align*}

By using (3.4), we get

$$\mathbb{E} [ e^{-\beta (s \wedge \tau_n)} w_1(s \wedge \tau_n, K^{t,k,\alpha}_{s \wedge \tau_n}) ] \leq e^{-\beta t} w_1(t,k) - \mathbb{E} \left[ \int_t^{s \wedge \tau_n} e^{-\beta u} (P^{t,z,\alpha}_u \alpha_u -C(\alpha_u)) du\right].$$

By sending $n$ to infinity, we obtain by the dominated convergence theorem

$$\mathbb{E} [ e^{-\beta s} w_1(s, K^{t,k,\alpha}_{s}) ] \leq e^{-\beta t} w_1(t,k) - \mathbb{E} \left[ \int_t^{s} e^{-\beta u} (P^{t,z,\alpha}_u \alpha_u -C(\alpha_u)) du\right].$$

By sending $s$ to $T^{-}_N$, we obtain by the dominated convergence theorem

$$\mathbb{E} [ e^{-\beta T_N } w_1(T^{-}_N, K^{t,k,\alpha}_{T^{-}_N}) ] \leq e^{-\beta t}w_1(t,k) - \mathbb{E} \left[ \int_t^{T_N} e^{-\beta u} (P^{t,z,\alpha}_u \alpha_u -C(\alpha_u)) du\right].$$

Which implies for any $\alpha \in {\mathcal A} _{t,1}(x)$

$$\mathbb{E} \left[ \int_t^{T_N} e^{-\beta u} (P^{t,z,\alpha}_u \alpha_u -C(\alpha_u)) du - e^{-\beta T_N} f((\Gamma - X^{t,x, \alpha}_{T_N})^{+},P^{t,p, \alpha}_{T_N}) \right] \leq e^{-\beta t} w_1(t,k).$$

Thus, $v(t,z) \leq w(t,z)$ for any $[T_{N-1},T_N] \times {\mathcal {Z}}$.

We now suppose the result holds on $[T_{j},T_{j+1}]$ for one $j \in \{1, \ldots,N-1\}$. We first consider the case $i=0$. Since $w^{0}$ is $C^{1,2}([T_{j-1},T_j) \times \mathbb{R} _+ \times (0,+\infty )) \cap C^{0}([T_{j-1},T_j] \times \mathbb{R} _+ \times (0,+\infty ))$, we have for any $(t, x,p) \in [T_{j-1},T_j) \times \mathbb{R} _+ \times (0,+\infty )$, $\alpha \in {\mathcal A} _{t,0}(x)$, $s \in [t,T_{j})$, and any stopping time $\tau$ valued in $[t,T_j]$, by Itô's formula

\begin{align*} & e^{-\beta s \wedge \tau} w_0(s \wedge \tau, K^{t,k,\alpha}_{s \wedge \tau}) \\ & \quad = e^{-\beta t} w_0(t,z) + \int_t^{s \wedge \tau} e^{-\beta u}(\partial_t w_0(u, K^{t,k,\alpha}_u) + {\mathcal{L}}^{0} w_0(u, K^{t,k,\alpha}_u) ) du\\ & \qquad + \int_t^{s \wedge \tau} e^{-\beta u}( \partial_x w_0(u, K^{t,k,\alpha}_u) \gamma X^{t,x, \alpha}_u dB_u + \partial_p w_0(u, K^{t,k,\alpha}_u) \sigma P^{t,z,\alpha}_u dW_u ). \end{align*}

By using the same technics that previously we get

$$\mathbb{E} [e^{-\beta T_j} w_0(T^{-}_j, K^{t,k,\alpha}_{T^{-}_j}) ] \leq e^{-\beta t} w_0(t,k).$$

By using the condition at time $T^{-}_j$ for $w_0$, we get

\begin{align*} e^{-\beta t}w_0(t,k) & \geq \mathbb{E} [e^{-\beta T_j } ( w_0(T_j, K^{t,k,\alpha}_{T_j}) {\mathbb 1}_{X^{t,x,\alpha}_{T_j} \leq \Gamma} + w_1(T_j, K^{t,k,\alpha}_{T_j}) {\mathbb 1}_{X^{t,x,\alpha}_{T_j} \gt \Gamma}) ]\\ & \geq \mathbb{E} [e^{-\beta T_j } w(T_j, Z^{t,z,\alpha}_{T_j}) ]\\ & \geq \mathbb{E} [e^{-\beta T_j } v(T_j, Z^{t,z,\alpha}_{T_j}) ] = e^{-\beta t}v(t,z). \end{align*}

We now consider the case $i=1$. Since $w^{1}$ is $C^{1,2}([T_{j-1},T_j) \times \mathbb{R} _+ \times (0,+\infty )) \cap C^{0}([T_{j-1},T_j] \times \mathbb{R} _+ \times (0,+\infty ))$, we have for any $(t, x,p) \in [T_{j-1},T_j) \times \mathbb{R} _+ \times (0,+\infty )$, $\alpha \in {\mathcal A} _{t,1}(x)$, $s \in [t,T_{j})$, and any stopping time $\tau$ valued in $[t,T_j]$, by Itô's formula

\begin{align*} & e^{-\beta s \wedge \tau} w_1(s \wedge \tau, K^{t,k,\alpha}_{s \wedge \tau})\\ & \quad = e^{-\beta t}w_1(t,k) + \int_t^{s \wedge \tau} e^{-\beta u} (\partial_t w_1(u, K^{t,k,\alpha}_u) + {\mathcal{L}}^{\alpha_u} w_1(u, K^{t,k,\alpha}_u) ) du\\ & \qquad + \int_t^{s \wedge \tau} e^{-\beta u}( \partial_x w_1(u, K^{t,k,\alpha}_u) \gamma X^{t,x, \alpha}_u dB_u + \partial_p w_1(u, K^{t,k,\alpha}_u) \sigma P^{t,z,\alpha}_u dW_u ). \end{align*}

By using the previous arguments, we obtain

$$\mathbb{E} [ e^{-\beta T_{j}} w_1(T^{-}_{j}, K^{t,k,\alpha}_{T^{-}_{j}}) ] \leq e^{-\beta t} w_1(t,k) - \mathbb{E} \left[ \int_t^{T_{j}} e^{-\beta u} (P^{t,z,\alpha}_u \alpha_u -C(\alpha_u)) du\right].$$

By using the condition at time $T^{-}_j$ for $w_1$, we get

\begin{align*} e^{-\beta t}w_1(t,k) & \geq \mathbb{E} [ e^{-\beta T_{j}} (w_0(T_{j}, K^{t,k,\alpha}_{T_{j}}) {\mathbb 1}_{X^{t,x, \alpha}_{T_j} \leq \Gamma} +w_1(T_{j}, K^{t,k,\alpha}_{T_{j}}) {\mathbb 1}_{X^{t,x,\alpha}_{T_j} \gt \Gamma}) ]\\ & \quad + \mathbb{E} \left[ \int_t^{T_{j}} e^{-\beta u} (P^{t,z,\alpha}_u \alpha_u -C(\alpha_u)) du\right] \\ & \geq \mathbb{E} [ e^{-\beta T_{j}} w(T_{j}, Z^{t,z,\alpha}_{T_{j}}) ] + \mathbb{E} \left[ \int_t^{T_{j}} e^{-\beta u}(P^{t,z,\alpha}_u \alpha_u -C(\alpha_u)) du\right]\\ & \geq \mathbb{E} [ e^{-\beta T_{j}} v(T_{j}, Z^{t,z,\alpha}_{T_{j}}) ] + \mathbb{E} \left[ \int_t^{T_{j}} e^{-\beta u} (P^{t,z,\alpha}_u \alpha_u -C(\alpha_u)) du\right]. \end{align*}

Then for any $\bar \alpha \in {\mathcal A} _{T_j,I^{t,i}_{T_j}}(X^{t,x,\alpha }_{T_j})$, we get

\begin{align*} e^{-\beta t} w_1(t,k) & \geq \mathbb{E} \left[\int_{T_j}^{T} e^{-\beta s} (P^{T_j,Z^{t,z,\alpha}_{T_j}, \bar \alpha }_s \bar \alpha_s -C(\bar \alpha_s))ds - e^{-\beta T} f((\Gamma - X^{T_j,X^{t,x,\alpha}_{T_j},\bar \alpha}_T)^{+},P^{T_j,P^{t,p,\alpha}_{T_j},\bar \alpha}_T) \right] \\ & \quad + \mathbb{E} \left[ \int_t^{T_{j}} e^{-\beta u} (P^{t,z,\alpha}_u \alpha_u -C(\alpha_u)) du\right] , \end{align*}

which implies for any $\alpha \in {\mathcal A} _{t,i}(x)$ we get

$$w_1(t,k) \geq \mathbb{E} \left[ \int_t^{T} e^{-\beta (u-t)} (P^{t,z,\alpha}_u \alpha_u -C(\alpha_u)) du - e^{-\beta(T-t)} f((\Gamma - X^{t,x,\alpha}_T)^{+},P^{t,p,\alpha}_T) \right].$$

Thus, $w_1(t,x,p) \geq v(t,z)$.

(ii) We prove by induction that $w=v$ on $[T_j,T_{j+1}]$ for any $j \in \{0, \ldots, N-1\}$.

We first consider the case $j=N-1$ and $i=0$. We apply Itô's formula to $e^{-\beta u}w(u, \hat {Z}^{t,z}_u)$ between $t \in [T_{N-1},T_N)$ and $s \in [t,T)$ (after a localization for removing the stochastic integral term in the expectation)

$$\mathbb{E} [e^{-\beta T_N } w(T^{-}_N, \hat{Z}^{t,z}_{T^{-}_N})] = e^{-\beta t} w(t,z) + \mathbb{E} \left[\int_t^{T_N} e^{-\beta u } (\partial_t w(u, \hat{Z}^{t,z}_u) + {\mathcal{L}}^{0} w(u, \hat{Z}^{t,z}_u)) du \right].$$

Thus, we get

$$w(t,z) = \mathbb{E} [{-}e^{-\beta (T_N-t) } f((\Gamma - X^{t,x,0}_T)^{+},P^{t,p,0}_T)] = v(t,z).$$

We now consider the case $j=N-1$ and $i=1$. We apply Itô's formula to $e^{-\beta u}w(u, \hat {Z}^{t,z}_u)$ between $t \in [T_{N-1},T_N)$ and $T_N$ (after a localization for removing the stochastic integral term in the expectation)

$$\mathbb{E} [ e^{-\beta (T_N -t)} w(T^{-}_N , \hat{Z}^{t,z}_{T^{-}_N}) ]= w(t,z) + \mathbb{E} \left[ \int_t^{T_N} e^{-\beta (u -t)}(\partial_t w(u, \hat{Z}^{t,z}_u) + {\mathcal{L}}^{\hat \alpha(u,\hat{Z}^{t,z}_u)} w(u, \hat{Z}^{t,z}_u)) du\right].$$

Which implies

\begin{align*} w(t,z)& = \mathbb{E} \left[ \int_t^{T_N} e^{-\beta (u -t)}(P^{t,z,\hat \alpha}_u \hat \alpha(u,\hat{Z}^{t,z}_u)-C(\hat\alpha(u,\hat{Z}^{t,z}_u))) du - e^{-\beta (T -t)}f((\Gamma - X^{t,x,\hat \alpha}_T)^{+},P^{t,p,\hat \alpha}_T) \right]\\ & = J(t,z,\hat \alpha). \end{align*}

Thus, $w(t,z)=J(t,z,\hat \alpha )=v(t,z)$ on $[T_{N-1},T_N] \times \mathbb{R} _+ \times (0,+\infty )$ with $i=1$.

We now suppose the result holds on $[T_j, T_{j+1}]$ for one $j \in \{1, \ldots, N-1\}$. We first consider the case $i=0$. Since $w$ is $C^{1,2}([T_{j-1},T_j) \times \mathbb{R} _+ \times (0,+\infty )) \cap C^{0}([T_{j-1},T_j] \times \mathbb{R} _+ \times (0,+\infty ))$, we have for any $(t, x,p) \in [T_{j-1},T_j) \times \mathbb{R} _+ \times (0,+\infty )$ by using the previous technics

$$\mathbb{E} [e^{-\beta (T_j -t)} w(T^{-}_j, \hat Z^{t,z}_{T^{-}_j}) ] = w(t,z).$$

By using the condition at time $T^{-}_j$ for $w$, we get

\begin{align*} e^{-\beta t}w(t,z) & = \mathbb{E} [e^{-\beta T_j } ( w(T_j, X^{t,x,\hat \alpha}_{T_j},P^{t,z,\hat\alpha}_{T_j}, 0) {\mathbb 1}_{X^{t,x}_{T_j} \leq \Gamma} + w(T_j, X^{t,x,\hat\alpha}_{T_j},P^{t,z,\hat\alpha}_{T_j}, 1) {\mathbb 1}_{X^{t,x}_{T_j} \gt \Gamma}) ]\\ & = \mathbb{E}[e^{-\beta T_j } w(T_j, \hat Z^{t,z}_{T_j}) ]\\ & = \mathbb{E} [e^{-\beta T_j } v(T_j, \hat Z^{t,z}_{T_j}) ] = e^{-\beta t}v(t,z). \end{align*}

We now consider the case $i=1$. Since $w$ is $C^{1,2}([T_{j-1},T_j) \times \mathbb{R} _+ \times (0,+\infty )) \cap C^{0}([T_{j-1},T_j] \times \mathbb{R} _+ \times (0,+\infty ))$, we have for any $(t, x,p) \in [T_{j-1},T_j) \times \mathbb{R} _+ \times (0,+\infty )$, $\alpha \in {\mathcal A} _{t,1}(x)$, by Itô's formula

\begin{align*} \mathbb{E}[e^{-\beta T_j} w(T_j, \hat Z^{t,z}_{T_j})] & = e^{-\beta t}w(t,z) + \mathbb{E}\left[\int_t^{T_j} e^{-\beta u} (\partial_t w(u, \hat Z^{t,z}_u) + {\mathcal{L}}^{\hat{\alpha}(u,\hat Z^{t,z}_u)} w(u, \hat Z^{t,z}_u) ) du\right] \\ & = e^{-\beta t}w(t,z) - \mathbb{E}\left[\int_t^{T_j} e^{-\beta u} (P^{t,z,\hat \alpha}_u \hat{\alpha}(u,\hat Z^{t,z}_u)-C(\hat{\alpha}(u,\hat Z^{t,z}_u))) du\right]. \end{align*}

By using the condition at time $T^{-}_j$ for $w$, we get

\begin{align*} e^{-\beta t}w(t,z) & = \mathbb{E}\left[ \int_{T_j}^{T} e^{-\beta u} (P^{T_j,\hat Z^{t,z}_{T_j}, \hat \alpha}_u \hat{\alpha}(u,\hat Z^{T_j,\hat Z^{t,z}_{T_j}}_u)-C(\hat{\alpha}(u,\hat Z^{T_j,\hat Z^{t,z}_{T_j}}_u))) du\right]\\ & \quad - \mathbb{E}[e^{-\beta T} f((\Gamma - X^{T_j, X^{t,x, \hat \alpha}_{T_j}, \hat \alpha}_T)^{+},P^{T_j, P^{t,p, \hat \alpha}_{T_j}, \hat \alpha}_T) ] \\ & \quad+ \mathbb{E}\left[\int_t^{T_j} e^{-\beta u} (P^{t,z, \hat \alpha}_u \hat{\alpha}(u,\hat Z^{t,z}_u)-C(\hat{\alpha}(u,\hat Z^{t,z}_u)))du\right] \\ & =\mathbb{E}\left[ \int_{t}^{T} e^{-\beta u} (P^{t,z,\hat \alpha}_u \hat{\alpha}(u,\hat Z^{t,z}_u)-C(\hat{\alpha}(u,\hat Z^{t,z}_u))) du - e^{-\beta T} f((\Gamma - X^{t, x, \hat \alpha}_T)^{+},P^{t, p, \hat \alpha}_T) \right]. \end{align*}

4. Numerical results

4.1. The discrete problem

In this section, we introduce the numerical tools that we use to solve numerically the HJB equations linked to $v_0$ and $v_1$ and associated to the stochastic control problem. We use a finite difference scheme mixed with an iterative procedure which leads to the resolution of a Controlled Markov Chain problem. This class of problems is intensely studied by Kushner and Dupuis [Reference Kushner and Dupuis9]. The convergence of the solution of the numerical scheme toward the solution of the HJB equation, when the time-space step goes to zero, can be shown using the standard local consistency argument, that is, the first and the second moments of the approximating Markov chain converge to those of the continuous process $(X,P)$. We refer to [Reference Budhiraja and Ross2,Reference Hindy, Huang and Zhu6,Reference Jin, Yin and Zhu7] for numerical schemes involving a Controlled Markov Chain control problem.

We begin by localizing the problem on the bounded domain $[0,T]\times [0,x_{{\rm max}}]\times [p_{{\rm min}},p_{{\rm max}}]$, where $x_{{\rm max}}$, $p_{{\rm min}}$, and $p_{{\rm max}}$ are nonnegative constants. Then, we assume the following Neumann boundary conditions on the localized boundary

\begin{align*} \frac{\partial v}{\partial x}(t,0,p)& =\frac{\partial v}{\partial x}(t,x_{{\rm max}},p)= 0 ,\\ \frac{\partial v}{\partial p}(t,x,p_{{\rm min}}) & = \frac{\partial v}{\partial p}(t,x,p_{{\rm max}}) = 0. \end{align*}

Let $\delta$, $h$, and $k$ be the discretization steps along the directions $t$, $x$, and $p$ respectively. For $(t,x,p)$ in the time-space grid

\begin{align*} {\mathcal G}_{\delta,h,k} & := \{t_i=(i-1)\delta, \; i=1,\ldots,n_t \}\times \{x_j=(j-1)h, \; j=1,\ldots,n_x\}\\ & \quad \times \{p_l=p_{{\rm min}}+(l-1)k, \; l=1,\ldots,n_p\}, \end{align*}

where $n_t=T/\delta +1$, $n_x=x_{{\rm max}}/h+1$ and $n_p=(p_{{\rm max}}-p_{{\rm min}})/k+1$.

We consider approximations of the following form

\begin{align*} \frac{\partial v}{\partial t}(t,x,p)& \sim \frac{v (t+\delta,x,p) - v(t,x,p)}{\delta},\\ \frac{\partial v}{\partial x }(t,x,p)& \sim{\pm} \frac{v (t,x\pm h,p) - v(t,x,p)}{h},\\ \frac{\partial v}{\partial p}(t,x,p)& \sim{\pm}\frac{v (t,x,p+ k) - v(t,x,p)}{k},\\ \frac{\partial^{2} v}{\partial x^{2}}(t,x,p)& \sim \frac{v (t,x+ h,p) +v (t,x- h,p)- 2v(t,x,p)}{h^{2}},\\ \frac{\partial^{2} v}{\partial p^{2}}(t,x,p)& \sim \frac{v (t,x,p+k) +v (t,x,p-k)- 2v(t,x,p)}{k^{2}},\\ \frac{\partial^{2} v}{\partial x\partial y}(t,x,p)& \sim \frac{2v(t,x,p) + v (t,x+h,p+k)+ v (t,x-h,p-k)}{2hk}\\ & \quad -\frac{v (t,x+ h,p)+v (t,x,p+k)+v (t,x-h,p)+v (t,x,p-k)}{2hk}. \end{align*}

Let us introduce the following quantities which are used to approximate the value functions $v_0$ and $v_1$

\begin{align*} \eta_x(x,a)& :=\eta x(\lambda-x)-a ,\\ Q^{\delta,h,k}(x,p,a) & :=1+\frac{|\eta_x(x,a) |\delta}{h} +\frac{\mu(x)p\delta}{k}+\frac{\gamma^{2}x^{2}\delta}{h^{2}}+\frac{\sigma^{2}p^{2}\delta}{k^{2}}-\frac{\rho\sigma\gamma xp\delta}{hk} ,\\ \Delta t^{\delta,h,k}(x,p,a) & :=\frac{\delta}{Q^{\delta,h,k}(x,p,a)}. \end{align*}

In Table 1, we define the Markov chain states and the associated transition probabilities that we obtain when we apply the finite difference approach.

TABLE 1. The approximating Markov chain.

Thus, using the above notations and discretizing the space of controls as follows

$$\{0,\ldots,\bar{a}\} := \{a=(m-1)\bar{a}/(n_a-1), \; m=1,\ldots,n_a\},$$

where $n_a\in \mathbb {N}^{*}$, we approximate the HJB equations associated to the functions $v_0$ and $v_1$ for any $(x,p) \in [0,x_{{\rm max}}]\times [p_{{\rm min}},p_{{\rm max}}]$ by the following iterative scheme by starting with $v_0^{\delta,0}\equiv 0$ and $v_1^{\delta,0}\equiv 0$

$$v_0^{\delta,n+1}(t,x,p)=\frac{1}{1+\beta\Delta t^{\delta,h,k}(x,p,0)}\sum_{i=1}^{7}\pi_i(x,p,0)v_0^{\delta,n}(z_i) ,$$

for $t\in [0,T]- \{T_j\}_{1 \leq j \leq N}$,

$$v_0^{\delta,n+1}(T_j-\delta,x,p) = v^{\delta,n}_0(T_j,x,p){\mathbb 1}_{x \leq \Gamma} + v^{\delta,n}_1(T_j,x,p){\mathbb 1}_{x\gt \Gamma} ,$$

for $j\in \{1,\ldots,N-1\}$,

$$v_0^{\delta,n+1}(T_N-\delta,x,p) ={-}f((\Gamma -x)^{+},p)$$

and

$$v_1^{\delta,n+1}(t,x,p)=\max_{\{0, \ldots ,\bar{a}\}}\left\{\frac{\sum_{i=1}^{7}\pi_i(x,p,a)v_1^{\delta,n}(z_i)+(pa-C(a))\Delta t^{\delta,h,k}(x,p,a)}{1+\beta\Delta t^{\delta,h,k}(x,p,a)}\right\},$$

for $t\in [0,T]- \{T_j\}_{1 \leq j \leq N}$,

$$v_1^{\delta,n+1}(T_j-\delta,x,p) =v^{\delta,n}_0(T_j,x,p){\mathbb 1}_{x \leq \Gamma} + v^{\delta,n}_1(T_j,x,p){\mathbb 1}_{x\gt \Gamma},$$

for $j\in \{1,\ldots,N-1\}$,

$$v_1^{\delta,n+1}(T_N-\delta,x,p) ={-}f((\Gamma -x)^{+},p).$$

For any $(x,p) \in [0,x_{{\rm max}}]\times [p_{{\rm min}},p_{{\rm max}}]$, the above iterative scheme combined with the boundary conditions is explicit and fully implementable on the enlarged grid

\begin{align*} {\mathcal G}_{\delta,h,k}^{+} & :=\{t_i=(i-1)\delta, \; i=1,\ldots,n_t \}\times \{x_j=(j-1)h, \; j=0,\ldots,n_x+1\}\\ & \quad \times \{p_l=p_{{\rm min}}+(l-1)k, \; l=0,\ldots,n_p+1\} \end{align*}

with a given stopping criterion $\varepsilon$, that means the iterative scheme is stopped when the relative error is less than $\varepsilon$.

Remark 4.1 Since the first and the second moments of the Markov chain defined in Table 1 converge to those of the continuous process $(X,P)$ as the time and space steps go to zero. Hence, the convergence of our scheme may be obtained using the same analysis developed in [Reference Kushner and Dupuis9].

4.2. Numerical interpretations

The numerical computation are done using the following set of data.

• Dynamics values
- ◦ $\eta =0.7$, $\lambda =0.5$, $\gamma =0.2$, $\mu =0.1$, $\sigma =0.1$, $\rho =0.01$.
- ◦ $T=1,\quad \beta =0.1$.
- ◦ Drift function: $\mu (x)=\mu + 0.5 \times \exp (-0.2 x)$.
- ◦ Penalty function : $f(x,p)=\kappa px$ with $\kappa =5$.
- ◦ Cost function : $C(x)=x^{2}$.
- ◦ Regulation parameters : $N=10\text { (number of checks)}, \quad \Gamma =0.2308$.
• Grid values
- ◦ Localization: $x_{{\rm max}}=1,\quad p_{{\rm min}}=0.1, \quad p_{{\rm max}}=1.1, \quad \bar {a}=0.5$.
- ◦ Discretization: $n_x=40$, $n_p=40, \quad n_t=100, \quad n_a=10$.
- ◦ Stopping criterion : $\varepsilon =0.01$.

Remark 4.2 The choice of the parameters values is arbitrary since we study a general natural resource exploitation model. Nevertheless, our numerical algorithm could be easily adapted to any specific model, for instance fishery or forest management, and parameterized by estimated values from real samples.

We plot the shape of the value functions $v_0$ and $v_1$ sliced in the plane $(x, p)$ for a fixed date $t$. We can see, as expected, $v_1\geq v_0$. In fact, we can see in line three of Figure 1 that the spread $v_1 - v_0$ is always positive. Obviously, if the manager can harvest the payoff will be greater. In the first graph of the second line of Figure 1, where we fix $(t,p)$, we can see that the two functions are nondecreasing w.r.t. $x$ which is natural due to the fact that greater the size of the natural resource, the more the manager can harvest and the less he/she is penalized at terminal time $T$. On the other hand, in the second graph of the second line of Figure 1, where we fix $(t,x)$ and take $x\gt\Gamma$, the two functions are nondecreasing in $p$, because the higher the price, the wealthier the manager becomes when he/she harvests and sells. Conversely, in the first line of Figure 1, when $x\lt\Gamma$, we can see that the value functions $v_0$ and $v_1$ are nonincreasing in $p$ which is due to the penalty function $f$ that is nondecreasing w.r.t. $p$ (i.e., the higher the price, the more the manager is penalized by the regulator).

FIGURE 1. The shape of the value functions $v_0$ and $v_1$ for a fixed time $t$ (First Line). The shape of the value functions $v_0$ and $v_1$ for fixed $(t,p)$ and $(t,x)$ (Second Line). The spread between the value functions $v_0$ and $v_1$ for fixed $(t,p)$ and $(t,x)$ (Third Line).

We plot the shape of the value functions $v_0$ and $v_1$ sliced in the plane $(t, x)$ for a fixed price $p$. In Figure 2, as expected, $v_0$ and $v_1$ are decreasing w.r.t. $t$ when $x$ is large. In fact, the manager will have more time to harvest the further he/she is from the terminal date $T$. If $x$ is small, the monotony depends on the parameters $\mu$ and $\eta$. On the one hand, when $x$ is small, we can see that $v_0$ and $v_1$ are increasing w.r.t. $t$ because the manager knows a priori that he/she is going to pay the tax since the quantity of natural resource will likely be smaller than $\Gamma$ at the terminal date $T$. On the other hand, if $t$ is small, the quantity of the natural resource increases with time, but so does the price, implying that the monotony of the value functions is not obvious. In fact, because $\mu$ is more important than $\eta$, the price in this case increases faster than the quantity of natural resource, so if the maturity is important, it is not beneficial for the manager because the tax will be more important.

FIGURE 2. The shape of the value functions $v_0$ and $v_1$ for a fixed price $p$.

In Figure 3, we choose a smaller drift for the price. We remark that the monotony of the value functions $v_0$ and $v_1$ is the same as in the previous figure if $x$ is large. But now, the value functions $v_0$ and $v_1$ are decreasing when $x$ is small. This is because, unlike in Figure 2, the drift of the price is small, so the price increases more slowly than the available quantity of natural resource. Thus, if the maturity is important, it is advantageous for the manager. In fact, the tax will be less important since the quantity of natural resources available has had enough time to grow.

FIGURE 3. The shape of the value functions $v_0$ and $v_1$ for a fixed price $p$ when $\mu (x)=0.01+0.05\times \exp (-0.002x)$.

We plot the shape of the optimal harvest strategy $\alpha ^{*}(t,x,p)$ sliced in the plane $(x,t)$ for a fixed price $p$ (Figure 4). We can see two main regions: a harvest region (with different harvest rates) and a No-harvest region (dark blue). When we are far from maturity $T$, it is not optimal to harvest when $X$ is under $\lambda$ because under this quantity the resource quantity increases naturally so we want to let this happen to reach the maturity with $X$ greater than $\Gamma$ and thus avoid the penalization. As we get closer to $T$, it is best to harvest when $X\lt\lambda$ with different rates $a$, allowing us to reach $T$ without being penalized (i.e., $X_T\gt\Gamma$) and thus optimizing the profit generated by selling the harvest. The rates are greater as the natural resource population grows and this is due to the cost function $C$ (the cost of harvesting).

FIGURE 4. The shape of the optimal control for a fixed price $p$.

In the following, we introduce a profit and loss measure, denoted P&L, defined as follows

$$\text{P & L}(\alpha) = \sum_{i=0}^{N-1}\sum_{t\in\mathcal{T}^{i}} e^{\beta (T-t)} (P^{\alpha}_{t}\alpha_{t}-\alpha_{t}^{2}) \delta\mathbb{1}_{X_{T_i}\gt\Gamma} - f((\Gamma - X^{\alpha}_T)^{+},P^{\alpha}_T),$$

where $\mathcal {T}^{i}:=\{T_i,T_i+\delta,\ldots,T_{i+1}\}$ with the convention $T_0=0$. In fact, this measure represents the payoff of the agent when adopting a given strategy $\alpha$. This measure will allow us to compare the effectiveness of the optimal control $\alpha ^{*}$ against a given naive strategy.

After computing the optimal strategy $\alpha ^{*}$ and the value function via the iterative procedure, we simulate the correlated Brownian motions $B$ and $W$ on the horizon $[0,T]$ and we adjust the dynamics of $X$ and $P$ according to the optimal control computed previously. Figure 5 represents a mean over 3,000 simulated paths of $X$ and $P$ controlled by the optimal strategy $\alpha ^{*}$ and two other naive strategies $\alpha ^{1}$ and $\alpha ^{2}$. The first naive strategy $\alpha ^{1}$ consists in harvesting the maximum $\bar {a}$ at all time when we are authorized till $T$ (in red) and the second one $\alpha ^{2}$ consists in waiting until a certain time $t_0$ which is chosen by the manager (in Figure 5 we take $t_0=0.5$) then harvesting the maximum $\bar {a}$ when we are authorized till time $T$ (in green). In Figure 5, the starting point is $X_0=0.7$ and $P_0=0.5$ and the P&Ls of the three strategies are respectively (with $95\%$ confidence level bounds): $\text {P & L}(\alpha ^{*})= 0.0873$ $(\pm 0.0002)$, $\text {P & L}(\alpha ^{1})= -0.0182$ $(\pm 0.0024)$, and $\text {P & L}(\alpha ^{2})=0.0315$ $(\pm 0.0005)$. We can see that our computed strategy is better than the two others. Indeed, with our strategy, the manager begins to harvest continuously at a rate $a$ smaller than the maximum $\bar a$, which allows him/her to attain terminal date $T$ with a natural resource population above $\Gamma$ avoiding the penalization that occurs if $X_T\lt\Gamma$. On the one hand, adopting the strategy $\alpha ^{1}$, the manager at time $T$ harvests more but he/she is penalized because the quantity of resource is under $\Gamma$ at the terminal time $T$. On the other hand, using the strategy $\alpha ^{2}$, he/she is not penalized because the quantity of resource at time $T$ is above $\Gamma$ but he/she harvests for a shorter period of time (starting the harvesting at $t_0=0.5$), hence, resulting in lower revenue.

FIGURE 5. The optimal control $\alpha ^{*}$ vs naive controls $\alpha ^{1}$ and $\alpha ^{2}$.

As in Figure 5, Figure 6 represents a mean over 3,000 simulated paths of $X$ and $P$ for the optimal control and the same two other naive strategies. But this time we choose to start with $X_0=0.3$ and $P_0=0.5$. The P&Ls of the three strategies are respectively (with $95\%$ confidence level bounds): $\text {P & L}(\alpha ^{*})=0.0302$ $(\pm 0.0008)$, $\text {P & L}(\alpha ^{1})= -0.1032$ $(\pm 0.0027)$, and $\text {P & L}(\alpha ^{2})=-0.0331$ $(\pm 0.0022)$. We can see that our strategy is still better than the two others. On the one hand, starting at time $t=0$ from a position under the threshold $\lambda$, the natural resource population tends to increase (mean-reverting effect), hence, as we can see in Figure 6, our optimal strategy is to wait until $X$ reaches a certain level over $\Gamma$ before starting to harvest (around time $t=0.2$). Doing this allows the manager to avoid the risk of being under the penalization barrier $\Gamma$ at time $T$. On the other hand, using the first naive strategy $\alpha ^{1}$, the natural resource population is quickly under $\Gamma$ and at time $t=0.2$, the regulator does not allow the manager to harvest anymore. Moreover, the manager is penalized as the natural resource population does not surpass $\Gamma$ at time $T$. On the contrary, if we wait up to time $t_0=0.75$ before starting to harvest with the maximum rate $\bar {a}$ (the naive strategy $\alpha ^{2}$), the natural resource population grows (since $X_t\lt\lambda$) till time $t_0=0.75$ and then decreases (because the harvesting starts at this date) but will be above $\Gamma$ at time $T$. Hence, we are not penalized but still our optimal strategy $\alpha ^{*}$ outperforms $\alpha ^{2}$ in terms of P&Ls.

FIGURE 6. The optimal control $\alpha ^{*}$ vs naive controls $\alpha ^{1}$ and $\alpha ^{2}$ ($X_0=0.3$ and $P_0=0.5$).

In Table 2, we choose to compute our optimal strategy and the corresponding simulations for $\Gamma =0.4872\gt\lambda =0.3$. With this configuration, we know that if the natural resource population drops under $\Gamma$, it is likely to stay under $\Gamma$. Therefore, the manager is obligated to keep the natural resource population $X$ over $\Gamma$ at all times in order to avoid the penalization at the maturity $T$. Except for $\Gamma$ and $\lambda$, we use the same values of the parameters defined in the beginning of this numerical part and we represent in Table 2 the value of the P&L and the natural resource population $X$ at time $T$ starting with $X_0=0.7$ and $P_0=0.5$. These quantities were computed using a mean over 3,000 trajectories under the optimal control $\alpha ^{*}$ for different values of the penalty constant $\kappa$ (with $95\%$ confidence level bounds). We can see that the P&L is a decreasing function w.r.t. $\kappa$ which is natural because the less the manager is penalized, the more he/she takes risks and the richer he/she is. Although we can remark that for $\kappa =1$ the natural resource population at time $T$ is under $\Gamma$ because the penalization is not severe enough, hence the manager prefers to be penalized and harvests a little more which makes him/her richer. Therefore, we assume that for our set of data, to create a fair balance between the biological requirements and the maximization of the profit induced by harvesting, a suitable choice for the penalty constant is $\kappa =2$. This amount of fines insures the double objective of the sustainable harvesting: the natural resource population does not fall below a certain threshold that guarantees its natural renewal, and the manager makes profits to prevent him/her from going bankrupt.

TABLE 2. P&L for different penalty constant $\kappa$.

5. Conclusion

In this paper, we have investigated the problem of sustainable harvesting of a natural resource. We built a model where harvesting is continuously depending on the quantity of resource available in the harvesting region. Manager tries to maximize his/her profit under the constraint of fines when the quota is exceeded. We have also introduced the fact that the selling price of natural resource depends on the quantity (stock) of resource remaining in the harvesting region.

We have shown some interesting results. First, we derived an optimal strategy from a verification Theorem. We then numerically observed that this optimal strategy provides better gain than naive ones. Second, we delimit a level of fines which insures the double objective of the sustainable harvesting: on the one hand, the natural resource population stays above a certain threshold ensuring its natural renewal; on the other hand, the manager is free to attain a certain level of harvesting allowing acceptable profits. These results give a better understanding of the manager's behavior according to the amount of the fines, and how to define a pricing rule for the fines to guarantee a sustainable harvesting.

References

Bruder, B. & Pham, H. (2009). Impulse control on finite horizon with execution delay. Stochastic Processes and their Applications 119: 1436–1469.CrossRef Google Scholar

Budhiraja, A. & Ross, K. (2007). Convergent numerical scheme for singular stochastic control with state constraints in a portfolio selection problem. SIAM Journal on Control and Optimization 45(6): 2169–2206.CrossRef Google Scholar

Clark, C.-W. & Reed, W.-J. (1989). The tree-cutting problem in a stochastic environment: The case of age-dependent growth. Journal of Economics & Management 2: 92–106.Google Scholar

Conrad, C.-W. & Clark, C.-W. (1987). Natural resource economics: notes and problems. New York: Cambridge University Press.CrossRef Google Scholar

Danielsson, A. (2002). Efficiency of catch and effort quotas in the presence of risk. Journal of Environmental Economics and Management 43: 20–33.CrossRef Google Scholar

Hindy, A., Huang, C., & Zhu, H. (1993). Numerical analysis of a free-boundary singular control problem in financial economics. Journal of Economic Dynamics and Control 21: 297–327.CrossRef Google Scholar

Jin, Z., Yin, G., & Zhu, C. (2012). Numerical solutions of optimal risk control and dividend optimization policies under a generalized singular control formulation. Automatica 48(8): 1489–1501.CrossRef Google Scholar

Kharroubi, I., Lim, T., & Ly Vath, V. (2019). Optimal exploitation of a resource with stochastic population dynamics and delayed renewal. Journal of Mathematical Analysis and Applications 477: 627–656.CrossRef Google Scholar

Kushner, H. & Dupuis, P. (2001). Numerical methods for stochstic control problems in continuous time, 2nd ed. Stochastic Modelling and Applied Probability, Vol. 24. New York: Springer.CrossRef Google Scholar

Kvamsdal, S.-F., Poudel, D., & Sandal, L.-K. (2016). Harvesting in a fisheries with stochastic growth and a mean-reverting price. Environmental and Resource Economics, Springer; European Association of Environmental and Resource Economists 63(3): 643–663.Google Scholar

Murillas, A. & Chamorro, J.-M. (2006). Valuation and management of fishing resources under price uncertainty. Environmental & Resource Economics 33: 39–71.CrossRef Google Scholar

Nostbakken, L. (2006). Regime switching in a fisheries with stochastic stock and price. Journal of Environmental Economics and Management 51(2): 231–241.CrossRef Google Scholar

Nostbakken, L., Thébaud, O., & Sorensen, L.-C. (2011). Investment behaviour and capacity adjustment in fisheries: A survey of the literature. Marine Resource Economics 26: 95–117.CrossRef Google Scholar

Pella, J.-J. & Tomlinson, P.-K. (1969). A generalized stock production model. Bulletin Inter-American Tropical Tuna Commission 13: 421–496.Google Scholar

Reed, W.-J. & Clark, H.-R. (1990). Harvest decisions and asset valuation for biological resources exhibiting size-dependent stochastic growth. International Economic Review 31: 147–169.CrossRef Google Scholar

Reed, W.-J. & Heras, E. (1992). The conservation and exploitation of vulnerable resources. Bulletin of Mathematical Biology 54: 185–207.CrossRef Google Scholar

Sarkar, S. (2009). Optimal fisheries harvesting rules under uncertainty. Resource and Energy Economics 31: 272–286.CrossRef Google Scholar

Schaefer, M.B. (1954). Some aspects of the dynamics of populations important to the management of commercial marine fisheries. Bulletin Inter-American Tropical Tuna Commission 1: 23–56.Google Scholar

Skiadas, C.H. (2010). Exact solutions of stochastic differential equations: Gompertz, generalized logistic and revised exponential. Methodology and Computing in Applied Probability 12: 261–270.CrossRef Google Scholar

Weitzman, M.-L. (2002). Landing fees vs harvest quotas with uncertain fish stocks. Journal of Environmental Economics and Management 43: 325–338.CrossRef Google Scholar