On the development and analysis of coupled surface–subsurface models of catchments. Part 1. Analysis of dimensions and parameters for UK catchments

Piotr Morawiecki; Philippe H. Trinh

doi:10.1017/jfm.2023.1031

On the development and analysis of coupled surface–subsurface models of catchments. Part 1. Analysis of dimensions and parameters for UK catchments

Published online by Cambridge University Press: 11 March 2024

Piotr Morawiecki

and

Philippe H. Trinh

Show author details

Piotr Morawiecki*: Affiliation:
Department of Mathematical Sciences, University of Bath, Bath BA2 7AY, UK
Philippe H. Trinh*: Affiliation:
Department of Mathematical Sciences, University of Bath, Bath BA2 7AY, UK
*: †Email addresses for correspondence: piotr.morawiecki@bath.edu, p.trinh@bath.ac.uk
†Email addresses for correspondence: piotr.morawiecki@bath.edu, p.trinh@bath.ac.uk

Article contents

Abstract
Introduction
Fundamentals of catchment modelling
Data sources and processing methods
Further investigation of distribution and correlation
Discussion
Funding
Declaration of interests
References

Rights & Permissions

Abstract

The objective of this three-part work is to formulate and rigorously analyse a number of reduced mathematical models that are nevertheless capable of describing the hydrology at the scale of a river basin (i.e. catchment). Coupled surface and subsurface flows are considered. In this first part, we identify and analyse the key physical parameters that appear in the governing formulations used within hydrodynamic rainfall–runoff models. Such parameters include those related to catchment dimensions, topography, soil and rock properties, rainfall intensities, Manning's coefficients and river channel dimensions. Despite the abundance of research that has produced data sets describing properties of specific river basins, there have been few studies that have investigated the ensemble of typical scaling of key physical properties; these estimates are needed to perform a proper dimensional analysis of rainfall–runoff models. Therefore, in this work, we perform an extensive analysis of the parameters; our results form a benchmark and provide guidance to practitioners on the typical parameter sizes and interdependencies. Crucially, the analysis is presented in a fashion that can be reproduced and extended by other researchers and, wherever possible, uses publicly available data sets for catchments in the UK.

JFM classification

Geophysical and Geological Flows: River dynamics Geophysical and Geological Flows: Shallow water flows

Type: JFM Papers
Information: Journal of Fluid Mechanics , Volume 982 , 10 March 2024 , A28

DOI: https://doi.org/10.1017/jfm.2023.1031 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2024. Published by Cambridge University Press.

1. Introduction

A key problem in hydrology concerns the prediction of water flow into the river. Here, the basic geometry considered is a drainage basin or a catchment site, defined by the area of land from which flow induced by precipitation will reach a specified outlet. As one would expect, the complete physics of this fluid transport problem involves a host of complex multiscale effects related to the precise geological, ecological, urban and climate-related features of the environment.

The objective of this work, divided into three parts, is the formulation and rigorous analysis of a number of relatively simple mathematical models that, following the basic assumptions of coupled surface–subsurface models, are nevertheless capable of describing the hydrology at the scale of a typical catchment site. In particular, we wish to study the following questions.

(i) What is a relatively simple mathematical model that describes catchment-scale dynamics of subsurface and surface flow? What are the key non-dimensional quantities that govern the physics of such phenomena? What scaling laws can be predicted based on asymptotic analysis of these models?
(ii) Are these reduced models justifiable given available data of catchments in the UK? What are the typical parameter values to use in such models?

The above forms a set of general questions of interest. We may pose much more specific ones of a fluid mechanical origin. For instance: given a typical catchment geometry in the UK with typical length scales, typical terrain slope, typical soil conductivity and so forth, what are the predicted scaling laws characterising the flow rates into the river during a rainfall of typical intensity and duration?

One challenge is to define the notion of what constitutes a ‘typical’ parameter; this concerns the second question, (ii), we have asked above.

We wish to study the behaviour of mathematical models and to do so, we require some notion of reasonable parameter values. However, in the context of hydrology, determination of typical parameters is not always possible: real-world catchments are characterised by a wide range of shapes and topography, with hydraulic properties that are highly heterogeneous and may significantly vary across different scales (see discussion by Clauser (Reference Clauser1992)). With such high natural variability in catchment properties, it is not possible to formulate one benchmark scenario that reflects the properties of all existing catchments. In this three-part work, our overarching aim to formulate a mathematical model of a catchment that oversimplifies the complexity of real-world catchments, but is characterised by plausible/typical physical parameters.

The task of the present paper is to study this issue of parameter values. We wish to obtain the parameter choices in a transparent way via the statistical analysis of real-world data or via references to other works. In general, we shall estimate the median and interquartile range of each parameter considered for a wide range of UK catchments, and then study their correlations and dependencies. Whenever possible, we use publicly available spatial data. However, some parameters (e.g. soil parameters, such as hydraulic conductivity) are hard to measure (or average) at the catchment scale; in these cases, we based our estimates on existing models or individual experimental case studies.

1.1. On mathematical and computational models

We have observed that in standard practices of industrial hydrological research on coupled surface–subsurface flows at the catchment scale, the emphasis is often on obtaining the exact prediction of flow quantities given available data at specific catchments (see e.g. the review by Furman (Reference Furman2008) and references therein). While this approach allows site-specific predictions, there seems to have been less work on the study of the general properties of coupled surface–subsurface flow models. This is a challenging task because of the complexity of real-world catchments, which are characterised by multiple temporal and spatial scales.

We are interested in the development of minimal mathematical catchment models that are simple enough to allow the development of universal scaling laws, but at the same time, are complex enough to represent the key physical processes characterising real-world systems. The study of such fundamental properties for simple benchmark models may help in understanding the limitations of such models when applied to the real world catchments – beyond what can be learned from standard numerical approaches.

We shall present a more thorough literature review of mathematical and computational models in Part 2, but here we review some relevant strands of investigation. First, there are classic references concerning both subsurface flow (by e.g. Bear (Reference Bear1972) and Anderson, Woessner & Hunt (Reference Anderson, Woessner and Hunt2015)) and surface flow (by e.g. Chow (Reference Chow1959) and Chow & Ben-Zvi (Reference Chow and Ben-Zvi1973)). These texts introduce equations of e.g. Boussinesq, Richards, Darcy, Saint-Venant and so forth. However, the theories in many of these classical references are not so easily adapted to direct comparison with statistical catchment data in a given location (such as the UK). Their representation is typically one-dimensional (1-D), is constrained in simplified domains and without explicit specification of parameters; the analysis is also qualitative and presents a generalised theory of physics-based modelling. The result, however, is that it is not at all obvious how the required scaling laws, raised above in the Introduction, can be derived from these isolated theories.

Generally, modern implementations of the fundamental theory of surface and subsurface flow do not treat the governing equations in isolation – that is to say, as applied to a simplified mathematical model of a catchment site. Instead, they often take the form of extensive three-dimensional (3-D) computational models and software (see e.g. the introduction by Beven (Reference Beven2011) and reviews by Shaw et al. (Reference Shaw, Beven, Chappell and Lamb2010) and Blöschl (Reference Blöschl2006)). In the computational era, this approach has led to the development of codes such as TOP Model (Beven & Kirkby Reference Beven and Kirkby1977), MIKE SHE (Abbott et al. Reference Abbott, Bathurst, Cunge, O'Connell and Rasmussen1986), HydroGeoSphere (Brunner & Simmons Reference Brunner and Simmons2012), ParFlow (Kollet & Maxwell Reference Kollet and Maxwell2006), OpenGeoSys (Kolditz et al. Reference Kolditz2012) and many others. One of our core questions is whether the typical output of a large-scale physics-based model can be explained by a simplified fluid mechanical model.

In addition to physics-based models, many modern references have tended towards statistical or phenomenological modelling. These approaches include predictions of flow rates based on statistical methods, such as multidimensional linear regression (Calver, Stewart & Goodsell Reference Calver, Stewart and Goodsell2009), as well as so-called conceptual rainfall–runoff models (Sitterson et al. Reference Sitterson, Knightes, Parmar, Wolfe, Avant and Muche2018). A detailed comparative analysis between these three classes of models (statistical, conceptual and physics-based) is an interesting topic we have highlighted for future work – in some sense, we anticipate that this challenge of intermodel comparison must first begin by agreeing on the minimal mathematical model to consider.

There are challenges to estimating the typical parameters as it is required for further mathematical modelling. In Part 2 of our work, it will be argued that under certain conditions, catchment dynamics can be modelled in terms of simplified geometries where the subsurface and surface flow travels towards the river channel in a transverse direction to the channel flow. Such reduced-dimensional flows will be governed by non-dimensional parameters that involve, for instance, a typical catchment width, say $L_x$, measured in a specific direction. However, given the complex network of streams, rivers and land topography in any location, it is not clear how $L_x$ should be estimated. Moreover, what is the proper definition of $L_x$ that provides consistency with the underlying assumptions of the model? These and similar questions do not seem to have yet been addressed by existing research.

1.2. On the development of a reproducible framework for parameter estimation

During the course of this work, we have discovered that it is an entirely non-trivial task to seek such typical parameters required for mathematical modelling. In many cases, the parameters used by modern software are determined through a black-box calibration of a complex computational model; the details of these procedures are not often published, or their reproduction may be impossible without access to the original codes (see in addition Hutton et al. (Reference Hutton, Wagener, Freer, Han, Duffy and Arheimer2016)). Consequently, it is important to develop a reproducible framework so that scientific researchers without access to specialised datasets can reproduce our methodology. To this end, we have focused, as much as possible, on the use of publicly available datasets. Furthermore, all numerical algorithms used in this paper are available in a readily applicable form.

For UK catchments, notable examples of datasets include the publicly available National River Flow Archive (NRFA) (Fry & Swain Reference Fry and Swain2010), the 3-D soil hydraulic database of Europe created by the European Soil Data Centre (ESDAC) (Tóth et al. Reference Tóth, Weynants, Pásztor and Hengl2017), the bedrock geology model by the British Geological Society (BGS) (Waters et al. Reference Waters, Terrington, Cooper, Raine and Thorpe2016) and detailed spatial datasets shared by Ordnance Survey (OS) (Lilley Reference Lilley2011). Boundaries of gauged catchments, for which flow at the outlet is regularly monitored, are defined in the aforementioned NRFA.

In this report, we identify key physical parameters in § 2, and in § 3 we use the above datasets to extract typical values of these parameters for all gauged catchments in the UK. Mean values of parameters, as well as their correlations and spatial distribution, are investigated in § 4, followed by discussion in § 5. The goal is to build a foundation for formulation of benchmark scenarios and further dimensional analysis, continued in further parts of our work.

2. Fundamentals of catchment modelling

Three flows are associated with a general catchment. First, subsurface flow occurs beneath the ground; second, overland flow occurs on what we refer to as the hillslope; third, channel flow occurs within a system of rivers and streams. In this section, we shall review some of the accepted governing equations for these three flows. A more detailed formulation, non-dimensionalisation and analysis of simplified catchment models will be presented in Part 2 of our work. Here, our objective is to extract those key dimensional parameters that are expected to be relevant.

Below, we shall consider a 3-D system with a general position vector $\boldsymbol {x} = (x, y, z)$. The equations presented correspond to general geometries, but for consistency with later mathematical modelling, it will be convenient to associate the $y$ direction as generally oriented along the channel direction; the $x$ direction as generally oriented along the steepest gradient of the typical hillslope; and the $z$ direction as oriented in the vertical direction. Hence, we shall annotate the catchment dimensions with the typical channel length $L_y$, the typical hillslope width $L_x$ and the typical aquifer depth of $L_z$. This geometry is shown in figure 1.

Figure 1. (a) Parameters describing catchment shape: hillslope width $L_x$; aquifer depth $L_z$; elevation gradient along the hillslope $S_x$; and along the river $S_y$. (b) Three types of flow are presented; subsurface flow includes both flow through the unsaturated zone and groundwater flow through the saturated zone.

2.1. Subsurface flow

The subsurface flow that governs the pressure head, $h_g(\boldsymbol {x}, t)$, is commonly modelled using a 3-D Richards equation (Dogan & Motz Reference Dogan and Motz2005; Weill, Mouche & Patin Reference Weill, Mouche and Patin2009),

(2.1)

\begin{equation} C(h_g)\frac{\partial{h_g}}{\partial t}=\boldsymbol{\nabla}\boldsymbol{\cdot} [K_s K_r(h_g)\nabla(h_g+z)]. \end{equation}

Here, $K_s > 0$ is the saturated soil conductivity and $C(h)={\mathrm {d}\theta (h)}/{\mathrm {d}h}$ is the so-called specific moisture capacity. The pressure head is equal to zero ($h_g=0$) at the surface of the groundwater table, which separates the fully and partially saturated region of the soil. We assume that the volumetric water content, $\theta (h)$, and relative hydraulic conductivity, $K_r(h)$, are given by the Mualem–van Genuchten (MvG) model (Van Genuchten Reference Van Genuchten1980),

(2.2)

$$\begin{gather} \theta(h) =\begin{cases} \theta_r+\dfrac{\theta_s-\theta_r}{(1+(\alpha_{MvG}h)^n)^m} & h<0 \\ \theta_s & h\ge0 \end{cases}, \end{gather}$$

(2.3)

$$\begin{gather}K_r(h) =\begin{cases} \dfrac{(1-(\alpha_{MvG}h)^{n-1}(1+(\alpha_{MvG}h)^n)^{{-}m})^2}{(1+(\alpha_{MvG}h)^n)^{m/2}} & h<0\\ 1 & h\ge0 \end{cases}. \end{gather}$$

In essence, the MvG model describes the key hydraulic properties of the soil, such as hydraulic conductivity and saturation, as nonlinear functions of the pressure head, $h$, as well as other parameters $\theta _r$, $\theta _s$, $\alpha _{MvG}$, $n$ and $m=1-{1}/{n}$. The residual water content, $\theta _r$, and saturated water content, $\theta _s$, represent the lowest and the highest water content, respectively. The parameter $\alpha _{MvG}$ $[\mathrm {m}^{-1}]$ is a scaling factor for pressure head $h$ [m]. Finally, the coefficient, $n$, characterises the distribution of pore sizes. Since the MvG model parameters describe the soil/rock properties at a given location, in general, they are functions of the spatial coordinates $(x,y,z)$.

2.2. Overland flow

If rainfall exceeds the infiltration capacity, water can accumulate on the surface and form an overland flow. Typically, following e.g. Chow & Ben-Zvi (Reference Chow and Ben-Zvi1973), Tayfur & Kavvas (Reference Tayfur and Kavvas1994) and Liu et al. (Reference Liu, Chen, Li and Singh2004), this flow is described by the 2-D Saint-Venant equations, which govern the overland water height, $z = h_s(x, y, t)$. The first equation is the continuity equation for the overland flow, written as

(2.4)

\begin{equation} \frac{\partial{h_s}}{\partial t}=\boldsymbol{\nabla}\boldsymbol{\cdot}\boldsymbol{q}+R-I-E, \end{equation}

where the source term includes the precipitation rate $R = R(x,y,t)$, the infiltration rate $I = I(x,y,t)$ and the evapotranspiration rate $E = E(x,y,t)$. The flow $\boldsymbol {q}$ can be expressed as $\boldsymbol {q}=h_s\boldsymbol {u}_s$, where $\boldsymbol {u}_s = \boldsymbol {u}_s(x,y,t)$ is the mean flow speed.

Therefore, (2.4) is a continuity equation with two unknowns, $h_s$ and $u_s$ (or $q_s$). The second equation required to close this system is provided by momentum conservation. In the general form of the Saint-Venant equations, the following equation is used:

(2.5)

\begin{equation} \frac{\partial{\boldsymbol{u}_s}}{\partial t}+\boldsymbol{u}_s\boldsymbol{\nabla}\boldsymbol{\cdot} \boldsymbol{u}_s+g\boldsymbol{\nabla} h+g(\boldsymbol{S}_{\boldsymbol{f}}-\boldsymbol{S}_{\boldsymbol{0}})=0, \end{equation}

where $g$ is the gravitational acceleration, $\boldsymbol {S}_{\boldsymbol {0}}$ is the elevation gradient (bed slope) and $\boldsymbol {S}_{\boldsymbol {f}}$ is the friction slope, i.e. the rate at which energy is lost along the $x$ and $y$ directions. One can model $\boldsymbol {S}_{\boldsymbol {f}}$ by specifying the shear stress between the overland flow and the surface. However, in the computational hydrological models, rather than solving (2.5) directly, the flux, $\boldsymbol {q}=h\boldsymbol {u}$ is often obtained using an empirical relationship known as Manning's equation (see Shaw et al. (Reference Shaw, Beven, Chappell and Lamb2010, chap. 14.3) for more details). This equation, originally formulated to describe a turbulent channel flow, is also commonly applied to the two-dimensional (2-D) turbulent overland flow over a rough terrain. In the vector form, this equation can be written as

(2.6)

\begin{equation} \boldsymbol{q}(\boldsymbol{x}, h_s) = \frac{1}{n_s}\frac{\boldsymbol{S}_{\boldsymbol{f}}}{\sqrt{\|S_f\|}}h_s^{5/3}, \end{equation}

where $n_s$ is an empirically determined value known as Manning's coefficient and describes the land surface roughness. Following (2.5), the friction slope $\mathrm {S_f}$ is expressed as

(2.7)

\begin{equation} \boldsymbol{S}_{\boldsymbol{f}}=\boldsymbol{S}_{\boldsymbol{0}}-\boldsymbol{\nabla}h_s -\frac{1}{2g}\boldsymbol{\nabla}{v^2}-\frac{1}{g}\frac{\partial{\boldsymbol{v}}}{\partial t}. \end{equation}

Often in practice, the last two terms of (2.7) can be neglected; this forms the so-called diffusion approximation. An additional common simplification is the kinematic approximation, in which the second term in (2.7) is also neglected, giving

(2.8)

\begin{equation} \boldsymbol{S}_{\boldsymbol{f}}=\boldsymbol{S}_{\boldsymbol{0}}. \end{equation}

Note that in general, the gradient varies spatially, therefore $\boldsymbol {S}_{\boldsymbol {0}}$ is a function of spatial coordinates $(x,y)$. Similarly, $n_s$ varies not only spatially, but also depends on time through seasonal variations in vegetation (Song et al. Reference Song, Schmalz, Xu and Fohrer2017).

There is an ongoing discussion whether using the above 2-D model (2.4)–(2.7) is an appropriate model for the overland flow, since the actual overland flow reaches the channel through a series of channels (natural or artificial ones), rather than being a uniform sheet of surface water $h_s(x,y)$ (see an overview by Leibowitz et al. (Reference Leibowitz, Wigington, Schofield, Alexander, Vanderhoof and Golden2018)). Nevertheless, we use this formulation as the current standard used in computational physical catchment models.

2.3. Channel flow

Finally, consider the channel flow illustrated in figure 1. We ignore the flow dynamics in the transverse directions of the channel and consider a channel located along $(x(s), y(s))$, for a parameterisation parameter, $s$, defined as the distance from the catchment outlet measured along the channel. Then, the channel water height is described by the Saint-Venant equation applied to the height $z = h_c(s, t)$. There are three main differences between the channel flow equations and the Saint-Venant equations used in the 2-D formulation of overland flow in § 2.2: (i) the direction of flow is given by the direction of the channel; (ii) precipitation adds a negligible contribution to the channel flow – instead, the channel flow is primarily affected by both the surface flow passing through the channel perimeter and the overland flow passing over the river banks; and (iii) the roughness is introduced over the entire perimeter of the channel (e.g. in the case of a rectangular channel, its bottom and the submerged part of its walls).

Mathematically, these assumptions lead to the following governing equation (Vieira Reference Vieira1983; Chaudhry Reference Chaudhry2007):

(2.9)

\begin{equation} \frac{\partial{A}}{\partial t} = w\frac{\partial{h_c}}{\partial t} = q_{in} - \frac{\partial q}{\partial s}, \end{equation}

where $A = A(s, h_c)$ is the channel cross-section, $w = w(h_c, s) = {{\rm d} A}/{{\rm d} h_c}$ is the channel width (constant in the case of a rectangular channel) and $q=q(s,t)$ is the mean flow in the channel. The source term, $q_{in}$, is given by considering the total surface and subsurface inflow into the river; hence $q_{in}$ is linked to the boundary values of the solutions of the Richards (2.1) and 2-D Saint-Venant equations (2.4). As for the overland equations, the flux, $q = \mathrm {area} \times \mathrm {velocity} = Av$, rather than being solved using the full momentum equation, is again assumed to be given by the empirical Manning's law, which in the case of a channel takes the form

(2.10)

\begin{equation} q = A\frac{\sqrt{S_f}}{n_c}\left(\frac{A}{P}\right)^{2/3}, \end{equation}

where $P = P(s, h_c)$ is the channel wetted perimeter, and $n_c$ is Manning's coefficient dependent on the banks and channel bed roughness. The quantity $S_f$ is the friction slope, as defined by (2.7) (or its simplified forms), with $S_0$ representing the elevation gradient along the river, denoted here as $S_y$. In the case of a rectangular channel, $A=wh_c$ and $P=w+2h_c$. Additionally, we denote the surface water height ($h_c$) at the outlet as $d$. As before, the kinematic approximation (2.8) is often assumed.

We use the above Manning's law since it is a standard approach in computational hydrological models. However, there may be some scenarios in which this assumption may not be a valid approach. In supercritical flows (such as in the case of flash floods), the flow may rapidly vary along the $y$ direction, and other approaches should be considered instead (see e.g. Mujumdar Reference Mujumdar2001).

This completes our formulation of the system of three coupled partial differential equations (PDEs) used to describe the key water flow components at the catchment scale, namely Richards equation (2.1), the 2-D Saint-Venant equations for overland flow (2.4) and the 1-D Saint-Venant equation for channel flow (2.4).

2.4. Boundary and initial conditions

The solution of the governing PDEs (2.1), (2.4), (2.9) for the respective subsurface $h_g$, surface $h_s$ and channel $h_c$ flows must be accompanied by appropriate boundary and initial conditions. For instance, boundary conditions are required to specify the mass exchange between flow components, and these conditions may introduce additional catchment-dependent parameters such as the channel depth and the particular geometry at the river outlet. Computational models also require the specification of initial conditions, which are generally unknown. Typically, the simulations can be run for a burn-in period to allow the results to be independent of the initial conditions. Example formulations of boundary and initial conditions will be studied in Part 2 of our work, where we focus on the mathematical and numerical analysis of model catchments.

2.5. Typical values of model parameters reported in the literature

All parameters appearing in the equations are hence summarised in . Firstly, let us provide a brief review of the typical values of these parameters, known from the literature.

(i) The range of saturated soil conductivity $K_s$ can vary from $10^0$ to $10^{-3}\ \mathrm {m\ s}^{-1}$ for very productive aquifers (for well-sorted sand and gravel, and highly fractured rocks) to below $10^{-9}\ \mathrm {m\ s}^{-1}$ for impervious rocks (Bear Reference Bear1972).
(ii) According to Chow (Reference Chow1959), Manning's roughness coefficient for channels, $n_c$, can vary from $0.01\ \mathrm {s\ m}^{-1/3}$ for artificial (e.g. cement) channels to over $0.1\ \mathrm {s\ m}^{-1/3}$ for channels with dense vegetation. Its value for flood plains, $n_s$, can vary from $0.03\ \mathrm {s\ m}^{-1/3}$ for pastures and cultivated areas without crops to $0.1\ \mathrm {s\ m}^{-1/3}$ in densely forested areas. In addition, the value of $n_c$ for a specific area can also vary with time due to seasonal variation in vegetation.
(iii) The evapotranspiration rate, $E$, in computational physics-based catchment models is usually estimated using models such as the evapotranspiration model by Kristensen & Jensen (Reference Kristensen and Jensen1975) and the two-layer unsaturated zone/evapotranspiration (UZ/ET) model by Yan & Smith (Reference Yan and Smith1994), both of which are for example used in the MIKE SHE (Système Hydrologique Européen) integrated model. However, the formulation of these models is beyond the scope of this report. Cole et al. (Reference Cole, Slade, Jones and Gregory1991) reports the mean monthly precipitation and evaporation values for different regions of the UK. Precipitation $R$ highly depends on the UK region, varying from $645.5$ mm year$^{-1}$ in central and East England to $1601.9$ mm year$^{-1}$ in Northwest and North Scotland, with the highest precipitation levels observed between October and January. According to Faulkner & Prudhomme (Reference Faulkner and Prudhomme1998), the highest daily precipitation measured throughout the year varies from $25$ mm (during a single day) in the east of England to over $80$ mm in some sites in mountainous regions of Wales and Scotland. The evapotranspiration rate $E$ is similar for all regions, but highly varies in time, from 6–12 mm month$^{-1}$ in January to 63–78 mm month$^{-1}$ in July.
(iv) Apart from specifying the precipitation, one also needs to specify catchment geometry – its size, terrain topography and aquifer depth. The publications on integrated catchment models mostly focus on one or a few real-world catchments. There are also works aiming to understand the characteristic properties of catchments and channel networks. Many catchment characteristics were described by Horton (Reference Horton1932), including drainage density, $D_D$, defined as the ratio of total stream length, $L$, and catchment area, $A$ (the authors found typical values for investigated catchments to be $A = 0.64\unicode{x2013} 1.367\ \mathrm {km}^{-1}$), average distance between streams (0.73–1.56 km), average overland flow distance (0.38–0.80 km), slope along the streams (0.014–0.038) and slope along the land (0.072–0.177). The cited values were estimated manually based on topographic maps. Together with the development of computing power and the collection of digital terrain models (DTMs), the methods for estimation of the above quantities were automated (Tarboton & Ames Reference Tarboton and Ames2001).
(v) Grieve, Mudd & Hurst (Reference Grieve, Mudd and Hurst2016) compared three different estimation methods for the hillslope width, $L_x$, by estimating its distribution for a number of catchments in the USA. The first estimate was obtained by dividing the catchment area, $A$, by twice the total stream length, $2L$, which is equivalent to the drainage density $(2D_D)^{-1}$. Note that the factor of two accounts for each stream having a hillslope on either side. The second method used a DTM model to find streamlines following the direction of the steepest descent. The lengths of the streamlines are interpreted as a hillslope width. The last method applied by the authors used a slope–area plot to separate areas dominated by channels from hillslopes. The hillslope width, $L_x$, defined using the streamline (flow routing) method is higher than estimates using the drainage density method, however, both have a similar order of magnitude ranging from $30$ to $130$ m.
(vi) A systematic study of the typical parameter ranges and their effect on model predictions was done by Doummar, Sauter & Geyer (Reference Doummar, Sauter and Geyer2012). The authors used the MIKE SHE software for a karst system (the Gallusquelle spring in the Southwest Germany). The model parameters were calibrated to minimise the root-mean-square error of the daily observed discharge, and the relative importance of each parameter was numerically investigated using sensitivity analysis. The most significant parameters include the hydraulic conductivity, the moisture content of the unsaturated rock matrix and van Genuchten parameters.

Table 1. List of parameters appearing in the formulation of the integrated catchment model, and where they are discussed in this work. As discussed in the text, these parameters vary spatially, and in some cases also temporally.

3. Data sources and processing methods

As we have noted in § 1, there have been a lack of studies that have attempted to collate and analyse the collection of dimensional parameters listed in as a whole, particularly with the intention of further mathematical modelling. Our focus in this section is to describe the data processing techniques that we have used, in order to extract typical values of the physical parameters characterising UK catchments. Crucially, we have made these tools available publicly in a GitHub repository via Morawiecki (Reference Morawiecki2022). The data sources are primarily from openly sourced data on catchments in the UK, but we expect that the methodology can be applied similarly to data from other locations.

3.1. Catchment dimensions ($L_x$, $L_y$)

Capturing catchment dimensions and topography is challenging, as real-world river networks have a fractal-like structure (Rodriguez-Iturbe & Rinaldo Reference Rodriguez-Iturbe and Rinaldo2001). An ambiguous quantity concerns the estimation of the river length characterising the catchment length, $L_y$, since there are multiple ways in which it can be defined, and, furthermore, its value depends on the precision of the dataset used for estimation. Similarly, the estimate of the catchment/hillslope length, $L_x$, is challenging on account of the high spatial variation and ambiguous definition. The aquifer thickness, $L_z$, which depends on the properties of the soil, will be discussed in § 3.3.

In our work, we use three OS datasets, providing different data formats and levels of detail:

(i) the OS Open Rivers only consists of major rivers represented as spatial lines;
(ii) the OS VectorMap District includes all surface water bodies – wide rivers and standing bodies of water (e.g. lakes and ponds) are represented using spatial polygons, while narrow streams, artificial channels and ditches are represented using spatial lines;
(iii) the OS Water Network includes all rivers/streams forming the drainage network, but data can be downloaded only for small user-specified regions, not for the entire UK as in the case of the other two datasets.

Figure 2 demonstrates a comparison of the typical information provided by the three types of datasets.

Figure 2. Comparison of three spatial datasets of river locations (thick black lines): (b) and (c) offer a similar level of accuracy. The background colours represent groundwater depth from the BGS Groundwater Levels dataset in metres, indicating the areas of missing streams in (a), where groundwater reaches the surface in a characteristic finger-like pattern. The northings and eastings on the axes are expressed in kilometres.

3.1.1. Estimating the catchment length, $L_y$

In order to estimate the characteristic scale of the catchment length ($L_y$), we propose the following four measures, graphically represented in figure 3. Note that depending on their size, the OS Open Rivers dataset represent rivers in the form of polygons (we refer to them as major rivers) or lines (minor rivers).

(i) Length of main rivers ($L_y^{main}$) – the total length of rivers as defined in the OS Open Rivers dataset; we can use this measure to estimate the total flow into the drainage network.
(ii) Length of all rivers ($L_y^{all}$) – the total length of major rivers and minor streams as defined in the OS VectorMap District dataset; it serves the same function as the previous measure, but at a higher spatial resolution; we also use this measure later as one way of measuring the catchment/hillslope width.
(iii) Length of the longest river ($L_y^{long}$) – the length of the longest river measured from the spring to the catchment's outlet, extracted from the OS Open Rivers dataset (this measure is approximately the same regardless of whether the minor rivers are included or not); it may be used to investigate the characteristic length of the channel when studying the channel flow, given by (2.9).
(iv) Distance between river tributaries ($L_y^{trib}$) – the average distance between river tributaries estimated from the OS Open Rivers dataset, including the first-order streams (see figure 3d); it is the only intensive quantity in this list (i.e. it does not scale with the catchment size) and therefore can be used as the characteristic length of a river in which hillslope flow is not disturbed by the presence of tributaries.

Figure 3. Illustration of the different length scales characterising the drainage network, presented for the River Frome catchment, with the outlet located at Bishop's Frome. Each $L_y$ estimate is equal to the total length of all highlighted streams. Note this total length gradually decreases from (a) to (d), therefore $L_y^{all}\leq L_y^{main} \leq L_y^{long} \leq L_y^{trib}$.

Calculating each of the above measures poses some challenges. Firstly, the river banks and other natural boundaries have a fractal structure, which means that their length can be very sensitive to the chosen spatial resolution. Here, we have used the data in the original resolution to maintain high accuracy in the case of estimates (i)–(iii). In the case of estimate (iv), we are interested in the typical lengths of hillslopes rather than river streams. In this case, the length of the former should not be affected by small-scale local meanders. Therefore, in case (iv), before measuring the river length, we simplify the geometry using the Douglas–Peucker algorithm (Douglas & Peucker Reference Douglas and Peucker1973) with tolerances of $1000$ m. This algorithm effectively smooths river meanders shorter than the chosen tolerance and follows a similar methodology as in Stuetzle, Franklin & Cutler (Reference Stuetzle, Franklin and Cutler2009).

Secondly, when using the OS VectorMap District dataset for calculating the total river/stream length, data corresponding to standing bodies of water (lakes and ponds) should not be taken into account. This cannot be easily implemented since such standing bodies are represented in the same way as for wide rivers within the dataset. Nevertheless, since lakes have much shorter total boundary length than rivers for the majority of UK regions, including such data does not significantly impact the estimates. Therefore, we include all surface water bodies when estimating (ii), which we obtain by adding the sum of the lengths of all spatial lines and the sum of the perimeters of all spatial polygons divided by two (since each of the two banks is counted separately).

For each UK catchment specified in the NRFA, we calculated the total length of all major rivers ($L_y^{main}$), and all major and minor rivers ($L_y^{all}$), as well as the length of the longest stream ($L_y^{long}$) and the mean distance between major tributaries ($L_y^{trib}$). Note that from their definition for each catchment we have, $L_y^{all}\leq L_y^{main} \leq L_y^{long} \leq L_y^{trib}$ (see figure 3). Their median values taken over all considered catchments are: $\mathrm {med}(L_y^{main})=71.6$ km; $\mathrm {med}(L_y^{all})=130$ km; $\mathrm {med}(L_y^{long})=22.8$ km; $\mathrm {med}(L_y^{trib})=945$ m.

3.1.2. Estimating the catchment width, $L_x$

Estimating the characteristic width of the catchment/hillslope, $L_x$, is also challenging due to the high spatial variation and the ambiguous definition of this distance. We use the following two alternative measures for $L_x$:

(i) the ratio of total stream length to catchment area ($L_x^{area}$);
(ii) the length of overland streamlines reaching a given river or river network ($L_x^{stream}$).

In (i), the first estimate of $L_x^{area}$ is suggested by e.g. Rodriguez-Iturbe & Rinaldo (Reference Rodriguez-Iturbe and Rinaldo2001). Here, we express the catchment area as $A=2L_x^{area}L_y^{all}$, where the factor of two represents the hillslope on the left- and right-hand bank. This provides an estimate of $L_x^{area}$ given the area, $A$, and $L_y^{all}$ (see § 3.1.1). We use $L_y^{all}$ since it approximates all streams in the catchment. This estimate is easy to compute, but is sensitive to river boundary roughness. For example, in the case of meandering streams, $L_x^{area}$ computed in this fashion can be very large compared with a catchment of identical proportions, but with smooth river banks.

In (ii), the second estimate of $L_x^{stream}$ is obtained by estimating the average stream-flow distance between points in the catchment to the nearest stream. Here, we use a sampling method in which a large number of spatial points are distributed over a given area. For each point, we find a streamline from the given point to the stream, following the steepest descent path (which approximates the direction of the overland flow). We use the DTM from the OS Terrain 50 dataset is used in order to find the steepest descent directions, while the OS VectorMap dataset is used to determine when such a streamline reaches the river or another surface water body.

The implementation details of (ii) are described in Appendix A.2. Since even in catchments with a constant hillslope width $L_x^{stream}$, the streamline lengths are distributed uniformly between $0$ and $L_x^{stream}$, the mean (and median) distance $\langle \mathrm {dist}\rangle$ is equal to $\langle \mathrm {dist}\rangle =L_x^{stream}/2$. Therefore, we estimate the catchment width as $L_x^{stream}=2\langle \mathrm {dist}\rangle$. This approach, unlike the previous one, is not sensitive to the roughness of the river boundary; however, it is much more computationally demanding. Using this approach, we found distances to the nearest stream for over 57 000 000 points uniformly distributed over the entire UK (excluding Northern Ireland). By finding the median value of this distance for each investigated catchment, we estimated the catchment width.

Summarising the above measures for the UK: using estimate (i), we estimate a median value of $L_x^{area} \approx 683$ m when using the ratio of catchment area to the total stream length. Interestingly, this average is close to the median value of estimate (ii), $L_x^{stream} \approx 616$ m, which uses the stream-flow definition. Further discussion appears in Appendix B where we remark that $L_x$ significantly varies across different regions of the UK. Denser drainage networks can be found in areas with higher precipitation rates and significant overland flow.

3.2. Catchment topography ($S_x$, $S_y$)

The gradient of the terrain is important in order to estimate the size of the surface flow, as given by Manning's law (2.6). In order to estimate the typical values of the elevation gradient perpendicular to the river ($S_x$) and along the river ($S_y$), we use the DTM from the OS Terrain 50 dataset.

3.2.1. Estimating the gradient perpendicular to the river, $S_x$

One way to estimate $S_x$ is by plotting the valley elevation cross-sections perpendicular to the channel at random locations in the region of interest. However, river shapes and hillslope topography are highly irregular, and therefore instead of taking straight cross-sections, we will investigate the topography profile along the line of the steepest descent. For this purpose, we use the streamlines previously generated to estimate catchment width ($L_x^{stream}$ in § 3.1). We estimate the mean gradient for each catchment by dividing the total elevation difference for all streamlines by their total length; this is equivalent to calculating the average of a slope over all streamlines weighted by their length. Note that this method is not heavily affected by very high local gradients (e.g. if cliffs are present in a given catchment), which would be the case if an arithmetic mean of the gradient for each streamline were calculated. We obtained values of $S_x$ ranging from as low as $0.01$ in the lowlands up to $0.3$ to $0.45$ in some highland and mountainous regions.

3.2.2. Estimating the gradient along the river, $S_y$

The gradient $S_y$ can be estimated similarly to $S_x$, but the points are taken along the river streams instead of streamlines. We thus estimate $S_y$ by dividing the total elevation difference across all channels in the OS Open Rivers dataset by their total length; this method is equivalent to taking a weighted average of the slope for each individual channel, weighted by their length. The typical values of $S_y$ range from as low as $0.0005$ in the lowlands up to as high as $0.1$ in highland and mountainous regions.

3.3. Soil and rock properties ($L_z$, $K_s$, $\alpha _{MvG}$, $\theta _s$, $\theta _r$, $n$)

In this section, we focus on determining the hydraulic properties of soil and rock, which are important to estimate the saturated and relative hydraulic conductivities, $K_s$ and $K_r$, appearing in (2.1).

The geological structure and hydrological properties of soils differ significantly across the UK. In some areas, such as highly productive chalk aquifers in Southeast England, the soil has a very high conductivity, and almost all the rainwater reaches the groundwater table. On the other end of the spectrum, there are aquifers with essentially no groundwater, where the entire rainfall reaches the river either as subsurface flow through the soil or forms an overland flow. Based on the 625 K digital hydrogeological map of the UK developed by the BGS, 15 % of the UK area is classified as a highly productive aquifer, 26 % as moderately productive, 47 % as low productive and the remaining 12 % as rock with essentially no groundwater. The geographical distribution of aquifers of different productivity levels is presented in figure 4(a).

Figure 4. The hydrodynamic properties of aquifers in the UK as provided by the 625 K digital hydrogeological map by the BGS. (a) The map presents the productivity of the aquifer, while (b) presents the dominating mechanisms for the groundwater flow.

3.3.1. Estimating the soil depth, $L_z$

The typical depth of the conductive layer of soil can be estimated based on the UK3D dataset, which was constructed by the British Geological Survey based on measurements from 372 deep boreholes (Waters et al. Reference Waters, Terrington, Cooper, Raine and Thorpe2016). The authors of this dataset interpolated vertical profiles, which were then interpolated to form a national network, or ‘fence diagram model’, of bedrock geology cross-sections. Each segment of a cross-section is assigned to one of the following qualitative aquifer designations (see EA 2017).

(i) Principal aquifers: layers with high intergranular and/or fracture permeability that can support river base flow on a strategic scale.
(ii) Secondary aquifers A: permeable layers capable of forming an important source of base flow at a local rather than strategic scale.
(iii) Secondary aquifers B: predominantly lower permeability layers.
(iv) Secondary undifferentiated: assigned in cases where it has not been possible to attribute either category A or B to a rock type.
(v) Unproductive strata: layers with low permeability and negligible significance for river base flow.

We estimate the typical depth of each layer by calculating the total area of each type of rock layer and dividing it by the total length of the cross-sections, which are available in the dataset (approximately $20\,000$ km). More details about the data format and our processing methods are presented in Appendix A.4. Our analysis yields the following mean thicknesses, rounded to the nearest metre, for the different types of aquifers presented in the classification (i)–(v) above:

(3.1)

\begin{equation} \left.\begin{array}{c@{}} \textrm{Principal} = 405\ \mathrm{m}, \quad \textrm{Secondary A} = 946\ \mathrm{m}, \quad \textrm{Secondary B} = 946\ \mathrm{m}, \\ \textrm{Secondary Undifferentiated} = 132\ \mathrm{m}, \quad \textrm{Unproductive} = 75\ \mathrm{m}. \end{array}\right\} \end{equation}

Thus, the mean principal aquifer thickness is $L_z = 405$ m. However, in some catchments in the UK, this layer may be much smaller – even practically reaching zero thickness, when the groundwater flow is insignificant (notice the classification of an aquifer with no groundwater sites in figure 4b). Therefore, such catchments need to be studied separately. In these low-productive aquifers, the subsurface flow takes place dominantly through the thin layer of soil confined from the bottom by impenetrable bedrock. Detailed quantitative data on soil thickness is not available; however, according to the maps shared by the UK Soil Observatory (Lawley Reference Lawley2009), over a half of the UK consists of deep soils with thickness exceeding 1 m, and nearly half of the soils are less than 1 m thick (see figure 5). Therefore, in the limiting case of catchments with no groundwater, we propose $L_z=1$ m as a reasonable choice for the typical soil thickness in regions with shallow impenetrable bedrock.

Figure 5. Thickness of soil according to the BGS Soil Parent Material Model, which provides a wide range of physical and chemical characteristics of the top layer of soil over the UK.

3.3.2. Estimating the MvG parameters

The necessary quantitative data on rock hydraulic properties (hydraulic conductivity and MvG parameters) would ideally be estimated directly through detailed experiments. However, they can also be calculated indirectly based on known soil composition. The ESDAC shares the 3-D Soil Hydraulic Database Europe, which uses European pedotransfer functions to estimate all MvG parameters (Tóth et al. Reference Tóth, Weynants, Pásztor and Hengl2017). The data is available at $1$ km and $250$ m resolutions for seven different depths (0, 5, 15, 30, 60, 100 and 200 cm). However, these estimates are not very precise since each property in the original database takes only one of a few discrete values, corresponding to specific soil types (see figure 6). In order to provide a single estimate of these parameters, we use data corresponding to $30$ cm and extract their mean value for individual catchments.

Figure 6. The MvG model parameters (from left to right: $K_S$; $\theta _R$; $\theta _S$; $\alpha _{MvG}$; and $n$) at a depth of 15 cm (a) and 1 m (b) according to the 3-D Soil Hydraulic Database Europe. As the legend indicates, each parameter in this dataset can only have one of a few discrete values.

3.3.3. Estimating hydraulic conductivity $K_s$

The values of the conductivity from the 3-D Soil Hydraulic Database, which range from around $10^{-8}$ to $10^{-6}$ m s$^{-1}$ (or approx. 0.001–0.1 m day$^{-1}$). However, as we shall discuss below, the direct measurements conducted in several highly productive chalk aquifers in England give higher estimates, likely because of the flow through macropores and fractures, which dominates over the intergranular flow in many areas in the UK (see figure 4b). Many studies, therefore, focus not only on analysing the hydraulic conductivity of the rock matrix, but also the bulk hydraulic conductivity, which includes the effect of both micropores forming the matrix and naturally occurring macropores/fractures.

In table 2, we cite several prior studies that have shown that the ratio of field-to-laboratory hydraulic conductivity values can be even greater than $1:1000$. For instance, measurements conducted by Robins & Buckley (Reference Robins and Buckley1988) on several boreholes in the Permian and Triassic aquifers in Southwest Scotland showed that hydraulic conductivity measured in the field varies from $0.1$ m day$^{-1}$ (Solway basin) to $20$ m day$^{-1}$ (Dumfries basin). In the study of chalk aquifers in the South Downs, Jones & Robins (Reference Jones and Robins1999) measured hydraulic conductivity values ranging from $0.15$ to $0.67$ m s$^{-1}$. Gardner et al. (Reference Gardner, Cooper, Wellings, Bell and Hodnett1990) observed that the hydraulic conductivity of English chalk aquifers increases from 1–6 mm day$^{-1}$ for low hydraulic potential, up to 100–1000 mm day$^{-1}$, as the potential is increased above $5$ kPa. This large increase in conductivity is caused by the fractures becoming saturated after increasing the potential.

Table 2. Comparison of laboratory (matrix) and field (fractures and matrix) hydraulic conductivity [m day$^{-1}$] for three different types of aquifers discussed by Robins & Ball (Reference Robins and Ball2006). Field and laboratory values can greatly differ.

Therefore, for highly conductive aquifers, we take typical values of conductivity ranging from $K_s=10^{-6}$ m s$^{-1}$ ($0.09$ m day$^{-1}$) to $K_s=10^{-4}$ m s$^{-1}$ ($9$ m day$^{-1}$), but these values can be significantly lower in low-productive aquifers. In the limiting scenario of catchments with no groundwater flow, the typical conductivity is given by the conductivity of the top layer of soil.

According to Kirkham (Reference Kirkham2014), the hydraulic conductivity of natural soils can vary from $0.05$ m day$^{-1}$ for a clay to $30$ m day$^{-1}$ for a silty clay loam. The variation is higher for disturbed soil materials, ranging from $0.02$ m day$^{-1}$ for silt and clay to $600$ m day$^{-1}$ for gravel. Therefore, for the soil, we may assume the same range. Even though the variation is higher than in the aforementioned studies for rock bulk conductivity, the typical values remain similar – between $10^{-6}$ and $10^{-4}$ m s$^{-1}$.

3.4. Water balance terms ($R$, $Q$, $E$)

The source term of the Saint-Venant equation for the overland flow (2.4) can be found by estimating the mean value of terms included in the water balance. It is given by

(3.2)

\begin{equation} R=Q-E-\Delta S, \end{equation}

where $R$ is precipitation over a given catchment, $Q$ is river flow (runoff) at the outlet, $E$ is total evapotranspiration and $\Delta S$ is the change in the water volume stored within the catchment (e.g. in groundwater and reservoirs). Here, we not only estimate the values of $R$, $Q$ and $E$ for the UK, but also their annual peak values, which can provide a good parametrisation to model extreme precipitation events (and are used in many flood estimation models, e.g. by Kjeldsen, Jones & Bayliss (Reference Kjeldsen, Jones and Bayliss2008)). We define these quantities in terms of the mean flow per unit area of the catchment (measured in metres per second).

The precipitation and river flow data for all UK catchments are available in the NRFA. We use the standardised annual average rainfall (1961–90) to estimate $P$, and the mean of gauged daily flow to estimate $Q$ (divided by the catchment's area, also provided by NRFA). By averaging all terms over a long period of time (several years), we should expect $\Delta S$ to tend to $0$, which allows us to estimate the mean actual evapotranspiration as $E=Q-R$.

The relationship between $Q$ and $R$ is presented in figure 7(a). The mean runoff is smaller than the mean rainfall, as expected (especially for catchments with low mean rainfall). However, there are some outliers, especially among dry catchments. They correspond to situations where the mean gauged river flow exceeds the mean rainfall rate in a given catchment. Several possible reasons for this observation include the inflow of water from neighbouring catchments, a long-term trend of decreasing groundwater storage ($\Delta S$), or the mismatch between the time for which the precipitation and runoff data were available (in some catchments, the runoff data is available only for a limited period).

Figure 7. Dependencies between water balance terms obtained from the NRFA. Panel (a) shows the relationship between the mean rainfall, $R$, and the mean runoff, $Q$. The difference, $R-Q$, which is a result of evapotranspiration, is approximately constant, $E \approx 1.4\times 10^{-8}\ \textrm {m s}^{-1}$ for the UK catchments. Panel (b) shows the relationship between the median of the annual maximum rainfall (RMED) and the mean rainfall, $R$. The thick solid lines in both plots correspond to the line of best fit; in (b) the fitted line confirms that RMED scales approximately linearly with $R$.

Based on this graph, we can deduce that the evapotranspiration is approximately independent of the other parameters $R$ and $Q$. Rainfall $R$ values vary from $10^{-8}$ to $10^{-7}$ m s$^{-1}$, runoff $Q$ from $10^{-9}$ to $10^{-7}$, while mean evapotranspiration $E$ is usually around 1–1.5$\ \times \ 10^{-8}$ m s$^{-1}$.

Finally, we estimate annual peak values of precipitation, as a measure of how intensive rainfall should be considered in rainfall–runoff models. Following Faulkner & Prudhomme (Reference Faulkner and Prudhomme1998), the standard quantity we can use is the RMED. As shown in figure 7(b), it is closely correlated with mean precipitation $R$, and is approximately 8–10 times higher.

3.5. Manning's coefficients ($n_s$, $n_c$)

Manning's roughness coefficient, which appears in the overland flow equation of (2.6), is not measured directly at the catchment scale, but is either obtained by physical model calibration or in controlled small-scale experiments. Therefore, we take typical values from engineering tables.

The Manning's coefficient of the surface, $n_s$, depends on the type of the surface (Chow Reference Chow1959; Brunner Reference Brunner2016). The NRFA includes information about the area of each catchment covered by arable lands and grasslands ($n_s\approx 0.035\ \mathrm {s\ m}^{-1/3}$), mountains ($n_s\approx 0.025\ \mathrm {s\ m}^{-1/3}$), urban areas ($n_s\approx 0.015\ \mathrm {s\ m}^{-1/3}$) and woodlands ($n_s\approx 0.16\ \mathrm {s\ m}^{-1/3}$). Following these estimates, for each investigated catchment, we calculated a mean value of $n_s$ weighted by the area covered by each of the above terrain types.

Similarly, Manning's coefficient for the channel, $n_c$, depends on the type of the channel. Since there is no UK-wide database on channels types, we took the typical range of $n_c$ values from Chow (Reference Chow1959). They vary from $n_c=0.01\ \mathrm {s\ m}^{-1/3}$ (for smooth cement channels) to $n_c=0.1\ \mathrm {s\ m}^{-1/3}$ (for natural channels with very weedy reaches), with typical values around $n_c=0.035\ \mathrm {s\ m}^{-1/3}$.

3.6. Channel dimensions ($w$, $d$)

Interestingly, estimating the typical channel dimensions (width $w$ and depth $d$), which appear in Manning's law for the channel flow (2.10), turns out to be quite challenging. Firstly, there is no database that provides a complete set of data on channel dimensions for UK rivers. Secondly, channel dimensions vary greatly depending on the river discharge, which can itself vary greatly depending on the catchment (though mainly on the catchment area and average rainfall). Instead, we perform the analysis via an indirect route.

Our central reference is the work by Nixon (Reference Nixon1959), who measured the width and depth of channels of several major rivers at 29 gauging stations in England and Wales. Values ranged from a width of $w=35$ ft ($11$ m) and a depth of $d=3.1$ ft ($0.94$ m) for the River Blackwater at the Shallowfield station to a width of $w=257$ ft ($78$ m) for the Thames at the Kingston Day's Weir station and a depth of $d=16.37$ ft ($4.99$ m) for the Wye River at the Cadora station. The data, together with similar measurements done for American rivers, were used to fit power laws describing the dependence between the channel dimensions and the full-bank discharge $Q$. From Nixon (Reference Nixon1959), the derived power laws are

(3.3a)

$$\begin{gather} d=C_1Q^{1/3}\quad\text{with}\ C_1=0.545\ \mathrm{s}^{1/3}, \end{gather}$$

(3.3b)

$$\begin{gather}w=C_2Q^{1/2}\quad\text{with}\ C_2=1.65\ \mathrm{s}^{1/2}\ \mathrm{ft}^{{-}1/2}=3.00\ \mathrm{s}^{1/2}\ \mathrm{m}^{{-}1/2}, \end{gather}$$

where $Q$ is the full-bank discharge, while $C_1$ and $C_2$ are fitted constants. These scaling laws can also be justified by analysis of mechanical equilibrium of the sediment bed, see e.g. Henderson (Reference Henderson1963) and Devauchelle et al. (Reference Devauchelle, Petroff, Lobkovsky and Rothman2011).

Because data on channel dimensions is not available for most UK catchments, we use the above power laws to estimate the channel dimensions at the outlet of each studied UK catchment. We note that this is a reasonable approach: Nixon (Reference Nixon1959) derived such laws based on empirical relations for a representative sample of UK catchments. Equally, Wharton (Reference Wharton1992) used similar scaling laws to estimate the peak annual flows in 70 UK catchments based on the channel dimensions.

3.7. A summary of results on characteristic values

Table 3 summarises the typical values of catchment model parameters for all gauged catchments in the UK, with boundaries defined as in the NRFA. A sample of raw data and their spatial distributions is included in Appendix B. In cases where no spatial data was available, an estimate from the literature was obtained (these entries are marked with a ${\dagger}$). Where available estimates obtained with more than one method are compared. The median value of the parameters presented in table 3 will be used in Part 2 to formulate a simple benchmark scenario characterised by similar physical properties to the UK catchments.

Table 3. Summary of catchment model parameters values for UK catchments.

$^{\dagger}$Numbers represent a range of values obtained from the literature; these quantities are not estimated individually on the catchment level.

$^\textrm {a}$Hydraulic conductivity of both bedrock and soil estimated based on various studies. $^\textrm {b}$Hydraulic conductivity of the soil as given in Hydraulic Database of Europe by ESDAC.

Based on table 3, we can make the following notes.

(i) The total stream length in the catchment is approximately twice as high when including all surface water streams and bodies from the OS VectorMap ($L_y^{all}$) than when looking only at the main rivers, which are included in OS Open Rivers dataset ($L_y^{main}$), cf. figure 2. The longest stream ($L_y^{long}$) typically contributes to from 20 % to 50 % of the total length of the rivers in OS Open River, with a higher percentage corresponding to shorter rivers (with fewer and shorter tributaries). The estimated lengths greatly vary as a result of the wide range of gauged catchments areas used in the study. The exception is the distance between tributaries ($L_t^{trib}$) ranging from $725$ to $1212$ m. As an intensive quantity, this quantity does not scale with the catchment area.
(ii) Hillslope width estimates in the majority of catchments are consistent between the two proposed methods. A difference can be observed in catchments with the lowest density of streams, for which streamlines turn out to be much shorter than the width between streams. The reason is that the data includes numerous streamlines that start far from the river, but terminate at a local elevation minimum, and hence do not reconnect with the river. In our study, such streamlines cover over $18\,\%$ of the UK's area. We do not account for such streamlines when estimating the hillslope width.
(iii) Table 3 provides an indication that the slope along the hillslope $S_x$ is typically much larger than the slope along the channel $S_y$, which is a general feature of the topography of river valleys. By the 2-D Manning's equation (2.6), the flow follows the direction of the steepest descent. Therefore, we would then expect that the rainfall water is transferred towards the main river through streams mainly directed along the hillslopes (with $S_x$ slope). Only after it reaches the channel, it is transferred down the river with $S_y$ slope. The consequences of this observation will be explored in detail in Part 2.
(iv) The highest disagreement can be observed between the soil conductivity as reported in the ESDAC dataset and the literature. The latter measured the bulk soil conductivity of the soil (and bedrock), which is a better macroscopic estimate than the one provided in the ESDAC dataset.
(v) The last two columns of table 3 include example values from the literature. The first column includes parameter ranges suggested by Doummar et al. (Reference Doummar, Sauter and Geyer2012) to model a real-world Karst catchment in Southwest Germany. The catchment width is higher than typical values for the UK. Notable differences between the suggested parameter ranges and estimations for UK catchments are in soil conductivity, where higher values come from measurements of observed flow velocities using artificial tracers in highly conductive areas of the investigated system. These velocities can be very high in well-fractured Karst systems, as is also observed in some measurements in the UK (cf. the chalk aquifer of Yorkshire studied by Ward, Williams & Chadha (Reference Ward, Williams and Chadha1997)).
(vi) The last column includes parameter values suggested in two simple benchmark scenarios by Maxwell et al. (Reference Maxwell2014) for integrated catchment model intercomparison. The first scenario includes a single hillslope tilted in the $x$-direction, ending in a flat river of constant depth. Only subsurface flow in a thin 5 m-deep soil layer is modelled. The second scenario includes a 2-D tilted V-shaped catchment with a different elevation gradient along and perpendicular to the river. However, this latter one is only used to model surface flow – subsurface flow is not involved.

One should note a few differences between parameter values used in these scenarios. The rainfall values are higher, since they correspond to typical intensive rainfall, not to an annual mean estimated for UK catchments. The next difference is a much lower aquifer thickness, since in benchmark scenarios, it represents only the soil layer ($L_z^*=5$ m), ignoring the subsurface water percolation to the usually much deeper permeable bedrock layer (on average $L_z=684$ m based on the UK3D dataset). Another important difference is the low soil conductivity in the scenarios: the saturated hydraulic conductivity, $K_s$, between $1.16\times 10^{-7}$ to $1.16\times 10^{-5}$ m s$^{-1}$ is lower than the reported typical values ranging from $10^{-4}$ to $10^{-6}$ m s$^{-1}$ (estimated based on large-scale studies in the UK, see § 3.3.3). Note that the combination of the two previous factors (thin aquifer with low conductivity) typically leads to model results, in simple benchmarks, where the majority of land is covered with surface water even for small rainfalls (cf. modelling and numerical simulations in Part 2).

4. Further investigation of distribution and correlation

In the previous section, we presented the results of an extensive analysis of the typical scaling of those key parameters expected to play a role in the dynamics of hydrology at the level of a catchment site. Naturally, an important question relates to the issue of which parameters are expected to be the most important in the description of physical observations or in numerical simulations. Such questions will form the basis of our investigation in Parts 2 and 3, where we develop scaling laws involving non-dimensional parameters. In this section, we focus on a statistical interpretation of such issues, and study the distribution and correlation of parameters, as well as the subsequent catchment classification. Our analysis is divided into the application of three statistical methods: analysis of cross-correlation of parameters (§ 4.1); grouping of catchments using k-means clustering (§ 4.2); principal component analysis (PCA) for catchment properties (§ 4.3).

4.1. Regional variability and correlation of parameters

Note that the estimates presented in table 3 refer to mean values across the UK. However, regional averages can significantly differ, as is represented by the interquartile range. As an example, consider the distribution of certain parameter mean values for UK catchments, as listed in the NRFA and presented in Appendix B. One can observe that highland areas, characterised by a higher elevation gradient (both along rivers and hillslopes), are also accompanied by higher rainfalls and lower soil conductivity (and usually low aquifer thickness). These observations are confirmed by the correlation matrix presented in the figure 8. The higher precipitation in mountainous regions was already observed by Duckstein, Fogel & Thames (Reference Duckstein, Fogel and Thames1973). Its impact on the aquifer properties (such as soil capacity and infiltration rate) was used by Bell & Moore (Reference Bell and Moore1998) to formulate the grid model and its successor, the grid-to-grid model (Bell et al. Reference Bell, Kay, Jones and Moore2007). The high correlation of catchment area and river length comes from the fractal structure of drainage networks (Rodriguez-Iturbe & Rinaldo Reference Rodriguez-Iturbe and Rinaldo2001), which develop to uniformly cover available space. Finally, there is a strong correlation between the $\alpha _{MvG}$, $n$ and $\theta _S$ parameters from the MvG model parameters, since all were estimated by Tóth et al. (Reference Tóth, Weynants, Pásztor and Hengl2017) based on the same soil composition dataset.

Figure 8. Correlogram of physical parameters summarised in table 3. It presents the Pearson correlation coefficient calculated based on values of the given parameters, which were estimated for UK catchments specified in the NRFA. Incomplete records were omitted.

4.2. Classification of catchments using cluster analysis

In order to better understand the similarities between different types of catchments, characterised by different combinations of physical parameters, we perform a cluster analysis. The goal is to construct a specified number of typical parameter combinations that reflect the spatial variation of UK catchments. Here, we use k-means clustering on the parameters summarised in table 3 to find a set of three clusters (see e.g. Kaufman & Rousseeuw Reference Kaufman and Rousseeuw2009). Parameters not directly associated or extracted with specific individual catchments in mind are removed from consideration ($w$, $d$, $n_c$, $L_z^*$). Also ignored are quantities that scale with the catchment areas, since these parameters are dependent on the location of the catchment outlet, rather than the physical properties characterising a given region ($L_y^{all}$, $L_y^{main}$ and $L_y^{long}$). Finally, the residual water content, $\theta _r$, was removed from consideration since it is almost always zero. This leaves us with the remaining 15 variables. Only catchments for which all these parameters are available are used for clustering.

The distribution of parameter values for each cluster is presented in figure 9. The figure highlights some key features of each cluster.

(i) From figure 9(f,g), we note that the first cluster is characterised by the highest elevation gradients, which mostly represent catchments located in highlands, while the third cluster is associated with the lowest elevation gradients, representing relatively flat lowlands. Intermediate values are represented by the largest second cluster.
(ii) Figure 9(i–l) shows that the third cluster, representing the lowlands, has significantly different soil properties from the other two clusters. It has the highest hydraulic conductivity, $K_s$, and the MvG $\alpha _{MvG}$ parameter, as well as the lowest saturated water content, $\theta _S$, and MvG coefficient, $n$.
(iii) Finally, in figure 9(b,c), we note that the first cluster, representing the highlands, has significantly higher mean precipitation, $R$, and runoff, $Q$, values, which is related to the fact that precipitation is higher in eastern parts of the UK, coinciding with many of the highland and mountainous areas.

Figure 9. Number of catchments belonging to each cluster and box plots showing the distribution of parameters among the catchments belonging to each graph.

4.3. Catchment characterisation using PCA

In order to better understand the separation between the class properties, one can use PCA, which allows us to find the linear combination of the original parameters with the highest variation in the dataset. Table 5 summarises the linear coefficients (eigenvectors) representing the first four principal components. Based on the parameters with the highest eigenvalues, our intuitive interpretation of the components is as follows. One can interpret the first principal component as strongly related to the topography of the terrain. The highland terrain with higher slopes and higher rainfall/runoff values is characterised by high values of the first principal components, while the lowlands correspond to low values. The second principal component is high for catchments with high aquifer thickness, high soil saturation, low $\alpha _{MvG}$ and/or high Manning's $n$ roughness coefficient. Typically, such a configuration of parameters represents catchments, in which groundwater flow has a relatively high importance. On the contrary, catchments with a low value of the second component will be characterised by a higher contribution of the overland flow component.

Figure 10 presents the clusters in a 2-D space spanned by the first two principal components. Here the separation between classes is apparent – as observed before, they vary in terrain topography. Additionally, the second cluster represents the most productive aquifers with the highest importance of groundwater. In the other two classes, the lower importance of groundwater is balanced by higher overland flows. In the case of cluster 1, the groundwater flow is limited by low hydraulic conductivity of the soil (cf. figure 9i), while in cluster 3, it is limited by the very flat terrain (cf. figure 9f,g), which can easily become saturated.

Figure 10. The illustration of the first two principal components. The graph on the left shows the values of principal components for catchments belonging to each of the three clusters. The map on the right shows the geographic distribution of these clusters.

Figure 11 also illustrates the spatial distribution of the first two principal components, as well as the geographical distribution of catchments belonging to different classes. Based on visual inspection of the distribution of regions associated with different values of the first principal component in figure 11(a), the measure seems to be highly spatially autocorrelated. This is due to the UK's distinct lowland, midland and highland regions, characterised by low, medium and high values of the first principal component, respectively. The spatial autocorrelation of the second principal component can also be observed in figure 11(b); however, this component varies much more locally – we expect it to be dependent on the local geological structure of the aquifer and the local soil composition.

Figure 11. Spatial distribution of values of the first (a) and second (b) principal components of the catchment parameters. Note that visually the value of the first component seems to coincide with the main lowland, midland and highland regions in the UK.

5. Discussion

In order to produce minimal mathematical models of surface–subsurface flows in catchments, a crucial step is to establish the typical size and distribution of the key parameters, which has been the main goal of this work. As we have noted, a comprehensive treatment of this topic does not seem to have been previously undertaken in the style we have presented here.

In this paper, we compiled a set of data processing methods that allow the extraction of typical values of physical parameters used in PDE-based catchment models. The methods were applied to extract properties of UK catchments based on publicly available datasets from the NRFA, ESDAC, BGS and OS. The parameter values for each catchment and the source code used to obtain them are published in our GitHub repository (Morawiecki Reference Morawiecki2022) for use by other researchers and practitioners. The key values of the parameters are summarised in table 3.

Our application of cluster analysis on the collected data sets allowed us to further classify those groupings of parameters that determine different classes of catchments. The intuitive interpretation of our findings seems reasonable: a key clustering component relates to those parameters characterising the terrain topography, while the second component is related to the groundwater importance in the runoff generation. For the demonstration purposes, UK catchments were divided into three clusters, each characterised by distinctive values of these principal components. Based on figure 10, we concluded that the clusters approximately represent lowland, midland and highland areas, with midland areas often characterised by the highest groundwater importance in water transfer.

The presented results can be used for verification of parameter values calibrated by numerical catchment models and for preparing realistic benchmark scenarios. Ideal catchment model(s) should thus provide quantitative or qualitative agreement with catchments belonging to any of the identified clusters. We expect that this is also an important criterion when creating and validating simplified (conceptual and statistical) catchment models, which often have a tendency to become overfitted to the training data (Beven Reference Beven2018, Reference Beven2019). Identifying scenarios where a given model may fail can lead to further model improvement.

We shall continue our work in Parts 2 and 3, where the parameter estimates are crucial in allowing for the development and analysis of physical models of surface and subsurface flows.

Acknowledgements

We thank S. Longfield (Environmental Agency) for many useful interactions and for motivating this work via the 7th Integrative Think Tank hosted by the Statistical and Applied Mathematics CDT at Bath (SAMBa). We also thank T. Kjeldsen (Bath), T. Pryer (Bath) and R. Lamb (Lancaster/JBA Trust) for insightful discussions. We are indebted to the reviewers and the JFM editorial team – their comments and suggestions were instrumental in the final development of this paper.

Funding

P.M. is supported by a scholarship from the EPSRC Centre for Doctoral Training in Statistical Applied Mathematics at Bath (SAMBa), under the project EP/S022945/1.

Declaration of interests

The authors report no conflict of interest.

Appendix A. Data extraction methods implementation

In this appendix, we describe the numerical methods used for data extraction and processing. In some cases, our methods require significant optimisation to reduce the processing time for large datasets. Only those parameters that require special handling are discussed here, namely $L_x^{stream}$, $L_y^{main}$ and $L_y^{all}$, $L_z$, $S_x$ and $S_y$. Processing methods for other parameters were sufficiently covered in § 3.

All data processing methods were implemented in the R programming language. The source code with further explanation of implementation details is publicly available in our GitHub repository (Morawiecki Reference Morawiecki2022).

A.1. Extracting catchment length ($L_y$)

As described in § 3.1, in order to determine the length of rivers for each catchment, we use two alternative datasets: OS Open Rivers and OS VectorMap District. The first dataset is available in a single shape file for the entire UK. In this case, we find the intersection of all streams defined in the file with the boundary of each respective catchment. Then, we calculate the total length of the streams located inside the catchment.

The OS VectorMap District dataset is divided into 55 OS National Grid reference system tiles of size $100\ \mathrm {km}\times 100\ \mathrm {km}$. Since the amount of spatial data in each of these tiles is large, we further divide them into $5\times 5$ subtiles. For each subtile, we find all rivers (and other surface water bodies) located within its boundary, as illustrated in figure 12. Then, we find the intersection of this river subset with each catchment, which is overlapping the given subtile. By introducing subtiles, we significantly reduce the computational time. Determining whether a given river is within the square subtile is relatively quick compared with finding the intersection of a river with an irregular catchment boundary. This approach allows us to significantly limit the number of rivers that need to be intersected with the boundary of each catchment. As described in § 3.1, for surface water bodies represented using polygons (usually corresponding to wide rivers), the length is defined as half the polygon's perimeter. We repeat this procedure for all National Grid tiles and their subtiles, summing the length of all streams intersecting with each catchment.

Figure 12. Illustration showing the division of the UK into tiles. Panel (a) shows nine National Grid tiles, each divided into 25 subtiles. The dark grey area in (b) represents the area from which the streamlines reach a given outlet (typically following the direction of steepest descent). The resultant catchment has an intersection with three tiles shown in (c). When analysing each of these subtiles, all streams within a given subtile are overlapped with the catchment boundary, as illustrated in (c). Adding the lengths of these streams $L_1$, $L_2$ and $L_3$, allows us to calculate the total stream length for the given catchment.

A.2. Extracting hillslope width ($L_x$) and gradient ($S_x$)

The OS Terrain 50 dataset includes a DTM with a resolution of $50$ m. The dataset is represented as a raster divided into OS National Grid tiles and further subdivided into $10\times 10$ subtiles (size of $10\ \mathrm {km}\times 10\ \mathrm {km}$).

To extract the hillslope width and gradient, we create an additional raster for each subtile that stores information about the type of terrain in the given tile. The types are as follows: (i) tidal boundary (e.g. coast); (ii) large channel/lake (represented as a polygon); (iii) small channel (represented as a line); (iv) subtile boundary; (v) local elevation minimum (without surface water). Types (i)–(iii) are determined by overlapping different surface water bodies from the OS VectorMap dataset District with a raster representing the given subtile. We determine a local elevation minimum based on the altitude of the neighbouring raster elements. An example of a terrain-type raster is presented in figure 13.

Figure 13. Streamline analysis steps. Panel (a) presents the input datasets: DTM in raster form with surface water bodies (black lines and polygons). These datasets are used to construct the terrain-type raster presented in (b). Then, from each land tile a streamline is generated until reaching another terrain type, which is presented in (c). For each starting point, the length of each streamline is presented in (d).

For all raster tiles that were not classified into one of the above types, we find one out of the eight neighbouring tiles that is located in the direction of the steepest descent. Then, for each element, we iteratively construct the steepest descent path, until it reaches one of the specified terrain types. This procedure allows us to find overland streamlines that follow the hillslope gradient. When the path is found for all raster tiles, we save their length and the elevation difference between the start and end point. Then, for each catchment, we aggregate the values assigned to all raster tiles located inside it and ending in rivers (terrain types (ii) or (iii)). The streamlines ending in the tidal boundary, local elevation minima, or leaving through the subtile boundary, are not taken into account.

To find the catchment width, we calculate the mean streamflow path and multiply it by two, as discussed in § 3.1.2. To find the elevation gradient along the hillslope, we divide the total elevation difference along all paths by the total path length, as discussed in § 3.2.1.

A.3. Extracting gradient along a river ($S_y$)

To estimate the typical value of the gradient along the rivers, we use the OS Terrain 50 and OS Open Rivers datasets. For each subtile represented in the OS Terrain dataset, we extract rivers that overlap with the given subtile. Then, for each node of the river, we find its elevation. The mean gradient $S_y$ is estimated as the total elevation difference along the river segments divided by their total length, as discussed in § 3.2.2. As in the case of the $S_x$ estimation method, this measure is not highly sensitive to local terrain drops along the river.

A.4. Extracting aquifer thickness ($L_z$)

Extracting aquifer thickness at the catchment scale is a challenging task, since the bedrock geology model by the BGS represents it in the form of 3-D polygons located sparsely across the UK, and the data is not appropriately aligned in the $xy$-plane. A sample of this dataset is presented in figure 14. The idea of extracting the mean aquifer thickness is to calculate the total area of polygons within a given catchment extent, and then divide it by the estimated length of these cross-sections projected onto the $xy$-plane.

Figure 14. A fence diagram showing the hydrological properties of the aquifer of the River Avon catchment with the outlet at Bath St James. Data was extracted from the UK3D dataset. The fence diagram for the entire UK can be generated using the Geology of Britain viewer (British Geological Society 2022).

To find the total aquifer area, we divide each polygon into triangular parts. Then, each triangle is further divided into two triangular parts with a common edge directed along the $z$-axis, as illustrated in figure 15. We refer to the point on the $xy$-plane that corresponds to the common edge as the ‘midpoint’, while the other two points are referred to as ‘endpoints’. For each catchment, we find all triangular polygons, with at least one of these three points located inside the given catchment.

Figure 15. (a) Method of dividing triangular polygons in order to calculate their area within the catchment. (b) Method of estimating the cross-section length $L$ inside the catchment.

We need to find the area of each polygon within the catchment boundary. If both endpoints are inside the catchment, we take the total polygon area $A$. If any endpoint is outside the catchment, we calculate the area of the part of this polygon that is located inside the catchment. This area, denoted as $A^*$, is expressed either as $A^*=({x^*}/{x})^2A$ if the midpoint is outside or $A^*=A-({x^*}/{x})^2A$ if the endpoint is inside the catchment. Here, $x^*$ denotes the length of the triangle projection in the $xy$-plane included inside the catchment boundary (see figure 15). The area of all triangular polygons within the catchment is then summed up.

Finding the projection of the profiles on the catchment area is challenging since the triangular polygons in the BGS database are not aligned, and in addition, there are many polygons. To solve this problem, we first convert each polygon projection into a thin rectangular polygon of width $d$, with its centreline going along the given projection. Then, we find the union of all polygons intersecting a given catchment (see figure 15). The length of the cross-section can be estimated in two ways: either as the area of the union divided by $2d$; or as the perimeter of the union. In both cases, there is a small overestimation of the true length $L$, which is minimised by choosing a value of $d$ that is significantly larger than the misalignment of polygon projections, and at the same time significantly smaller than typical catchment dimensions. In this study, we estimated the cross-section length by calculating the union's area obtained for rectangular polygons with a width $d=15$ m. Dividing the total area by the total length gives us the mean thickness of the aquifer.

Appendix B. Spatial distribution of catchment parameters

In this study, we estimated the physical parameters for 1601 catchments in the UK as defined in the NRFA. The resultant database is available in the project's GitHub repository. A sample of the constructed dataset, including the parameters for the first five catchments, is presented in table 4.

Table 4. A sample of dataset consisting of estimates of physical parameters for all UK catchments included in NRFA. The first column represents the names of the parameters used in our dataset. The top row represents the ID numbers assigned to the given catchments in the NRFA database. The detailed description of all catchments can be found in the NRFA (2022) dataset.

In § 4.3, we discuss the results of applying PCA to the catchment dataset. Each principal component is defined as a linear combination of the selected physical parameters from table 4,

(B1)

\begin{equation} PC_k=\sum_i=1^n w_{i,k} x_i, \end{equation}

where $x_i$ is the $i$th physical parameter, and $w_{i,k}$ is the weight of the $i$th parameter and $k$th principal component. The weights for the first five principal components are presented in table 5. Additionally, the spatial distribution of the values of the selected catchment parameters from table 4 is presented in figure 16.

Table 5. The first five principal components obtained from UK catchments.

Figure 16. Map of UK catchments listed in NRFA with extracted mean values of the physical parameters.

References

Abbott, M.B., Bathurst, J.C., Cunge, J.A., O'Connell, P.E. & Rasmussen, J. 1986 An introduction to the European Hydrological System – Systeme Hydrologique Europeen, ‘SHE’, 1. History and philosophy of a physically-based, distributed modelling system. J. Hydrol. 87 (1–2), 45–59.CrossRef Google Scholar

Allen, D.J., Bloomfield, J.P., Gibbs, B.R. & Wagstaff, S.J. 1998 Fracturing and the hydrogeology of the Permo-Triassic sandstones in England and Wales. Tech. Rep., British Geological Survey.Google Scholar

Anderson, M.P., Woessner, W.W. & Hunt, R.J. 2015 Applied Groundwater Modeling: Simulation of Flow and Advective Transport. Academic Press.Google Scholar

Bear, J. 1972 Dynamics of Fluids in Porous Media. Dover.Google Scholar

Bell, V.A., Kay, A.L., Jones, R.G. & Moore, R.J. 2007 Development of a high resolution grid-based river flow model for use with regional climate model output. Hydrol. Earth Syst. Sci. 11 (1), 532–549.CrossRef Google Scholar

Bell, V.A. & Moore, R.J. 1998 A grid-based distributed flood forecasting model for use with weather radar data. Part 1. Formulation. Hydrol. Earth Syst. Sci. 2 (2/3), 265–281.CrossRef Google Scholar

Beven, K. 2018 On hypothesis testing in hydrology: why falsification of models is still a really good idea. Wiley Interdiscip. Rev.: Water 5 (3), e1278.CrossRef Google Scholar

Beven, K. 2019 How to make advances in hydrological modelling. Hydrol. Res. 50 (6), 1481–1494.CrossRef Google Scholar

Beven, K.J. 2011 Rainfall-Runoff Modelling: The Primer. John Wiley & Sons.Google Scholar

Beven, K.J. & Kirkby, M.J. 1977 Towards a simple, physically-based, variable contributing area model of catchment hydrology, University of Leeds, School of Geography.Google Scholar

Blöschl, G. 2006 Rainfall-runoff modeling of ungauged catchments. Encyclopedia of Hydrological Sciences (ed. M.G. Anderson), chapter 133. John Wiley & Sons.CrossRef Google Scholar

British Geological Society 2022 Geology of Britain viewer. https://www.bgs.ac.uk/map-viewers/geology-of-britain-viewer/.Google Scholar

Brunner, G.W. 2016 HEC-RAS river analysis system 2D modeling user's manual. US Army Corps of Engineers—Hydrologic Engineering Center, pp. 1–171.Google Scholar

Brunner, P. & Simmons, C.T. 2012 Hydrogeosphere: a fully integrated, physically based hydrological model. Groundwater 50 (2), 170–176.CrossRef Google Scholar

Calver, A., Stewart, E. & Goodsell, G. 2009 Comparative analysis of statistical and catchment modelling approaches to river flood frequency estimation. J. Flood Risk Manage. 2 (1), 24–31.CrossRef Google Scholar

Chaudhry, M.H. 2007 Open-Channel Flow. Springer Science & Business Media.Google Scholar

Chow, V.T. 1959 Open-channel Hydraulics. McGraw-Hill.Google Scholar

Chow, V.T. & Ben-Zvi, A. 1973 Hydrodynamic modeling of two-dimensional watershed flow. J. Hydraul. Div. ASCE 99 (11), 2023–2040.CrossRef Google Scholar

Clauser, C. 1992 Permeability of crystalline rocks. Eos, Trans. Am. Geophys. Union 73 (21), 233–238.CrossRef Google Scholar

Cole, J.A., Slade, S., Jones, P.D. & Gregory, J.M. 1991 Reliable yield of reservoirs and possible effects of climatic change. Hydrol. Sci. J. 36 (6), 579–598.CrossRef Google Scholar

Devauchelle, O., Petroff, A.P., Lobkovsky, A.E. & Rothman, D.H. 2011 Longitudinal profile of channels cut by springs. J. Fluid Mech. 667, 38–47.CrossRef Google Scholar

Dogan, A. & Motz, L.H. 2005 Saturated-unsaturated 3D groundwater model. I. Development. J. Hydrol. Engng ASCE 10 (6), 492–504.CrossRef Google Scholar

Douglas, D.H. & Peucker, T.K. 1973 Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica 10 (2), 112–122.CrossRef Google Scholar

Doummar, J., Sauter, M. & Geyer, T. 2012 Simulation of flow processes in a large scale karst system with an integrated catchment model (Mike She)–identification of relevant parameters influencing spring discharge. J. Hydrol. 426, 112–123.CrossRef Google Scholar

Duckstein, L., Fogel, M.M. & Thames, J.L. 1973 Elevation effects on rainfall: a stochastic model. J. Hydrol. 18 (1), 21–35.CrossRef Google Scholar

EA 2017 Guidance: protect groundwater and prevent groundwater pollution. https://www.gov.uk/government/publications/protect-groundwater-and-prevent-groundwater-pollution/protect-groundwater-and-prevent-groundwater-pollution.Google Scholar

Faulkner, D.S. & Prudhomme, C. 1998 Mapping an index of extreme rainfall across the UK. Hydrol. Earth Syst. Sci. 2 (2/3), 183–194.CrossRef Google Scholar

Fry, M.J. & Swain, O. 2010 Hydrological data management systems within a national river flow archive. Tech. Rep., British Hydrological Society.CrossRef Google Scholar

Furman, A. 2008 Modeling coupled surface–subsurface flow processes: a review. Vadose Zone J. 7 (2), 741–756.CrossRef Google Scholar

Gardner, C.M.K., Cooper, J.D., Wellings, S.R., Bell, J.P. & Hodnett, M.G. 1990 Hydrology of the unsaturated zone of the chalk of south-east England. In International Chalk Symposium, pp. 611–618.Google Scholar

Grieve, S.W.D., Mudd, S.M. & Hurst, M.D. 2016 How long is a hillslope? Earth Surf. Process. Landf. 41 (8), 1039–1054.CrossRef Google Scholar

Henderson, F.M. 1963 Stability of alluvial channels. Trans. Am. Soc. Civil Engrs 128 (1), 657–686.CrossRef Google Scholar

Horton, R.E. 1932 Drainage-basin characteristics. Trans. Am. Geophys. Union 13 (1), 350–361.Google Scholar

Hutton, C., Wagener, T., Freer, J., Han, D., Duffy, C. & Arheimer, B. 2016 Most computational hydrology is not reproducible, so is it really science? Water Resour. Res. 52 (10), 7548–7555.CrossRef Google Scholar

Jones, H.K. & Robins, N.S. 1999 The Chalk Aquifer of the South Downs. British Geological Survey.Google Scholar

Kaufman, L. & Rousseeuw, P.J. 2009 Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons.Google Scholar

Kirkham, M.B. 2014 Principles of Soil and Plant Water Relations. Academic Press.Google Scholar

Kjeldsen, T.R., Jones, D.A. & Bayliss, A.C. 2008 Improving the FEH Statistical Procedures for Flood Frequency Estimation. Environment Agency.Google Scholar

Kolditz, O., et al. 2012 Opengeosys: an open-source initiative for numerical simulation of thermo-hydro-mechanical/chemical (THM/C) processes in porous media. Environ. Earth Sci. 67 (2), 589–599.CrossRef Google Scholar

Kollet, S.J. & Maxwell, R.M. 2006 Integrated surface–groundwater flow modeling: a free-surface overland flow boundary condition in a parallel groundwater flow model. Adv. Water Resour. 29 (7), 945–958.CrossRef Google Scholar

Kristensen, K.J. & Jensen, S.E. 1975 A model for estimating actual evapotranspiration from potential evapotranspiration. Hydrol. Res. 6 (3), 170–188.CrossRef Google Scholar

Lawley, R. 2009 The soil-parent material database: a user guide. Tech. Rep., British Geological Survey.Google Scholar

Leibowitz, S.G., Wigington, P.J. Jr., Schofield, K.A., Alexander, L.C., Vanderhoof, M.K. & Golden, H.E. 2018 Connectivity of streams and wetlands to downstream waters: an integrated systems framework. JAWRA J. Am. Water Resour. Assoc. 54 (2), 298–322.CrossRef Google Scholar PubMed

Lilley, B. 2011 The ordnance survey openData initiative. Cartogr. J. 48 (3), 179–182.CrossRef Google Scholar

Liu, Q.Q., Chen, L., Li, J.C. & Singh, V.P. 2004 Two-dimensional kinematic wave model of overland-flow. J. Hydrol. 291 (1–2), 28–41.CrossRef Google Scholar

Maxwell, R.M., et al. 2014 Surface-subsurface model intercomparison: a first set of benchmark results to diagnose integrated hydrology and feedbacks. Water Resour. Res. 50 (2), 1531–1549.CrossRef Google Scholar

Morawiecki, P.W. 2022 GitHub repository for parametric analysis of UK catchments. https://github.com/Piotr-Morawiecki/UK-catchment-properties.Google Scholar

Mujumdar, P.P. 2001 Flood wave propagation: the saint venant equations. Resonance 6 (5), 66–73.CrossRef Google Scholar

Nixon, M. 1959 A study of the bank-full discharges of rivers in England and Wales. Proc. Inst. Civil Engrs 12 (2), 157–174.Google Scholar

NRFA 2022 Catchment descriptors dataset. https://nrfaapps.ceh.ac.uk/nrfa/ws/station-info?station=*&format=html&fields=all.Google Scholar

Price, M. 1996 Introducing Groundwater. Taylor & Francis.CrossRef Google Scholar

Robins, N. & Ball, D. 2006 The Dumfries Basin Aquifer. British Geological Survey.Google Scholar

Robins, N.S. & Buckley, D.K. 1988 Characteristics of the Permian and Triassic aquifers of south-west Scotland. Q. J. Engng Geol. Hydrogeol. 21 (4), 329–335.CrossRef Google Scholar

Rodriguez-Iturbe, I. & Rinaldo, A. 2001 Fractal River Basins: Chance and Self-Organization. Cambridge University Press.Google Scholar

Shaw, E., Beven, K., Chappell, N. & Lamb, R. 2010 Hydrology in Practice, 3 edn. CRC Press.Google Scholar

Sitterson, J., Knightes, C., Parmar, R., Wolfe, K., Avant, B. & Muche, M. 2018 An overview of rainfall-runoff model types. In Proceedings of 9th International Congress on Environmental Modelling and Software.Google Scholar

Song, S., Schmalz, B., Xu, Y.P. & Fohrer, N. 2017 Seasonality of roughness-the indicator of annual river flow resistance condition in a lowland catchment. Water Resour. Manage. 31, 3299–3312.CrossRef Google Scholar

Stuetzle, C., Franklin, W.R. & Cutler, B. 2009 Evaluating hydrology preservation of simplified terrain representations. SIGSPATIAL Special 1 (1), 51–56.CrossRef Google Scholar

Tarboton, D.G. & Ames, D.P. 2001 Advances in the mapping of flow networks from digital elevation data. In Bridging the Gap: Meeting the World's Water and Environmental Resources Challenges, pp. 1–10.Google Scholar

Tayfur, G. & Kavvas, M.L. 1994 Spatially averaged conservation equations for interacting rill-interrill area overland flows. ASCE J. Hydraul. Engng 120 (12), 1426–1448.CrossRef Google Scholar

Tóth, B., Weynants, M., Pásztor, L. & Hengl, T. 2017 3D soil hydraulic database of Europe at 250 m resolution. Hydrol. Process. 31 (14), 2662–2666.CrossRef Google Scholar

Van Genuchten, M.T.. 1980 A closed-form equation for predicting the hydraulic conductivity of unsaturated soils. Soil Sci. Soc. Am. J. 44 (5), 892–898.CrossRef Google Scholar

Vieira, J.H.D. 1983 Conditions governing the use of approximations for the Saint-Venant equations for shallow surface water flow. J. Hydrol. 60 (1–4), 43–58.CrossRef Google Scholar

Ward, R.S., Williams, A.T. & Chadha, D.S. 1997 The use of groundwater tracers for assessment of protection zones around water supply boreholes–a case study. In Proceedings of the 7th International Symposium in water tracing, Portoroz, Slovenia, pp. 369–375.Google Scholar

Waters, C.N., Terrington, R.L., Cooper, M.R., Raine, R.J. & Thorpe, S. 2016 The construction of a bedrock geology model for the uk: UK3D_v2015. Tech. Rep., British Geological Survey.Google Scholar

Weill, S., Mouche, E. & Patin, J. 2009 A generalized Richards equation for surface/subsurface flow modelling. J. Hydrol. 366 (1–4), 9–20.CrossRef Google Scholar

Wharton, G. 1992 Flood estimation from channel size: guidelines for using the channel-geometry method. Appl. Geogr. 12 (4), 339–359.CrossRef Google Scholar

Yan, J. & Smith, K.R. 1994 Simulation of integrated surface water and ground water systems – model formulation. JAWRA J. Am. Water Resour. Assoc. 30 (5), 879–890.CrossRef Google Scholar

Figure 5. Thickness of soil according to the BGS Soil Parent Material Model, which provides a wide range of physical and chemical characteristics of the top layer of soil over the UK.

Table 2. Comparison of laboratory (matrix) and field (fractures and matrix) hydraulic conductivity [m day$^{-1}$] for three different types of aquifers discussed by Robins & Ball (2006). Field and laboratory values can greatly differ.

Table 3. Summary of catchment model parameters values for UK catchments.

Figure 9. Number of catchments belonging to each cluster and box plots showing the distribution of parameters among the catchments belonging to each graph.

Figure 15. (a) Method of dividing triangular polygons in order to calculate their area within the catchment. (b) Method of estimating the cross-section length $L$ inside the catchment.

Table 5. The first five principal components obtained from UK catchments.

Figure 16. Map of UK catchments listed in NRFA with extracted mean values of the physical parameters.

Article contents

On the development and analysis of coupled surface–subsurface models of catchments. Part 1. Analysis of dimensions and parameters for UK catchments

Abstract

JFM classification

1. Introduction

1.1. On mathematical and computational models

1.2. On the development of a reproducible framework for parameter estimation

2. Fundamentals of catchment modelling

2.1. Subsurface flow

2.2. Overland flow

2.3. Channel flow

2.4. Boundary and initial conditions

2.5. Typical values of model parameters reported in the literature

3. Data sources and processing methods

3.1. Catchment dimensions ($L_x$, $L_y$)

3.1.1. Estimating the catchment length, $L_y$

3.1.2. Estimating the catchment width, $L_x$

3.2. Catchment topography ($S_x$, $S_y$)

3.2.1. Estimating the gradient perpendicular to the river, $S_x$

3.2.2. Estimating the gradient along the river, $S_y$

3.3. Soil and rock properties ($L_z$, $K_s$, $\alpha _{MvG}$, $\theta _s$, $\theta _r$, $n$)

3.3.1. Estimating the soil depth, $L_z$

3.3.2. Estimating the MvG parameters

3.3.3. Estimating hydraulic conductivity $K_s$

3.4. Water balance terms ($R$, $Q$, $E$)

3.5. Manning's coefficients ($n_s$, $n_c$)

3.6. Channel dimensions ($w$, $d$)

3.7. A summary of results on characteristic values

4. Further investigation of distribution and correlation

4.1. Regional variability and correlation of parameters

4.2. Classification of catchments using cluster analysis

4.3. Catchment characterisation using PCA

5. Discussion

Acknowledgements

Funding

Declaration of interests

Appendix A. Data extraction methods implementation

A.1. Extracting catchment length ($L_y$)

A.2. Extracting hillslope width ($L_x$) and gradient ($S_x$)

A.3. Extracting gradient along a river ($S_y$)

A.4. Extracting aquifer thickness ($L_z$)

Appendix B. Spatial distribution of catchment parameters

References

Linked content

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests