Introduction
The global population is expected to reach 9.7 billion and the associated global demand for food is expected to roughly double by 2050 (Koning and van Ittersum, Reference Koning and van Ittersum2009). Although this will create demand for an additional 593 million hectares of land to feed the increasing population, the potential to acquire such land is very slim (Searchinger et al., Reference Searchinger, Waite, Hanson, Ranganathan, Dumas, Matthews and Klirs2019). Hence, the most feasible option is to sustainably improve overall system productivity on current arable land. The use of adequate and appropriate inputs is a critical entry point to sustainably intensify production and tackle food insecurity, undernutrition, and poverty (Garnett et al., Reference Garnett, Appleby, Balmford, Bateman, Benton, Bloomer, Burlingame, Dawkins, Dolan, Fraser, Herrero, Hoffmann, Smith, Thornton, Toulmin, Vermeulen and Godfray2013; Vanlauwe et al., Reference Vanlauwe, Coyne, Gockowski, Hauser, Huising, Masso, Nziguheba, Schut and van Asten2014). The use of improved agronomic practices, including the application of macronutrients, resulted in an unprecedented increase in crop yield during the 1960s Green Revolution (Otsuka and Larson, Reference Otsuka and Larson2012; Khush, Reference Khush2001).
Compared with other regions, the situation in Africa is more complex as both input supply and market access are limited, and the ability of farmers to apply the required inputs is low. Although the crop production potential of Africa is large under optimal agronomic management conditions (Tittonell and Giller, Reference Tittonell and Giller2013), yield gaps are one of the largest (van Ittersum et al., Reference Van Ittersum, Van Bussel, Wolf, Grassini, Van Wart, Guilpart, Claessens, De Groot, Wiebe, Mason-D’Croz and Yang2016) due to poor soil fertility (Chianu et al., Reference Chianu, Chianu and Mairura2012; Stewart et al., Reference Stewart, Pierzynski, Middendorf and Prasad2020), erratic rainfall (Fei et al., Reference Fei, Meijun, Jiaqi, Zehui, Xiaoli and Jiuchun2020; Li et al., Reference Li, Hasegawa, Yin, Zhu, Boote, Adam, Bregaglio, Buis, Confalonieri, Fumoto and Gaydon2015), and poor agronomic practices (Kalra et al., Reference Kalra, Chakraborty, Kumar, Jolly and Sharma2007; Moswetsi et al., Reference Moswetsi, Fanadzo and Ncube2017) among others. Consistent with this, the application of fertilizer in sub-Saharan Africa (SSA) is one of the lowest (8 kg/ha) when compared with South-East Asia (200 kg/ha), Europe (110 kg/ha), Latin America (95 kg/ha), and North America (107 kg/ha) in 2010 (Wanzala and Groot, Reference Wanzala and Groot2013). In Ethiopia, the focus of this study, productivity is still well below its potential despite efforts to increase the use of fertilizer and improved seed for many crops. One of the main challenges in Ethiopia, as elsewhere in SSA, is the high variability in environmental, social, and other factors that determine the adoption of and response to seed and fertilizer inputs (Spielman et al., Reference Spielman, Byerlee, Alemu and Kelemework2010) and agronomic practices (Kassie et al., Reference Kassie, Zikhali, Manjur and Edwards2009; Marenya et al., Reference Marenya, Gebremariam, Jaleta and Rahut2020). It is thus important to understand the spatio-temporal dynamics of production conditions and the types and amounts of inputs required for farmers to use in their fields.
One of the challenges in Ethiopia (and elsewhere) is the widespread use of ‘blanket’ recommendations – whereby a single fertilizer rate is prescribed for a large area or across the entire country. This approach can lead to either a zero or negative yield response to fertilizer, and generally low profitability from the fertilizer use (Kihara et al., Reference Kihara, Nziguheba, Zingore, Coulibaly, Esilaba, Kabambe, Njoroge, Palm and Huising2016; Vanlauwe et al., Reference Vanlauwe, Kihara, Chivenge, Pypers, Coe and Six2011). Site-specific fertilizer application can thus constitute an effective means to address low crop response to fertilizers and seed inputs as well as to reduce the overall environmental impact due to inorganic fertilizer pollution in agricultural landscapes (Kihara et al. Reference Kihara, Bolo, Kinyua, Nyawira and Sommer2020; Rodriguez, Reference Rodriguez2020). Many factors, ranging from topography, climate (e.g. Getnet et al., Reference Getnet, Husen, Fetene and Yemata2015), availability of nutrients in the soil (Kihara and Njoroge, Reference Kihara and Njoroge2013), soil catena (Thelemann et al., Reference Thelemann, Johnson, Sheaffer, Banerjee, Cai and Wyse2010), landscape position (Amede et al., Reference Amede, Gashaw, Legesse, Tamene, Mekonen, Thorne and Schultz2020), and soil moisture regime (Getnet et al., Reference Getnet, Husen, Fetene and Yemata2015) determine crop response to fertilizers.
Given the low agronomic effectiveness and poor economic efficiencies of blanket fertilizer applications, some efforts have already been made to improve such recommendations. For example, Optimizing Fertilizer Recommendations for Africa (OFRA) developed an agro-ecological zone (AEZ)-based fertilizer recommendation for Ethiopia using data from a few sites (Kaizzi et al., Reference Kaizzi, Mohammed, Nouri, Wortmann and Sones2017). And though the AEZ-based approach provided relatively refined fertilizer recommendations, it did not take into account the several micro-factors that influence crop response to nutrients, and hence still results in a coarse recommendation that can lead to sub- or supra-optimal fertilizer applications for farmers. Amede et al. (Reference Amede, Gashaw, Legesse, Tamene, Mekonen, Thorne and Schultz2020) showed the effect of local-scale topography on yield response to fertilizer, and developed a topography-based fertilizer recommendation. However, this approach, too, has ignored other important sources of fertilizer response determinants and hence is not suited for holistic, site-specific fertilizer recommendations. Another effort on disaggregated fertilizer recommendation is the recent soil fertility map developed under the EthioSIS project Footnote 1 . EthioSIS developed a nutrient recommendation map based on the level of nutrients in the soil. However, data show that many areas can still be non-responsive to fertilizer application due to a specific nutrient with the low amount in the soil (Tittonell et al., Reference Tittonell, Zingore, Van Wijk, Corbeels and Giller2007). Thus, identifying existing nutrient pools in the soil alone is not enough to provide reliable fertilizer recommendations as nutrient uptake is influenced by soil and non-soil related factors. None of the methods mentioned above consider the influence of relevant soil, climate, topographic, and soil-nutrient pool in an integrated manner.
In this study, we demonstrate a site-specific fertilizer recommendation approach based on a fertilizer-yield response function for wheat in Ethiopia. Wheat was chosen because it is a key staple crop grown by 4.7 million farmers, and because of its reported large yield gaps; that is, yield is only about 20% of its potential (Silva et al., Reference Silva, Baudron, Reidsma and Giller2019; van Ittersum et al., Reference Van Ittersum, Van Bussel, Wolf, Grassini, Van Wart, Guilpart, Claessens, De Groot, Wiebe, Mason-D’Croz and Yang2016). The novel approach included: (1) a machine-learning (ML) model to identify the most important site-specific variables determining wheat yield based on data collated from various sources; (2) the generation of spatially distributed nutrients—nitrogen (N), phosphorous (P), potassium (K), and sulfur (S) – response curves for wheat; and (3) the identification of the biophysical optimal nutrient recommendation for wheat. In the last one decade, the application of ML to guide agronomic management decisions has been increasing in many parts of the world (Chlingaryan et al., Reference Chlingaryan, Sukkarieh and Whelan2018). Unlike process-based crop models which depends on limited and defined input–output functions, ML method ‘learns’ to develop any form of transfer functions to predict output based on the provided inputs (Jeong et al., Reference Jeong, Resop, Mueller, Fleisher, Yun, Butler, Timlin, Shim, Gerber, Reddy and Kim2016; Shahhosseini et al., Reference Shahhosseini, Martinez-Feria, Hu and Archontoulis2019). In addition, they have the capability to integrate large and ever increasing in situ, remote sensing and other legacy data, and handle non-linear tasks to make the best-informed decisions towards site-specific nutrient management (Chlingaryan et al., Reference Chlingaryan, Sukkarieh and Whelan2018; Jeong et al., Reference Jeong, Resop, Mueller, Fleisher, Yun, Butler, Timlin, Shim, Gerber, Reddy and Kim2016). The approach implemented here adds to the development of decision support tools that can contribute to the sustainable intensification of African cropping systems.
Materials and Methods
Study area
Our study focuses on Ethiopia, a major wheat-growing country in the Horn of Africa and the largest wheat producer in SSA. Wheat is grown across the diverse landscapes in the mid- and highlands of the country ranging from 1800 to 3800 m asl (Figure 1). Wheat is selected for this case study because of its importance as a food security crop in the country and its wide adoption throughout larger areas of Ethiopia (Hodson et al., Reference Hodson, Jaleta, Tesfaye, Yirga, Beyene, Kilian, Carling, Disasa, Alemu, Daba and Alemayehu2020). Wheat is grown by more than 4.7 million farmers on approximate 1.6–1.8 million hectares of land, or about 15–18% of the total cropland of the country (CSA, 2016; Hodson et al., Reference Hodson, Jaleta, Tesfaye, Yirga, Beyene, Kilian, Carling, Disasa, Alemu, Daba and Alemayehu2020; Minot et al., Reference Minot, Warner, Lemma, Abate and Rashid2019). Currently, wheat is produced mostly under rainfed conditions and with relatively low inputs (Anteneh and Asrat, Reference Anteneh and Asrat2020).
Yield response to fertilizer dataset
The agronomic dataset used for the modelling was collected from various sources in Ethiopia. Several researchers and institutions in Ethiopia have been conducting agronomic fertilizer-yield trials across the wheat-growing areas since the 1960s (Zegeye, Reference Zegeye2001). We compiled agronomic data from various existing sources, namely: (1) published data from peer-reviewed scientific journals (Kihara et al., Reference Kihara, Tibebe, Gurmessa and Desta2017); (2) the OFRA database (EIAR, 2020); and (3) Ethiopia Institute Agricultural Research (EIAR) trials data from sites that are coordinated both at regional state and federal levels. The combined dataset constitutes a total of 6585 agronomic experiments (with 179 unique locations) conducted in a wide range of wheat-growing environments of Ethiopia between 1986 and 2017. The data cover almost all wheat-growing areas in the central highlands of Ethiopia (Figure 1). The established database was primarily for nutrient omission trials for N, P, K, and S. For each trial, the district and location (latitude and longitude) of the trials, treatment type (i.e. nutrient type), nutrient application rate, resulting yield, and year of a trial conducted variables are reported. A normalized response ratio of yield, which is calculated as yield of the treatment divided by the yield of the control for each location, is used for model crop yield response to nutrients.
Environmental covariates
To model the response of wheat yield to fertilizer, we included climate variables such as rainfall, temperature, and solar radiation; topographic variables (elevation and topographic index); and soil factors, particularly soil organic carbon (SOC), soil pH, soil texture, and cation exchange capacity (CEC). As climatic elements are very dynamic and their variabilities are strongly associated with yield variabilities (Hoffman et al., Reference Hoffman, Kemanian and Forest2018), we used monthly rainfall, maximum and minimum temperature, and solar radiation data for the first four months of the growing season in the modelling.
We downloaded climate data from the TerraClimate database (Abatzoglou et al., Reference Abatzoglou, Dobrowski, Parks and Hegewisch2018). TerraClimate is a global monthly database of 2.5 arc-min (∼4 km) spatial resolution that covers the period 1958–2019. TerraClimate is developed by combining high spatial resolution (30 arc-sec) average climatology data from WorldClim (Fick and Hijmans, Reference Fick and Hijmans2017) and the monthly temporal resolution data from CRU TS4.0 and the Japanese Reanalysis (JRA-55). The combination of these datasets then produces a dataset with greater temporal resolution than WorldClim, and greater spatial detail than CRU TS4.0 and JRA-55. We extracted monthly precipitation, maximum and minimum temperature, and solar radiation data for each of the trials using their geographic coordinates and reported growing seasons.
Soil data were downloaded from the 250-m spatial resolution SoilGrids database (Hengl et al., Reference Hengl, Mendes de Jesus, Heuvelink, Ruiperez Gonzalez, Kilibarda, Blagotić, Shangguan, Wright, Geng, Bauer-Marschallinger and Guevara2017). SoilGrids is a pan-Africa database constructed by applying random forest (RF) modelling to more than 28 000 individual soil sampling locations and many geospatial covariates. Using the point locations of the wheat trials, we extracted the organic carbon content, pH, clay percentage, silt percentage, and CEC as key variables that potentially influence the fertilizer response of wheat. Finally, topographic variables (i.e. the elevation and the topographic position index) were obtained from the void-filled Shuttle Radar Topography Mission (SRTM) database at 90 m spatial resolution (Jarvis et al., Reference Jarvis, Guevara, Reuter and Nelson2008). The full list of environmental factors (covariates) used in the ML model is presented in Table 1.
Actual yield prediction using machine learning
The fundamental formulation of yield at any given location can be expressed as the function of genotype, environment, and agronomic management as follows:
where Y A is the actual yield response ratio (i.e. yield normalized by control experiment at a given location), G is the crop varieties grown, E is site-specific environmental variables, and M is agronomic management practices applied at a given plot. To capture site-specific yield prediction and understand fertilizer response, we included several environmental factors that are important to determine yield in the ‘E’ term (Table 1). The ‘M’ term is the chemical fertilizer rate reported in the trial database.
To model Y A , we used a ML algorithm as it allows constructing a non linear relationship between the response and the predictor variables. We selected the RF model because of its relatively robust performance in capturing collinearity among predictor variables and noisy covariate data, in addition to its comparatively better performance concerning other ML tools (Breiman, Reference Breiman2001; Svetnik et al., Reference Svetnik, Liaw, Tong, Culberson, Sheridan and Feuston2003). In the RF model, data were split into training (70% of the data) and testing (30%) components for building the model and model testing, respectively (Svetnik et al., Reference Svetnik, Liaw, Tong, Culberson, Sheridan and Feuston2003).
The model building followed a stepwise procedure starting from all potential variables that can explain yield response, then dropping those variables that do not show variability and do not improve model performance. The variable of importance is computed based on the accuracy of the model performance (particularly mean square error (MSE)) computed on the out-of-bag data for each tree, and then the same computed after permuting a covariate. The differences are averaged and normalized by the standard error. Then the order of importance of the variable is based on the mean decrease in accuracy of the model (i.e., MSE). Towards this aim, we used the variable selection method using the CART R package (Kuhn, Reference Kuhn2008). The RF model is optimized for the number of trees (ntree) to grow and the number of predictors used at each node (mtry). In this study, several values were considered for the mtry parameter, varying from 2 to the whole number of predictors (Kuhn and Johnson, Reference Kuhn and Johnson2013). To assess the performance of the model between predicted and observed yield in both the training and testing datasets, R2 and the Willmott Index of Agreement (d) (Willmott et al., Reference Willmott, Ackleson, Davis, Feddema, Klink, Legates, O’donnell and Rowe1985) were used.
A site-specific fertilizer response function
While Eq. 1 provides yield response prediction at any given location based on site-specific inputs, disentangling the impact of fertilizer application rate on yield from the rest of the parameters can be done using a partial dependence plot and Individual Conditional Inference (ICE). Many studies have used partial dependence analysis (Cutler et al., Reference Cutler, Edwards, Beard, Cutler, Hess, Gibson and Lawler2007; Friedman et al., Reference Friedman, Hastie and Tibshirani2001) to estimate the average partial effect of one or more variables on the outcome of the ML model—in this case yield (Delerce et al., Reference Delerce, Dorado, Grillon, Rebolledo, Prager, Patiño, Varón and Jiménez2016; Jiménez et al., Reference Jiménez, Delerce, Dorado, Cock, Muñoz, Agamez and Jarvis2019). Partial dependence plots show the general trend between the model output and target variable, whether it is linear, monotonic, or complex. However, this analysis assumes that other values are kept constant at the average value, and hence the result cannot be used for real yield response to fertilizer application and cannot be disentangled for a specific location. In this study, we used the ICE (Goldstein et al., Reference Goldstein, Kapelner, Bleich and Pitkin2015) method to analyze fertilizer responses at any specific location by varying fertilizer application rate under the prevailing condition of other environmental factors, which helps generate fertilizer response curves at any given location. ICE estimates the predicted response as a function of a target variable, ${\rm{Fer}}{{\rm{t}}_{{\rm{Rate}}}}$ , conditional on some observed covariates (G, E, M). Mathematically, it can be expressed as in Equation 2.
where f (G × E × FerRate) is an estimated yield based on a trained RF model. And although we focus specifically on fertilizer response curves, Equation 2 can be applied to evaluate the relationship of other environmental variables with yield. For each response curve, or any location with a response curve developed, the biophysical optimal nutrient can be identified as the fertilization level that corresponds to the highest yield (yield response ratio). This is not related to agronomic or economic optimal nutrient recommendation.
We then grouped the fertilizer response curves into three categories based on the lower, middle, and upper quartile ranges of the slope of the curves. Accordingly, the three response curve groups represent low, medium, and high nutrient responsive areas. We applied a principal component analysis (PCA) to categorize the environmental variables that determine the nutrient response groups. In all cases, we retain the three first principal components (PCs) as these have the largest contribution to total explained variance in the PCA.
Results
Model performance
The RF model generally performed well at predicting yield response to nutrients, with N and P showing the greatest model performance, and K showing the poorest performance (Table 2, Figure 2). The model optimization procedure produced optimal values of the RF model parameters (i.e. mtry, Splitrule, and min.node.size) with R2 values ranging from 0.43 to 0.78 and index of agreement (d) values ranging from 0.73 to 0.93 both for the training and evaluation datasets (Table 2). The model performance results, particularly for N and P, could be considered very good, given the complexity of the yield estimation in a complex environment like the wheat-growing environments in Ethiopia. Notably, a greater number of well-distributed data points along the response range led to greater predictive skill since the response signal became stronger.
Importance of variables for yield response prediction
Figure 3 shows the importance of soil and climatic variables in predicting wheat yield under different rates of N, P, K, and S application rates. In all cases, fertilizer application rate was the most important variable in predicting yield, followed by some soil and climatic variables (Figure 3). Soil variables were the second and third most important compared with climate variables for N and P; the converse was true for K and S. Some variables (e.g. elevation, topographic features, and texture) have systematically low contributions to the model performance. However, the importance of the variables considered for model performance varied among the nutrients studied.
Crop fertilizer response
We assessed the relationship between nutrient level and yield using ICE (see subsequent section). Figure 4 shows both the average crop response for each nutrient response category and individual sites and individual years. The bold black curves for each category show that there is a monotonic increasing relationship between nutrient rate and yield response ratio for N, P, and S. On average, the crops are responsive to all nutrients, although primarily N and P show response ratios that vary strongly with the quantity of fertilizer used. The crop response curves to N and P have a similar shape with continuously increasing until it becomes flat, the average response to K and S is relatively flat (Figure 4). The spread across sites and years is considerably larger for K and S than for N and P. Some of the individual curves show shapes that do not comply with the established agronomic experiment studies. This could be due to some irregularities in the data generation process. Depending on the response curve gradients, the locations/areas are divided into three responsive categories: low, medium, and high (Figure 4). These suggest that in some sites and years, the response ratios vary strongly concerning the fertilizer applied, whereas in others the response ratios remain the same regardless of the fertilizer applied. In some cases, especially for K, response ratios decrease with increasing fertilizer amounts. This shows that the K application increases yield until a certain level, but the excessive application did not respond with high yield (or even decreasing), and similar evidence was reported in Amede et al. (Reference Amede, Gashaw, Legesse, Tamene, Mekonen, Thorne and Schultz2020). Yield response ratio decreases with increasing K could be due to its effect on the plant N metabolism; impact on the optimal N/P/K ratio needed for optimal yield response (Xu et al., Reference Xu, Du, Wang, Sha, Chen, Tian, Zhu, Ge and Jiang2020); and inhibit Mg uptake and may induce Mg deficiency in plants (Tränkner et al., Reference Tränkner, Tavakol and Jákli2018).
We used the developed crop to nutrient response curves to identify the biophysical optimal nutrient. We presented the optimal level for N, P, and S for Basona Worena woreda for the year 2018 to show how the approach can develop site-specific fertilizer recommendation (Figure A1). The recommendation is just for the specific year and is not static as it varies with weather elements such as rainfall and temperature. The spatial variability of recommended optimal nutrients is high (Figure A1), indicating that the approach can be a useful to develop a tool to guide fertilizer decisions at any spatial and administrative scale which can range from lowest administrative unit like Kebele or Woreda to higher units like regional and federal level. For instance, at the Basona Worena woreda for the year 2018, our analysis shows that application of site-specific optimal N can increase on average about 263 kg/ha in comparison to the Blanket recommendation (i.e. 100 kg/ha) throughout the woreda (Figure A2).
Determinants of crop response to nutrients
The PC analysis shows that about 50% of the difference between the low, medium, and high response categories can be explained by the first and second PCs (Figure 5). For N, the first three components contributed about 62.1% of the total variation of the three nutrient response groups. Principal component one (PC1) alone explained 30% of the total variation for the N response groups. In PC1 climatic parameters (particularly solar radiation and temperature) and soil parameters (pH and elevation) had a strong loading in categorising the N response functions. On the other hand, elevation, rain_1, and SOC negatively correlated with PC1, indicating that they are not in the first order of variables determining N responses curves. In PC2, the first-month maximum temperature and elevation had a strong positive influence on N response groups (Figure 5a). Silt percentage and CEC are the key elements constituting PC3.
For P response groups, PC1, PC2, and PC3 explained 29%, 18%, and 16% of the variation, respectively. Radiation and temperature in PC1, texture and elevation in PC2, and temperature and topographic position in PC3 explained most of the variation in grouping P responses functions into the three categories (Figure 5b). PC1 correlates negatively with rainfall (Rain_1, Rain_2, Rain_3) in determining P response curve shapes.
With respect to K, PC1 explained more than 52% of the determinants for K response function groupings, with PC2 and PC3 having almost the same effects (11–12%) (Figure 5c). Most environmental covariates have contributions to PC1, whereas temperature covariates have a strong negative correlation with PC2, and rainfall has positive loading for PC3. The PCA for S response function showed that PC1 and PC2 had almost the same effect (30–34%) in explaining the S response function groupings, whereas PC3 explained only 8% of the variation (Figure 5c). PC1 has high and positive loadings on elevation, soil texture, pH, and temperature; PC2 has high loading with temperature and SOC. The former correlates negatively, the latter correlates positively (Figure 5c). PC3 correlates negatively with rainfall and positively with maximum temperature.
In all nutrient responses, many covariates contribute to the first PC. There are no single or two covariates with exceptional large loading, indicating that no single variables are determining the shape of the fertilizer response curve.
Discussion
Crop yields in Ethiopia need to increase considerably to reduce import dependency and keep up with the expected population increase and dietary changes. Despite the yield increase observed for many crops including wheat in recent years, crop yield gaps remain large. Although Ethiopia is the largest wheat producer in SSA, the country imported 1.5 million tonnes of wheat, corresponding to a value of around $600 million (CSA, 2019). Currently, the national average yield of wheat is 2.9 t/ha, or roughly 20% of the crop’s rainfed yield potential (Silva et al., Reference Silva, Reidsma, Baudron, Jaleta, Tesfaye and van Ittersum2021). Low and blanket fertilizer application has long been considered as the main cause of low yields in Ethiopia (Tamene et al., Reference Tamene, Amede, Kihara, Tibebe and Schulz2017). Given the rapidly increasing demand for cereals due to fast population growth and dietary change in the country (van Ittersum et al., Reference Van Ittersum, Van Bussel, Wolf, Grassini, Van Wart, Guilpart, Claessens, De Groot, Wiebe, Mason-D’Croz and Yang2016), the application of higher amounts of nutrients with higher agronomic and economic efficiency are needed to increase wheat productivity under rainfed systems.
In the present study, we aimed to develop site-specific nutrient response functions for wheat using large datasets collated from various in situ observations. The datasets were used to train and evaluate a ML model to develop wheat nutrient response functions for four nutrients (N, P, K, and S) at a fine spatial resolution. As shown by the statistical indices used to evaluate model performance against measured observations (Table 2), the developed model performed well in representing measured nutrient responses across Ethiopia’s diverse wheat-growing environments. The study also categorized wheat-growing environments into three groups based on patterns of non-sentient functions to facilitate manageable nutrient recommendations across relatively similar response environments. Moreover, the study identified the major factors that influence nutrient response for each response group. Since the nutrient response functions are developed by considering all relevant climatic and edaphic factors, the response functions can be used to match fertilizer application rates with soil fertility problems effectively in the wheat-growing environments. Although most smallholder farmers appreciate the benefit of fertilizers, they rarely apply them at recommended rates and at the appropriate time because of unreliable returns, high cost, lack of supportive policy to access, and limited knowledge about their efficient use (Tamene et al., Reference Tamene, Amede, Kihara, Tibebe and Schulz2017). Therefore, the availability of high-resolution fertilizer recommendations, such as the one developed for wheat in this study, can increase nutrient efficiency, affordability, and economic returns to smallholder farmers. Moreover, government policies aimed at increasing wheat production in rainfed systems should focus on fostering the accessibility and affordability of inputs, particularly fertilizers (Silva et al., Reference Silva, Reidsma, Baudron, Jaleta, Tesfaye and van Ittersum2021).
The approach followed in this study is one step closer to developing precision nutrient management based on optimal fertilizer recommendations. This would help to fill the gap that exists between coarse resolution farming system or agroecology-based fertilizer recommendations and more complex process-based model approaches which have site-specific data and intensive calibration requirements (Basso et al., Reference Basso, Ritchie, Cammarano and Sartori2011). The RF ML model presented in this study provided high-resolution nutrient recommendations compared with other fertilizer recommendation efforts such as OFRA (Wortmann and Sones, Reference Wortmann and Sones2017) and the soil test-based methods of EthioSIS (Tegbaru, Reference Tegbaru2015).
The ML approach is data-driven and based on capturing important factors that determine yield by considering the important biophysical factors that influence yield. The approach thus has a huge potential to develop spatially explicit fertilizer recommendations for several crops which will be improved with increased availability and accessibility of agronomic and other agricultural data. Note, however, that further improvement will be needed to minimize uncertainties associated with the ML approach as more data with several temporal and spatial dimensions become available. In future, the ML model can be improved by considering: (1) the use of organic fertilizers and new crop varieties for different crops; (2) high-resolution and updated soil datasets such as the data collected by the EthioSIS project; and (3) innovations around agronomic data acquisition technologies (e.g. ‘internet of things’) from crop plots which can facilitate real-time data streaming to help improve the predictive capacity of models.
Conclusion
This study used data from nutrient omission field trials conducted across the diverse wheat-growing environments in Ethiopia to develop site-specific nutrient response functions using a ML approach. The RF model developed for predicting wheat response to N, P, K, and S nutrients at the different sites performed well when it is compared against observed records, particularly for N, P, and S. The ML model also provided high-resolution nutrient recommendations compared with other previous fertilizer recommendation efforts in Ethiopia. Moreover, the ML approach enabled us to identify the major climatic and soil conditions that influence the response of wheat to the four nutrients studied. The study demonstrated the potential of using ML approaches for developing spatially explicit nutrient response functions and fertilizer recommendations by using wheat as a case study crop. The study suggested the need for further improvement to minimize uncertainties associated with the current approach as more data with several temporal and spatial dimensions become available soon. The approach will help develop decision support tools for site-specific nutrient recommendations by smallholder farmers and thereby enhance nutrient use efficiency and increase yield and income. This research is expected to respond to the national policy demands for a sound method to identify the optimal fertilizer rate to increase economic returns for fertilizer investments and take fertilizer utilization research one step further. It is useful to set an evidence-based threshold (limit) of nutrients, above which additional applications are prohibited or limited due to a low probability of positive crop response and a high probability of negative environmental impacts on soil and water.
Acknowledgements
This research was financed by the Supporting Soil Health Interventions in Ethiopia, funded by the Bill and Melinda Gates Foundation and managed by the Deutsche Gesellschaft für Internationale Zusammenarbeit (GIZ). The Excellence in Agronomy (EIA) and the Accelerating the Impact of CGIAR Climate Research in Africa (AICCRA) projects have also supported staff time of some of the authors. In addition, we recognize the support from the European Union–International Fund for Agricultural Development (EU-IFAD) project under the CGIAR Research Program on Climate Change, Agriculture and Food Security (CCAFS). JR-V is supported by CCAFS through its Agroclimas project. CCAFS is carried out with support from CGIAR Trust Fund donors and through bilateral funding agreements. For details, visit https://ccafs.cgiar.org/donors. The views expressed in this paper cannot be taken to reflect the official opinions of these organizations.
Funding Support
This work was supported, in whole or in part, by the Bill & Melinda Gates Foundation [INV-005460]. Under the grant conditions of the Foundation, a Creative Commons Attribution 4.0 Generic License has already been assigned to the Author Accepted Manuscript version that might arise from this submission.
Appendix