Sensitivity analysis of point and parametric pedotransfer functions for estimating water retention of soils in Algeria

Improving the accuracy of pedotransfer functions (PTFs) requires studying how prediction uncertainty can be apportioned to different sources of uncertainty in inputs. In this study, the question addressed was as follows: which variable input is the main or best complementary predictor of water retention, and at which water potential? Two approaches were adopted to generate PTFs: multiple linear regressions (MLRs) for point PTFs and multiple nonlinear regressions (MNLRs) for parametric PTFs. Reliability tests showed that point PTFs provided better estimates than parametric PTFs (root mean square error, RMSE: 0.0414 and 0.0444 cm3 cm−3, and 0.0613 and 0.0605 cm3 cm−3 at −33 and −1500 kPa, respectively). The local parametric PTFs provided better estimates than Rosetta PTFs at −33 kPa. No significant difference in accuracy, however, was found between the parametric PTFs and Rosetta H2 at −1500 kPa with RMSE values of 0.0605 cm3 cm−3 and 0.0636 cm 3 cm−3, respectively. The results of global sensitivity analyses (GSAs) showed that the mathematical formalism of PTFs and their input variables reacted differently in terms of point pressure and texture. The point and parametric PTFs were sensitive mainly to the sand fraction in the fineand medium-textural classes. The use of clay percentage (C %) and bulk density (BD) as inputs in the medium-textural class improved the estimation of PTFs at −33 kPa.


Introduction
Predictive information on the spatial distribution of soil water and its availability for plants enables producers to take effective decisions (e.g., on nutrient management and plant cover) to maximize profitability.The soil-water balance is central to many processes that influence plant growth and the degradation of soil and water resources.
Hydrologists face the situation where soil hydraulic data such as water retention or hydraulic conductivity are often missing.Therefore, pedotransfer functions (PTFs) are used as an alternative to estimate these properties.The extrapolation of PTFs in different agropedoclimatic context limits their performance (Touil et al., 2016).The development of local PTFs could be useful in meeting the agricultural requirements for modelling with reasonable accuracy.
Soil-water retention (SWR) curves can usually be estimated using two approaches: point PTFs and parameter PTFs.With point PTFs, SWR is estimated at defined pressure points (Pachepsky et al., 1996;Minasny et al., 1999).One of the most commonly used SWR curves is the Van Genuchten model (1980).With parameter PTFs, the parameters of SWR models, such as θs, θr, α and n, are estimated by fitting them to the data and then relating then by empirical correlation to basic soil properties (Vereecken et al., 1992;Wösten et al., 1995;Schaap et al., 1998;Minasny and McBratney, 2002;Rawls and Brakensiek, 1985;Van Genuchten et al., 1992;Wösten et al., 2001;Vereecken et al., 2010).Schaap et al. (2001) developed the Rosetta package based on the artificial neural network (ANN) method, which uses five hierarchical models to predict the van Genuchten (VG) parameters (θs, θr, α and n) with soil texture classes only and the input data (texture, bulk density [BD], and one or two water content values at -33 and -1500 kPa).
PTFs for point and parametric estimation of SWR from basic soil properties can be developed using multiple regression methods (Lin et al., 1999;Mayr and Jarvis, 1999;Tomasella et al., 2000).Some 97% of PTFs are based on multiple linear and polynomial regressions of n th order techniques (Botula et al. 2014).
Using PTFs in environments that differ from those from which they were derived can lead to an under-or overestimation of SWR.Several studies have shown that SWR is a complex function of soil structure and composition (Rawls et al., 1991;Wösten et al., 2001;Rawls et al., 2003;Mirus et al., 2015).Applying PTFs to different textural or structural classes could also be a source of uncertainty (Bruand et al., 2002;Pachepsky et al., 2003).SWR and hydraulic conductivity vary widely and nonlinearly with soil-water potential.Soil texture is the main determinant of the water-holding characteristics of most agricultural soils (Saxton et al., 1986).The relationship between the SWR curve and particle size distribution (PSD) has been investigated in many studies (Jonasson et al., 1992;Minasny et al., 2006;Ghanbarian et al., 2009;Xu Yang et al., 2013;Tae-Kyu Lee et al., 2014).
The variability in PTF response depends on the variability and uncertainty of one or more of the input variables.Uncertainty analysis in the variety of available PTF approaches is necessary to minimize error in estimation and identify its source.Recently, sensitivity analysis techniques and uncertainty analysis have begun to receive considerable attention in PTF studies (Nemes et al., 2006b;Kay et al., 1997;Grunwald et al., 2001;Deng et al., 2009;Moeys et al., 2012;Loosvelt et al., 2013).The question is: Which variable input is the main or best complementary predictor of SWR, and at which potential?Global sensitivity analysis (GSA) enables us to study how uncertainty in the output of a model can be apportioned to different sources of uncertainty in the model inputs (Saltelli et al., 2000).Generally, GSA is useful for identifying which variables make the main contribution to output variables (Jaques et al., 2004).
The objectives of this study were to: a maximum of 30 cm for surface horizons and more than 30 cm for subsurface horizons.
PSD analysis was conducted using the international Robinson's pipette method (Robinson, 1922).
Undisturbed soil samples obtained with 500-1,000 cm 3 cylinders were used to determine BD.The SWR values at -33 kPa and -1500 kPa were obtained using Richards's apparatus (Richards et al., 1943).Undisturbed soil samples were collected near field capacity with 100 cm 3 cylinders.Water content was measured using the gravimetric method at 105°C (24 h).Organic carbon content was determined using the wet oxidation method (Walkley and Black, 1934).Variation in soil texture in the dataset is displayed using the textural triangle proposed by FAO (1990) in Figure 3b.
The SWR model devised by Van Genuchten (1980) is defined as: Where θ r and θ s are residual and saturated soil-water content (cm 3 cm -3 ), respectively, and α (cm -1 ) and n are the shape factors of the SWR function.The VG parameters were indirectly estimated for each soil sample from four levels of measured data inputs: sand, silt and clay percentages, and BD using the Rosetta model H3 (Schaap et al., 2001).The 'm' parameter was calculated as follows:

PTF development
Two approaches were used in this study to develop the PTFs: point PTFs for estimating SWR for particular points of pressure (h); and parametric PTFs for predicting the VG parameters.Each water content level at selected water potentials of -33 kPa and -1500 kPa and estimated VG parameters were related to basic soil properties (i.e., sand, silt, clay content, OM content and BD) using multiple regression techniques (Table 2).The most significant input variables were determined using the Pearson correlation (α =5%).T For the multiple-linear regression (MLR) models, the general form of the resulting equations was expressed thus: For the multiple-non-linear regression (MNLR) models, it was expressed thus: (3) Where Y represents the dependent variable, a 0 is the intercept; b 1 …, b n are the regression coefficients, and X 1 to X 4 refer to the independent variables representing the basic soil properties.
The prediction quality of the point and parametric PTFs developed from Algerian soils were then compared with three Rosetta PTFs (H1, H2 and H3).We chose the Rosetta model because it gives the user flexibility in inputting the data required (Stumpp et al., 2009), with the option of five levels based on input data (Schaap et al. 2002): The Rosetta model was also chosen because it has given reasonable predictions in several evaluation studies (Frederick et al., 2004, Nemes et al., 2003).In our study, the three Rosetta model levels (H1, H2, and H3) were selected to compare their performance in the Algerian soils because they require only texture data and BD as inputs, as well as the locally developed PTFs.

Evaluation criteria
PTFs are regularly assessed by comparing the values that they predict with the measured values (Pachepsky and Rawls, 1999).In order to assess the validity of the PTFs developed, we used the following criteria: mean prediction error (ME) to indicate the bias of the estimate; root mean square error (RMSE) to assess the quality of the prediction (it is frequently used in studies on PTFs); and the index of agreement (d) developed by Willmott and Wicks (1980) and Willmott (1981) as a standardized measure of the degree of model prediction error.They were calculated using the following equations, respectively: Where N is number of horizons, and θ p , θ m , predicted and measured volumetric water content, respectively.The estimate was better when ME was close to 0'.Negative ME values indicated an average underestimation of θm, whereas positive values indicated overestimation.
(5) Thus, the lower the RMSE, the better the estimate.
The index of agreement varied from 0 to 1, with higher index values indicating that the modeled values θ p were in better agreement with the observations θ m .

Global sensitivity analysis (GSA)
GSA involves determining which part of the variance in model response is due to variance in which input variable or group of inputs.The impact of the parameters is quantified by calculating the global sensitivity indices.
The Sobol method (Sobol, 1990) is an independent GSA method based on decomposition of the variance.When the model is non-linear and non-monotonic, the decomposition of the output variance is still defined and can be used.The Sobol model is represented by the following function: Where Y is the model output (or objective function) and X=(X 1 ,….., X p ) is the input variable set.
Where V(Y) is the total variance in the model, V (E(Y|X)) and E (Var(Y|X)) signify variance in the conditional expected value and expected value of the conditional variance, respectively.When the input variables X i are independent, the variance decomposition of the model is: Where V i is the proportion of variance due to variable X i .Dividing V i by V(Y) produces the expression of the first-order sensitivity index (S i ), such that: The term S i is the measure that guarantees an informed choice in cases where the factors are correlated and interact (Saltelli and Tarantola, 2002).This index is always between 0 and 1, and represents a proper measurement of the sensitivity used to classify the input variables in order of importance (Saltelli and Tarantola, 2001).
In order to quantify variation in the sensitivity index (V Si ) of an input factor X i , we fixed it at Xi = Xi * (Xi * : the average when the variable follows the normal distribution, the median when the variable follows the lognormal distribution).In order to calculate how much this assumption changed the variance of Y, we used this formula: ) * 100 (11) V Si > 0 and S i close to 1 indicate increasing accuracy of PTFs; V Si < 0 and S i close to 1 indicate increasing accuracy of PTFs; V Si > 0 and S i close to 0 indicate decreasing accuracy of PTFs; V Si < 0 and S i close to 0 indicate decreasing accuracy of PTFs.
In addition, combining the RMSE and S i enabled us to detect the contribution of each variable to improvement in the quality of prediction of the PTFs.

III. Results and discussion
In Table 3, most of the PTFs underestimated SWR except for the point PTF at the two pressure points (-33 kPa and -1500 kPa).The Rosetta H2 model, which considers only texture as an input, gave a smaller ME value than the H1 and H3 models (-0.0728; -0.0436 cm 3 cm -3 at -33 kPa and -1500 kPa, respectively ) .
The poor ME values indicated better estimates of PTFs.They were produced after the application of point PTFs followed by parametric PTFs.
Among the five tested models in the Lower Cheliff soils, the point PTFs (MLR) derived from a database taken from some Algerian soils had the lowest RMSE values (0.041 and 0.044 cm 3 cm -3 at -33 kPa and -1500 kPa, respectively).Performances equivalent or superior to PTFs derived by multiple regression methods have been reported in some studies (Minasny et al., 1999;Nemes et al., 2003).
The non-linear models (parametric PTFs), however, gave a better estimation than the Rosetta models based on ANN (RMSE: 0.0613 and 0.0605 cm 3 cm -3 at -33 kPa and -1500 kPa, respectively).The RMSE and ME values of the three Rosetta models also showed that H2 was better than H1 or H3 (Table 3).
The index of agreement results showed that point PTFs were more suitable for Lower Cheliff soils than parametric PTFs (Figure 6), with values of 0.9975 and 0.9911 cm 3 cm -3 ).Similar comparisons in different regions were undertaken by Minasny et al. (1999), Tomasella et al. (2003) and Ghorbani Dashtaki et al. (2010), who all reported similar differences between these two PTF approaches.
As Table 3 shows, there was no significant difference in RMSE values between the parametric PTFs and Rosetta H2 at -1500 kPa (RMSE: 0.0605 cm 3 cm -3 and 0.0636 cm 3 cm -3 , respectively).

Sensitivity index before textural grouping
In the development of PTFs, using PSD as an input is the usual approach (texture as an overall expression of PSD, clay, silt and sand content) and its contribution is fundamental to understanding the process of retaining water at different pressure points, although various physical and chemical characteristics are used to describe the SWR curve, such as BD and OM.
The importance of each input variable was assessed by the first order S i .It was clear for the PTFs developed that OM% and clay percentages (C%) were the variables with the greatest impact (Figure 2).For the point PTFs (MLR), the most sensitive estimations were at two pressure points (S i : 0.821; 0.782 at -33 kPa and 0.630; 0.585 at -1500 kPa for OM% and C%, respectively.The percentage of silt (S i %) was second in importance in parametric PTFs (0.576 at -33 kPa) after OM, followed by BD and C (Fig. 2).The S i values placed sand content in third place in the MLR (0.262; 0.162), indicating that its impact on the parametric model was almost insignificant, with very low values (S i : 0.077; 0.017) at -33 kPa and -1500 kPa, respectively).
The prediction quality of point PTFs (MLR) can be explained, first, by taking into account the basic characteristics of soil as an input from the textural and structural information given by the BD.
Second, point PTFs (MLR) are based mainly on these input variables, unlike parameter PTFs (MNLR), which have inputs other than texture and BD, as well as other parameters (VG parameters: θ r , θ s , α, n).

Sensitivity and uncertainty analysis after the textural grouping
The sensitivity of the multiple regression methods (linear and non-linear) used to develop PTFs from basic soil characteristics for estimating SWR for different textural classes was analyzed.We grouped the samples into three classes of particles (Figure 3) in line with FAO guidelines (FAO, 1990): very fine (12 samples); fine (31 samples); and medium (10 samples).
The results showed that after the textural grouping, there was an improvement in the quality estimation of PTFs only in the medium class (Figure 4).A better prediction at -1500 kPa was provided by point PTFs (RMSE = 0.027 cm 3 cm -3 ) and parametric PTFs (RMSE = 0.038 cm 3 cm -3 ) at -1500 kPa.
1. Texture: After textural grouping, the MLR and MNLR PTFs developed were always sensitive mainly to the sand fraction in the fine and medium classes (Table 4).The variation in the first S i in the point PTFs was significantly greater in the medium texture class at the two pressure points ( -33 kPa and -1500 kPa).In the MNLR, sand had the most influence, particularly with regard to the fine class (-40.9%,18.9% at -33 kPa and 1500 kPa) and the medium class (-16.7% at -1500 kPa).
The S i of a variable quantifies the influence of its uncertainty on the output.This is the part of the variability output explained by the variability input.What was confirmed after calculating the variation in the first order S i was that the PTFs developed were still more influenced by the variability in sand at -33 kPa than at -1500 kPa.This impact could be explained by the irregularity of the dispersion of sand content in the validation database, with a coefficient of variation (CV) of about 119% compared with the other input variables (33%, 18%, 9% and 57% for clay, silt, BD and OM, respectively).This heterogeneity in the sand data series clearly influenced the uncertainty of the PTF response.
Looking at the matrix correlation (Table 5), the clay and silt fractions were significantly correlated with sand content.Saltelli and Tarantola (2002) observed that when X 1 and X 2 were correlated with a third factor, X 3 , the S i calculated depended on the force of this correlation as well as the distribution of X 3 .In this case, the index power could be influenced by this statistical association, as it explains the higher value difference of index variation in the sand percentage compared with the other variables (Fig. 2).
We observed that point PTF (MLR) produced a lower error of estimation when the variation of the first order S i for sand was the most important (MLR in the medium class: RMSE 0.030; 0.027 cm 3 cm -3 with V Si -103% and 86.4% at -33 kPa and -1500 kPa, respectively].A negative S i variation in sand content when the latter was fixed was apparent in all texture classes (Table 4).This could be explained by the proportional relationship between sand and clay content, particularly in the validation dataset with a dominant clay texture.Insignificant sensitivity of sand was recorded for the very fine texture.Rawls et al. (2003) observed that 10% of sand provides an increase in SWR at low clay content and a decrease in SWR at high clay content of more than 50%.
The relationship between VG's SWR curve parameters (especially n and α) and PSD has been examined in many studies (e.g., Minasny et al., 2007;Benson et al., 2014) in order to explain why the sand impact increases in the fine texture class in parametric PTFs.It could be explained by the predominant presence of sand and clay content as inputs in parametric PTFs.For soils with clay content between 35% and 70%, water content is greatly influenced by the percentage of sand in the soil (Loosvelt et al., 2013).
In addition, when the sand content of a sample increased to 60%, the drying rate was faster and water absorbing ability was weaker than with the low sand content.When sand content falls to 20%, the small pores occupy a large part of the pore structure, making the soil compact (Hao et al., 2015).
In the medium texture class, there was increasing accuracy in PTFs after fixing the clay content at -33 kPa.This could be explained by the reduced clay percentage in the medium class (mean of clay [%] = 23%), which produced fewer errors at -33 kPa.The greatest impact of clay (%) was observed at -1500 kPa in the point and parametric PTFs in different textural classes (Figure 4).The clay content of soils is a major predictor for modelling the permanent wilting point of soils (Minasny et al., 1999).
The accuracy of the PTFs decreased when they were applied to some soil samples with a clay content (%) > 60% (Figure 4).In the very fine class, insignificant sensitivity was recorded at all pressures defined in this study.In this class, the variation in clay was much lower because it is only the dominant solid fraction, which could explain the smaller variation in S i after fixing the clay percentage.SWR was higher in the very fine and fine classes than in the medium class, because they quickly drained water initially retained.
The silt percentage was introduced as an explanatory variable only in parametric PTFs (MNLR).This fraction is known for its ability to retain water at high and medium soil water potentials.The GSA showed that the silt percentage had a stronger impact on the estimation of parametric PTFs at -1500 kPa than at -33 kPa with the MNLR model.After textural grouping, an important variation in the first order S i was observed in the medium class (-36.7% to -1500 kPa).The lowest values were recorded in the very fine class.It was clear that the silt percentage has an important role in estimating VG's parameters (α, n), and that its use as an input influences the estimate in the medium and fine classes.
There was an increasing accuracy, however, in the PTFs recorded in the fine class at -1500 kPa.With silt and clay as inputs, there was a better estimation.Plant-available water content variation is more related to sand and silt than to clay content (Reichert et al., 2009).
2. Bulk density: this is the second most influential variable on the point PTF (MLR) response is by variation of sensitivity index on all textural class, mainly in the very fine textural class with elevated values at -33 kPa (V Si = -50, 5%).In parametric PTFs, BD influenced the medium class at -33 kPa.
The results showed the accuracy of quality estimation in the medium class when fixing the BD at -33 kPa for the two PTF approaches (Table 4).The very fine textural class represented 16 surface samples (0-30 cm) with a dominance of clay texture.In a similar study on clay soils, volumetric water content (VWC) was highly related to the inverse of BD at field capacity (Bruand et al., 1996).This might also explain the fact that many soils with high clay content in the database are Vertisols in which BD and VWC are lower (Rawls et al., 2003).The inclusion of BD as an input provides information on pore volume, which can influence the performance of PTFs when applied to soil with high clay content.
In addition, the soil structural information characterized by BD measurements is an indirect measurement of pore space and is affected mainly by texture and structure.For structureless soils, primarily coarse and medium textured soils, the capillary pore-size distribution can be satisfactorily described by PSD.The medium texture is related in general to pore-size distribution, as large particles give rise to large pores between them, and therefore have a major influence on the SWR curve (Arya and Paris, 1981;Nimmo, 2004).With BD and texture as inputs in point PTF (MLR), predicted values very close to the experimental results are obtained.This study showed that the effect of using soil structural information in estimating SWR depended on the type of regression technique (Nguyen et al., 2015).

Organic matter content:
The most insignificant variation in the S i after textural grouping related to OM content.This could be explained, first, by the poor OM content in the Algerian soil samples.Lal (1979) did not find any effect of OM content on SWR.Danalatos et al. (1994) attributed this to the generally low OM content in their samples.Second, homogeneity of the data for OM content in every textural class reduced the variation in PTF response.The increasing accuracy of parametric PTFs, however, was apparent for medium-textured soils at -33 kPa, where OM was used as an input to predict θ s .SWR at -33 kPa is affected more strongly by organic carbon than at -1500 kPa (Rawls et al., 2003).The sensitivity analysis conducted by Rawls et al. (2003) to study the role of OM content as a predictor showed that the SWR of coarse-textured soils is much more sensitive to changes in organic carbon than is the case with fine-textured soils.Bauer and Black (1981) found that the effect of organic carbon on SWR in disturbed samples was substantial in sandy soil and marginal in medium and fine textured soils.

IV. Conclusion
The objective of this study was to analyze the sensitivity of estimating the SWR properties of Algerian soil using PTFs.We developed and validated point and parametric PTFs from basic soil properties using regression techniques and compared their predictive capabilities with the Rosetta models (H1, H2, and H3).The reliability tests showed that point PTFs produce more accurate estimations than parametric PTFs.The derived parametric PTFs, however, provided better estimates than the Rosetta models originally developed from a large intercontinental database.
The GSA showed that the mathematical formalism of the PTF models and their input variables reacted differently in terms of point pressure and textural class:  After textural grouping, the two PTF approaches developed (MLR and MNLR) were always sensitive primarily to the sand fraction in the fine and medium classes at -33 kPa, rather than at -1500 kPa.
 The results illustrated the accuracy of quality estimation in the medium class for the two PTF approaches when fixing the clay percentage (C %) and BD at -33 kPa.
 The accuracy of PTFs decreased when they were applied to soil samples with a clay content > 60%.
 The most insignificant variation in the S i after textural grouping was related to the OM content in Algerian soils.

Figure 5 .
Figure 5. Variation in first sensitivity index with RMSE after textural grouping.

Figure 6 .
Figure 6.Scatter plots of measured soil water retention versus predicted soil water retention.


Develop and validate two PTF approaches using regression methods: point PTFs for estimating SWR in Algerian soils at -33 kPa and -1500 kPa; and parametric PTFs for estimating the VG parameters Study the impact of each input on the PTF responsesThe soil dataset used for this study was collected from various regions in Algeria, mainly in the north, which has a Mediterranean climate.It contained 242 samples, with basic soil properties: texture fractions (based on the USDA system; clay and silty-clayey for most of the soils, Fig.3a), BD, OM content and water content at -33 kPa and -1500kPa.Descriptive statistics of the development and validation datasets are presented in Table1.The available database was split into two datasets.Used to verify the PTFs, they were collected from Benziane valley in the lower south-western Cheliff plain.The depth of the two upper horizons varied from site to site, with Subset 1, which was used to develop the PTFs, contained 78.1% of the samples.Used as the calibration set, they were collected from the coastal plain of Annaba in north-eastern Algeria (13 samples), the Beni Slimane plain of Media (42 samples), the Kherba El Abadia plain of Ain Defla (54 samples) and the Lower Cheliff plain in north-western Algeria (80 samples).Subset 2 contained the remaining 21.9% of the samples.

Table 1 .
Soil characteristics of the developed and validated datasets.

Table 2 .
Multiple regression coefficient R 2 and regression coefficients of models developed.

Table 4 .
Variation of first order sensitivity index (S i ) in the different textural classes.

Table 5 .
Pearson correlation matrix between basic soil characteristics in the validation dataset of 53