A probabilistic approach to quantifying soil property change 1 through time integration of energy and mass input 2 3

Discussion on manuscript title. The authors are not consequent in their objectives: the model calculates the soil state (at present), but this is presented as soil change (a.o. in the title). For “change” you need (minimally) 2 values in time, these are the initial value and the final/current value. As the authors state there is currently no role of the parent material in their functions, there is no initial value and thus there can be no “change” calculated. Following the discussion inside the manuscript on reasons for poor model fits, parent materials may enter in future versions of the model and then change may come in. Advice: do not suggest “change” when it is not calculated, certainly not in the title.


Introduction
The need for pedogenic models that can be widely applied and easily utilized is paramount for understanding soil-landscape evolution, soil property change with time, and predicting future soil conditions.A mathematically simple, easily parameterized approach has yet to be developed that is capable of predicting current soil properties or recreating potential soil evolution with time.Here we address this knowledge gap through development of a probabilistic model of soil property change capable of predicting soil properties across a wide range of terrains, climates, and ecosystems.
The state factor approach has been one of the primary pedogenic models since it's development in the late 1800's and early 1900's (Dokuchaev, 1883;Jenny, 1941).The soil state factor approach (Jenny, 1941) assumes the state of the soil system or specific soil properties (S) may be described as a function of the external environment, represented by climate (cl), biology (o), relief (r), parent material (p), and time (t): S = f (cl, o, r, p, t).This approach increased our understanding of soil variation across each factor, but more complex, multivariate approaches are generally not possible or difficult to derive from this formulation (Yaalon, 1975).From the original state factor model have evolved pedogenic models that include functional approaches (Jenny, 1961), energetic approaches (Rasmussen and Tabor, 2007;Rasmussen et al., 2005Rasmussen et al., , 2011;;Runge, 1973;Smeck et al., 1983;Volobuyev, 1964), and mechanistic approaches (Finke, 2012;Minasny and McBratney, 1999;Salvador-Blanes et al., 2007;Vanwalleghem et al., 2013).
However, many of these approaches are either limited to a site-specific basis, require a high degree of parameterization, or lack wide-scale applicability.
Here we develop a simple probabilistic approach to predict soil physical properties using a large dataset of chronosequences studies.The model compresses state factor variability into 2 key components (parent material and total pedogenic energy, defined in Section 1.1) that were parameterized and calibrated using the chronosequence database.Additionally, we modified the model to include soil depth to capture the influence of redistributive hillslope processes to predict soil properties.We hypothesized that by including soil depth, the model would effectively predict the clay content in an independent dataset synthesizing soil and landscape variability in complex, hilly terrain from a wide range of environments.

Probabilistic model of soil property change
The model presented here is based on a reformulated state-factor model, where a location has a probability of displaying a range of differing soil morphologies and properties based upon the state factors, with some range of values more probable than others, meaning the state-factor model (Jenny, 1941) may be restated as: where, the left hand side of the equation, ℙ s !≤ S ≤ s !, represents the probability that a given soil will have a value located between a lower limit (s 1 ) and an upper limit (s 2 ) (Phillips, 1993b).
Eq. 1 can be restated more simply as: where, the original soil forming state factors have been simplified to represent the fluxes of matter and energy into the soil system (P x ), incorporating the influence of climate and biology, and the initial state of the soil forming conditions (L o ), incorporating the influence of the initial topography and original soil parent material, and time or age of the soil system (t) (Jenny, 1961).
Equation 2 was further simplified to make the approach operational.A quantitative measure of climate and biology was needed to represent the influence of P x on soil formation.
We used a quantification of P x calculated from effective precipitation and biological productivity, termed effective energy and mass transfer (EEMT, J m -2 yr -1 ) (Rasmussen and Tabor, 2007;Rasmussen et al., 2005Rasmussen et al., , 2011)).EEMT provides a measure of the energy transferred to the subsurface, in the form of reduced carbon from primary productivity and heat transfer from effective precipitation, which has the potential to perform pedogenic work, e.g., chemical weathering and carbon cycling.Using EEMT as a simplification of P x , Eq. 2 was restated as (Rasmussen et al., 2011): We further simplified Eq. 3 by combining the flux term EEMT and the age of the soil system (t).
EEMT multiplied by the age of the soil system, i.e.EEMT*t, provides an estimate of the total energy transferred to the soil system over the course of its evolution, referred to here as "total pedogenic energy" (TPE, J m -2 ).The TPE provides an estimate of P x that incorporates soil age, thus Eq. 3 may be restated as: where at a certain point in time the probability of a soil property existing between s 1 and s 2 is a function of L o and TPE.Explicitly including time in Eq. 4 through TPE partially captures variation in soil property change attributable to topography and parent material.Soil residence time may be directly related to landscape position through topographic control on soil production and sediment transport/deposition (Heimsath et al., 1997(Heimsath et al., , 2002;;Yoo et al., 2007).Additionally, parent material modulates soil residence time through control on soil depth (Heckman and Rasmussen, 2011;Rasmussen et al., 2005), soil production, and sediment transport rates (Andre and Anderson, 1961;Portenga and Bierman, 2011).The initial conditions of the soil forming system (L o ) are never fully known; however, representing the state of the soil system as a probable distribution of values, implicitly accounting for soil age, and not constraining the initial soil forming conditions, Eq. 4 can be stated simply as: where the probability state of the soil, ℙ s !≤ S ≤ s !, bounded by a lower and upper limit, is a function of one quantifiable variable.
Quantitatively realizing Eq. 5 required the use of predetermined joint probability density functions parameterized with TPE and a selected soil physical property.Bivariate normal density functions were calculated to determine the probability of a soil property range given a TPE value.The bivariate normal density distribution (Ugarte et al., 2008) was calculated as: where, ρ represents the correlation coefficient, µ x is the mean of TPE, µ y is the mean of the selected soil physical property, σ x is the standard deviation of TPE, σ y is the standard deviation of the selected soil physical property.Using the bivariate normal density functions, conditional mean and variance values were calculated given a value of TPE; the conditional means and variances parameterized conditional univariate normal distributions for the selected soil physical properties.The conditional mean (Ugarte et al., 2008) was calculated as: where, µ Y|X=x is the conditional mean soil property value given a value for TPE.The conditional variance (Ugarte et al., 2008) was calculated as: is the conditional variance of the soil property given a value of TPE.Applying this approach required certain assumptions and simplifications.The model assumes that climate was constant over the entire duration of pedogenesis.The model makes no assumptions about the progressive and regressive processes that drive pedogenesis; by weighing all profiles equally, both progressive (e.g., horizonation, clay accumulation, reddening, etc.) and regressive (e.g., haplodization, erosion, pedoturbation, etc.) pedogenic processes (Johnson and Watson-Stegner, 1987;Phillips, 1993a), are implicitly captured in the model structure.The model makes no assumptions about the initial soil forming system; the model simply describes the probability of a location exhibiting a range of soil properties based on TPE.The model assumes all changes in soil physical properties are due to pedogenic processes.We used a bivariate normal distribution; consequently the model assumes the data conforms to a normal distribution.

Data collection and preparation
The probability distributions were parameterized using an extensive literature review of chronosequence studies.Over 140 chronosequence publications were identified using Google

Total Pedogenic Energy
The influence of both climate and vegetation at the locations of each soil profile was determined using effective energy and mass transfer (EEMT) (Rasmussen and Tabor, 2007;Rasmussen et al., 2005).The EEMT values for each soil profile were extracted from a global map of EEMT derived from the monthly global climate dataset of New et al. (1999) at 0.5°x0.5°resolution using ArcMap 10.1 (ESRI, Redlands, CA) (Rasmussen et al., 2011).Total pedogenic energy (TPE, J m -2 ) was derived simply by multiplying EEMT (J m -2 yr -1 ) for each soil profile by its reported age (yr).TPE was used because it was a better predictor of soil physical properties relative to mean annual temperature, mean annual precipitation, or net primary productivity (Table 3).

Application to chronosequence data
The chronosequence database included 45 distinct chronosequences representing 416 different soil profiles.We focused here on changes in sand, silt, and clay content and solum thickness as proxies for soil change with time.We tested the approach on depth weighted (DWT) sand, silt and clay content (reported as weight %), as well as the maximum measured value of sand, silt, and clay content within each soil profile.Buried horizons were removed from the soil profiles before either the maximum or DWT content values were calculated.For soils reported in McFadden et al. (1986), surficial modern-aged eolian horizons were removed; the reported ages of the soil-geomorphic surface more closely matched the buried horizons under the eolian horizons.Solum thickness was extracted for each profile, defined as the thickness of the horizons influenced by pedogenic processes or the depth to C horizons (Schaetzl and Anderson, 2005).The site RW-14 from McFadden and Weldon (1987) was not included in the solum thickness model calculations, the measured solum thickness of RW-14 was 1460 cm, an order of magnitude greater than all other soil profiles included in the study.Four hundred and sixteen profiles reported clay content data, only 398 profiles reported sand and silt content, and 410 soil profiles contained a developed solum.We classified the soil profiles by parent material in terms of igneous, metamorphic, or sedimentary and by geomorphic landform (e.g., alluvial surface, marine terrace, or moraine, etc.) (Shoeneberger et al., 2012); for example, if a soil was formed on an alluvial fan from granitic parent material, it would be defined as alluvial and igneous.
Using the soils data, we calculated bivariate normal probability distributions using TPE and the soil physical properties (Eq.6).The soil data were transformed using logarithmic and square root transformations when appropriate to meet the normality assumption of the bivariate normal probability distribution.Conditional univariate normal distributions (Eqs.7, 8) were calculated to approximate probable ranges of soil properties using leave one out cross validation (LOOCV).Each of the soil chronosequences was removed from the model dataset, with the remaining chronosequence data used to calculate the parameters of the bivariate and conditional univariate normal distributions.The conditional univariate normal distributions were calculated using the TPE values for the profiles within the left-out chronosequence.

Application to complex terrain
By design, soil chronosequences are generally sited on gentle, low sloping terrain to minimize the influence of topography and erosion/deposition on soil formation (Harden, 1982).
However, much of the Earth's surface is characterized by complex topography with high relief, steep slopes, and differences in slope aspect.Any predictive soil model or approach must be effective in both simple terrain and complex terrain.To test the ability of the model to predict soil properties in complex terrain, we compiled data from upland catchments with variable parent material and topography from the literature, as well as data available from the US NSF Critical Zone Observatory Network (CZO, wwww.criticalzone.org)(Table 1) (Bacon et al., 2012;Dethier et al., 2012;Foster et al., 2015;Holleran et al., 2015;Lybrand and Rasmussen, 2015;Rasmussen, 2008;West et al., 2013).Data from several additional studies from complex terrain were also included to test the model (Table 1) (Dixon et al., 2009;Yoo et al., 2007).These data were accessed from: www.criticalzone.org, or Google Scholar (scholar.google.com).These studies were included because they all contained horizon-level soil texture data, soil depth, percent volume rock fragment data, and 10 Be or U-series measures of soil erosion rates or residence time, where mean residence time (MRT) was calculated as: MRT=h/E, where h is soil depth (m) and E is erosion rate (m/yr) (Pelletier and Rasmussen, 2009b).redistribution and primary productivity that can lead to significant topographic variation in EEMT (Rasmussen et al., 2015).Using Eq. 6 and the parameters generated from the chronosequence database, conditional mean depth weighted clay content was calculated for each profile.
Due to the influence of redistributive hillslope processes on soil development (Yoo et al., 2007), soil depth varies systematically across hillslopes (Heimsath et al., 1997); thus, soil depth can be used to incorporate information about these processes within the model calculations.We calculated the mass per area clay content of these profiles using soil depth to correct for these processes, as: where, ρ b is the soil bulk density assumed to be 1500 kg m -3 for all soil profiles, µ Y|X=x, DWT CLAY is the predicted conditional mean for depth weighted clay content (DWT CLAY) using Eq. 7, RF% is the measured depth weighted percent volume rock fragments within the soil, when no RF% data were available we assumed a value of 41.7%, which was the average RF% for profiles with reported values, and h is the soil depth in meters.Using Eq. 9, mass per area clay was calculated for each soil profile.Further, we examined the impact of depth, rock fragment percentage, and predicted conditional mean DWT clay on the predicted mass per area clay predictions using multiple linear regression.

Coupling geomorphic model with probabilistic model
Additionally, we applied the model independent of measured soil data, across a small complex catchment in the Santa Catalina Mountains (Catalina-Jemez CZO, Fig 2a-b EEMT and soil residence time to predict DWT clay, and coupled with modeled depth in Eq. 9 to predict mass per area clay at 2 m pixel resolution.We assumed a constant 50% rock fragment value for each location.The coupled geomorphic-TPE model outputs were compared with point measures of mass per area clay from Holleran et al. ( 2015) and Lybrand and Rasmussen (2015).
Model data were completely independent from Holleran et al. and Lybrand and Rasmussen and these datasets served as a validation for the modeled output.

Application and parameterization to chronosequences
The relationships between TPE and soil texture and solum thickness were used to calculate the bivariate probability distributions.The bivariate probability distributions (Eq.6) were parameterized using the chronosequence database (Table 2).Furthermore, the relationship between TPE and the soil properties was stronger than just using age, NPP, MAP, or MAT alone (Table 3).Age was expected to strongly correlate to the soil properties due to the design of chronosequence studies; however, comparing age and TPE separately, the percent increase in Spearman rank correlations (r) ranged from 1.9% (DWT Silt) to 22.4% (Max Sand).Maximum

Application in complex terrain
The model was much less effective in complex terrain and highly overpredicted DWT

Coupled geomorphic-TPE model
The

Model results for chronosequences
The model predicted maximum clay content across a diverse range of lithologies, climates, and landforms.Weathering and clay production are primary pedogenic processes (Birkeland, 1999;Schaetzl and Anderson, 2005), and because the model assumed all changes in the soil profile are due to these processes, the model was the most effective at predicting clay content.For initial soil states that begin pedogenesis with a potentially significant amount of clay-sized particles the model was much less effective.The soils of the Taiwanese Additionally, the model highly underestimated the clay content of soils located on coral reef terraces in tropical environments (Maejima et al., 2005;Muhs, 2001).Coral reef terraces represent a relatively unique landform that weathers rapidly to fine sized particles, especially under tropical climates, and generally have complicated parent material compositions (Muhs et al., 1987).The combination of these factors limited the ability of the model to predict the soil properties on these surfaces.
Sand and silt displayed weaker relationships with increasing total pedogenic energy.The lack of correlation of sand and silt to TPE may result in part from the definitions of the particle size classes.Sand sized particles span several orders of magnitude difference in particle size, ranging from particles of 2 mm to 0.05 mm (Soil Survery Staff, 2010), whereas clays are constrained to particles less than 0.002 mm.The sequential weathering of rock fragments and coarse sand to fine and very fine sands therefore is not reflected in total sand content and likely diminishes the relationship between sand content and total pedogenic energy and time (Pye and Sperling, 1983;Pye, 1983;Sharmeen and Willgoose, 2006).The relationship between silt content and pedogenic energy was the weakest of the three broad particles size classes (Tables 2,    3).Similar to sand, the silt size fractions span an order of magnitude in particle size ranging from 0.05 to 0.002 mm in diameter.Additionally, the silt fraction may also be heavily influenced by deposition of eolian material and thereby introduce an additional mass of silt that was not derived from the direct weathering of the initial soil forming system (McFadden et al., 1987) effectively uncoupling silt content from total pedogenic energy.Solum thickness displayed a relatively strong relationship with increasing pedogenic energy, with TPE explaining up to 42% of the variance in solum thickness (Tables 2, 3).Soil production is related to climatic variation (Amundson et al., 2015), with this variation partly captured by EEMT and TPE, leading to the slightly stronger predictive power of the model.
However, soil production is also highly influenced by redistributive hillslope process, chemical and physical weathering, and tectonic uplift (Heimsath et al., 1997;Riebe et al., 2004;Yoo and Mudd, 2008b), and can be a highly non-linear process (Pelletier and Rasmussen, 2009a).These factors were not directly accounted for in this study in that topography was not a quantified factor, which likely represents a large proportion of the remaining unexplained variance in solum thickness.

Model results in complex terrain
Due to using soil chronosequence data to parameterize the approach, the influence of redistributive hillslope processes was not captured.Additionally, in the amount of time required to transport soil across a hillslope, chemical and physical alterations of the soil particles are possible and may not be reflected in mean residence time calculations (Yoo and Mudd, 2008a;Yoo et al., 2007).Soil thickness is highly dependent upon hillslope position and landscape morphology (Dietrich et al., 2003;Heimsath et al., 1997;Pelletier and Rasmussen, 2009a).By using soil thickness as a proxy for the strength of these redistributive hillslope processes, and converting the predicted conditional mean clay content value to a mass per area basis, the model was able to capture differences in clay content across complex terrain for a variety of lithologies and climates.The differing lithologies, climates, or vegetation types did not appear to impact the ability of the model to predict clay contents, likely because soil depth accounts for many of these controls.Parent material and climate influence the weathering process and production of clay in soils (Harden and Taylor, 1983;Muhs et al., 2001); however, these factors are collinear with soil depth (Heckman and Rasmussen, 2011;Lybrand and Rasmussen, 2015;Pelletier and Rasmussen, 2009a), such that by including soil depth, differences due to lithology or climate were partly incorporated in the model prediction.

Results from coupled geomorphic-TPE model
For the majority of sites in the Marshall Gulch sub-catchment, the coupled geomorphic- assemblage, the amphibolite materials are likely weather at a faster rate compared to the granite derived soils (White et al., 2001;Wilson, 2004), resulting in greater clay production and likely explaining the underestimated clay contents.Inclusion of differential weathering rates for varying lithologies within the geomorphic model would likely lead to better prediction of clay contents.With these adjustments, the coupled geomorphic-TPE model represents an effective, independent prediction of clay stocks.

Advantages of probabilistic approach
Simplifying and representing the soil-forming factors as multivariate distributions and probabilities has the potential to quantitatively represent the general state-factor model, making the approach universally applicable.The initial state of the soil can likely never be fully known, leading to variability in soil properties over time that cannot necessarily, or ever, be attributed to any external factor (Phillips, 1989(Phillips, , 1993b)).A probabilistic approach utilizes that variability to drive predictions and understanding of these systems.Similar to the approach taken here, building distributions of the soil-forming state factors that are associated with distributions of particular soil properties could yield probabilistic predictions of soil formation and change.We selected to use a representation of climate and biology (EEMT), however, depending on the soil property of interest the variables needed to parameterize the distributions would likely change; for example, if interested in organic matter content, aboveground net primary productivity or normalized difference vegetation index may be better predictors of organic matter accumulation.
The strength of this approach lies in the fact that no assumptions are made about the initial conditions of the soil forming system or the specific soil forming processes.Predicting probable distributions of soil physical properties implicitly acknowledges that our understanding of any system is incomplete, but explicitly quantifies uncertainty in predictions and constrains the potential observable values to a predicted range.Utilizing this approach will require the necessary data to build distributions that are widely representative and applicable to most locations (Yaalon, 1975).With wide accessibility to large databases of soil information, such as the US National Soil Information System (NASIS) and the FAO Harmonized World Soil Database, access to the required amount and quality of data may be possible.Similar to the present study, simple bivariate distributions could be solved to calculate conditional distributions based on the soil-forming state factors, effectively producing quantitative probabilistic representations of Jenny's original equation (Jenny, 1941).
The simplicity of the present approach allows easy integration into pre-existing geomorphic models of landscape evolution.Past approaches that have combined pedogenic and landscape evolution models have generally focused on producing hypothetical soil-landscape relationships that progress forward through time (Minasny and McBratney, 2001;Vanwalleghem et al., 2013), or have focused on idealized landscapes (Temme and Vanwalleghem, 2015).
However, by combining probabilistic approaches parameterized using known landscapes, and geomorphically based landscape evolution models, both potential soil-landscape evolution scenarios can be investigated, as well as predictions of the current state of the soil-landscape.As There are obvious limitations within the current model: lack of consideration of parent material influences, topographic variation, or internal soil feedbacks and thresholds, and differences in paleoclimate variation.Parent material control on the relative proportion of weatherable minerals and mineral weathering rates (Jackson et al., 1948) can manifest as vastly different soil morphologies and rates of pedogenesis when controlling for other soil forming factors (Heckman and Rasmussen, 2011;Parsons and Herriman, 1975).The current approach implicitly assumes no information about the initial conditions, only that all clay production is a pedogenic process.Applying this approach to parent materials, where a large fraction of claysized particles formed through non-pedogenic processes, is thus limited and may explain why the model was ineffective for some soils.Refining the current approach would require normalization of soil to the particle size distribution of the soil parent material.Past studies have utilized highly characterized parent material data to model soil property change with time (Chadwick et al., 1990;Harden, 1982), but these data are generally difficult to obtain and often not reported in the available chronosequence literature.
Topography dictates soil chemical and physical properties and residence times, especially in complex terrain (Almond et al., 2007;Egli et al., 2008;Lybrand and Rasmussen, 2015), where non-linear diffusive hillslope processes control the fluxes of matter and energy into and out of the soil system (Heimsath et al., 1997;Pelletier and Rasmussen, 2009a;Rasmussen et al., 2015;Yoo and Mudd, 2008b;Yoo et al., 2007).Using earlier versions of EEMT (Rasmussen and Tabor, 2007;Rasmussen et al., 2005), the current formulation of the model and TPE does not explicitly quantify topographic variation, which may account for error within current soil property distributions and predictions.With the inclusion of topographic variation within EEMT (Rasmussen et al., 2015) and topographic control of soil residence times (Foster et al., 2015;SOIL Discuss., doi:10.5194/soil-2016-63, 2016 Manuscript under review for journal SOIL Published: 18 October 2016 c Author(s) 2016.CC-BY 3.0 License.West et al., 2013), we were able to correct this error within the present approach, particularly in complex terrain, and effectively predicted clay stocks.
Internal or intrinsic feedbacks and thresholds within the soil system drive pedogenic development without changes in the external state factors (Chadwick and Chorover, 2001;Muhs, 1984).For example, greater chemical weathering and clay production due to increased water residence time caused by argillic horizon development is the result of an internal feedback that is independent of the external climatic and biological system (Schaetzl and Anderson, 2005).These thresholds can operate as progressive or regressive processes, driving soil formation forward or hindering further development (Johnson and Watson-Stegner, 1987;Phillips, 1993a).Internal soil development feedbacks were not explicitly considered in the present model formulation.The presence of these internal feedbacks may partially explain error within the model predictions.
Changes in EEMT would not explain all observed differences in soil properties over the age of the soil.However, if these feedbacks were operating in the included soils, the influence of intrinsic thresholds was implicitly captured within the probability distributions, partially accounting for the role of internal soil development feedbacks on soil formation.
Furthermore, global climate patterns have shifted dramatically over the last 65 Mya (Zachos et al., 2001).The majority of soils observed in the compiled chronosequence database span the Quaternary, including both the Holocene and Pleistocene.The Pleistocene was marked by a number of major glacial-interglacial cycles at approximately 100,000-year intervals (Imbrie et al., 1992;Wallace and Hobbs, 2006), which corresponded with shifting climatic conditions, e.g., for large portions of the northern mid-latitudes glacial periods were generally cooler and wetter, and interglacial periods were warmer and drier (Connin et al., 1998;Petit et al., 1999).
Further, the Pleistocene climate shifts likely influenced the rates of weathering and clay production (Hotchkiss et al., 2000).Taking into account the differences in past and modern climate would likely diminish disparities between observed and modeled soil physical properties.
Reconstructed global paleo-EEMT values would improve model accuracy, and limit uncertainty in the probabilistic ranges of soil properties for soils older than Holocene age.

Conclusion
The present approach effectively predicts soil physical properties across a diverse range of geomorphic surfaces, lithologies, ecosystems, and climates.Further, this approach is mathematically simple and only requires knowledge of the probable age of a geomorphic surface and the effective energy and mass transfer value associated with a given location, making this approach universally applicable.The simplicity of the probabilistic approach is the lack of assumptions about the initial conditions of the soil forming state or the processes driving soil property change.A probabilistic approach does not exactly predict a soil physical property value at a given location, but constrains the probable values based upon the state of the external environment to the soil.Using probabilistic approaches, we can model probable soil-landscape evolution scenarios, greatly informing our understanding of the evolution of critical zone structure.
provided in support of the Catalina-Jemez Critical Zone Observatory.LiDAR data acquisition was supported by U.S. National Science Foundation grant no.EAR-0922307 (P.I.Qinghua Guo).
Table 1.Complex terrain study sites and characteristics.
We used published coordinates to extract EEMT values, calculated from New et al. (1999), for each soil profile using ArcGIS 10.1, and used EEMT and MRT to calculate TPE.It should be noted the coarse resolution of New et al. (1999) EEMT values do not account for local scale variation in water SOIL Discuss., doi:10.5194/soil-2016-63,2016   Manuscript under review for journal SOIL Published: 18 October 2016 c Author(s) 2016.CC-BY 3.0 License.
clay contents in soils located in complex landscapes(Fig 6a, r 2 =0.26, y=0.39x+7.27,p<0.0001,RMSE=5.3%).The model highly over predicted the clay content of the South Carolina site and the Gordon Gulch soils, and under predicted the clay content of the Rincon, Santa Catalina, Jemez sites.When correcting for the influence of hillslope processes by explicitly including soil depth and calculating mass per area clay, the approach effectively predicted clay content, with an r 2 =0.81(Fig 6b, p<0.0001,RMSE=84.4 kg clay m -2 ), only slightly overpredicting clay content, with a slope of 1.56.Soil depth was the strongest contributing factor to the mass per area clay prediction with the greatest sums of squares in a simple multiple linear regression including depth, RF%, and DWT clay% (Table4); predicted conditional mean clay content percentage was the second strongest contributing factor to the mass per area clay prediction.Rock fragment percentage did not influence the mass per area clay content prediction.SOIL Discuss., doi:10.5194/soil-2016-63,2016 Manuscript under review for journal SOIL Published: 18 October 2016 c Author(s) 2016.CC-BY 3.0 License.
coupled geomorphic-TPE model effectively predicted mass per area clay for the majority of soils located within the Marshall Gulch subcatchment with an r 2 =0.74 (Fig 7a, y=0.85x-5.00,p<0.0001,RMSE=17.7 kg clay m -2 ).For a subset of soils, the model did not effectively predict mass per area clay, and were excluded from the regression in Fig 7a; four of these soils were located on the east-facing ridge of the catchment, and an additional two soils were formed on amphibolite rather than the granite or quartzite materials that all of the other soils in the catchment were derived from.All of these locations also exhibited a poor fit between modeled and measured soil depth (Fig 2e).The spatial distribution of mass per area clay was also predicted across the catchment (Fig 7b), independently of measured data, and generally conformed to previously predicted spatial distribution of clay stocks in the Marshall Gulch catchment (Holleran et al., 2015).

SOIL
Discuss., doi:10.5194/soil-2016Discuss., doi:10.5194/soil--63, 2016     Manuscript under review for journal SOIL Published: 18 October 2016 c Author(s) 2016.CC-BY 3.0 License.chronosequence formed from conglomerates(Huang et al., 2010); conglomerates are typically poorly sorted, such that these soils initially formed with high clay contents slowing clay accumulation, limiting the effectiveness of the model to predict clay contents in these soils.
TPE model was highly effective at predicting clay content, and the spatial distribution of clay stocks.Large differences were found for four soils located on the east-facing ridge of the catchment underlain by granite with the model generally over-predicting soil depth and clay content.Discrepancies between the modeled and measured depths were likely the primary sources of error within the mass per area clay predictions for the 4 east-facing ridge soils (Fig2e).The geomorphic model predicted deeper soil depths due to the presence of an apparent convergent zone on the east-facing ridge of the sub-catchment; however, this convergent zone is only a small feeder tributary to the larger catchment drainage.The inability of the model to effectively predict clay contents and the mismatch between modeled and actual soil depths in the 4 soils located on the east-facing ridge is likely due to this local, fine-scale topographic variation Error in predicted soil depths due to fine-scale differences in lithology within the Marshall Gulch sub-catchment partly explains the discrepancies between measured and predicted mass per area clay contents.For two amphibolite-derived soils, the model greatly underestimated mass per area clay.The geomorphic soil depth model assumed a uniform weathering rate based on the granitic soils(Pelletier and Rasmussen, 2009a); due to differences in primary mineralSOIL Discuss., doi:10.5194/soil-2016-63,2016   Manuscript under review for journal SOIL Published: 18 October 2016 c Author(s) 2016.CC-BY 3.0 License.
was demonstrated in Fig 7B, combining the present approach with geomorphically based soil depth models generated from DEMs has great potential to predict soil properties across a diverse range of environments, without needing prior knowledge of the landscape other than topography and climate.4.3 Limitations and potential refinements SOIL Discuss., doi:10.5194/soil-2016-63,2016 Manuscript under review for journal SOIL Published: 18 October 2016 c Author(s) 2016.CC-BY 3.0 License.

Figure 2 .
Figure 2. Marshall Gulch study site.(A) Location of the Santa Catalina Mountains and the Marshall Gulch catchment within Arizona, USA; (B) Elevation of the granite sub-catchment of Marshall Gulch; (C) Predicted soil depth in the granite sub-catchment (Pelletier and Rasmussen, 2009a); (D) EEMTv2.0 in the granite sub-catchment(Rasmussen et al., 2015); (E) Mismatch between the measured soil depths and predicted soil depths.

Figure 3 .
Figure 3. Bivariate normal distribution between TPE and max clay content.The points indicate individual soils.The red ellipses represent lines of equal probability, which corresponds to a three dimensional probability distribution.From this relationship the conditional mean and variances for the soil physical properties were calculated.

Figure 4 .
Figure 4. LOOCV results for max clay content.The results were subdivided by general soil parent material: igneous, metamorphic, and sedimentary; the points represent the geomorphic surface each soil formed on, and the colors represents the EEMT value for the location of each soil.Using LOOCV, where one chronosequence was removed from the model dataset and the remaining datasets were used to predict the parameters of the bivariate distributions, the conditional means of the left out chronosequence was determined.The model was effectively able to predict the conditional mean values of the max clay contents with an r 2 =0.54 (RMSE=14.7%).The model was least capable of predicting the clay contents on coral reef terraces (+), and appeared the most effective for alluvial surfaces (□).

Figure 5 .
Figure 5. Selected relationships between the measured maximum clay content and predicted maximum clay content.A) Harden, 1987, B) Howard et al., 1993, C) Birkeland 1984, D) Muhs, 2001, E) Huang et al., 2010, and F) Merritts et al., 1991.The errors represent the conditional standard deviations around the mean, which correspond to a probability of 50%.The model effectively predicted clay content across a diverse range of climates, landforms, and parent materials.The model was the least effective at predicting the clay content of soils in tropical climates, and soils forming on coral reef terraces.

Table 2 .
Parameters for the bivariate normal probability distributions for the soil physical properties and TPE.

Table 2 . Soil property parameters
a ρ, Pearson correlation between soil variables and Total Pedogenic Energy

Table 3 .
Spearman rank correlations between soil physical properties and TPE and age.

Table 3 . Spearman Rank Correlations
Max indicates maximum content DWT indicates depth weighted average content a Precent increase in Spearman rank correlation between TPE and age