Next Article in Journal
Gap-Scale Disturbance Patterns and Processes in a Montane Pinus palustris Woodland
Previous Article in Journal
Effective Methods for Adventitious Root Regeneration on Weeping Fig Stems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparison of Modeling Approaches for the Height–diameter Relationship: An Example with Planted Mongolian Pine (Pinus sylvestris var. mongolica) Trees in Northeast China

Key Laboratory of Sustainable Forest Ecosystem Management-Ministry of Education, School of Forestry, Northeast Forestry University, Harbin 150040, China
*
Author to whom correspondence should be addressed.
Forests 2022, 13(8), 1168; https://doi.org/10.3390/f13081168
Submission received: 13 June 2022 / Revised: 19 July 2022 / Accepted: 21 July 2022 / Published: 23 July 2022
(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)

Abstract

:
In the process of modeling height–diameter models for Mongolian pine (Pinus sylvestris var. mongolica), the fitting abilities of six models were compared: (1) a basic model with only diameter at breast height (D) as a predictor (BM); (2) a plot-level basic mixed-effects model (BMM); (3) quantile regression with nine quantiles based on BM (BQR); (4) a generalized model with stand or competition covariates (GM); (5) a plot-level generalized mixed-effects model (GMM); and (6) quantile regression with nine quantiles based on GM (GQR). The prediction bias of the developed models was assessed in cases of total tree height (H) predictions with calibration or without calibration. The results showed that extending the Chapman–Richards function with the dominant height and relative size of individual trees improved the prediction accuracy. Prediction accuracy was improved significantly when H predictions were calibrated for all models, among which GMM performed best because random effect calibration provided the lowest prediction bias. When at least 8% of the trees were selected from a new plot, relatively accurate and low-cost prediction results were obtained by all models. When predicting the H values of Mongolian pine for a new stand, GMM and BMM were preferable if there were available height measurements for calibration; otherwise, GQR was the best choice.

1. Introduction

Diameter at breast height (D) and total tree height (H) are the two most important variables for measuring forest inventory, and are often used to estimate site index, biomass, growth and yield, and other important parameters [1,2,3,4,5]. Measuring D is a quick, easy and reliable process, while measuring H is difficult and time-consuming [6]. Moreover, the measurement of H is often severely affected by visual impairment, which may produce considerable bias [7]. Therefore, developing a highly accurate height–diameter (H–D) model is necessary since forest managers prefer to use the H–D model to estimate H. The H–D relationship varies in different growing conditions, as well as with differences in other factors. Research has shown that predicting H with only the D variable is unreliable because trees with similar D values may have different H values, and the base H–D model (BM) is only suitable for homogeneous stands [8]. Numerous studies have shown that the generalized H–D model (GM) has wider applicability than the BM [9,10]. The GM takes stand and competitive factors into account, and the local variation in the relationship between D and H can be explained by extending other stand-level and tree-level variables [11]. The variables expanded to the BM usually include the average height of dominant trees (Hdom) [12], the quadratic mean diameter (Dq) [13], stand density (SD) [14], the ratio of diameter to Dq (Rd) [10], and stand basal area (BAS) [15]. Among these stand variables, Hdom is the most useful because it can represent the combined impact of stand level and site quality [16]. Thus, the accuracy of H predictions can be significantly improved by incorporating Hdom into the model [6,17]. Additionally, competing variables are sometimes introduced into the H–D model to improve prediction accuracy [18]. To date, numerous H–D models have been developed for different tree species around the world [14,19,20,21,22]. Although current research suggests many different methods for modeling the H–D relationship, there is no unified conclusion as to which method is most useful. Therefore, choosing the most appropriate modeling method to accurately predict H is a problem that urgently needs to be solved.
In the past, ordinary least squares (OLS) regression, which can provide appropriate parameter estimates for the overall average, was widely used to develop the H–D model [8]. In general, H–D data are characterized by a grouping structure of more than one level, such as trees nested within the plots and plots nested within the region, which may influence the H–D relationship. The OLS method is insufficient for explaining potential differences among groups [23]. Consequently, the nonlinear mixed effects (NLME) modeling approach is suitable to account for differences between groups and can reduce predicted bias [24,25]. In NLME, part or all of the model parameters include both fixed effects and random effects. The fixed effects enable the data modeling of typical sample plots, while the random effects explain the differences between each level based on the data structure [26,27]. In addition, NLME can solve the problem of high correlation encountered in datasets with a hierarchical data structure [28], while the OLS method is unable to solve this problem [14]. Therefore, NLME is more popular in the field of forestry [18,29,30].
Quantile regression (QR) was proposed by Koenker and Bassett [31] and has become increasingly popular in forestry research. QR is a method for estimating the conditional distribution of the dependent variable, which allows for more flexibility in expressing the relationship between the response variable and the predictor variable [9]. QR has been applied to self-thinning boundary lines [32], height–diameter [10], taper models [33,34], diameter growth [35], and crown modeling [36,37]. For different quantile curves, two adjacent quantile curves can be used to calibrate the plot-specific H–D curve, which can provide more accurate tree height predictions [10]. As a result, each plot has its H–D curve, which is similar to the NLME approach. QR calibration has been used for tree diameter growth prediction [35], tree height prediction [9,10], branch growth prediction [38], and stand volume growth modeling [39]. Currently, research on QR calibration has not received a great deal of attention, and more diverse data are needed to validate the method.
The calibration of model predictions has become a trend in recent research regarding the H–D model. Models based on OLS, NLME, and QR require calibration for the most accurate predictions [10]. For the OLS method, the calibration factor can be constructed and used to calibrate H predictions [40]. For NLME models, the empirical best linear unbiased prediction (EBLUP) method is used to calculate the random effects parameters [41]. The QR method constructs a new H–D curve for each group, which is achieved by combining two consecutive known curves for a specific quantile [38]. With the calibration of model predictions, an additional subsample consisting of H measurements from a new group is indispensable for the three modeling approaches. At present, the most critical issue is the size of the actual calibration subsample used. Therefore, it is important to analyze the effect of different sample sizes on the prediction accuracy of the three modeling methods [38,42,43]. If the subsample is not measured and the model predictions cannot be calibrated, H can still be predicted, but may with a larger bias [10].
Mongolian pine (Pinus sylvestris var. mongolica) is a geographic variety of Scots pine (Pinus sylvestris) and plays an important role in sand fixation, carbon sequestration, water and soil conservation, and timber production [44]. After the successful introduction of Mongolian pine, this tree species has been used in afforestation in large numbers in Northeast China [45]. It has the advantages of being good quality timber and of exhibiting strong growth ability under harsh site conditions [46]. Therefore, it is necessary to provide a variety of management schemes based on tree and stand attributes to maximize the ecological benefits of Mongolian pine [47]. To this end, a systematic study of the H–D curve of Mongolian pine is necessary to provide highly accurate H predictions, which can also help forest managers effectively plan Mongolian pine plantations. Until now, relatively few H–D models of Mongolian pine have been developed in Northeast China; only the H–D model of Mongolian pine grown in sandy areas has been developed [48], and an H–D model of Mongolian pine with wider applicability has not been proposed.
Thus, the objectives of this study were: (1) to evaluate the predictive capability of H–D models developed using the basic model (BM), basic mixed-effects model (BMM), basic nine-quantile regression model (BQR), generalized model (GM), generalized mixed-effects model (GMM), and generalized nine-quantile regression model (GQR); (2) to evaluate the prediction performance of different modeling methods using the jackknife technique; (3) to determine the required sampling percentage for calibration; and (4) to recommend several alternatives, considering their accuracy and applicability.

2. Materials and Methods

2.1. Study Area and Data Collection

Heilongjiang Province is the northernmost and easternmost provincial administrative region in China (from 121°11′–135°05′ E and from 43°26′–53°33′ N), and it is one of the largest Chinese forestry provinces. The forest area, total amount of forest volume, and timber production in Heilongjiang are among the highest ranked in the country. The area has a temperate continental climate, with a low-temperature and dry spring, warm and rainy summer, waterlogging-prone autumn with an early frost, and a cold, long winter. The annual average temperature is 3.5 °C, the altitude of the mountains is between 50 and 1000 m, and the average annual precipitation is 525 mm.
The data in this study are from pure plantations of Mongolian pine with different ages (12–59 years). The experimental data are mainly from Mengjiagang Forest Farm (130°32′–130°52′ E, 46°20′–45°30′ N) and Maoershan Forest Farm (127°30′–127°34′ E, 45°20′–45°25′ N). Based on criteria, such as stand development stage, site quality, and stand density, a number of pure plantations in Mengjiagang Forest Farm and Maoershan Forest Farm were selected randomly. In each selected plantation, rectangular or square plots were established and the sample plot sizes ranged from 0.02 to 0.2 ha. The differences in shapes and sizes of sample plots were mainly determined by the stand area, terrain, and other conditions. Finally, a total of 183 plots were established during the last twenty years. The distribution of the plots was representative of the current age range, stand density, and altitude of Mongolian pine. In the field work, the basic properties of all living trees were measured and recorded, such as D, H, tree status, and so on. D was measured using a diameter tape measure, and H was measured by a hypsometer (Vertex IV Ultrasonic, Haglöf, Sweden). A total of 10,841 individual Mongolian pines were measured. A basic summary of the modeling data is presented in Table 1, and the scatter plot between D and H is shown in Figure 1. Additionally, stand and competition variables were calculated, such as dominant tree height (Hdom, the average height of the 100 thickest trees per hectare), basal area (BAS), dominant tree diameter (Ddom, the average D of the 100 thickest trees per hectare), quadratic mean diameter (Dq), stand density (SD), the basal area of trees with larger diameter than the subjected tree (BAL), and relative size in the stand (Rd, ratio of D to Dq).

2.2. Methods

2.2.1. Basic and Generalized Model

The relationship between H and D has been described using a number of functions. In previous studies, the Chapman–Richards model has been shown to be highly flexible and exhibits satisfactory performance for fitting statistics; this model is one of the most widely used functions for developing H–D models [17,49,50,51,52]. Therefore, the Chapman–Richards function is used in this study as the form of the BM for Mongolian pine. The specific formula is
H i j = 1.3 + β 1 ( 1 e β 2 D i j ) β 3 + ε i j
where H i j and D i j are the total height and diameter at breast height of the j th Mongolian pine in the i th plot, respectively; β 1 β 3 are the parameters of this model; and the ε i j terms are the residuals.
The H–D relationship is greatly affected by stand characteristics and competition status [29]. Therefore, it is necessary to extend the BM to the GM by introducing covariates. To consider the effects of stand and competition variables on the H–D relationship, we took all variables in Table 1 as candidate variables. After evaluating the effect of stand variables and competing variables on the H–D relationship by a two-step covariate selection method, we developed a GM to predict H. First, we used the BM to fit the data of each plot and obtain the corresponding parameter estimates. Then, the relationship between the model coefficients and candidate variables was assessed through scatter plots and correlation analysis [50]. The stand factor and competitive factor introduced need to be significant and the fitting accuracy of the GM needs to be significantly improved. In addition, a collinearity analysis that make sure the independence of the variables was also needed.
As a result, Hdom and Rd were selected as the best covariates, as the inclusion of the two variables outperformed other candidates and the corresponding model coefficients were significant (at p < 0.05 level) according to the approximate t-statistic. In addition, the condition number (CN) was used as a collinearity diagnostic indicator to assess the collinearity effects, which indicated that there is no multicollinearity among the variables, using the CN = 30 as a threshold value [53]. Indeed, the CN of the GM was 21.33, which suggested that multicollinearity is not detected. Finally, the specific form of GM was as follows:
H i j = 1.3 + ( β 1 + β 2 H d o m i + β 3 R d i j ) ( 1 e β 4 D i j ) β 5 + ε i j
where β 1 β 5 are the parameters of this model, and the others were defined above.
Temesgen et al. [40] showed that predictions of ordinary models can be calibrated by a calibration factor, which can reduce a certain bias. According to the study of Temesgen et al. [40], the following calibration factors can be calculated when an H subsample is obtained:
c f = i = 1 n k ( H ^ i 1.3 ) × ( H i 1.3 ) i = 1 n k ( H ^ i 1.3 ) 2
where c f is the calibration factor, H ^ i is the H prediction made by the BM or GM, H i is the observed H of tree i th in plot k , and n k is the size of the subsample in plot k . Then the calibrated H predictions ( H C ) can be expressed as follows:
H C = 1.3 + c f × ( H ^ i 1.3 )
where H ^ i is the prediction value of the BM or GM, and H C represents the predicted value after calibration.

2.2.2. Basic and Generalized Mixed-Effects Models

NLME models can handle the intrinsic variation and nested structure of sample data. In NLME, all parameters in the model can be represented as fixed effect parameters, and some or all the parameters can contain additional random components. The single-level NLME form was as follows [54]:
y i = f ( Φ i , x i ) + ε i , i = 1 , 2 , , m
Φ i = A i β + B i b i
where m is the total number of plots; Φ i represents the formal parameter vector, which includes the fixed-effect parameter vector β and random-effect parameter vector b i of the i th sample plot; and symbols A i and B i are the design matrices for β and b i , respectively. y i and x i represent the response and independent variable vectors, respectively, and ε i is normally distributed within the group error. b i and ε i follow a normal distribution:
b i ~ N ( 0 , Ψ )
ε i ~ N ( 0 , R i )
R i = σ 2 G i 0.5 Γ i G i 0.5
where Ψ   is   the   variance covariance   matrix   of   b i , R i is the within-plot variance–covariance matrix, σ 2 is the residual variance common to all plots, G i accounts for a within-plot variance heteroscedasticity, and Γ i is the identity matrix and accounts for the within-plot autocorrelation structure of residual errors. Since the H–D model was prone to heteroscedasticity, a power variance function was applied to correct for variance heterogeneity (Equation (10)) [55,56]. In addition, the heteroscedasticity of the Basic and Generalized models can also be solved by the same weighting function.
v a r ( ε i j ) = σ 2 H ^ i j 2 δ
where δ is the estimated parameter and ε i j and H ^ i j are the residuals and H predictions, respectively, obtained by fixed-effect unweighted regression.
In NLME, there were two different scenarios of model prediction: fixed effects and a combination of fixed and random effects [57,58]. The NLME model results without random effects were referred to as uncalibrated predictions; conversely, the model results using random effects were referred to as calibrated predictions. Estimates of random effects parameters were calculated using the EBLUP method, where the detailed formula was [59,60,61,62]
u k = Ψ Z k T ( Z k Ψ Z k T + R k ) 1 ( y k y ^ k f i x e d )
where u k is a vector of the random effects for plot k , Ψ is the estimate of the variance-covariance matrix for the random effects, R k is the estimated variance-covariance matrix of within-group errors, Z k is the design matrix of the partial derivatives of the nonlinear function corresponding to the random parameters, Z K T represents the transposition of Z k and e k is the vector of residuals calculated by the fixed-effects predictions and observations.

2.2.3. Basic and Generalized Quantile Regression Models

QR can accurately estimate the complete conditional distribution of the dependent variable and can assess the predictors’ effects at different quantiles [63]. Based on Equations (1) and (2), the basic and generalized QR models were as follows:
H ^ τ ( D i j ) = 1.3 + ( β 1 ) ( 1 e ( β 2 ) D i j ) β 3
H ^ τ ( D i j ) = 1.3 + ( ( β 1 ) + β 2 H d o m i + ( β 3 ) R d i j ) ( 1 e β 4 D i j ) β 5
where H ^ τ is the predicted tree H at the τ th quantile and τ is the quantile. The quantile regression parameters were estimated by minimizing Equation (14), which was different from minimizing the sum of squared deviations in the OLS technique [64].
S = y ( D i j ) y ^ τ ( D i j )   τ [ y i j y ^ τ ( D i j ) ] + y ( D i j ) < y ^ τ ( D i j ) ( 1 τ ) [ y ^ τ ( D i j ) y i j ]
where S is the sum of the weighted residuals of the τ th quantile and y i j and y ^ τ ( D i j ) are the measured H and predicted H in the τ th quantile, respectively.
Model predictions of QR can be calibrated using different quantile curves. In this study, the nine-quantile regression ( τ = 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9) was found to generate better statistical results than other combinations of quantiles because the current data were too scattered. In predicting Hs for plot k by the QR model, the first step was to identify the two correct closest quantile regression curves based on the smallest absolute mean difference between observed and QR predicted Hs of the subsampled trees measured from plot k . Then, two consecutive quantile regressions (i.e., q th and ( q + 0.1)th) were chosen for interpolation of the H–D curve:
y ^ k i ( D i j ) = α y ^ q ( D i j ) + ( 1 α ) y ^ q + 0.1 ( D i j )
where y ^ k i ( D i j ) indicates the predicted H of the i th tree in plot k , y ^ q ( D i j ) and y ^ q + 0.1 ( D i j ) are the QR predictions in the q th and ( q + 0.1)th quantiles when D is D i j , and q can be 0.1, 0.2,…,0.8. α represents the interpolated ratio and was obtained by minimizing [35]:
Q = [ y k i ( D i j ) y ^ k i ( D i j ) ] 2

2.2.4. Model Evaluation and Predictions with/without Calibration

Six models were evaluated in this study: (1) BM; (2) BMM; (3) BQR; (4) GM; (5) GMM; and (6) GQR. In the current cases, given the lack of independent data sets, data splitting is unreasonable. Therefore, the use of cross validation techniques, such as the jackknife technique, was considered to be an appropriate way to evaluate the prediction accuracy of a model [65,66]. Thus, the jackknife technique [38,67] was applied to test the prediction ability of the developed models in both calibrated and uncalibrated cases, where the models were fitted using leave-one-out sample plots (i.e., m-1 sample plots), and the fitted model was used to predict each tree height in the excluded sample plot. Three indices were used for the assessment of fitting performance in this study: adjusted coefficient of determination ( R a 2 ), root mean square error (RMSE), and Akaike information criterion (AIC). Prediction bias was evaluated by two statistics: mean absolute percent error (MAPE) and mean absolute error (MAE).
R a 2 = 1 ( n 1 n p ) i = 1 m j = 1 n i ( y i j y ^ i j ) 2 i = 1 m j = 1 n i ( y i j y ¯ ) 2
R M S E = i = 1 m j = 1 n i ( y i j y ^ i j ) 2 i = 1 m n i p
A I C = 2 L L + 2 p
M A P E = i = 1 m j = 1 n i | y i j y ^ i j , i | / y i j i = 1 m n i
M A E = i = 1 m j = 1 n i | y i j y ^ i j , i | i = 1 m n i
where m is the total number of sampled plots; n i is the size of the i th plot; L L was the log likelihood value, p is the number of effective parameters; y i j and y ^ i j are the observed and fitted tree heights, respectively; and y ^ i j , i is the predicted tree height obtained by the jackknife technique.
To obtain H predictions with higher accuracy for a new plot, the predictions of all six models were calibrated once there was an H subsample. It was reported that the prediction accuracy improved as the sample size of measured subsamples increased [9,10]. However, more measurements require more work, which would increase the cost and time of fieldwork. Therefore, it was necessary to find an appropriate size that considered both the cost of the survey and the prediction accuracy. In this study, a percentage of trees in each plot were sampled for calibration to improve the prediction ability as there were obvious differences in the sizes of the plots. l ( l = 2%, 4%, 6%, 8%, …, 20%) sample trees were selected to adjust the calibration coefficients, random effects, and two specific continuous quantile curves to compare the predictive performance of the H–D models. The selection was a random process without replacement, and the calibration was repeated 200 times in each subsample size [6,68].

3. Results

3.1. Model Fitting

A preliminary result for the analysis of random effects parameters showed that the optimal forms of the BMM (Equation (22)) and GMM (Equation (23)) were expressed as follows:
H i j = 1.3 + ( β 1 + u 1 ) ( 1 e ( β 2 + u 2 ) D i j ) β 3 + ε i j
H i j = 1.3 + ( β 1 + u 1 + β 2 H d o m i + ( β 3 + u 3 ) R d i j ) ( 1 e β 4 D i j ) β 5 + ε i j
where the variables are as defined above, β 1 β 5 are fixed–effects parameters, u 1 u 3 are random effects parameters.
Among the BM, BMM, GM, and GMM, the BM exhibited heteroscedasticity and a weighted regression was applied with Equation (10), while others had homogeneous variance (Figure 2). The parameter estimates of the BM, BMM, and the basic nine quantile regressions are listed in Table 2, and the parameter estimates of the GM, GMM, and the generalized nine quantile regressions are given in Table 3. All parameters were statistically significant, with p values < 0.05. The fit statistics of the six models are shown in Table 4. BM described 71.76% of the variation when only D was used as a predictor. After the introduction of other variables (Hdom and Rd) in BM, the model fit was significantly improved, as R a 2 increased by 34.04% (from 0.7176 to 0.9619), RMSE decreased by 63.28% (from 3.4644 to 1.2718), and AIC decreased by 36.60%. After the introduction of random effects in the BM and GM, the fit of the model was further improved, which had a higher value of R a 2 and lower values of RMSE and AIC. The AIC value of the GMM was lower than that of the BMM; however, the R a 2 and RMSE of the BMM were similar to (or even better than) the GMM results. In addition, based on the fit statistics for BQR (τ = 0.5) and GQR (τ = 0.5), all indicators were similar to those of the BM and GM, respectively, with the exception of a large gap between the AIC of BQR (τ = 0.5) and that of the BM.

3.2. Model Evaluation and Prediction Comparison

The results of the evaluation of the six models are listed in Table 5. The performance of the BM was relatively poor when compared to other multivariate equations, which suggested that using D alone as a predictor is not sufficient to explain the variation in H. When all methods were uncalibrated, all models with D as the only predictor had low prediction accuracy, and the highest prediction accuracy was obtained with BQR (τ = 0.5), followed by the BM and BMM. In uncalibrated methods with multiple predictors, the GQR model (τ = 0.5) had the best model performance when compared with the other two models, as the MAPE and MAE were 6.9132% and 0.9706, respectively. The performance of the GM was worse than that of the GQR model but slightly better than that of the GMM. Overall, the GQR model had the highest prediction accuracy, followed by the GM, GMM, BQR model, BM, and BMM in the case of using models without calibration.
When all methods were calibrated, in calibrated methods containing only D as a predictor, the prediction accuracy of the calibrated BMM and calibrated BQR model was better than that of the calibrated BM, while the prediction accuracy of the calibrated BMM was higher than that of the BQR model (Table 5). In the calibrated methods with multiple predictors, the prediction accuracies of the calibrated GMM and calibrated GQR model were better than that of the calibrated GM, while the prediction accuracy of the calibrated GMM was higher than that of the GQR model. When considering all calibration models, the GMM had the highest prediction accuracy, followed by the GQR model, GM, BMM, BQR model, and BM. Scatter plots of measured tree heights and calibrated predicted tree heights are shown in Figure 3.
Figure 4 shows the statistics for calibrating the six developed models using different sampling percentages. The prediction performance of each model improved as the sampling percentage per plot increased. Compared to the case where the subsample was zero and, thus, the model was uncalibrated, the BM, BMM, BQR model, and GMM performed better based on the MAPE and MAE values. The calibrated GQR model and calibrated GM yielded better statistics than the corresponding uncalibrated models when measuring 4% and 6% of trees per plot, respectively (Table 6). Regardless of the sampling percentage, the GMM always exhibited better predictive performance than the other models, followed by the GQR model, GM, BMM, BQR model, and BM. Thus, tree height predictions calibrated by sampling 8% of trees per plot were optimal in terms of the prediction accuracy and considering the measurement costs in our study.
A further comparison was carried out using trees with a variety of diameters and calibrated based on 8% of the trees in each plot. Boxplots of MAPE and MAE across D classes (taking 4 cm as a class) were used to compare the BM, BMM, BQR model, GM, GMM, and GQR model in terms of prediction performance and stability (Figure 5). The trees with D values greater than 36 cm were classified into 34 diameter classes, as there were few trees that in which D was greater than 36 cm. Both BM and BQR exhibited large changes and relatively poor prediction performance in all D classes predicted at the tree level in terms of MAPE. For the BMM, GM, GMM, and GQR model, the mean MAPE in different D classes decreased with increasing D, and the mean MAPE in different D classes for the GMM was closer to 0 than those of other models, which had a narrower distribution. The MAE, BM and BQR model yielded poor prediction results in different D classes, and the MAE distribution range was larger. In addition, the BMM, GM, GMM, and GQR model were relatively stable in predicting tree height in different D classes. The MAE distribution range of the GMM was always narrowest in different D classes; therefore, a higher prediction accuracy was obtained. Overall, the GMM outperformed other models in predicting individual tree heights, followed by the GQR model, GM, BMM, BQR model, and BM.

4. Discussion

4.1. Specification of a Generalized H–D Model

H–D models are often used to predict individual H, which is necessary for estimating tree volume and predicting stand development over time [14]. Therefore, it is particularly important to determine the most appropriate method for developing H–D models. Thus, this study used the H–D dataset of Mongolian pine to develop H–D models and compared the predictive power of OLS, NLME and QR approaches. The Chapman–Richards function has been recognized as one of the most widely used functions in the H–D model due to its flexibility in current studies [50,51]. Previous studies have shown that there are certain differences in the H–D relationship between different stands, and the BM with only D as the predictor was only applicable to the stand for which the data were collected [17,69]. Therefore, a GM with wider applicability was proposed by introducing significant stand-level covariates or competition indices into the basic Chapman–Richards model [9]. In the GM of this study, the asymptotic parameters were best explained by the stand-related variable Hdom, as Hdom can represent the pros and cons of stand quality [70]. In addition, Hdom represented the developmental stage of the stand. In the H–D model, Hdom is often used as a covariate to reduce the bias of H prediction, [29,51,68]. For the tree-level competing factors that affect the H–D relationship, we chose Rd as the representative variable, and the larger the RD, the stronger the competitiveness of the tree. We incorporated Rd into the asymptote parameter as well. Additionally, the independence of all the variables (i.e., D, HDOM, and RD) was further tested by using the CN indictor, which would influence the inferences made about significance and parameter estimates [71,72]. It was found that the CN of the GM was smaller than 30, which suggested that there was no multicollinearity. The final form of GM was a Chapman–Richards function with Hdom and Rd, which had a high degree of flexibility and could further explain the variation in H for similar values of D in different stands.

4.2. Comparison of Modeling Approaches

There are still many stochastic variabilities in the H–D relationship controlled by other factors, such as soil, terrain, and climate factors [28,70,73]. The useful variables are sometimes difficult to find, which suggests that the GM may be not sufficient to explain all the differences between stands [70]. Therefore, this study developed H–D models by using the NLME and QR methods. Plot-level random effects can accurately express the H–D relationship. In addition, a set of QR models can accurately reveal possible relationships that standard regression analysis is unable to determine [38]. Different quantile curves can provide the H–D relationship of a certain plot more flexibly.
The fit indices of the BMM and GMM were somewhat unexpected, the statistics R a 2 and RMSE indicated that the BMM has a better fit than the GMM, while the AIC of the GMM was better than that of the BMM, which was consistent with Han et al. [18,48]. A possible explanation for why the GMM does not provide better precision than the BMM is related to the ability of the random-effects parameter to explain the various H–D relationships among stands. In our research, both the stand variable and the random effects were introduced alongside the asymptote coefficient, and the random effects of the BMM were significantly larger than those of the GMM [48].
Among the three modeling methods OLS, NLME, and QR, it was found that the developed NLME model and QR model were better than the OLS model because NLME and QR took into account the potential differences between plots [39]. In addition, the OLS and NLME methods for H–D modeling have been widely studied, while the use of QR has been relatively limited [7]. In our study, the results showed that the GMM and GQR model achieved accurate predictions with calibration. Generally, QR is more flexible than the NLME model because it can provide different parameter estimates based on different quantiles [10,38], and it has the ability to provide different H–D curves, as shown in Table 2 and Table 3. However, evaluation statistics showed that the prediction accuracy of the calibrated NLME model was better than that of the calibrated QR model (Table 5). Özçelik et al. [9] found that QR curves from different plots generally follow the basic shapes dictated by the QR model, and do not accept the fact that some of these QR curves might cross one another. In contrast, the plot is used as a grouping variable and enables specific and accurate predictions in NLME. The random effects in the NLME model can describe the differences among plots and account for variation in the H–D curve between plots, which improves the prediction accuracy of NLME model [69,74,75]. Nevertheless, QR still has its own advantages; QR can robustly estimate the extreme values of the response variable when the data are highly scattered and contain certain outliers in this case, and it is suitable to apply QR for analysis [76]. In addition, QR not only does not have high data requirements but also does not require multilevel structural data. The calibration of a QR model can be based on any category of classification, so even if QR prediction accuracy is slightly worse than that of NLME, when there is no obvious hierarchical structure (for example, there is no clustering by plot), QR models are more flexible and appropriate.
As shown in Table 5, calibrating the H predictions of all models using all trees per plot improved the prediction accuracy compared to the case that was not calibrated. However, it is very difficult and impractical to measure H for all trees in a forest inventory, and it is unreasonable to measure many samples for the calibration of H predictions [68]. Because of the uneven distribution of plot area in the data in this study, it may be a more reasonable approach to perform sampling calibration on the percentage of trees in each plot, which can more accurately predict the H of all trees in each plot. Sampling is a simulation of the actual survey situation, and the measured subsample can be used to adjust the calibration coefficient, random effect, or interpolation ratio of the two consecutive quantile regression curves, and effectively improve model estimation accuracy. Based on the results of OLS, NLME, and QR with calibration, the prediction accuracy of NLME was always higher than that of the OLS and QR methods regardless of the sample sizes (Table 6), which was consistent with Özçelik et al. [9]. In addition, the calibrated OLS prediction accuracy was the worst. For the calibration of the BM and GM, the calibration factor was the same for the predicted H of each tree in the plot, indicating that the default bias was the same; this was different from the NLME and QR methods and may be the reason for the poor performance of the the BM and GM’s calibration. In practice, it is crucial to determine the number of suitable measurement samples due to the high cost of sampling. Figure 4 shows that with the increase in the sampling percentage of each plot, the change curve of each index value showed a steep trend and then gradually became a straight line. In addition, the study found that for the GQR model and the GM, 4% and 6% of trees were selected as subsamples in each plot, respectively, and the prediction accuracy was better than that of the corresponding uncalibrated model, which was different from the GMM (Table 6). Furthermore, when a smaller percentage of trees was selected for calibration from each plot, the GMM was found to have certain advantages, and the prediction accuracy was significantly better than that of the GQR model and the GM; therefore, NLME was indeed the best recommended technique. Considering the sampling cost and prediction accuracy of the model, we believe that sampling 8% (or more) of trees per plot for calibration is sufficient to provide high-accuracy predictions. The calculation was based on the number of trees and the number of sample plots in this study, and was equivalent to taking an average of five trees from each sample plot for calibration calculation.
The statistical measures presented in this paper are based on the data used for the regression analysis, only assessing the quality of the fitting process. The model fit is currently assessed but not the error of the model application. In the future, it is possible to validate the developed models by using an independent data set to evaluate the quality of the predictions in real world applications. Hence, in a follow-up research, we will also focus on the combination of quantile regression and mixed effects models to explore a new method, namely, the QRNLMM. At present, the generalized QRNLMM has some problems in developing models, such as difficulty obtaining convergence, the calculation principle of random effects, and the prediction of QRNLMM, which are all required for making further explorations. In conclusion, we aim to explore the prediction of H using the QRNLMM, including calibration with different subsamples, in future studies to minimize the burden of field measurements.

5. Conclusions

In this study, six modeling approaches, the BM, BMM, BQR model, GM, GMM, and GQR model, were used to develop an H–D model based on the H–D data from 183 plots in Mongolian pine plantations in Northeast China, and were further compared. All models benefited from calibration when compared with uncalibrated models, and the calibrated models exhibited more stable performance and higher prediction accuracy. Among the calibrated models, the GMM obtained the highest prediction accuracy, followed by the GQR model, GM, BMM, BQR model, and BM; when there were no available H measurements for calibration, the GQR model provided the most accurate H predictions, suggesting that the calibration of random effects has advantages over both the combination of quantile regressions with different quantiles and calibration factors for tree-specific H prediction. The variation in prediction accuracy with increasing calibration sample size was assessed, and we found that the appropriate sample size for balancing both prediction accuracy and investigation cost was achieved when 8% of trees were randomly selected from a plot. Overall, the mixed-effects model showed the best performance for predicting H. However, the proposed models were developed for Mongolian pine plantations in Northeast China, and caution should be exercised when applying the newly developed models for other tree species or other areas.

Author Contributions

Conceptualization, L.D.; Data curation, F.L., L.X., Y.H., Z.M. and L.D.; Formal analysis, L.X., Y.H. and L.D.; Funding acquisition, L.D.; Investigation, F.L., L.X., Y.H., Z.M. and L.D.; Methodology, Z.M. and L.D.; Project administration, L.D.; Resources, L.D.; Software, F.L., L.X. and Z.M.; Supervision, L.X., Y.H., Z.M. and L.D.; Validation, L.X. and L.D.; Visualization, F.L., L.X. and Y.H.; Writing—original draft, F.L.; Writing—review & editing, F.L., L.X., Y.H., Z.M. and L.D. All authors have read and agreed to the published version of the manuscript.

Funding

The Joint Funds for Regional Innovation and Development of the National Natural Science Foundation of China (Grant No. U21A20244),the Fundamental Research Funds for the Central Universities (2572020DR03), and the Heilongjiang Touyan Innovation Team Program (Technology Development Team for High-efficient Silviculture of Forest Resources).

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to confidentiality.

Acknowledgments

The authors would like to thank the faculty and students of the Department of Forest Management, Northeast Forestry University (NEFU), China who provided and collected the data for this study.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Dong, L.; Zhang, Y.; Zhang, Z.; Xie, L.; Li, F. Comparison of tree biomass modeling approaches for larch (Larix olgensis Henry) trees in Northeast China. Forests 2020, 11, 202. [Google Scholar] [CrossRef] [Green Version]
  2. Trim, K.R.; Coble, D.W.; Weng, Y.; Stovall, J.P.; Hung, I.-K. A new site index model for intensively managed loblolly pine (Pinus taeda) plantations in the West Gulf Coastal Plain. For. Sci. 2020, 66, 2–13. [Google Scholar] [CrossRef] [Green Version]
  3. Yang, S.-I.; Burkhart, H.E. Evaluation of total tree height subsampling strategies for estimating volume in loblolly pine plantations. For. Ecol. Manag. 2020, 461, 117878. [Google Scholar] [CrossRef]
  4. Soares, P.; Tomé, M. Height–diameter equation for first rotation eucalypt plantations in Portugal. For. Ecol. Manag. 2002, 166, 99–109. [Google Scholar] [CrossRef]
  5. Mensah, S.; Pienaar, O.L.; Kunneke, A.; du Toit, B.; Seydack, A.; Uhl, E.; Pretzsch, H.; Seifert, T. Height—Diameter allometry in South Africa’s indigenous high forests: Assessing generic models performance and function forms. For. Ecol. Manag. 2018, 410, 1–11. [Google Scholar] [CrossRef]
  6. Raptis, D.I.; Kazana, V.; Kazaklis, A.; Stamatiou, C. Mixed-effects height–diameter models for black pine (Pinus nigra Arn.) forest management. Trees 2021, 35, 1167–1183. [Google Scholar] [CrossRef]
  7. Castaño-Santamaría, J.; Crecente-Campo, F.; Fernández-Martínez, J.L.; Barrio-Anta, M.; Obeso, J.R. Tree height prediction approaches for uneven-aged beech forests in northwestern Spain. For. Ecol. Manag. 2013, 307, 63–73. [Google Scholar] [CrossRef]
  8. Curtis, R.O. Height-diameter and height-diameter-age equations for second-growth Douglas-fir. For. Sci. 1967, 13, 365–375. [Google Scholar]
  9. Özçelik, R.; Cao, Q.V.; Trincado, G.; Göçer, N. Predicting tree height from tree diameter and dominant height using mixed-effects and quantile regression models for two species in Turkey. For. Ecol. Manag. 2018, 419, 240–248. [Google Scholar] [CrossRef]
  10. Xie, L.; Widagdo, F.R.A.; Miao, Z.; Dong, L.; Li, F. Evaluation of the mixed-effects model and quantile regression approaches for predicting tree height in larch (Larix olgensis) plantations in northeastern China. Can. J. For. Res. 2022, 52, 309–319. [Google Scholar] [CrossRef]
  11. Crecente-Campo, F.; Tomé, M.; Soares, P.; Diéguez-Aranda, U. A generalized nonlinear mixed-effects height–diameter model for Eucalyptus globulus L. in northwestern Spain. For. Ecol. Manag. 2010, 259, 943–952. [Google Scholar] [CrossRef] [Green Version]
  12. Castedo-Dorado, F.; Dieguez-Aranda, U.; Barrio-Anta, M.; Sanchez-Rodriguez, M.; von Gadow, K. A generalized height-diameter model including random components for radiata pine plantations in northwestern Spain. For. Ecol. Manag. 2006, 229, 202–213. [Google Scholar] [CrossRef]
  13. Saud, P.; Lynch, T.B.; KC, A.; Guldin, J.M. Using quadratic mean diameter and relative spacing index to enhance height–diameter and crown ratio models fitted to longitudinal data. Forestry 2016, 89, 215–229. [Google Scholar] [CrossRef] [Green Version]
  14. Ciceu, A.; Garcia-Duro, J.; Seceleanu, I.; Badea, O. A generalized nonlinear mixed-effects height–diameter model for Norway spruce in mixed-uneven aged stands. For. Ecol. Manag. 2020, 477, 118507. [Google Scholar] [CrossRef]
  15. Sharma, M.; Parton, J. Height–diameter equations for boreal tree species in Ontario using a mixed-effects modeling approach. For. Ecol. Manag. 2007, 249, 187–198. [Google Scholar] [CrossRef]
  16. Huang, S.; Wiens, D.P.; Yang, Y.; Meng, S.X.; Vanderschaaf, C.L. Assessing the impacts of species composition, top height and density on individual tree height prediction of quaking aspen in boreal mixedwoods. For. Ecol. Manag. 2009, 258, 1235–1247. [Google Scholar] [CrossRef]
  17. Sharma, R.P.; Vacek, Z.; Vacek, S.; Kučera, M. Modelling individual tree height–diameter relationships for multi-layered and multi-species forests in central Europe. Trees 2019, 33, 103–119. [Google Scholar] [CrossRef]
  18. Zang, H.; Lei, X.D.; Zeng, W.S. Height-diameter equations for larch plantations in northern and northeastern China: A comparison of the mixed-effects, quantile regression and generalized additive models. Forestry 2016, 89, 434–445. [Google Scholar] [CrossRef]
  19. Huang, S.; Titus, S.J.; Wiens, D.P. Comparison of nonlinear height–diameter functions for major Alberta tree species. Can. J. For. Res. 1992, 22, 1297–1304. [Google Scholar] [CrossRef]
  20. Ng’andwe, P.; Chungu, D.; Yambayamba, A.M.; Chilambwe, A. Modeling the height-diameter relationship of planted Pinus kesiya in Zambia. For. Ecol. Manag. 2019, 447, 1–11. [Google Scholar] [CrossRef]
  21. Gollob, C.; Ritter, T.; Vospernik, S.; Wassermann, C.; Nothdurft, A. A Flexible Height–Diameter Model for Tree Height Imputation on Forest Inventory Sample Plots Using Repeated Measures from the Past. Forests 2018, 9, 368. [Google Scholar] [CrossRef] [Green Version]
  22. Kearsley, E.; Moonen, P.C.; Hufkens, K.; Doetterl, S.; Lisingo, J.; Bosela, F.B.; Boeckx, P.; Beeckman, H.; Verbeeck, H. Model performance of tree height-diameter relationships in the central Congo Basin. Ann. For. Sci. 2017, 74, 1–13. [Google Scholar] [CrossRef] [Green Version]
  23. Pinero, J.; Bates, D. Mixed-Effects Models in S and S-PLUS (Statistics and Computing); Springer: New York, NY, USA, 2000. [Google Scholar]
  24. Zhang, X.; Lei, Y.; Liu, X.J. Modeling stand mortality using Poisson mixture models with mixed-effects. iForest-Biogeosci. For. 2015, 8, 333. [Google Scholar] [CrossRef]
  25. Timilsina, N.; Staudhammer, C.L. Individual tree-based diameter growth model of slash pine in Florida using nonlinear mixed modeling. For. Sci. 2013, 59, 27–37. [Google Scholar] [CrossRef]
  26. Mehtätalo, L.; de-Miguel, S.; Gregoire, T.G. Modeling height-diameter curves for prediction. Can. J. For. Res. 2015, 45, 826–837. [Google Scholar] [CrossRef] [Green Version]
  27. Lappi, J. A longitudinal analysis of height/diameter curves. For. Sci. 1997, 43, 555–570. [Google Scholar]
  28. Temesgen, H.; Zhang, C.; Zhao, X. Modelling tree height–diameter relationships in multi-species and multi-layered forests: A large observational study from Northeast China. For. Ecol. Manag. 2014, 316, 78–89. [Google Scholar] [CrossRef]
  29. Calama, R.; Montero, G. Interregional nonlinear height-diameter model with random coefficients for stone pine in Spain. Can. J. For. Res. 2004, 34, 150–163. [Google Scholar] [CrossRef] [Green Version]
  30. Fang, Z.; Bailey, R. Height–diameter models for tropical forests on Hainan Island in southern China. For. Ecol. Manag. 1998, 110, 315–327. [Google Scholar] [CrossRef]
  31. Koenker, R.; Bassett, G., Jr. Regression quantiles. Econom. J. Econom. Soc. 1978, 46, 33–50. [Google Scholar] [CrossRef]
  32. Zhang, L.; Bi, H.; Gove, J.H.; Heath, L.S. A comparison of alternative methods for estimating the self-thinning boundary line. Can. J. For. Res. 2005, 35, 1507–1514. [Google Scholar] [CrossRef]
  33. Cao, Q.V.; Wang, J.J.F.S. Evaluation of methods for calibrating a tree taper equation. For. Sci. 2015, 61, 213–219. [Google Scholar] [CrossRef] [Green Version]
  34. He, P.; Hussain, A.; Shahzad, M.K.; Jiang, L.; Li, F. Evaluation of four regression techniques for stem taper modeling of Dahurian larch (Larix gmelinii) in Northeastern China. For. Ecol. Manag. 2021, 494, 119336. [Google Scholar] [CrossRef]
  35. Bohora, S.B.; Cao, Q.V. Prediction of tree diameter growth using quantile regression and mixed-effects models. For. Ecol. Manag. 2014, 319, 62–66. [Google Scholar] [CrossRef]
  36. Raptis, D.I.; Kazana, V.; Kechagioglou, S.; Kazaklis, A.; Stamatiou, C.; Papadopoulou, D.; Tsitsoni, T. Nonlinear Quantile Mixed-Effects Models for Prediction of the Maximum Crown Width of Fagus sylvatica L., Pinus nigra Arn. and Pinus brutia Ten. Forests 2022, 13, 499. [Google Scholar] [CrossRef]
  37. Sun, Y.; Gao, H.; Li, F. Using linear mixed-effects models with quantile regression to simulate the crown profile of planted Pinus sylvestris var. Mongolica trees. Forests 2017, 8, 446. [Google Scholar] [CrossRef] [Green Version]
  38. Miao, Z.; Widagdo, F.R.A.; Dong, L.; Li, F. Prediction of branch growth using quantile regression and mixed-effects models: An example with planted Larix olgensis Henry trees in Northeast China. For. Ecol. Manag. 2021, 496, 119407. [Google Scholar] [CrossRef]
  39. Wang, T.; Xie, L.; Miao, Z.; Widagdo, F.R.A.; Dong, L.; Li, F. Stand Volume Growth Modeling with Mixed-Effects Models and Quantile Regressions for Major Forest Types in the Eastern Daxing’an Mountains, Northeast China. Forests 2021, 12, 1111. [Google Scholar] [CrossRef]
  40. Temesgen, H.; Monleon, V.; Hann, D. Analysis and comparison of nonlinear tree height prediction strategies for Douglas-fir forests. Can. J. For. Res. 2008, 38, 553–565. [Google Scholar] [CrossRef] [Green Version]
  41. Mehtätalo, L.; Lappi, J. Biometry for Forestry and Environmental Data: With Examples in R; Chapman and Hall/CRC: Boca Raton, FL, USA, 2020. [Google Scholar]
  42. Fu, L.; Zhang, H.; Sharma, R.P.; Pang, L.; Wang, G. A generalized nonlinear mixed-effects height to crown base model for Mongolian oak in northeast China. For. Ecol. Manag. 2017, 384, 34–43. [Google Scholar] [CrossRef]
  43. Miao, Z.; Zhang, L.; Widagdo, F.R.A.; Dong, L.; Li, F. Modeling the number of the first-and second-order branches within the live tree crown of Korean larch plantations in Northeast China. Can. J. For. Res. 2021, 51, 704–719. [Google Scholar] [CrossRef]
  44. Zhu, J.; Zheng, X. The prospects of development of the Three-North Afforestation Program (TNAP): On the basis of the results of the 40-year construction general assessment of the TNAP. Chin. J. Ecol. 2019, 38, 1600–1610. [Google Scholar]
  45. Zheng, X.; Zhu, J.; Yan, Q.; Song, L. Effects of land use changes on the groundwater table and the decline of Pinus sylvestris var. mongolica plantations in southern Horqin Sandy Land, Northeast China. Agric. Water Manag. 2012, 109, 94–106. [Google Scholar] [CrossRef]
  46. Stankova, T.V.; Diéguez-Aranda, U. Height-diameter relationships for Scots pine plantations in Bulgaria: Optimal combination of model type and application. Ann. For. Res. 2012, 56, 149–163. [Google Scholar]
  47. Scolforo, H.F.; McTague, J.P.; Burkhart, H.; Roise, J.; Campoe, O.; Stape, J.L. Eucalyptus growth and yield system: Linking individual-tree and stand-level growth models in clonal Eucalypt plantations in Brazil. For. Ecol. Manag. 2019, 432, 1–16. [Google Scholar] [CrossRef]
  48. Han, Y.; Lei, Z.; Ciceu, A.; Zhou, Y.; Zhou, F.; Yu, D. Determining an Accurate and Cost-Effective Individual Height-Diameter Model for Mongolian Pine on Sandy Land. Forests 2021, 12, 1122. [Google Scholar] [CrossRef]
  49. Richards, F.J. A flexible growth function for empirical use. J. Exp. Bot. 1959, 10, 290–301. [Google Scholar] [CrossRef]
  50. Karatepe, Y.; Diamantopoulou, M.J.; Özçelik, R.; Sürücü, Z. Total tree height predictions via parametric and artificial neural network modeling approaches. iForest-Biogeosci. For. 2022, 15, 95. [Google Scholar] [CrossRef]
  51. Patrício, M.S.; Dias, C.R.; Nunes, L. Mixed-effects generalized height-diameter model: A tool for forestry management of young sweet chestnut stands. For. Ecol. Manag. 2022, 514, 120209. [Google Scholar] [CrossRef]
  52. Nigul, K.; Padari, A.; Kiviste, A.; Noe, S.M.; Korjus, H.; Laarmann, D.; Frelich, L.E.; Jõgiste, K.; Stanturf, J.A.; Paluots, T.; et al. The Possibility of Using the Chapman–Richards and Näslund Functions to Model Height–Diameter Relationships in Hemiboreal Old-Growth Forest in Estonia. Forests 2021, 12, 184. [Google Scholar] [CrossRef]
  53. Belsley, D.A. Conditioning Diagnostics: Collinearity and Weak Data in Regression; Wiley: New York, NY, USA, 1991; p. 396. [Google Scholar]
  54. Pinheiro, J.; Bates, D. Mixed-Effects Models in S and S-PLUS; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
  55. Sharma, R.P.; Bílek, L.; Vacek, Z.; Vacek, S. Modelling crown width–diameter relationship for Scots pine in the central Europe. Trees 2017, 31, 1875–1889. [Google Scholar] [CrossRef]
  56. Fu, L.; Duan, G.; Ye, Q.; Meng, X.; Luo, P.; Sharma, R.P.; Sun, H.; Wang, G.; Liu, Q. Prediction of individual tree diameter using a nonlinear mixed-effects modeling approach and airborne LiDAR Data. Remote Sens. 2020, 12, 1066. [Google Scholar] [CrossRef] [Green Version]
  57. Sainz, R.A.G.; Montero, G. Multilevel linear mixed model for tree diameter increment in stone pine (Pinus pinea) A calibrating approach. Silva. Fennica. 2005, 39, 37–54. [Google Scholar]
  58. Yang, Y.; Huang, S.; Meng, S.X.; Trincado, G.; VanderSchaaf, C.L. A multilevel individual tree basal area increment model for aspen in boreal mixedwood stands. Can. J. For. Res. 2009, 39, 2203–2214. [Google Scholar] [CrossRef]
  59. Robinson, G.K. That BLUP is a good thing: The estimation of random effects. Stat. Sci. 1991, 6, 15–32. [Google Scholar]
  60. Hao, Y.; Widagdo, F.R.A.; Liu, X.; Quan, Y.; Dong, L.; Li, F. Individual tree diameter estimation in small-scale forest inventory using UAV laser scanning. Remote Sens. 2020, 13, 24. [Google Scholar] [CrossRef]
  61. Liu, X.; Hao, Y.; Widagdo, F.R.A.; Xie, L.; Dong, L.; Li, F. Predicting Height to Crown Base of Larix olgensis in Northeast China Using UAV-LiDAR Data and Nonlinear Mixed Effects Models. Remote Sens. 2021, 13, 1834. [Google Scholar] [CrossRef]
  62. Wang, Y.; LeMay, V.M.; Baker, T.G. Modelling and prediction of dominant height and site index of Eucalyptus globulus plantations using a nonlinear mixed-effects model approach. Can. J. For. Res. 2007, 37, 1390–1403. [Google Scholar] [CrossRef]
  63. Kalliovirta, J.; Laasasenaho, J.; Kangas, A. Evaluation of the laser-relascope. For. Ecol. Manag. 2005, 204, 181–194. [Google Scholar] [CrossRef]
  64. Koenker, R.; Hallock, K.F. Quantile regression. J. Econ. Perspect. 2001, 15, 143–156. [Google Scholar] [CrossRef]
  65. Li, H.; Zhao, P. Improving the accuracy of tree-level aboveground biomass equations with height classification at a large regional scale. For. Ecol. Manag. 2013, 289, 153–163. [Google Scholar] [CrossRef]
  66. Dong, L.; Zhang, L.; Li, F. A compatible system of biomass equations for three conifer species in Northeast, China. For. Ecol. Manag. 2014, 329, 306–317. [Google Scholar] [CrossRef]
  67. Dong, L.; Zhang, L.; Li, F. Additive biomass equations based on different dendrometric variables for two dominant species (Larix gmelini Rupr. and Betula platyphylla Suk.) in natural forests in the Eastern Daxing’an Mountains, Northeast China. Forests 2018, 9, 261. [Google Scholar] [CrossRef] [Green Version]
  68. Xie, L.; Widagdo, F.R.A.; Dong, L.; Li, F. Modeling height–diameter relationships for mixed-species plantations of Fraxinus mandshurica Rupr. and Larix olgensis Henry in Northeastern China. Forests 2020, 11, 610. [Google Scholar] [CrossRef]
  69. Bronisz, K.; Mehtätalo, L. Mixed-effects generalized height–diameter model for young silver birch stands on post-agricultural lands. For. Ecol. Manag. 2020, 460, 117901. [Google Scholar] [CrossRef]
  70. Zheng, J.; Zang, H.; Yin, S.; Sun, N.; Zhu, P.; Han, Y.; Kang, H.; Liu, C.; Greening, U. Modeling height-diameter relationship for artificial monoculture Metasequoia glyptostroboides in sub-tropic coastal megacity Shanghai, China. Urban For. Urban Greenin. 2018, 34, 226–232. [Google Scholar] [CrossRef]
  71. Lavery, M.R.; Acharya, P.; Sivo, S.A.; Xu, L.J. Number of Predictors and Multicollinearity: What Are Their Effects on Error and Bias in Regression? Commun. Stat.-Simul. Comput. 2017, 48, 27–38. [Google Scholar] [CrossRef]
  72. Shahzad, M.K.; Hussain, A.; Burkhart, H.E.; Li, F.; Jiang, L. Stem taper functions for Betula platyphylla in the Daxing’an Mountains, northeast China. J. For. Res. 2021, 32, 529–541. [Google Scholar] [CrossRef]
  73. Zhang, X.; Chhin, S.; Fu, L.; Lu, L.; Duan, A.; Zhang, J. Climate-sensitive tree height–diameter allometry for Chinese fir in southern China. Forestry 2018, 92, 167–176. [Google Scholar] [CrossRef]
  74. Xu, Q.; Lei, X.; Zang, H.; Zeng, W. Climate Change Effects on Height–Diameter Allometric Relationship Vary with Tree Species and Size for Larch Plantations in Northern and Northeastern China. Forests 2022, 13, 468. [Google Scholar] [CrossRef]
  75. Zhang, X.; Fu, L.; Sharma, R.P.; He, X.; Zhang, H.; Feng, L.; Zhou, Z. A Nonlinear Mixed-Effects Height-Diameter Model with Interaction Effects of Stand Density and Site Index for Larix olgensis in Northeast China. Forests 2021, 12, 1460. [Google Scholar] [CrossRef]
  76. Cade, B.S.; Noon, B.R. A gentle introduction to quantile regression for ecologists. Front. Ecol. Environ. 2003, 1, 412–420. [Google Scholar] [CrossRef]
Figure 1. Scatter plot of diameter at breast height (D) and tree height (H) of Mongolian pine.
Figure 1. Scatter plot of diameter at breast height (D) and tree height (H) of Mongolian pine.
Forests 13 01168 g001
Figure 2. Standardized residual plot of the BM, BMM, GM, and GMM.
Figure 2. Standardized residual plot of the BM, BMM, GM, and GMM.
Forests 13 01168 g002
Figure 3. The calibrated predicted H vs. measured H calculated by different models.
Figure 3. The calibrated predicted H vs. measured H calculated by different models.
Forests 13 01168 g003
Figure 4. Mean absolute error percentage (MAPE; (A)) and mean absolute error (MAE; (B)) for the calibrated BM, BMM, BQR model, GM, GMM and GQR model with different percentages of trees premeasured per plot.
Figure 4. Mean absolute error percentage (MAPE; (A)) and mean absolute error (MAE; (B)) for the calibrated BM, BMM, BQR model, GM, GMM and GQR model with different percentages of trees premeasured per plot.
Forests 13 01168 g004
Figure 5. MAPE (A) and MAE (B) of the fit results for the BM, BQR, BMM, GM, GMM and GQR methods across several D classes. Solid symbols in each box represent the mean value.
Figure 5. MAPE (A) and MAE (B) of the fit results for the BM, BQR, BMM, GM, GMM and GQR methods across several D classes. Solid symbols in each box represent the mean value.
Forests 13 01168 g005
Table 1. Descriptive statistics of variables for the sample plots and sample trees.
Table 1. Descriptive statistics of variables for the sample plots and sample trees.
ItemVariablesNMaximumMinimumMeanStd.
Tree-levelD (cm)10,84141.703.4019.307.28
H (m)30.503.0017.016.51
BAL (m2·ha−1)53.020.0020.1210.69
Rd2.210.300.980.20
Stand-levelAGE (year)183591245.5413.08
BAS (m2·ha−1)53.559.6233.766.89
Ddom (cm)38.659.3328.205.78
Dq (cm)33.577.2121.985.96
SD (trees·ha−1)44503361163.16841.47
Hdom (m)29.625.1521.285.89
N is the sample size; Std. is the standard deviation; BAL is the basal area of trees with larger diameter than the subject tree; Rd is the relative size in the stand; AGE is the stand year; BAS is the stand basal area; Ddom is the average diameter of dominant trees; Dq is the quadratic mean diameter; SD is the stand density; and Hdom is the average height of dominant trees.
Table 2. Parameter estimates (standard error) for the basic model (BM), basic mixed effects model (BMM) and basic quantile regression (BQR) model at nine quantiles (τ).
Table 2. Parameter estimates (standard error) for the basic model (BM), basic mixed effects model (BMM) and basic quantile regression (BQR) model at nine quantiles (τ).
Method β 1 β 2 β 3 φ σ 2 σ μ 1 2 σ μ 2 2 σ μ 1 μ 2
BM31.31590.05921.67640.72501.4811
(0.6287)(0.0025)(0.0448)
BMM21.07970.07070.6376 1.226834.26630.0002−0.0168
(0.4754)(0.0038)(0.0265)
BQR(τ)
τ = 0.126.09320.05121.7087
(0.9622)(0.0036)(0.0688)
τ = 0.225.86630.06091.8218
(0.7057)(0.0032)(0.0624)
τ = 0.327.96000.05691.6794
(0.9067)(0.0036)(0.0612)
τ = 0.431.11910.05271.6053
(1.5717)(0.0051)(0.0829)
τ = 0.536.07000.04621.5256
(1.6903)(0.0039)(0.0601)
τ = 0.636.01390.05191.6015
(1.2384)(0.0038)(0.0657)
τ = 0.734.90030.05991.6791
(0.8392)(0.0032)(0.0567)
τ = 0.833.13760.07331.8325
(0.5763)(0.0034)(0.0633)
τ = 0.931.67630.08991.9441
(0.4080)(0.0035)(0.0650)
Table 3. Parameter estimates (standard error) for the generalized H–D model (GM), generalized nonlinear mixed effects model (GMM) and generalized nonlinear quantile regression (GQR) model at nine quantiles (τ).
Table 3. Parameter estimates (standard error) for the generalized H–D model (GM), generalized nonlinear mixed effects model (GMM) and generalized nonlinear quantile regression (GQR) model at nine quantiles (τ).
Method β 1 β 2 β 3 β 4 β 5 σ 2 σ μ 1 2 σ μ 3 2 σ μ 1 μ 3
GM−3.31450.98441.76640.08490.47861.6168
(0.1912)(0.0050)(0.1133)(0.0043)(0.0270)
GMM−4.00671.00121.98440.08420.45851.26241.50590.6118−0.8614
(0.4572)(0.0096)(0.2325)(0.0069)(0.0423)
GQR (τ)
τ = 0.1−3.93380.95951.55600.09310.7510
(0.4083)(0.0112)(0.2457)(0.0082)(0.0787)
τ = 0.2−4.08270.96661.88310.09570.6049
(0.3306)(0.0073)(0.1762)(0.0064)(0.0524)
τ = 0.3−4.16620.97542.06430.09790.5567
(0.2491)(0.0049)(0.1495)(0.0047)(0.0418)
τ = 0.4−3.62670.97121.95890.09670.5314
(0.2105)(0.0039)(0.1275)(0.0043)(0.0310)
τ = 0.5−3.24540.97541.87730.09070.4902
(0.2140)(0.0057)(0.1288)(0.0056)(0.0338)
τ = 0.6−3.09980.98761.85150.08150.4175
(0.1907)(0.0053)(0.1130)(0.0051)(0.0271)
τ = 0.7−2.89160.99651.77500.07480.3622
(0.2052)(0.0056)(0.1124)(0.0051)(0.0254)
τ = 0.8−2.73191.00941.73490.06720.3100
(0.3100)(0.0082)(0.1037)(0.0064)(0.0244)
τ = 0.9−2.30531.01431.56730.06910.3060
(0.2540)(0.0069)(0.1593)(0.0063)(0.0266)
Table 4. Fit statistics for the BM, GM, BMM, GMM, BQR ( τ   = 0.5), GQR (τ = 0.5).
Table 4. Fit statistics for the BM, GM, BMM, GMM, BQR ( τ   = 0.5), GQR (τ = 0.5).
Method R a 2 RMSE (m)AIC
BM0.71763.464456,761.79
GM0.96191.271835,985.07
BMM0.97111.107834,737.53
GMM0.97101.110433,948.04
BQR ( τ = 0.5 )0.71423.485359,101.60
GQR ( τ = 0.5 )0.96151.279735,981.73
Table 5. Model evaluation statistics under uncalibrated or calibrated conditions with the BM, BMM, BQR model, GM, GMM, and GQR model, based on the jackknife technique. Bold numbers indicate the best model.
Table 5. Model evaluation statistics under uncalibrated or calibrated conditions with the BM, BMM, BQR model, GM, GMM, and GQR model, based on the jackknife technique. Bold numbers indicate the best model.
MethodTypeMAPE (%)MAE (m)
BMNo calibration18.24952.8405
Calibration10.56921.5804
BMMNo calibration33.89114.3097
Calibration6.03950.8680
BQRNo calibration17.87082.8245
Calibration9.86341.4418
GMNo calibration6.92080.9793
Calibration5.71870.8457
GMMNo calibration6.96520.9866
Calibration5.61980.8341
GQRNo calibration6.91320.9706
Calibration5.70630.8428
Table 6. Statistics of the BM, BMM, BQR model, GM, GMM, and GQR model in different sampling percentages of trees. The numbers in bold indicate the best model.
Table 6. Statistics of the BM, BMM, BQR model, GM, GMM, and GQR model in different sampling percentages of trees. The numbers in bold indicate the best model.
ItemSizeBMBMMBQRGMGMMGQR
MAPE
(%)
0%18.249533.891116.18946.92086.96526.9132
2%13.29957.537111.94517.12396.46866.9088
4%11.94356.914411.05126.54196.29216.4722
6%11.51186.657710.69606.30726.16576.2694
8%11.30296.508610.51926.17466.08836.1471
10%11.16576.406410.39826.08356.02166.0643
12%11.05486.330410.29996.01865.97486.0049
14%10.97946.281210.23275.97765.93615.9675
16%10.92966.238210.18745.94135.90555.9318
18%10.88876.206710.14975.91615.88015.9076
20%10.85406.180110.11725.89475.85905.8861
MAE
(m)
0%2.84054.30972.51680.97930.98660.9706
2%2.04941.09761.78731.08480.93611.0352
4%1.81641.00021.64100.98560.91750.9649
6%1.74330.96121.58320.94680.90420.9328
8%1.70660.93801.55350.92430.89510.9135
10%1.68360.92251.53340.90910.88770.9006
12%1.66450.91111.51700.89830.88210.8915
14%1.65140.90341.50520.89110.87720.8852
16%1.64280.89681.49760.88470.87330.8794
18%1.63620.89211.49120.88040.87010.8755
20%1.63040.88821.48580.87660.86740.8720
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Lin, F.; Xie, L.; Hao, Y.; Miao, Z.; Dong, L. Comparison of Modeling Approaches for the Height–diameter Relationship: An Example with Planted Mongolian Pine (Pinus sylvestris var. mongolica) Trees in Northeast China. Forests 2022, 13, 1168. https://doi.org/10.3390/f13081168

AMA Style

Lin F, Xie L, Hao Y, Miao Z, Dong L. Comparison of Modeling Approaches for the Height–diameter Relationship: An Example with Planted Mongolian Pine (Pinus sylvestris var. mongolica) Trees in Northeast China. Forests. 2022; 13(8):1168. https://doi.org/10.3390/f13081168

Chicago/Turabian Style

Lin, Fucheng, Longfei Xie, Yuanshuo Hao, Zheng Miao, and Lihu Dong. 2022. "Comparison of Modeling Approaches for the Height–diameter Relationship: An Example with Planted Mongolian Pine (Pinus sylvestris var. mongolica) Trees in Northeast China" Forests 13, no. 8: 1168. https://doi.org/10.3390/f13081168

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop