Regression Modeling
Description
Statistical regression models are used to establish a relationship between a particular climate variable and the tree-ring chronologies (Fritts 1976). Typically, the period of overlap between the climate and the tree-ring data is broken up and the regression model is "trained" or calibrated on a portion of the overlap period. Another portion of the overlap is used as a verification period, which will evaluate the skill of the regression model on data purposely withheld from the calibration period (Fritts 1976; Cook and Pederson 2011). Dendro Tools performs simple linear, multiple linear, and principal components regression modeling. Leading and lagging of the predictor data is supported. If multiple linear or principle components regression modeling occurs, then the variables are added to the model in a forward stepwise fashion. See regression modeling texts for further details (e.g., Draper and Smith 1981; Kutner et al. 2005; Wilks 2006). This tool is based on the program PcReg developed by Ed Cook and Paul Krusic.
Running the Tool
This complex tool performs key procedures in a series of steps. These steps and the files that are output are detailed in the following outline:
Input Predictor Data
File
Select the input data to analyze by clicking on "Browse" and navigating to the proper file.
Input Data Format
Year / Data
Each "/" is a tab (i.e., data should be tab-delimited). Missing Data should be denoted as "-99".
Start Row
This input is the row number where the main data to be processed begin in the input file selected above. An example of how row numbers are entered is shown in Figure 2 located in the Running Dendro Tools topic.
Number of Variables
Enter the number of variables located in the input file.
Prewhiten
Check this box to perform autoregressive modeling on the input predictor data. Dendro Tools will also ask for years to start and end the autoregressive modeling. This is often the same years as the calibration period.
Lead / Lag
Check this box to create predictor data with leads and/or lags. Dendro Tools will then activate the lead and lag checkboxes. Check boxes with a minus sign to lead the predictor data (i.e., -1 will move each predictor series back one year so year t is directly compared with year t + 1). Check boxes with a plus sign to lag the predictor data (i.e., +1 will move each predictor series forward one year so year t is directly compared with year t - 1). A checkmark included in the box "0" will tell Dendro Tools to also consider the original predictor data in year t with no lead or lag.
Input Predictand Data
File
Select the input data to analyze by clicking on "Browse" and navigating to the proper file.
Input Data Format
Year / Data
Each "/" is a tab (i.e., data should be tab-delimited). Missing Data should be denoted as "-99".
Start Row
This input is the row number where the main data to be processed begin in the input file selected above. An example of how row numbers are entered is shown in Figure 2 located in the Running Dendro Tools topic.
Prewhiten
Check this box to perform autoregressive modeling on the input predictand data. Dendro Tools will also ask for the years to start and end the autoregressive modeling. These are often the same years as the calibration period.
Output Data
File
Select or enter the name of the file where the results will be stored.
Header
Information entered here will be output in the first line of the output file.
Parameters
Perform Principal Components Regression
Check this box to perform principal components regression. Dendro Tools will then activate the eigenvalue cutoff criterion. The following five options are available: eigenvalue < 1, a specific number of eigenvalues, a proportional variance threshold, a cumulative variance threshold, and all eigenvalues. Eigenvalues that exceed the cutoff will not be considered in the regression modeling phase. Eigenvalue <1 is selected by default.
Screen Predictors Prior to Regression Modeling
Check this box to correlate all predictors against the predictand. A probability criterion is also required. This routine will examine the Pearson correlation between each predictor and the predictand for statistical significance. If p is less than the value entered for the probability criterion, then the specific predictor passes the screening test. Predictors that fail to pass screening will not be considered in the regression modeling phase. This parameter is checked by default with a probability criterion of 0.10 automatically entered.
Criterion for Model Selection
Autoregressive and forward stepwise regression models are selected based on a specific criterion that maximizes the explained variance while minimizing the number of variables fit to the model. Two options are available, the minimum AICc and the minimum BIC. See Cook et al. (1999) for a discussion of the minimum AICc, which is selected by default. Note that there can be multiple minima in the AICc, the first minimum value is the one selected by Dendro Tools.
Calibration Period
Enter the years to start and end the regression model calibration.
Verification Period
Enter the years to start and end the regression model verification. These input boxes can be left blank if verification will not be performed (e.g., use the entire overlapping period to calibrate the regression model).
Output Statistics
Statistics Information
Once the regression modeling tool has been run, the output file with the word "Statistics" appended to the end of the filename is read into Dendro Tools and displayed for the user to scroll through (Fig. 1).
![]() |
Figure 1. The regression modeling tool has been successfully run. The results are output in a series of output files as detailed in outline of steps located under "Running the Tool" above. The main statistics needed for assessment of the regression model quality are output to their own output file and are displayed under "Output Statistics" as illustrated by the redbox. Two buttons underneath this textbox allow for further visual inspection (see below). |
Plot...
Clicking this button will display the observed and reconstructed data during the calibration and verification periods and the final reconstruction for visual inspection (Fig. 2). The observed and reconstructed data during the calibration and verification periods are displayed by default. The user can display the final reconstruction by selecting "Reconstruction" from the drop-down list next to "Displayed Data". In Figure 2 below, the calibration was performed on the older data (denoted by the orange arrow; left of the gray vertical line) and verification analysis was performed on the more recent data (denoted by the green arrow; right of the gray vertical line). The calibration and verification periods will always be separated by the gray vertical line, and the text in the header will change depending on which side of the gray line (left or right) the calibration and verification periods reside. If the calibration period is left of the gray line, the text will read "Calibration vs. Verification" (i.e., like Fig. 2), but if verification period is left of the gray line, the text will read "Verification vs. Calibration" (i.e., opposite of Fig. 2). See the Running Dendro Tools topic for more general details about plots in Dendro Tools.
![]() |
Figure 2. Plots of the observed and reconstructed data during the calibration and verification periods and the final reconstruction can be accessed by clicking on the "Plot..." button shown above in Figure 1. |
Residuals...
Clicking this button will display various residual plots (Fig. 3). Residual plots can be used to check for potential biases in the regression model (e.g., residuals that are asymmetric relative to the zero line suggesting non-randomness). Each variable used in the reconstruction, the estimate value (y-hat), and the year can be plotted against the observed data or the residuals. The first variable vs. observed data are displayed by default. Drop-down lists next to "Displayed Data" can be used to select other variables. Once the proper selection has been made, press the "Replot" button to show the specified plot. Hovering the mouse arrow over a red dot will display the year when that red dot occurs. See the Running Dendro Tools topic for more general details about plots in Dendro Tools.
![]() |
Figure 3. Example plot of the residuals against a predictor variable. |