Extending SAS High-Performance Forecasting Using User-Defined Models and External Forecast
Extending SAS High-Performance Forecasting Using User-Defined Models and External Forecast
Paper 329-2011
Extending SAS High-Performance Forecasting Using User-Defined Models and External Forecasts
Udo Sglavo, SAS Institute Inc., Cary, NC ABSTRACT
For automatic forecasting of large numbers of time series, SAS High-Performance Forecasting (part of SAS Forecast Server) provides access to the most robust models available today. However as an experienced forecaster, you might have the desire to extend the model families provided. SAS High-Performance Forecasting allows you to extend the models in three ways: Custom repositories: You can create your own model repository. External models: Your forecasts are provided by methods that are external to the system. User-defined models: You are adding forecasting methods that are not provided by SAS High-Performance Forecasting.
In this paper, the use of custom repositories, external models, and user-defined models in SAS High-Performance Forecasting will be discussed, and four easy-to-follow examples will be provided.
Extending SAS High-Performance Forecasting Using User-Defined Models and External Forecasts, continued
HPFDIAGNOSE PROCEDURE
SAS High-Performance Forecasting provides a comprehensive diagnostic engine (called the HPFDIAGNOSE procedure) to automatically identify a univariate time series model. This procedure automatically diagnoses the statistical characteristics of each time series and identifies appropriate models. The models that the HPFDIAGNOSE procedure considers for each time series include autoregressive integrated moving average with exogenous inputs (ARIMAX) models exponential smoothing models unobserved components models intermittent demand models
Log transformation and stationarity tests are automatically performed. The ARIMAX model diagnostics find the autoregressive (AR) and moving average (MA) orders, detect outliers, and select the best input variables. The unobserved components model (UCM) diagnostics find the best components and select the best input variables. As a result of running the HPFDIAGNOSE procedure, model specifications for each model family and a model selection list are created and stored in an automatic model repository.
EXAMPLES
To illustrate how you can extend SAS High-Performance Forecasting, four easy-to-follow examples will be provided. The focus of these examples is to demonstrate the required steps and not so much the complexity of the models that are created. In fact, you will find that we are extending SAS High-Performance Forecasting with a moving average model of three observations. Furthermore, you will see how we implement this method first using a custom model repository, then using an external forecast, and finally using a user-defined model. Eventually, you will see how these three approaches can be combined in one repository.
Extending SAS High-Performance Forecasting Using User-Defined Models and External Forecasts, continued
As an example data set, we will use the airline passenger series, given as Series G in Box and Jenkins (1976). This series is often used in time series literature as an example of a non-stationary seasonal time series. This series is a monthly series consisting of the number of airline passengers in the United States who traveled during the years 1949 to 1960. Its main features are a steady rise in the number of passengers from year to year and the seasonal variation in the numbers during any given year. The airline passenger series is available in SASHELP library as the AIR data set. Using a moving average model of three observations is by no means an appropriate model for the AIR data set. In fact, the HPFDIAGNOSE procedure will be able to come up with the so-called AIRLINE model or a Winters exponential smoothing model on log-transformed data for you automatically. However, given the wide-spread use of the airline data and the simplicity of the moving average model, they make ideal candidates for illustrating the required steps for extending SAS High-Performance Forecasting.
As the next step, we will need to build our model selection list using the HPFSELECT procedure: proc hpfselect modelrepository=work.custom_repository selectname=custom_select selectlabel="Custom MA 3 Model"; diagnose seasontest=none; specification custom_MA3; run; A couple of things to note in this code: We need to make sure that the model selection list is in the same custom repository as the model specification (custom_repository). The name of our selection list is custom_select. The SPECIFICATION statement points to our model specification (custom_MA3). By default, SAS High-Performance Forecasting runs a seasonality test that we are disabling with the DIAGNOSE statement. This statement is required to make sure SAS High-Performance Forecasting is not dropping our inadequate MA3 model and is using a seasonal model instead.
After defining the model specification and the model selection list, our custom repository is ready to use. Note that we have only one model available. Hence, there is no need to pick and choose a model from a wide range of candidates. In general, a custom repository can feature many different models. In order to apply our model, we will need to run the HPFENGINE procedure. Note that we need to point the procedure to our repository (custom_repository) and the model selection list (custom_select). We will not only ask the procedure to create a data set (which contains our forecasts also referred to as predictions with confidence limits), but we will also create a forecast plot.
Extending SAS High-Performance Forecasting Using User-Defined Models and External Forecasts, continued
To get an idea of the forecasting accuracy, we are also using an out-of-sample period of 1 (specified by BACK=1), and we are forecasting one period ahead (specified by LEAD=1). ods graphics on; proc hpfengine data=sashelp.air modelrepository=custom_repository globalselection=custom_select out=_null_ outfor=outfor1 back=1 lead=1 plot=forecasts; id date interval=month; forecast air; run; ods graphics off; Here is the output from our first example. Note that our MA3 model forecasts 453 (rounded) for our lead time in Dec 1960.
Extending SAS High-Performance Forecasting Using User-Defined Models and External Forecasts, continued
In order for SAS High-Performance Forecasting to identify the external forecasts, we will use the HPFEXMSPEC procedure. We are specifying a specification name (external_MA3) and a model repository (work.external_repository). proc hpfexmspec modelrepository=work.external_repository specname=external_MA3; exm; run; As before, the next step is to define a model selection list using the HPFSELECT procedure. proc hpfselect modelrepository=work.external_repository selectname=external_select selectlabel="External MA 3"; specification external_MA3/exmmap(predict=predict); run; Different than before, we will need to tell the procedure how to reference the external forecasts. Using the EXMMAP option, we specify the name of the variable (Predict) that contains the forecasts. If we also have information about confidence intervals and standard deviation, we could provide this information as well. ods graphics on; proc hpfengine data=external_MA3 modelrepository=work.external_repository globalselection=external_select out=_null_ outfor=outfor2 back=1 lead=1 plot=forecasts; id date interval=month; forecast air; external predict; run; ods graphics off; Note that we need to use the data set that contains the external forecasts (not SASHELP.AIR). In our case, this data set was created using the EXPAND procedure and is called external_MA3. The external statement tells the HPFENGINE procedure where to look for the external forecasts. As expected, the resulting outputs are similar to those in Example 1.Again the MA3 prediction is 453 for Dec1960.
Extending SAS High-Performance Forecasting Using User-Defined Models and External Forecasts, continued
Extending SAS High-Performance Forecasting Using User-Defined Models and External Forecasts, continued
run; Using the EXMFUNC option on the SPECIFICATION statement in PROC HPFSELECT, we are now able to call our user-defined function (move_avg3). As defined earlier, we are passing the actual values of our series to our function (specified by the _ACTUAL_ option), and the function returns the predictions (specified by the _PREDICT_ option). proc hpfselect modelrepository=work.user_repository selectname=user_select; diagnose seasontest=none; specification user_MA3 / exmfunc('move_avg3(_actual_ _predict_ )'); run; The main difference to Example 2 is that we are not feeding the HPFENGINE procedure with forecasts that were created before running the procedure. This time we are creating these forecasts on-the-fly by running our newly defined function. Note that the syntax for the HPFENGINE procedure is almost similar to Example 1. ods graphics on; proc hpfengine data=sashelp.air modelrepository=work.user_repository globalselection=user_select out=_null_ outfor=outfor3 back=1 lead=1 plot=forecasts; id date interval=month; forecast air; run; ods graphics off; The results are again similar to those in Example 1 and 2.The MA3 prediction is 453 for Dec1960.
Extending SAS High-Performance Forecasting Using User-Defined Models and External Forecasts, continued
Output 1 Output from Print Statement Note that by default the model selection criterion is MAPE. The differences of the MAPE values are due to the way we have implemented our forecasting models. As expected, all three models are performing in a similar way.
Extending SAS High-Performance Forecasting Using User-Defined Models and External Forecasts, continued
CONCLUSION
For automatic forecasting of large numbers of time series, SAS High-Performance Forecasting provides access to the most robust models for forecasting, such as exponential smoothing models, unobserved component models, autoregressive integrated moving average models, and intermittent demand models. As illustrated, SAS HighPerformance Forecasting also allows experienced forecasters to extend the models in three ways: Custom repositories: You can create your own custom model repository. External models: Your forecasts are provided by methods that are external to the system. User-defined models: You are adding forecasting methods that are not provided by SAS High-Performance Forecasting.
By providing this ultimate flexibility, SAS High-Performance Forecasting can be considered as one of the most complete automatic forecasting engines available today.
REFERENCES
SAS Institute Inc. 2004. SAS Institute white paper. Large-Scale Automatic Forecasting with Inputs and Calendar Events. http://www.sas.com/reg/wp/corp/3478 Box, George. E. P., Gwilym M. Jenkins, and Gregory C. Reinsel. 1994. Time Series Analysis: Forecasting and Control Englewood Cliffs, NJ: Prentice Hall. SAS Institute Inc. 2009. SAS High-Performance Forecasting 3.1: User's Guide. Cary, NC: SAS Institute Inc.
ACKNOWLEDGMENTS
A special thanks to Michael Leonard, Meredith John, Mike Gilliland, all SAS Institute Inc., and Snurre Jensen from SAS Institute in Denmark for providing technical and other information used in this paper.
CONTACT INFORMATION
Your comments and questions are valued and encouraged. Contact the author: Udo Sglavo SAS Campus Drive SAS Institute Inc. E-mail: [email protected] SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies.