Parameters
The library has the following parameters. Some of them have default values but its better to tune them to optimal values for best results.
TSF parameters
-
data: Dataframe containing the time series you want to forecast for. It may be incomplete with missing dates and have multiple time series of different products/locations together.
-
date_col: Name of the column containing dates.
-
target_col: Name of the column containing the target to be forecasted.
-
validation_starting_date: Date from which to start validation. Evaluation will be performed on validation data.
-
validation_ending_date: Date at which to end validation. Evaluation will be performed on validation data.
-
forecast_starting_date: Date from which you want the forecast to begin.
-
forecast_ending_date: Date upto which you want to forecast.
-
period: Periodicity of the input data. Could be 'D' for daily, 'M' for monthly, 'A' for annual, 'A-MAR' for annual at March 1 every year, 'MS' for stating of every month, 'ME' for end of every month, etc. (default 'D')
-
aggregate: Whether to perform daily, monthly or annual forecast. (default 'M' for monthly)
-
agg: Flag which indicates whether you want to aggregate the data. Set to True if you have daily data and want to forecast monthly, etc. Set to False if you have monthly data and want to perform monthly forecasts, etc. (default True)
-
starting_date (optional): Date from which you want the time series to begin, if you don't want the entire historical data available. (default None)
-
product_col (optional): Name of the column containing name of different products (default None)
-
discontinue: Whether to discontinue products with 0 values in checked time period (default False)
-
check: List of periods for which to check whether to discontinue products (default [])
-
threshold: Threshold below which if a target total is for a product (for the check periods), and if discontinue is True, product will be removed. (default 1)
-
covariates: The list of covariates to be joined to the time series dataframe. (default None)
Feature Engineering parameters
-
n_lags: Number of time lags to use. (default 12)
-
order: Order of time trend to use (default 3)
-
fourier: Number of fourier components to use for seasonality (default 0)
-
seasonal_features: Whether to use trend and seasonality features. (default True)
-
cyclic_features: Whether to use cyclic features, i.e., sin and cos waves indicating the date. Because 1st of a month and 30th of a month are close, despite not seeming to be so. (default True)
-
time_features: Whether to use time features year, month, quarter. (default False)
Forecasting parameters
- model: The regression model you want to use. Can be sklearn, xgboost, or any other valid regression with fit() and predict() methods. (default StackingRegressor with base estimators CatBoostRegressor and RandomForestRegressor and final estimator HuberRegressor)
Note that the tsf() method has the same parameters as Feature Engineering and Forecasting parameters.