Skip to main content

Usage

This library can be used in two different ways. But in both ways, you have to perform these steps first:

  1. Import the library and read the historical data as well as covariates datasets.

    import pandas as pd
    from tsf import TSF

    df = pd.read_parquet("data/delivery_data.parquet")
    weather_df = pd.read_csv("data/weather.csv")
  2. Make the TSF instance with the desired parameters.

    tsf = TSF(data=df, date_col="date", target_col="quantity_delivered", 
    validation_starting_date="2023-01-01", validation_ending_date="2023-12-01",
    forecast_starting_date="2024-01-01", forecast_ending_date="2024-12-01",
    period="D", aggregate="M", agg=True, starting_date="2018-01-01", product_col="brand",
    discontinue=True, check=[("2023-01-01", "2023-12-01")], covariates=[weather_df])

Step-by-step

This is if you want to do the steps of preprocessing, feature engineering and forecasting one step at a time.

This is useful if you want to do some custom preprocessing/feature engineering after the library has done those steps, or if you want the intermediate dataframes and not the final forecast.

  1. Preprocessing:

    df = tsf.preprocess()
  2. Feature Engineering with appropriate parameters:

    df = tsf.engineer(df=df, n_lags=12, order=2, fourier=6, seasonal_features=False, cyclic_features=True, time_features=True)
  3. Forecasting with appropriate model:

    from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor, StackingRegressor
    from sklearn.linear_model import HuberRegressor

    rf = RandomForestRegressor(random_state=42)
    gb = GradientBoostingRegressor(random_state=42)
    hr = HuberRegressor()
    model = StackingRegressor(estimators=[("rf", rf), ("gb", gb)], final_estimator=hr, passthrough=False)

    df, model, total_scores, product_scores = tsf.forecast(df=df, model=model)

In one go

Otherwise, if you want to directly perform the forecast, you can use this method with the appropriate parameters:

from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor, StackingRegressor
from sklearn.linear_model import HuberRegressor

rf = RandomForestRegressor(random_state=42)
gb = GradientBoostingRegressor(random_state=42)
hr = HuberRegressor()
model = StackingRegressor(estimators=[("rf", rf), ("gb", gb)], final_estimator=hr, passthrough=False)

df, model, total_scores, product_scores = tsf.tsf(model=model, n_lags=3, order=2, fourier=2, seasonal_features=True, cyclic_features=True, time_features=False)