Modeling Commodity Prices
I wanted to try out an idea that's been rolling around in my head for a while. Given a small amount of multivariate time series data, what if I used a mixture of splines and tensor decomposition to train a multivariate time series forecasting model? What could we take away as insight about the underlying causal system generating that data? Would that kind of model perform well enough to be valuable, and if so, how?
There's a lot to unpack here. Let's start with a high-level overview of the dataset.
It would be most efficient to list all the questions that come to mind here regarding our modeling approach.
- Why not just use neural methods (i.e., LSTMs, RNNs, Transformers, etc.) for this multivariate sequence modeling problem? So this is a question that's definitely worth asking. There are so many cases of neural models outperforming other kinds of data modeling. However, it's important to know that they aren't without their issues. The largest reason is that these techniques will often times require relatively larger amounts of data to train well.
- Why not use VARIMA, SARIMAX, or something similar since this is a time series modeling problem? So given the small amount of data for this problem, we don't have any discernable evidence of seasonality (the AutoRegressive in VARIMA, and given the variance of the dataset there's good reason to believe that relatively simple mathematical representations (the Moving Average in VARIMA) will not be able to represent the patterns well. There's another reason for why I wanted to at least consider other techniques. I wanted to reduce inductive bias as much as possible in the beginning phases of solving this problem. The thing about these techniques is that the mathematics behind these models explicitly make inclusions for seasonality and trend. I wanted to start with a more generic sequence modeling approach since I already know that there aren't many inherent biases within the dataset that can be leveraged by a corresponding symmetry in the model. Basically, I have reason to believe that this would be better modeled with pure sequence modeling.
- Why not use a different kind of sequence modeling like HMMs (Hidden Markov Models)? Well, why not just use Q-learning while I'm at it? Casting this problem into one that thinks of multidimensional sequence modeling as a state transition process is the kind of bias that, while extremely flexible, requires too much data to fit properly.
- Are you really sure about this approach? Honestly, no. But the nice thing about this is that I can create ensembles later on. I can add complexity to this model until the system as a whole behaves as I need - and splines are a very flexible way to do that. For example, when looking at a visualization of my data, it's hard to think that smooth splines could reasonably represent the spikiness in the data. However, later on I can add an ensemble with a Generalized Additive Model (GAM) to increase the range of what can be represented. I can do hyperparameter tuning on things like the number of knots to increase it's ability as well. Ultimately, it's a pretty good place to start.



No comments:
Post a Comment