Select Page
Brute force correlation investment models are becoming a thing of the past. Investors need to harness the power of AI to deal with the exploding amount of market data, but require observability into the investment process to get comfortable.
Ensembling allows for the disaggregation of big data to produce unique observable investment insight. EquBot AI platforms use multiple models rather than a single large model to produce observable investable insight.
EquBot believes industry innovation will continue as a result of using ensembling techniques on large investment data sets.
Ensembling is the statistical approach of combining multiple algorithms to produce better predictive models. Unlike early quantitative funds that leaned heavily on singular brute force correlation models, the “black box” approach to derive investment recommendations, new AI investment models are delivering superior results by employing multiple smaller and more focused modules. In this piece we will discuss the advantages of ensembling investment data models beyond superior performance, but the analogy to consider is a singular musical performer playing multiple instruments versus a group of specialized musicians. While there certainly can be talented individuals that accomplish musical masterpieces simultaneously playing a variety of instruments, the result is often a lower quality and less sophisticated output versus a well trained band or orchestra.

Originally quantitative investment models could hardly be explained. Such an enormous amount of data was piped into a program that the end result was often confusing and questionable without a hope of constructing clear rationale driving the investment output other than a broad conclusion reaching at the correlation of data. This archaic approach is prone to fault on numerous levels. Investment data is perpetually plagued with misprints and errors. Finding a data issue in one of these broad correlation models is the virtual equivalent of finding the needle in a haystack. Because investment datasets are so large quantitative desks try to explain away this issue through the law of large numbers, but as refined data scientists know, a large error in a trained model on a key signal can be disastrous. A bad price print can influence a model to increase risk incorrectly in many of the most common cases. It can be difficult to trust a system lacking observability. Again, if the solo performer is failing to execute in an instrument there is little to no support from others, more often we see the performance train go off track. This historically has been true for many of the founding algorithmic trading models.

Ensembling through smaller different models offers the opportunity for regular monitoring of specific model areas. Large price changes, or increases in sentiment that may lead to investment purchases can be identified and validated given a proper system architecture. Again, the data issue still persists at the system data ingestion level, but with proper checks in place an alert can be delivered to acknowledge outsized price developments or extreme investment outputs. Strings break during orchestral performances, but observant conductors can lead a performance through such difficulties and allow supporting performers to carry the tune. The penultimate output would obviously be preferable with no issues, but as with performance art, data and system issues occur regularly for AI investment platforms. Ensembling gives data scientists and quantitative traders a fighting chance to resolve these issues in a timely manner.

Stepping away from the musical performance metaphor and taking a step back further as to what the data says about the investment management industry we see both retail and institutional investors favoring transparency. Understanding why a position is being traded aligns more closely with institutional requirements as it is uncommon to have large investors just accept investment results without supporting material. This approach offers a more candid discussion as far as where the data exists and is free from the conjectures of a human portfolio manager. Trillions of assets will continue to flow into US ETFs given the improved liquidity and transparency of these investment vehicles. We see this trend only continuing as global markets continue to develop and grow.

Overfitting and bias correction capabilities are important considerations in long lived quantitative models. An ensemble approach in this regard can lead to better results and more timely corrective action when dealing with smaller separate models. Retraining models is a must as data sets will only continue to grow, so we see managing smaller focused models as a more efficient methodology versus the bulky black box models so frequently used in the past.

There are some exceptional models and individuals that can perform multiple tasks at a high operating level, but statistical models should drive investment and operational decisions. Error reduction, observability, and sustainability support an ensemble AI powered investment framework in a growing data environment. As tools and product offerings in the AI space continue to be bolstered by the innovation from large and small technology and finance players, we anticipate the ensembling practice to become more normalized. This ultimately should translate to better experiences with AI investment products.