The Future of Predictive Modeling
The future of predictive modeling is, undoubtedly, closely tied to artificial intelligence. As computing power continues to increase, data collection rises exponentially, and new technologies and methods are born, computers will bear the brunt of the load when it comes to creating models. The global management consulting firm McKinsey and Co. recently studied future trends, some of which are detailed below.
Partially due to recent advancements in computing power and data quantities, predictive modeling technologies have improved the impact of regular newsworthy breakthroughs. Predictive algorithms are becoming extremely sophisticated in many fields, notably computer vision, complex games, and natural language.
Changes in Work
With more intelligent computers, the work of predictive modeling professionals, much like with other occupations, will change to adapt to newly available predictive technology. People who work in predictive modeling will not likely become obsolete, but their roles will shift in a way that complements new predictive technological features and abilities, and they will need to acquire new skills to excel in these new roles.
Advances in predictive technology are extremely promising in terms of commercial and scientific value creation, but they do require risk mitigation as well. Some of these risks center on data privacy and security. With exponential increases in data volume, the importance of protecting data from hackers and mitigating other privacy concerns increase as well. Additionally, researchers point out the risk of hard wiring overt and unconscious societal biases into predictive models and algorithms, an issue that will be of great importance to policymakers and big technology companies.
The Limitations of Predictive Modeling
Despite its numerous high-value benefits, predictive modeling certainly has its limitations. Unless certain conditions are met, predictive modeling may not provide the entirety of its potential value. In fact, if these conditions are not met, predictive models may not provide any value over legacy methods or conventional wisdom. It is important to consider these limitations to capture the maximum amount of value from predictive modeling initiatives. According to McKinsey and Co., which recently analyzed use cases, value creation, and limitations, here are some of the challenges:
Especially in Machine Learning, in which a computer is constructing the predictive model, data must be labeled and categorized appropriately. This process can be imprecise, full of errors, and a generally colossal undertaking. However, it is a necessary component of constructing a model, and, if proper classification and labeling cannot be completed, any predictive model produced will suffer from poor performance and issues associated with improper categorization.
Obtaining Massive Training Datasets
In order for statistical methods to be consistently successful at predicting outcomes, a basic tenet needs to be met: sufficient sample size. If a predictive modeling professional doesn’t have sufficient amounts of data to construct the model, the model produced will be unduly influenced by noise in the data that is used. Of course, relatively small datasets tend to exhibit more variation or, in other words, more noise. Currently, the number of records required to reach sufficiently high model performance ranges from the thousands to the millions. In addition to size, the data used must be representative of the target population. If the sample size is large enough, the data should have a wide variety of records, including unique or odd cases, to refine the model.
The Explainability Problem
As more complex and esoteric models and methodologies become available, it will often be a great challenge to untangle models to determine why a certain decision or prediction was made. As models intake more data records or more variables, factors that could explain predictions become murky, a significant limitation in some fields. In industries or use cases that require explainability, such as environments that have significant legal or regulatory consequences, the need to document processes and decisions can hinder the use of complex models. This limitation will likely drive demand for new methodologies that can handle huge data volumes and complexities while also remaining transparent in decision making.
Generalizability of Learning
Generalizability refers to the ability of the model to be generalized from one use case to another. Unlike humans, models tend to struggle with generalizability, also known as external validity. In general, when a model is constructed for a particular case, it should not be used for a different case. Although methods like transfer learning, an approach that attempts to remedy this very issue, are in development, generalizability remains a significant limitation of predictive modeling.
Bias in Data and Algorithms
Though it’s more of an ethical or philosophical issue than a technical one, some argue that researchers and professionals creating predictive models must be careful when choosing which data to use and which to exclude. Because historical biases can be engrained at the lowest level of data, great care must be taken when attempting to address these biases, or their repercussions could be perpetuated into the future by predictive models.