Data science has been evolving rapidly, with a seemingly endless number of new applications each year. Recent use cases focus on streamlining the data science process by removing barriers that slow down model building, such as unclean or missing data. In this blog post, we’ll discuss three machine learning trends that can revolutionize your 2019.
1. Improve Data Quality via Outlier Detection
Innovations like the Internet of Things (IoT) and unstructured data are producing exponentially more data, allowing organizations to discover unexpected insights from new and more granular sources. However, these growing sources of data introduce a greater risk of producing machine learning models that use unclean data and therefore produce inaccurate forecasts. The risks from bad data extend beyond poor forecasts—IBM estimates that bad data costs the US economy roughly $3.1 trillion dollars each year.
Luckily machine learning is here to help through outlier detection, which uses an algorithm to identify observations that are extremely different from the rest of the data, allowing data scientists to determine if these extreme observations are true values or were included erroneously. Implementing outlier detection on a data set before making predictions can save time and money and help ensure the model data is free of incorrect data. As more objects become “smart” and the sources of data continue to expand, expect to see outlier detection implemented more frequently.
2. Continued Growth of Natural Language Processing
Natural language processing (NLP), a subfield of artificial intelligence that allows computers to process and act on human language and human language queries, has seen explosive growth over the past year, mostly through chatbots and AI assistants like Alexa. But NLP is showing up in new and unexpected places. For example, Google now interprets searches rather than simply searching on the words themselves, and car manufacturers are installing AI assistants in their vehicles for facial recognition and even music recommendations. These features allow people to quickly gain more information and control their environment with simple commands and fewer clicks.
The MicroStrategy platform already integrates natural language processing into its platform. Users can ask Alexa questions about their KPIs, use natural language queries to ask questions about their data in dossiers, and find answers in new visualizations. With MicroStrategy 2019, HyperIntelligence will take NLP to the next level and provide answers with zero clicks. This technology lets users hover over a word in their browsers and immediately see pertinent information about the word in a card right on their screen. In 2019, expect to see NLP utilized in ways that democratize data and speed up interactions between analysts and their data sets.
3. Automated Machine Learning with AI
Data scientists and their skills are in short supply, and model creation is a time-intensive process involving data cleaning, model selection, and many rounds of training and testing, both of which slow down the rate at which machine learning models can be implemented. To combat this issue, AI is being used to both create and tune models, significantly speeding up model construction and allowing analysts who lack coding skills to build their own models.
In 2017, Google released AutoML, a cloud-based collection of machine learning products that automates the construction of machine learning models, with additional modules for natural language processing and image detection. Expect enterprises to start integrating these products into their model suites, as well as the release of similar packages from other companies in 2019.
Interested in getting in on these trends with MicroStrategy? Check out our machine learning offerings and get started.