Now that so many companies have had a taste of self-service BI, business users are looking for even more control over their data. In order to take their data-backed decision making to the next level, users are demanding access to information from multiple enterprise sources, including cloud applications, social media, departmental data, MapReduce and NoSQL (i.e. Hadoop), to name a few.
The hunger for more information has led to the transformation of data architectures. Today, organizations seek new ways to replace their legacy data systems with Hadoop and other modern sources. Luckily, as discussed in our recent webcast, the business intelligence technology available today enables users to connect to, prepare, and visualize information in any big data environment.
The Value of Hadoop
There are several technical and business scenarios that have emerged as the most common reason for moving data from relational stores into Hadoop. In situations where users have large amounts of unstructured data requiring a high level of processing, Hadoop is quickly becoming the most viable option. Why?
Well, many companies are also moving cold data (data sets not frequently called upon) to Hadoop because of the potential to bring down storage costs, reduce the load on the data warehouse, extend the life of the current data warehouse, and establish the foundation for a modern data architecture. This is executed through horizontal data partitioning (best for large historical data sets between 20-50 years old) and vertical data positioning (best for full, original data tests across joint systems).
Moving to Hadoop
In order to properly harness data from disparate sources like Hadoop, organizations need an adaptive model. This way, companies can move from legacy systems to modern data options without impacting existing reports, dashboards, or projects. Without an adaptive model, it can take data scientists months, even years, to handwrite all of the necessary calling scripts.
A hybrid data warehouse approach is typically employed with the adaptive model to virtually and dynamically combine data stored in warehouses with other information stored in external, independent systems, like Hadoop.
Businesses should make sure their adaptive model includes metadata partitioning infrastructure, which allows data to be collected, sorted, and stored – tying together multiple sources to deliver a trustworthy system of record repository.
Interested in learning more about moving data from traditional to modern systems, sign up to attend my upcoming talk at the Washington Big Data Conference.