Setting up a Reliable
and Powerful Platform
Data spread across multiple systems.
Centralized data storage in a data lake, using CDH tools.
One of our customers had numerous problems with data management.
Data was spread across multiple systems and it was proving incredibly difficult to figure out exactly where certain data was located.
This made effective business intelligence virtually impossible.
A data lake was created where data could be consolidated and quickly accessed for business intelligence.
CDH tools were implemented to manage the data.
And proper additional health check systems, cluster maintenance logs, workshop checks, and housekeeping systems were added to ensure everything was running smoothly.
An example of how this all works in practice is the concept of using energy as a commodity on the trading market.
If you know in advance how much energy to expect, you can make better offers and better deals.
With a proper data lake and the necessary tools for rapid forecasting, whether data could be quickly analyzed to predict energy production.
Energy trading could then be conducted much more efficiently.
FRAMEWORK & TOOLS
Cloudera (including full Cloudera ecosystem), Jupyter Hub, Spark, R, Oozie, Hive