Microsoft has developed a big data framework called REEF (Retainable Evaluator Execution Framework) and plans to open source it next month. The framework sits on top of Hadoop’s new YARN resource manager and is designed to let users build jobs that can maintain state even after they’re done, and that can grab data from wherever they need it.
YARN, developed as part of the Apache Hadoop project as a resource manager, lets users run and manage multiple types of jobs atop the same cluster of physical machines. This allows companies to consolidate the number of systems they have to manage, and allows allows them to run different types of analysis on top of the same data from the same location.