Apache OODT evolved from a principled software architecture for designing NASA data systems to an open source framework for big data management and processing systems. This talk will discuss the history, core architectural principles behind OODT, why it was developed, and how it fits into the big data suite of open source capabilities for dealing with massive data sets and analysis.
Apache OODT was conceived in 1998 to address challenges in sharing scientific data from NASA missions. From its inception, OODT has been first, considered a reference architecture for distributed data management, and second an implementation of that reference architecture. The early designers of OODT took great effort in ensuring that the implementation adhered to a key set of architectural principles which has guided the implementation of Apache OODT to this day. This disciplined approach has been critical to developing a reusable framework and identifying how such a framework could be applied to different data management and processing systems. The early implementations of OODT validated this approach and demonstrated where key decision should be made in terms of ensuring the software framework did not overly specify and implementation which could not be reused. These implementations, which were from vastly different scientific disciplines, were able to use OODT to support research and analysis of scientific data returned from experiments both in space research and biology. The development of the architecture, the framework, and application to very different disciplines, has been key to developing a capability that can scale to meet the big data demands that are among us today.
This talk, by Dan Crichton, original architect of OODT, will discuss the core architectural principles behind the software, why it was developed, and how it fits into the big data suite of open source capabilities for dealing with massive data sets and analysis. Several lessons learned will be provided along with a discussion about how OODT can continue to evolve to support multiple discipline needs in big data management.