From the invention of the abacus onwards, there has always been a tension between those that use computing and those that manage it. This has been a vital driving force in the innovation cycle of data and analytics deployments.
Applications on the mainframe enabled a leap in productivity, but the equipment was expensive, and skills were in short supply, so a culture of operations (that today is called dev-ops) emerged to provide predictable and efficient use of the valuable compute resources.
Unfortunately, this also shifted these new, automated business processes out of the direct visibility and control of the business users. The operations team became disconnected from the changing needs of the business – which led to the demand for business reporting.
In the early days, it was an act of programming to create a new business report. This overloaded development teams and delayed their availability. Out of the resulting frustration, relational databases were born – providing business users with self-service queries and reporting. Relational databases, of course, consumed valuable compute resource, escalating tensions between IT that was operating the technology and business users that were trying to manage and improve the business.
Eventually, these mainframe-based relational databases transitioned into a transaction workhorse supporting operations under the management of IT…and new relational databases were established in mini-computer based departmental systems, becoming the new-fangled data warehouse where business users could do ad hoc analyses and reporting without disturbing the productivity of the transactional systems.
As the data warehouses gained in size, sophistication, and importance to business operations they moved under the purview of IT so that they could be properly controlled and managed. The projects became more ambitious and often less nimble. Once again, frustration led to innovation. Our intrepid and entrepreneurial business analysts needed a new sandbox to support local analytic needs and the data mart was born…
And the cycle repeated … leading to the development of the data lake, then the lake-house, and now the data fabric.
Over time, each of these technologies finds their place in the pantheon of data architectures. But the one thing of which we can be assured is that the cycle will repeat. The tension between control and flexibility, between central and local management, between production and experimentation, will continue to be a source of tension and innovation.
As our data environments evolve it is important that we enable the right degrees of management and control, of visibility and sharing. Locking down data to protect and preserve it for the few privileged teams that know of its existence often results in a new “copy” of the data being established by the business (sometimes called shadow IT). This “copy” needs protecting too.
We need a more flexible, distributed, and federated approach to data governance that enables the right controls on the right data, for the right purpose – and at the same time exposes and promotes data sets (products) for sharing and reuse. This is accomplished through a blending of local autonomy and governance that is sensitive to the needs and opportunities of the business, along with thoughtful organization-wide policies to ensure legal and ethical compliance.
The Egeria open-source project has been designed to enable these blended environments. With a flexible and dynamic deployment model, Egeria can be easily configured and reconfigured in managed ways by trusted teams. Some organizations may choose to employ static configurations laid out by dev-ops teams – but others will choose more flexible approaches that allow governance teams and data owners to enable management and governance of the data they own and use. And these are not either/or choices. They work together and can give an organization the freedom to innovate and consolidate as the need arises.
Egeria (https://egeria-project.org/) provides the open framework with auditing, security, integration and management of metadata and governance that meets the needs of both IT control and business agility. It does not require that you replace your current investments. Instead, it integrates what you have, and fills the gaps where needed. If you believe that innovation, resilience, cost-effectiveness and safety are not mutually exclusive, visit https://pdr-associates.com/ to find out more.