An analytic warehouse is designed for the explicit purpose of producing and delivering analytic results. It’s unlike most traditional data warehouses which ended up mostly just storing data. My brother Dan and I are teaching a course for O’Reilly Online Learning titled “Enterprise Analytic Warehousing: Data Warehouse Architecture and Strategy Designed for Analytics” this Thursday, October 29 (Register here). I just spent the weekend reviewing our presentation materials, polishing a few demos, and debating several key points with Dan during a rehearsal (hey, he’s my brother, it’s only natural that we take opposing sides.) It’s going to be fun!
We see a lot of large enterprise data warehouse systems in our consulting practice and have lived through many of the “hot trends” in technology that were supposed to make everything smarter and faster and more productive. Are there any “magic bullets” out there? Not really, not especially for organizations that have been in existence for many years and have allowed some of their systems to grow unchecked and other systems to remain stagnant. We see a lot of data lakes that are more like data swamps and data warehouses that are more like data dumps (imagine a “hoarder house” that is tens of thousands of square feet/square meters in size). The huge surge in generated data is only exceeded by the need to use that data intelligently for the purpose of increasing organizational performance and position through better decision making and achieving organizational coherence.
Once upon time there was a vision for a “single source of truth” that included ALL of an organization’s data. Most organizations never got beyond the objective of gathering all their data through ETL processes. Indeed, most data warehouses were conceived and designed for efficiency in managing inbound data flows. It’s a bit like organizing a distribution warehouse to receive goods, but forgetting that the real business returns lie in efficiently picking, packing, and delivering orders in an efficient, safe, timely, and effective manner. That’s where we focus most of our presentation, in the processes that organize and aggregate data flows, generate new information and insights from raw data, and deliver that information quickly to visualization and analytic interfaces.
We cover things like machine learning and predictive modeling, geo spatial analytics and map generation, multi-dimensional data structures and hierarchies, and network theory and graph analytics. We also cover foundational practices that must be in place and frameworks that can be used for data valuation and determining how much to invest in data sources (not all data is created equal!).
Honestly, the thing I enjoy the most is working through really challenging situations and sharing the stories and experiences with my brother. He’s the guy with the Ivy League Computer Science degree and who has led a software development team. I’m the guy with the MBA and the business frameworks. We have a great team in our consulting practice who are smarter and more experienced than us both, but it doesn’t get more fun than digging into something that every organization needs help with.
To learn more or register for the event, click here.