DATABASE SYSTEMS, PART 1: SPOILT FOR CHOICE

- Falk Borgmann

Distributed Systems Centralized Systeme NoSQL store

Some of us probably hanker for the good old days when it was mainframes or nothing. Things have gotten harder for today’s IT departments, since the world has grown a lot more complicated. Even the simple fact of being able to choose between different technologies and models underlines the challenges that we now face. One example where this becomes very clear indeed is that of databases. Until about 15 years ago, it would have been virtually unthinkable to use anything other than licensed relational databases in German companies. And the number of providers to choose from could have been counted on the fingers of one hand. The license was purchased, the software installed, the training course completed—and you were good to go.

And it’s in this same area of databases where the many approaches and models available today have not only created greater choice but have also led to a situation where there is no standard technical answer to the various business questions. Unless, of course, you are satisfied with average or suboptimal solutions. During the 1990s, an IT strategy often considered state-of-the-art was to address every challenge with a standard technology—and ideally one from just a single manufacturer. In reality, this always meant using a relational database—either Oracle, Microsoft, or IBM—or occasionally MS Access for internal use.

Today, this kind of standardization would seem as strange as thinking driving is the solution to every transport problem. Regardless of the distance to the destination, what we needed to transport or if we had to cross an ocean to do so. A strange comparison—because it’s obviously easier to walk to the mailbox round the corner, automobiles obviously don’t float and you can’t get a closet in a subcompact? OK, that’s right and the comparison may be somewhat overstated. But how many projects do you know of that were too expensive, too complicated, or too time-consuming, and finished up with results that didn’t meet expectations? And how many of these projects had to work with IT in a kind of architectural straitjacket? It’s not that I’d say a standard is generally a bad thing. Standardization in IT is only bad if it means that you have to leave technical potential by the wayside. Or are forced to take a square piece of technology, and hammer it so far and so hard into a round hole until it just about fits, but you’ll never, ever get it back out. Alongside other missteps in IT projects, this is certainly one of the major problems that are to be found in technical projects today.

Reinventing IT Management, Rethinking IT Personnel

For a number of years now, US technology companies like Google or Amazon have shown us just what a powerful and flexible IT organization is capable of. Compared to the US market, we must unfortunately realize that the European continent probably has about five to eight years of catch-up to play in terms of the technological development of IT solutions. Some progress has been made in recent years to neutralize this head start, however. One good recent example is the GAIA-X project, which should be warmly welcomed. However, it would have been even better if this project had in fact been greenlighted five—or even eight—years ago. Today, every large company has set up their think tanks, innovation labs, and a bunch of PoC projects. All too often, the pendulum then swings in the other direction, resulting in projects where only the latest, bleeding-edge technologies are allowed to be used. But these kinds of technology playgrounds rarely lead to a robust, long-term technical solution. Typically, the lessons learned range from “OK, so these new technologies don’t actually work as well as everyone says they do” to “we like what you’ve done—so let’s make the topic part of our strategic roadmap somewhere”. Less risk-averseness combined with an organizational shake-up would be a good move for many large companies. To avoid results that de facto do not bring any benefits and also promote innovation only in the rarest of cases—while still tying up plenty of investment capital—a modern IT team should take the following approach:

  1. Scout out technological developments and models (including open source)
  2. Trial technologies and models
  3. Design a solution/operating/maintenance plan
  4. Decide which technology is suited to a specific requirement

To deploy these four interrelated aspects successfully, a specific kind of supportive framework is necessary. Namely: a new kind of IT staffer (someone who is permanently engaged with new technologies, capable of internalizing ideas, and comparing them/testing them against one another) and a model for IT organization that also lets these employees flourish. The traditional domain specialist of the 90s or 00s is not so much in demand here. This is where many companies will find their potential. Not least because agile working also means analyzing and trying out technologies on a continuous, proactive basis, while also taking direct responsibility for the team’s decisions and results (‘you build it—you run it’). A careful separation of strategy and implementation—as would be advised by conventional 1960s management theory—no longer works in today’s IT. All the same: far too many organizations still follow the old patterns and top management is surprised by the lack of progress. This doesn’t have to mean shutting the door on the future, however, since the first two pillars of this approach can be bought in with technology consulting services.

The World of Database Technology

So, to gain an initial overview of the world of database technologies, we should first make sure that we are clear about the basic concepts. At the top level of our overview, we could set up a range of categories, such as open source versus commercial, for example, or cloud versus on-premise. If we recall the question of scaling associated with the Big Data movement, however, we can use a key decision criterion to drill down into our topic, namely: a) Centralized database systems b) Distributed database systems

It should also be remembered that a database system is always made up of two parts: the data that will actually be managed (database) and the database management system (DBMS) that lies on top of this. The question of centralization or distribution can also be answered independently for each part of the database system. A system is only truly distributed if both components work according to a distributed model while still addressable as one system externally.

Centralized Versus Local Which of the two models offer advantages or disadvantages in terms of concrete technical use cases? To answer this question, we’ll be taking a closer look at the general properties of both models in the next article in the series. Specifically, we’ll make sure we understand the basic concepts behind each database solution (Part 2 – Fundamentals). We’ll then take a look at the differences between these database models and use a simple dataset to compare the one with the other. Relational databases versus NoSQL versus NewSQL. Open source versus subscription versus licenses. While many database marketing experts certainly know how to pitch their product, the devil is always in the details. At the end of this series, I hope to at least have given you food for thought or even a solid basis for making better decisions in your company or organization.