3.3. Data Oriented Architecture

Data is more important than code.

It was, is, and always will be.

Many paradigms (OOP, DDD, Clean Architecture, and similar ones) forget about this, and sooner or later programmers who use them pay the price with months of refactoring.

But as soon as you accept this axiom and start writing code in this manner, your programs will become significantly more understandable, faster, and more flexible.

We retrieve data from some data source (application memory, API, file system, message queue, etc.), modify it, and save it to the same or a different data source.

Data Oriented Architecture – an approach that puts working with Data first

If we formulate this as a set of coding rules, it would be as follows:

Don't create abstractions on top of storage data structures

Operations on data sources take priority over business logic

i. Don't create abstractions

As soon as a programmer starts writing a program, their first desire is to write some Domain Models, then write business logic for them, and only then design a storage that will correspond to these Models.


class UserModel {
	id: number // This field is in the `user` table of the main database
	profile: ProfileModel // In the DB, Profile is not part of User, but just a link through a foreign key to Profile
	email: string // This field is stored in a separate system
}

class ProfileModel {
	id: number // This field is in the `profile` table of the main database
	age: number // This field is in the `profile` table of the main database
	firstName: string // And this field is in the `profile` table of the legacy database
	user: UserModel // User is not part of Profile, but is linked to it by the `userId` field
}

When they retrieve or send requests, they will also use Models as parameters for these requests, because for them, the Model is the center of the universe of their program.


type ChangeUserProfileRequest = {
	data: Profile // we won't describe the necessary fields, but simply use the already described Profile model
}

But most often, they only create more problems than advantages for their program by doing so.

The idea of DoA is to work with our data as they appear in the original source, without creating unnecessary abstractions on top:


// # Tables from the main DB
type UserTable = {
	id: number // In the DB this will be Serial
}

type ProfileTable = {
	id: number
	legacyProfileId: string
	userId: number
	age: number
}
 
// # Tables from the legacy DB
type OldProfileTable = {
	id: string
	firstName: string
}

// # Data from a third-party authentication system (Auth0)
type GetUserDataRequest = {
	id: string
	userId: number
	email: string
}

And likewise, when we describe requests and responses, we should describe them exactly as they appear:


type ChangeUserProfileRequest = {
	data: {
		firstName: string
		age: number
		userId: number
	}
}

And our entire application, at the moment when it needs to, should work with these structures.

What if it's inconvenient to work with the original structure?

There are situations where we frequently need to use a combination of several data points, and that's when Projection comes to the rescue.

Projection – a pattern that allows extending data structures without disrupting their original structure.

The rules for building a Projection are as follows:

You can extend types as long as the extension is backward compatible.

You can add computed fields

If you use another entity, use it completely

Example:


type PositiveNumber = BrandedType<number, "PositiveNumber"> // We'll talk about branded types later

const PositiveNumber = {
  ofNumber: (val: number) => {
    if (val < 0) {
      throw new Error()
    }
    return val
  }
}

type Profile = ProfileTable & {
	age: PositiveNumber // # Example of extending an existing type
}

type UserWithProfile = {
	user: UserTable // Example of fully using an existing DB structure
	profiles: ProfileTable[] // Example of fully using an existing DB structure
  profileCount: number // Example of a computed field
}

In other words, all operations that were available when working with, for example, UserTable will remain available for this Projection as well, because it is part of the Projection.

ii. Operations on data sources take priority over business logic

To simplify, this means that you should be able to use the functionality of your data source anywhere in your program.

If you've ever heard of Clean / Onion / Hexagonal Architecture or DDD, then you know about the concept of "layers," where, for example, you're only allowed to work with the database in a separate layer, and in business logic, database work should be abstracted by interfaces.

Some of the main disadvantages of such an approach are:

Inability to use specific functions of your data source (if we've made an abstraction, then using some special capability of PostgreSQL means tying ourselves to its code)

Inability to perform low-level optimizations when working with a data source. Any optimization will require using more specific operations on the data source, but we're "not allowed" to pass it where needed.

Spending a lot of time creating interfaces and abstractions that often either won't sufficiently abstract the data source (for example, to replace it), will give the above problems, and will call one method after another (how often have I seen a Controller call an Adapter method, which in turn simply calls a UseCase, which does nothing but call a Repository... and that's 90% of the codebase...)

So, the idea of DOA is exactly the opposite: don't create additional layers, use your data sources wherever and however you want.

And by doing so, you'll not only avoid the problems described above, but also gain the ability to make low-level optimizations + use ALL the capabilities of your data source + not waste time inventing unnecessary abstractions.

What's next?

Now we're going to touch on another super important topic that will provide an answer about when and how to reuse code:

👇

3.4. What you need, where you need it (WYNWYN): Context Dependence and Code Connectedness

👈 Previous chapter

⛓️

3.2. Process First Design

Next chapter 👉

👇

3.4. What you need, where you need it (WYNWYN): Context Dependence and Code Connectedness