Skip to main content

Schemas & Contracts

The Power of Explicit Over Implicit

If you've ever worked with both dynamically and statically typed programming languages, you understand the fundamental tradeoff: dynamic typing offers flexibility and rapid development but pushes error detection to runtime; static typing catches errors early but requires upfront structure.

Datoria firmly believes in the "static typing" approach for data: explicit, complete schemas that serve as contracts between data producers and consumers. While other tools might let you operate without defined schemas—relying on inference or runtime discovery—we've seen how this leads to brittleness, troubleshooting nightmares, and trust issues across teams.

Why Explicit Schemas Matter

  • Catch issues at development time, not runtime - No more pipeline failures because of unexpected nulls or type mismatches
  • Enable confident refactoring - Understand impact before making changes
  • Create clear team boundaries - Explicit contracts between data producers and consumers
  • Power automated validation - Tooling can verify schema compatibility across your entire platform
  • Improve documentation - Schemas serve as living documentation of your data structures

Just as static typing transforms software development—enabling better tooling, clearer interfaces, and fewer runtime surprises—explicit schemas transform data engineering from a reactive firefighting exercise into a proactive, confident discipline.

Schema Definition in Datoria

In Datoria, schemas are defined as arrays of field definitions, each with a name, type, mode (required/nullable/repeated), and optional properties:

import { BQField, BQFieldType } from "@datoria/sdk";

// Define a simple schema
const userSchema = [
BQField("id", BQFieldType.String, "REQUIRED"),
BQField("name", BQFieldType.String, "REQUIRED"),
BQField("email", BQFieldType.String, "NULLABLE"),
BQField("created_at", BQFieldType.Timestamp, "REQUIRED"),
];

This explicit approach forces important questions to be answered early: What fields exist? Which are required? What types should they have? The result is a data platform built on clarity rather than assumptions.

Rich Field Definitions

Fields can include detailed properties that directly mirror what's supported in the underlying database:

// A field with additional options
export const emailField = BQField(
"email", // Field name
BQFieldType.String, // Data type
"NULLABLE", // Mode
{
description: "User's email address", // Documentation
defaultValueExpression: "NULL", // Default value
dataPolicies: ["PII"], // Custom metadata
},
);

This comprehensive approach ensures both humans and systems understand the full context of each field—from technical constraints to business meaning and data governance requirements.

Structured and Nested Data

Modern data rarely fits into flat structures. Datoria fully supports nested and repeated fields:

import { BQStructField } from "@datoria/sdk";

// Define a struct field
export const addressField = BQField.struct("address", "NULLABLE")(
BQStructField("street", BQFieldType.String, "REQUIRED"),
BQStructField("city", BQFieldType.String, "REQUIRED"),
BQStructField("state", BQFieldType.String, "REQUIRED"),
BQStructField("zip", BQFieldType.String, "REQUIRED"),
);

Schema Manipulation and Reuse

Schemas are composable and transformable using standard array operations:

// Combine schemas
export const combinedSchema = [...userSchema, addressField];

// Transform schemas
export const nullableSchema = userSchema.map((field) => ({
...field,
mode: "NULLABLE",
}));

// Filter schemas
export const userSchemaWithoutEmail = userSchema.filter(
(field) => field.name !== "email",
);

This approach brings software engineering principles to schema management, enabling modularity and reuse without sacrificing clarity.

Contract-Based Evolution

As business needs evolve, schemas must change too. Explicit schema definitions are essential for performing safe migrations without breaking downstream dependencies. Learn more about how Datoria manages schema evolution in the Migrations documentation.