Views
Views in Datoria are first-class concepts that serve as stable interfaces to your data. Unlike some platforms that treat views as simply materialized results of jobs, Datoria gives views their own dedicated definition structure.
Creating a View
To create a view in Datoria, you define it with a name, dataset, schema, and query:
import {
BQView,
sql,
BQField,
BQFieldType,
BQPartitioning,
BQTable,
} from "@datoria/sdk";
import { analyticsDataset } from "../../core-concepts/tables-and-partitioning.md";
// Example table
const personEventDailyTable = BQTable({
dataset: analyticsDataset,
name: "person_event_daily",
partitioning: BQPartitioning.Day("date"),
schema: [
BQField("person_id", BQFieldType.String, "REQUIRED"),
BQField("event", BQFieldType.String, "REQUIRED"),
BQField("date", BQFieldType.Date, "REQUIRED"),
],
});
// BigQuery has a limitation where view columns cannot be required,
// so we convert the schema fields to view types
const schema = personEventDailyTable.schema.map((x) => x.asViewType());
// Since a view is unpartitioned, we need to reference all partitions
// when depending on a partitioned table
const query = sql`
SELECT x.* FROM ${personEventDailyTable.allPartitions()} x
WHERE x.event in unnest(['login', 'signup'])
`;
export const login_signup_events_view = BQView({
dataset: analyticsDataset,
name: "login_signup_events",
query,
schema,
description: "A view that selects only login and signup events",
});
Views vs. Partitioned Tables
You can set up a dependency on a view and use it as input data. However, this doesn't work well with partitioning, since views fundamentally do not have partitions. For partition-aware transformations, jobs are the recommended approach.
Sharing Logic Between Jobs and Views
When you need similar logic for both views and jobs, we recommend using TypeScript functions rather than duplicating SQL:
import { Dependency, TransformationJob, BQPartition } from "@datoria/sdk";
export const loginAndSignupEventsQuery = (dep: Dependency<any>) => sql`
SELECT x.* FROM ${dep} x
WHERE x.event in unnest(['login', 'signup'])
`;
export const view = BQView({
dataset: analyticsDataset,
name: "login_signup_events_view",
query: loginAndSignupEventsQuery(personEventDailyTable.allPartitions()),
schema,
});
export const job = TransformationJob({
name: "login_signup_events_job",
destination: {
...personEventDailyTable,
name: "login_signup_events_table",
},
query: loginAndSignupEventsQuery(personEventDailyTable.atExecutionDate()),
startPartition: BQPartition.Day(2025, 1, 1),
});
This approach allows you to reuse the logic while handling partitioning differently for each context.
Working with Table Aliases
You can also create reusable predicates that work with any table alias:
import { StringIdent } from "@datoria/sdk";
export const predicate = (name: StringIdent) => sql`
${name}.event in unnest(['login', 'signup'])
`;
export const view2 = BQView({
dataset: analyticsDataset,
name: "login_signup_events_view2",
query: sql`
SELECT x.* FROM ${personEventDailyTable.allPartitions()} x
WHERE ${predicate(StringIdent("x"))}
`,
schema,
});
export const job2 = TransformationJob({
name: "login_signup_events_job2",
destination: {
...personEventDailyTable,
name: "login_signup_events_table2",
},
query: sql`
SELECT x.* FROM ${personEventDailyTable.atExecutionDate()} x
WHERE ${predicate(StringIdent("x"))}
`,
startPartition: BQPartition.Day(2025, 1, 1),
});
Common View Patterns
Data Transformation Views
Views are useful for transforming raw data into more usable formats:
// Define a raw events table
const rawEventsTable = BQTable({
dataset: analyticsDataset,
name: "raw_events",
partitioning: BQPartitioning.Day("timestamp"),
schema: [
BQField("user_id", BQFieldType.String, "REQUIRED"),
BQField("event_type", BQFieldType.String, "REQUIRED"),
BQField("timestamp", BQFieldType.Timestamp, "REQUIRED"),
],
});
// Define schema for cleaned events
const cleanedEventsSchema = [
BQField("user_id", BQFieldType.String, "NULLABLE"),
BQField("event_type", BQFieldType.String, "NULLABLE"),
BQField("event_time", BQFieldType.DateTime, "NULLABLE"),
BQField("event_date", BQFieldType.Date, "NULLABLE"),
];
export const cleanedEventsView = BQView({
dataset: analyticsDataset,
name: "cleaned_events",
query: sql`
SELECT
user_id,
LOWER(event_type) as event_type,
DATETIME(timestamp) as event_time,
DATE(timestamp) as event_date
FROM ${rawEventsTable.allPartitions()}
`,
schema: cleanedEventsSchema,
});
Entity Relationship Views
Views can join related tables to provide a unified view of an entity:
// Define users table
const usersTable = BQTable({
dataset: analyticsDataset,
name: "users-view-example",
partitioning: BQPartitioning.Unpartitioned,
schema: [
BQField("user_id", BQFieldType.String, "REQUIRED"),
BQField("name", BQFieldType.String, "REQUIRED"),
BQField("email", BQFieldType.String, "REQUIRED"),
BQField("state", BQFieldType.String, "REQUIRED"),
BQField("city", BQFieldType.String, "REQUIRED"),
],
});
// Define subscriptions table
const subscriptionsTable = BQTable({
dataset: analyticsDataset,
name: "subscriptions",
partitioning: BQPartitioning.Unpartitioned,
schema: [
BQField("user_id", BQFieldType.String, "REQUIRED"),
BQField("plan_name", BQFieldType.String, "REQUIRED"),
BQField("subscription_status", BQFieldType.String, "REQUIRED"),
],
});
// Define schema for the user profile view
const userProfileSchema = [
BQField("user_id", BQFieldType.String, "NULLABLE"),
BQField("name", BQFieldType.String, "NULLABLE"),
BQField("email", BQFieldType.String, "NULLABLE"),
BQField("plan_name", BQFieldType.String, "NULLABLE"),
BQField("subscription_status", BQFieldType.String, "NULLABLE"),
];
export const userProfileView = BQView({
dataset: analyticsDataset,
name: "user_profile",
query: sql`
SELECT
u.user_id,
u.name,
u.email,
p.plan_name,
p.subscription_status
FROM ${usersTable.allPartitions()} u
LEFT JOIN ${subscriptionsTable.allPartitions()} p
ON u.user_id = p.user_id
`,
schema: userProfileSchema,
});
Security Views
Views can restrict access to sensitive data:
// Define schema for the limited user view
const limitedUserSchema = [
BQField("user_id", BQFieldType.String, "NULLABLE"),
BQField("name", BQFieldType.String, "NULLABLE"),
BQField("city", BQFieldType.String, "NULLABLE"),
BQField("state", BQFieldType.String, "NULLABLE"),
];
export const limitedUserView = BQView({
dataset: analyticsDataset,
name: "limited_user_view",
query: sql`
SELECT
user_id,
name,
city,
state
-- PII fields like email, phone, etc. are excluded
FROM ${usersTable.allPartitions()}
`,
schema: limitedUserSchema,
});
Testing Views
You can test views using Datoria's unit testing framework:
import { UnitTest, TestRelation } from "@datoria/sdk";
export const loginSignupViewTest = UnitTest({
testSubject: login_signup_events_view,
testRelations: [
TestRelation(
personEventDailyTable,
{ person_id: "user1", event: "login", date: "2025-01-01" },
{ person_id: "user2", event: "signup", date: "2025-01-01" },
{ person_id: "user3", event: "purchase", date: "2025-01-01" },
),
],
assertEquals: [
{ person_id: "user1", event: "login", date: "2025-01-01" },
{ person_id: "user2", event: "signup", date: "2025-01-01" },
],
});
Best Practices for Views
When to Use Views
- Interface Stability - Views can provide a stable interface while underlying tables change
- Data Access Control - Restrict access to sensitive columns or rows
- Simplifying Complex Joins - Abstract away complex join logic for common entity relationships
- Legacy Compatibility - Maintain backward compatibility with older interfaces
When to Use Jobs Instead
- When Partitioning is Important - For performance and cost efficiency with large datasets
- When Materialization is Beneficial - To improve query performance for frequently accessed data
- For Incremental Processing - When you need to process data incrementally by partition
Migrating Views
When you run datoria migrate
, Datoria will:
- Identify new or changed views
- Create or update them in your database
- Ensure the schema matches the view definition
No special handling is needed for view migrations—they're treated like any other component in your data platform.
Conclusion
Views in Datoria provide a powerful way to create stable interfaces to your data. By defining them explicitly with schemas and queries, you ensure consistency and maintainability across your data platform.