Skip to main content

The Virtual Data Platform

The Virtual Data Platform (VDP) offers a compact, structured representation of your data platform. It encapsulates all key parts — tables, views, routines, transformations, dependencies, and rules—in a form that's fast to load, inspect, interpret, and move between systems.

Think of it as a comprehensive blueprint that allows Datoria to understand, validate, and optimize your entire data workflow before execution.

You can read all the full details in the architecture reference.

Key Benefits

  • Millisecond-fast validation - Catch errors during development, not production
  • Complete dependency awareness - Know exactly what affects what
  • Schema enforcement - Ensure compatibility across your platform
  • End-to-end lineage - Track data flows with precision

Practical Applications

Real-Time Validation

The VDP enables comprehensive validation of your data platform:

  • Schema compatibility checking between sources and destinations
  • SQL syntax validation against the actual database engine
  • Detection of missing dependencies or circular references
  • Validation of partitioning and invalidation rules

This means issues are caught instantly during development, not discovered after costly production runs.

IDE Integration

The VDP powers an exceptional development experience through Language Server Protocol:

  • Autocompletion for table and column names
  • Inline error highlighting
  • Hover documentation showing schemas and dependencies
  • Jump-to-definition for navigating complex projects

These capabilities transform how engineers interact with data transformations, bringing software engineering-quality tooling to data work.

Migration Planning

When deploying changes, the VDP enables:

  • Comparison between local and production environments
  • Detection of breaking schema changes
  • Identification of affected components
  • Proper sequencing of operations

This ensures deployments are predictable and safe, eliminating the "deploy and pray" approach.

Environment Management

The VDP's structure allows for sophisticated environment management:

  • Copying definitions from one environment to another
  • Rewriting dataset references for development or testing
  • Isolation of changes to prevent interference
  • Verification of consistency across environments

Engineers can work in isolation while ensuring changes integrate smoothly with the broader environment.

Dependency Tracking

With the VDP, dependencies are explicit and precise:

  • Tracking which tables a job reads from
  • Identifying which partitions are needed for a specific execution
  • Monitoring complex dependency chains
  • Analyzing the impact of changes

This granular dependency information enables cost optimization by processing only the necessary data.

Conclusion

The Virtual Data Platform is the foundation of Datoria's approach to data engineering. By modeling your entire data ecosystem in memory, it enables validation, tooling, and precision that wouldn't be possible with traditional approaches. This architecture shift allows teams to move quality earlier in the development process, catching issues before they become costly production problems.