The Virtual Data Platform
The Virtual Data Platform (VDP) offers a compact, structured representation of your data platform. It encapsulates all key parts — tables, views, routines, transformations, dependencies, and rules—in a form that's fast to load, inspect, interpret, and move between systems.
Think of it as a comprehensive blueprint that allows Datoria to understand, validate, and optimize your entire data workflow before execution.
You can read all the full details in the architecture reference.
Key Benefits
- Millisecond-fast validation - Catch errors during development, not production
- Complete dependency awareness - Know exactly what affects what
- Schema enforcement - Ensure compatibility across your platform
- End-to-end lineage - Track data flows with precision
Practical Applications
Real-Time Validation
The VDP enables comprehensive validation of your data platform:
- Schema compatibility checking between sources and destinations
- SQL syntax validation against the actual database engine
- Detection of missing dependencies or circular references
- Validation of partitioning and invalidation rules
This means issues are caught instantly during development, not discovered after costly production runs.
IDE Integration
The VDP powers an exceptional development experience through Language Server Protocol:
- Autocompletion for table and column names
- Inline error highlighting
- Hover documentation showing schemas and dependencies
- Jump-to-definition for navigating complex projects
These capabilities transform how engineers interact with data transformations, bringing software engineering-quality tooling to data work.
Migration Planning
When deploying changes, the VDP enables:
- Comparison between local and production environments
- Detection of breaking schema changes
- Identification of affected components
- Proper sequencing of operations
This ensures deployments are predictable and safe, eliminating the "deploy and pray" approach.
Environment Management
The VDP's structure allows for sophisticated environment management:
- Copying definitions from one environment to another
- Rewriting dataset references for development or testing
- Isolation of changes to prevent interference
- Verification of consistency across environments
Engineers can work in isolation while ensuring changes integrate smoothly with the broader environment.
Dependency Tracking
With the VDP, dependencies are explicit and precise:
- Tracking which tables a job reads from
- Identifying which partitions are needed for a specific execution
- Monitoring complex dependency chains
- Analyzing the impact of changes
This granular dependency information enables cost optimization by processing only the necessary data.
Conclusion
The Virtual Data Platform is the foundation of Datoria's approach to data engineering. By modeling your entire data ecosystem in memory, it enables validation, tooling, and precision that wouldn't be possible with traditional approaches. This architecture shift allows teams to move quality earlier in the development process, catching issues before they become costly production problems.