Skip to content

Ontology Model

ENVITED-X: A modular ontology ecosystem for simulation asset metadata.

What is an Ontology?

In this context, an ontology is a formal description of the types, properties, and relationships of simulation assets. It defines:

  • What types of assets exist (HD maps, scenarios, environment models, ...)
  • What properties they have (road types, lane counts, formats, countries, ...)
  • What values are allowed (road types must be one of: motorway, urban, rural, ...)
  • How they relate to each other (a scenario references an HD map)

The ENVITED-X ontologies use two W3C standards:

StandardRoleWhat it defines
OWL (Web Ontology Language)Class and property definitions"An HDMap has properties roadTypes, laneCount, formatType"
SHACL (Shapes Constraint Language)Value constraints and validation"roadTypes must be one of: motorway, urban, rural, interstate, highway"

Domain Structure

Each simulation asset type has its own domain ontology following a consistent pattern:

This pattern is uniform across all domains — the system discovers properties and values automatically from the SHACL shapes.

Vocabulary Extraction

The system does not use a manually maintained vocabulary. Instead, at startup:

Example: How sh:in becomes vocabulary

The ontology defines allowed road types like this:

turtle
hdmap:RoadTypesPropertyShape a sh:PropertyShape ;
  sh:path hdmap:roadTypes ;
  sh:in ("motorway" "urban" "rural" "interstate" "highway"
         "country-road" "pedestrian" "bicycle" "parking" "ramp") .

The vocabulary extractor runs a SPARQL query against the schema graph:

sparql
SELECT ?property ?value ?domain WHERE {
  GRAPH <urn:graph:schema> {
    ?shape sh:path ?property ;
           sh:in/rdf:rest*/rdf:first ?value .
    ?parentShape sh:property ?shape .
  }
}

This produces a structured OntologyVocabulary that the prompt builder and slot validator consume — fully automatically, no manual mapping required.

Why not SKOS?

An earlier design used manually maintained SKOS vocabularies as an intermediate layer. This was replaced because:

SKOS approachDirect OWL+SHACL approach
Manual maintenance per ontology changeAutomatic extraction at startup
Risk of vocabulary driftAlways in sync with ontology
Concept matcher with fuzzy string matchingLLM handles synonym resolution natively
Extra layer of indirectionSimpler, fewer moving parts

Supported Domains

The system auto-discovers 22 ontology domains from the ontology source. Currently, 5 domains have sample instance data:

DomainInstance AssetsKey Properties
HD Map117roadTypes, laneCount, speedLimit, formatType, country, trafficDirection
Scenario50scenarioCategory, weather, timeOfDay, trafficDensity
OSI Trace50roadTypes, granularity, fileFormat, numberFrames
Environment Model30terrainType, vegetationType, weatherCondition
Surface Model20materialType, frictionCoefficient, textureFormat

Additional ontology-only domains are still discovered at startup (for example automotive-simulator, simulation-model, openlabel, simulated-sensor, and vv-report).

Cross-Domain Relationships

Domains reference each other through SHACL property paths. The compiler discovers these cross-references at runtime — no predicate name is hardcoded in production code:

At warmup, reference-index.ts BFSes every typed asset instance and records each outgoing reference as a (sourceClass, predicatePath, targetClass) signature. The compiler uses these signatures to emit JOINs, and the per-row traceability breadcrumb under each result renders the actual predicate path that connected the two assets.

When a user searches for "scenarios on German motorways", the compiler generates a SPARQL query that joins scenario assets with their referenced HD map's georeference shape group. Broader queries can stay multi-domain as well: because roadTypes exists in both HD map and OSI trace ontologies, a search like "German motorway assets" can match both domains without hardcoded domain tables.