Contributing to BioSim Schema

This page explains how to contribute new terms and improvements to the BioSim schema, how LinkML is used in this repository, and how schema changes flow into downstream artefacts used by other BioSimDR tools.

What is LinkML

LinkML is a schema language for defining data models in YAML and generating downstream representations such as JSON-LD, JSON Schema, Python data models, and human-readable documentation.

Useful references:

How this Schema is Designed

The root schema is defined in biosim_schema_root. It imports domain components from schema_components_folder.

Design principles used in this project:

  • Keep top-level structure stable via the SimulationMetadata root class.

  • Group terms by scientific domain in component files.

  • Prefer reusable quantity classes for values with units.

  • Use engine_mapping annotations on slots to map MD engine-native terms to canonical schema terms.

  • Generate derived artefacts from schema source, rather than editing generated files manually.

How to Propose Changes

You can contribute through either a GitHub issue or a pull request.

Direct pull request workflow:

  1. Fork and branch from main.

  2. Update relevant schema component file(s) in schema_components_folder.

  3. Regenerate artefacts with the utility command.

  4. Run tests and checks.

  5. Submit a PR in biosim_schema_pulls with rationale and examples.