Contributing to BioSim Schema¶
This page explains how to contribute new terms and improvements to the BioSim schema, how LinkML is used in this repository, and how schema changes flow into downstream artefacts used by other BioSimDR tools.
What is LinkML¶
LinkML is a schema language for defining data models in YAML and generating downstream representations such as JSON-LD, JSON Schema, Python data models, and human-readable documentation.
Useful references:
How this Schema is Designed¶
The root schema is defined in biosim_schema_root. It imports domain components from schema_components_folder.
Design principles used in this project:
Keep top-level structure stable via the SimulationMetadata root class.
Group terms by scientific domain in component files.
Prefer reusable quantity classes for values with units.
Use engine_mapping annotations on slots to map MD engine-native terms to canonical schema terms.
Generate derived artefacts from schema source, rather than editing generated files manually.
How to Propose Changes¶
You can contribute through either a GitHub issue or a pull request.
Issue-first workflow (recommended for new terms):¶
Open an issue in biosim_schema_issues describing the term, scientific meaning, expected value type, and unit.
Include at least one engine-specific key where available, for example gromacs or amber naming.
State where the term should live in the schema hierarchy.
Direct pull request workflow:¶
Fork and branch from main.
Update relevant schema component file(s) in schema_components_folder.
Regenerate artefacts with the utility command.
Run tests and checks.
Submit a PR in biosim_schema_pulls with rationale and examples.