Skip to content

[FEATURE]: Proposed changes for MLBOM schema for CycloneDX 2.0 #702

@mrutkows

Description

@mrutkows

Describe the feature

This issue will capture proposed changes and action items raised as part of the MLBOM work group towards improving the ML schema for CycloneDX 2.0

  • fields in modelParameters should be moved to top-level modelCard schema
  • explore releaseNotes being made plural to account for the same (identical) component being released simultaneously to different platforms/repositories (i.e., models released to HF, Ollama, etc.)
    • We agree "best practices" should be written on how to account for this use case on when/how to use multiple release notes and assure the component identity is the same...
  • architectureFamily as a simple string may not be enough as there are now more commonly many hybrid architectures with a increasing rate of divergence (just assuming transformers and also taking into account models that include multiple models such as a smolDocling or tableFormer model...
    • Consider if this field has value IF we actually allow a more precise desc. of the architectural "layers" inside the modelArchitecture if redesigned to do so?
    • Additionally, consider things like dense, sparse, moe, etc.
  • Redesign modelArchitecture to allow a description of the layers that compose the model. -> TODO @mrutkows
  • modelParameters should reflect # of "learned" parameters
    • Note: EU CRA says each parameter needs to be described??? Need more info. on this as for a BOM this would be unsupportable as well as impossible to derive from any scanning tool
  • task needs to be plural i.e., tasks
    • TODO: does this need to be a string? an enum? more complex?
  • NEW: Add trainingConsiderations to describe the training processes
    • TODO: more discussion/design needed
  • TODO: Discuss reworking of inputs and outputs (both strings) is not clear... assume by the desc. that these are really (chat) template parameters (which may vary by template as models can have multiple)
  • TODO: Need to add a description of required (or fixed) hyperparameters (e.g., params.json)
    • Many models require certain params and will NOT work properly (invalid results) if not set properly (e.g., image model clip rects, guardrails models need temperature set to zero, etc.)

Additional considerations

  • Extend modelCards to allow for similar, new concepts for "system cards" (system level usage if a model) and "agent cards" (models used in agentic instances) which are being adopted by model providers.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions