-
Notifications
You must be signed in to change notification settings - Fork 7
feat(medcat): CU-869bhm1zy Improve plugins #272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…register and their metadata
… without the necessary plugins
… fallback during provider finding
|
Task linked: CU-869bhm1zy Improve plugin system |
tomolopolis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice - v small comments - an update to the tutorial(s) might also be useful once this is in
| "provider": provider, | ||
| } | ||
| else: | ||
| pipeline_description["addons"].append({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iirc - extenal addons would be called plugins, and addons would be internal only, i.e. MetaCAT, RelCAT? but now reading this, does addons still remain internally? for ease, I'm okay with just sticking to one name for all...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to describe the pipeline - core components (ner, linking) and addon components (MetaCAT, RelCAT).
If it's not a core component, it's an addon component. And it gets added here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps some clarity:
A component that adds on top of the core components is called an addon. For instance, MetaCAT or RelCAT.
An external source of code that hooks into MedCAT is a plugin. This plugin can provide any component - core or addon. For instance, medcat-gliner is a plugin.
|
|
||
|
|
||
| def get_all_extra_deps_raw(package_name: str) -> list[str]: | ||
| """Get all the dependencies for a pcakge that are for an extra component. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spelling of package
Improved Plugin System: Model Awareness and Better Error Handling
Overview
This PR enhances the plugin system to make models aware of what plugins they use, improves error handling when required plugins are missing, and provides better visibility into plugin dependencies through model cards.
Key Features
1. Plugin Registration and Tracking
PluginRegistry) that tracks all loaded plugins and their metadata (name, version, author, URL)2. Model Card Enhancements
3. Improved Error Handling
4. Backward Compatibility
Example: Missing Plugin Error
When a user tries to load a model pack without the required plugins installed, they now get a clear error message:
Model card example
{ "Model ID": "234dda1597f635e3", "Last Modified On": "2025-11-07T12:57:03.405689", "History (from least to most recent)": [ "234dda1597f635e3" ], "Description": "This is a UK KCH medcat model. Created on the 20220913. It contains mappings to ICD10 and OPCS4. Enjoy!", "Source Ontology": [ "20220803_SNOMED_UK_clinical_ext", "20220831_SNOMED_UK_drug_ext", "Enriched via UMLS v2022AA English terms only" ], "Location": "N/A", "Pipeline Description": { "core": { "tagging": { "name": "tag-and-skip-tagger", "provider": "medcat" }, "token_normalizing": { "name": "token_normalizer", "provider": "medcat" }, "ner": { "name": "gliner_ner", "provider": "medcat_gliner" }, "linking": { "name": "medcat2_linker", "provider": "medcat" } }, "addons": [] }, "Required Plugins": [ { "name": "medcat_gliner", "provides": [ [ "ner", "gliner_ner" ] ], "author": "Mart Ratas <[email protected]>", "url": "Homepage, https://github.com/CogStack/medcat-ops/tree/main/medcat-gliner" } ], "MetaCAT models": [], "Basic CDB Stats": { "Number of concepts": 760283, "Number of names": 3080847, "Number of concepts that received training": 38460, "Number of seen training examples in total": 153875883, "Average training examples per concept": 4000.932995319813, "Unsupervised training history": [], "Supervised training history": [] }, "Performance": {}, "Important Parameters (Partial view, all available in cat.config)": { "config.ponents.ner.min_name_len": { "value": 2, "description": "Minimum detection length (found terms/mentions shorter than this will not be detected)." }, "config.ponents.ner.upper_case_limit_len": { "value": 2, "description": "All detected terms shorter than this value have to be uppercase, otherwise they will be ignored." }, "config.ponents.linking.similarity_threshold": { "value": 0.3, "description": "If the confidence of the model is lower than this a detection will be ignore." }, "config.ponents.linking.filters.cuis": { "value": 0, "description": "Length of the CUIs filter to be included in outputs. If this is not 0 (i.e. not empty) its best to check what is included before using the model" }, "config.general.spell_check": { "value": true, "description": "Is spell checking enabled." }, "config.general.spell_check_len_limit": { "value": 4, "description": "Words shorter than this will not be spell checked." } }, "MedCAT Version": "2.2.0.dev0" }