Skip to content

[ENH] Improve Global Config Architecture #1564

@SimonBlanke

Description

@SimonBlanke

The config module uses extensive globals and I agree with this comment: https://github.com/openml/openml-python/blob/main/openml/config.py#L395

This is problematic for the following reasons:

  • Any code can modify openml.config.apikey = "x". So bad or no encapsulation
  • Global state persists between tests, causing flaky tests
  • openml.config.apikey = 123 silently accepts wrong type. So no validation

And I am sure there are more reasons.

I think it was done to have this kind of API instead of function-call-like syntax:

openml.config.apikey = "my-key"
openml.config.server = "https://test.openml.org/api/v1/xml"

We can preserve this API and still get rid of most of these globals()/global, by defining a module level __getattr__ and __setattr__ and use a dataclass for the data encapsulation and validation:

  from dataclasses import dataclass, replace

  @dataclass
  class OpenMLConfig:
      apikey: str = ""
      server: str = "https://www.openml.org/api/v1/xml"
      # ... 

  _config = OpenMLConfig()

  def __getattr__(name: str):
      if hasattr(_config, name):
          return getattr(_config, name)
      raise AttributeError(f"module 'openml.config' has no attribute '{name}'")

  def __setattr__(name: str, value):
      global _config
      if hasattr(_config, name):
          _config = replace(_config, **{name: value})
      else:
          raise AttributeError(f"module 'openml.config' has no attribute '{name}'")

This is still not a solution I am completely content with, but it is an improvement and not too difficult to implement.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions