Skip to content

Conversation

@synchon
Copy link
Member

@synchon synchon commented Nov 13, 2025

Are you requiring a new dataset or marker?

  • I understand this is not a marker or dataset request

Which feature do you want to include?

We want to be able to validate the yaml so we avoid issues like #296

How do you imagine this integrated in junifer?

Use pydantic and validate the yaml.

Do you have a sample code that implements this outside of junifer?

No response

Anything else to say?

No response

@fraimondo fraimondo added the enhancement New feature or request label Sep 24, 2024
@fraimondo fraimondo added this to the 0.0.6 (alpha 5) milestone Sep 24, 2024
@codecov
Copy link

codecov bot commented Nov 17, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (5abcacf) to head (39b1e18).
⚠️ Report is 12 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##             main      #364      +/-   ##
===========================================
+ Coverage   91.64%   100.00%   +8.35%     
===========================================
  Files         146         1     -145     
  Lines        5862         1    -5861     
  Branches      934         0     -934     
===========================================
- Hits         5372         1    -5371     
+ Misses        318         0     -318     
+ Partials      172         0     -172     
Flag Coverage Δ
docs 100.00% <ø> (ø)
junifer ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
docs/conf.py 100.00% <ø> (ø)

... and 144 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@synchon synchon requested a review from fraimondo November 26, 2025 10:39
Copy link
Contributor

@fraimondo fraimondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks promising. Though there are some changes that go against the easy-to-use QA:

  • Many parameters were str or list of str. Now they are lists. This adds the unnecessary burden to add a list in the YAMLs where the most common use case is to use a single element.

  • some parameters were str or Path and now they have to be Paths. This also makes it more complicated for users to define code. This can easily be validated later on and not with pydantic. Same for HttpUrl

  • I think there's an issue with the datadir which now will be part of the meta due to missing it's _. Also, there was a reason why it was not called fulldir in the datababber, but datadir, on purpose.

  • Some enums might add validation to pydantic, but will restraint the extensibility of junifer. E.g. DataType, ConfoundFormat, etc... These should be dynamic, as I might want to add some of them in an extension, without modifying junifer code.

Since there are 197 files changed, and many changes happen because of the first item. I will review more later with the next changes.

@synchon synchon force-pushed the feat/pydantic-validation branch from 5831270 to f9e207c Compare November 27, 2025 10:17
@synchon
Copy link
Member

synchon commented Nov 28, 2025

  • Many parameters were str or list of str. Now they are lists. This adds the unnecessary burden to add a list in the YAMLs where the most common use case is to use a single element.

All of them should be reverted now, let me know if you find any stray ones.

  • some parameters were str or Path and now they have to be Paths. This also makes it more complicated for users to define code.

Not at all, users can keep passing str and they'll be converted to Path by pydantic.

This can easily be validated later on and not with pydantic. Same for HttpUrl

The introduction of pydantic is primarily to not hand-roll these validations anymore and users should only need to add logical validations, but anyway these can be easily added via pydantic. HttpUrl is now AnyUrl and is same as the Path argument above.

  • I think there's an issue with the datadir which now will be part of the meta due to missing it's _.

Has been addressed in the specific comment.

Also, there was a reason why it was not called fulldir in the datababber, but datadir, on purpose.

The reason was to keep it in sync with non-datalad based datagrabbers and is already taken care of internally via introduction of _repodir. The UX should be exactly as it is now, let me know if you find any difference while using.

  • Some enums might add validation to pydantic, but will restraint the extensibility of junifer. E.g. DataType, ConfoundFormat, etc... These should be dynamic, as I might want to add some of them in an extension, without modifying junifer code.

Has been addressed in specific comments.

@synchon synchon requested a review from fraimondo November 28, 2025 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants