Global maintenance 2025 turn 2

vecxoz · vecxoz · commit bcbca766ddaa · 2025-09-24T15:45:14.000+03:00
diff --git a/README.md b/README.md
@@ -1,6 +1,6 @@
 [![PyPI version](https://img.shields.io/pypi/v/vecstack.svg?colorB=4cc61e)](https://pypi.python.org/pypi/vecstack)
 [![PyPI license](https://img.shields.io/pypi/l/vecstack.svg)](https://github.com/vecxoz/vecstack/blob/master/LICENSE.txt)
-[![Build Status](https://travis-ci.org/vecxoz/vecstack.svg?branch=master)](https://travis-ci.org/vecxoz/vecstack)
+[![Build status](https://github.com/vecxoz/vecstack/actions/workflows/actions.yaml/badge.svg?branch=master)](https://github.com/vecxoz/vecstack/actions)
 [![Coverage Status](https://coveralls.io/repos/github/vecxoz/vecstack/badge.svg?branch=master)](https://coveralls.io/github/vecxoz/vecstack?branch=master)
 [![PyPI pyversions](https://img.shields.io/pypi/pyversions/vecstack.svg)](https://pypi.python.org/pypi/vecstack/)
 
@@ -137,7 +137,7 @@ S_test = stack.transform(X_test)
 28. [Can I use `(Randomized)GridSearchCV` to tune the whole stacking Pipeline?](https://github.com/vecxoz/vecstack#28-can-i-use-randomizedgridsearchcv-to-tune-the-whole-stacking-pipeline)
 29. [How to define custom metric, especially AUC?](https://github.com/vecxoz/vecstack#29-how-to-define-custom-metric-especially-auc)
 30. [Do folds (splits) have to be the same across estimators and stacking levels? How does `random_state` work?](https://github.com/vecxoz/vecstack#30-do-folds-splits-have-to-be-the-same-across-estimators-and-stacking-levels-how-does-random_state-work)
-31. [How does `vecstack.StackingTransformer` differ from `sklearn.ensemble.StackingClassifier`?](https://github.com/vecxoz/vecstack#31)
+31. [How does `vecstack.StackingTransformer` differ from `sklearn.ensemble.StackingClassifier`?](https://github.com/vecxoz/vecstack#31-how-does-vecstackstackingtransformer-differ-from-sklearnensemblestackingclassifier)
 
 ### 1. How can I report an issue? How can I ask a question about stacking or vecstack package?
 
@@ -410,13 +410,6 @@ It significantly differs. Please see a [detailed explanation](https://github.com
 9. You can also look at animation of [Variant A](https://github.com/vecxoz/vecstack#variant-a-animation) and [Variant B](https://github.com/vecxoz/vecstack#variant-b-animation).
 
 
-# References
-
-* [Ensemble Learning](https://en.wikipedia.org/wiki/Ensemble_learning) ([Stacking](https://en.wikipedia.org/wiki/Ensemble_learning#Stacking)) in Wikipedia
-* Classical [Kaggle Ensembling Guide](https://mlwave.com/kaggle-ensembling-guide/) or try [another link](https://web.archive.org/web/20210727094233/https://mlwave.com/kaggle-ensembling-guide/)
-* [Stacked Generalization](https://www.researchgate.net/publication/222467943_Stacked_Generalization) paper by David H. Wolpert
-
-
 # Variant A
 
 ![Fold 1 of 3](https://github.com/vecxoz/vecstack/raw/master/pic/dia1.png "Fold 1 of 3")
@@ -442,3 +435,10 @@ It significantly differs. Please see a [detailed explanation](https://github.com
 # Variant B. Animation
 
 ![Variant B. Animation](https://github.com/vecxoz/vecstack/raw/master/pic/animation2.gif "Variant B. Animation")
+
+
+# References
+
+* [Ensemble Learning](https://en.wikipedia.org/wiki/Ensemble_learning) ([Stacking](https://en.wikipedia.org/wiki/Ensemble_learning#Stacking)) in Wikipedia
+* Classical [Kaggle Ensembling Guide](https://mlwave.com/kaggle-ensembling-guide/) or try [another link](https://web.archive.org/web/20210727094233/https://mlwave.com/kaggle-ensembling-guide/)
+* [Stacked Generalization](https://www.researchgate.net/publication/222467943_Stacked_Generalization) paper by David H. Wolpert
diff --git a/setup.py b/setup.py
@@ -5,6 +5,7 @@
 long_desc = '''
 Python package for stacking (stacked generalization) featuring lightweight functional API and fully compatible scikit-learn API.
 Convenient way to automate OOF computation, prediction and bagging using any number of models.
+All details, FAQ, and tutorials: https://github.com/vecxoz/vecstack
 '''
 
 setup(name='vecstack',
diff --git a/tests/test_sklearn_api_regression.py b/tests/test_sklearn_api_regression.py
@@ -2042,6 +2042,48 @@ def test_compare_with_stackingregressor_from_sklearn(self):
         y_pred_rf = rf.fit(X_train, y_train).predict(X_train)
         assert_array_equal(S_train_sklearn, np.hstack([y_pred_et.reshape(-1, 1), y_pred_rf.reshape(-1, 1)]))
 
+    # -------------------------------------------------------------------------
+    # Added 20250924
+    # Explicitly check that `validate_data` checks number of features
+    # -------------------------------------------------------------------------
+    
+    def test_inconsistent_shape_passed_to_transform(self):
+        """
+        When transforming non-training set there was a check:
+        ```
+        if X.shape[1] != self.n_features_:
+            raise ValueError('Inconsistent number of features.')
+        ```
+        It was needed because I used `check_array` function to validate data
+        and probably number of features was not checked.
+        
+        Now I check data with `validate_data` which checks `self.n_features_in_`.
+        So my manual check can never happen and coverage dropped.
+        So I removed my manual check and created this test case to confirm explicitly that `validate_data` works.
+    
+        In version 0.4.0 there was no specific test for this case,
+        probably because it was included in `check_estimator`.
+        """
+        estimators = [
+            ('lr', LinearRegression()),
+            ('ridge', Ridge())]
+        
+        stack = StackingTransformer(estimators=estimators,
+                                    regression=True,
+                                    variant='B',
+                                    n_folds=5,
+                                    shuffle=False)
+        
+        stack = stack.fit(X_train, y_train)
+        S_train = stack.transform(X_train)  # OK
+        S_test = stack.transform(X_test)  # OK
+        
+        # Transform train set with different number of features - in fact it is identified as non-train set because shape is different
+        assert_raises(ValueError, stack.transform, X_train[:, 1:])
+        
+        # Transform test set with different number of features
+        assert_raises(ValueError, stack.transform, X_test[:, :-1])
+
 # -----------------------------------------------------------------------------
 # -----------------------------------------------------------------------------
 
diff --git a/vecstack/coresk.py b/vecstack/coresk.py
@@ -755,9 +755,10 @@ def transform(self, X, is_train_set=None):
         # Transform any other set
         # *********************************************************************
         else:
+            # Legacy check included in `validate_data`
             # Check n_features
-            if X.shape[1] != self.n_features_:
-                raise ValueError('Inconsistent number of features.')
+            # if X.shape[1] != self.n_features_:
+            #     raise ValueError('Inconsistent number of features.')
 
             # Create empty numpy array for test predictions
             S_test = np.zeros((X.shape[0], self.n_estimators_ * self.n_classes_implicit_))