Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

combined_ftest_5x2cv: accuracy vs error rates #1086

Open
AlbertoImg opened this issue Mar 24, 2024 · 0 comments
Open

combined_ftest_5x2cv: accuracy vs error rates #1086

AlbertoImg opened this issue Mar 24, 2024 · 0 comments

Comments

@AlbertoImg
Copy link

AlbertoImg commented Mar 24, 2024

Hi @rasbt,
First of all thanks for the implementations and the teaching material. It is really appreciated.
I want to use this test is r, so I was looking at your python implementation to get an idea. I would like to ask you a technical detail that is not clear to me:
I saw that in your function "combined_ftest_5x2cv", in a classification case, you use the "accuracy" as the default scoring. When looking at the papers Dietterich 1998 and Alpaydin 1998, they mentioned this scoring ($p_i^{(j)}$) as "error rates" or "observed proportion of test examples misclassified by algorithm".
Are you using the "accuracy" scoring because the math at the end does not change (considering accuracy = 1-error rate, and the difference between the algorithms cancel this "1-" operation) ?

Thanks in advance
Best
Alberto

Your implementation:
``
if scoring is None:
if estimator1._estimator_type == "classifier":
scoring = "accuracy" <-- HERE
elif estimator1._estimator_type == "regressor":
scoring = "r2"
else:
raise AttributeError("Estimator must " "be a Classifier or Regressor.")
if isinstance(scoring, str):
scorer = get_scorer(scoring)
else:
scorer = scoring

variances = []
differences = []

def score_diff(X_1, X_2, y_1, y_2):
estimator1.fit(X_1, y_1)
estimator2.fit(X_1, y_1)
est1_score = scorer(estimator1, X_2, y_2) <-- HERE
est2_score = scorer(estimator2, X_2, y_2) <-- HERE
score_diff = est1_score - est2_score <-- HERE
return score_diff
´´

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant