You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It says, "high precision relates to a low false positive rate" and some a few places it links these two together, e.g., "false positives, decreasing precision."
I think you're technically right, @daniel-yj-yang, for those who have been trained to use terms like "false discovery rate". This is true of a lot of the medical community, but unfortunately not for much of the machine learning community. The problem here is that a technical term is inadvertently being used: an increase in false positives will indeed decrease precision, if the number of true positives remains constant; and indeed the count of "false positives" and of "false negatives" is all that differs between the formulas for P & R. The reason for a difference between FPR and FDR is that the denominator of FDR is dependent on the estimator, whereas the denominator of FPR is dependent only on the ground truth. It is an important difference, but one that might not be easily drawn out in the context of this example. In any case, an attempt to improve the wording that avoids misuse of jargon, would be helpful.
Describe the issue linked to the documentation
https://scikit-learn.org/stable/auto_examples/model_selection/plot_precision_recall.html
It says, "high precision relates to a low false positive rate" and some a few places it links these two together, e.g., "false positives, decreasing precision."
Suggest a potential alternative/fix
"Precision = 1 - false discovery rate" and "Specificity = 1 - false positive rate"
Thus, the term "false discovery rate" should be emphasized, and "false positive rate" should be deemphasized when talking about high precision.
The text was updated successfully, but these errors were encountered: