No comments here
Presentation #302.10 in the session Computation, Data Handling, Image Analysis — iPoster Session.
The problem of bias in machine learning that comes from training set composition is usually addressed by weighting to simulate equal composition. Here we show that this practice biases the probabilities that the machine learning classifier uses to assign labels to new data. Further, we show that weighting can degrade the performance of the classifier. We show this using both simulated and real photometric data from SDSS and WISE. We also demonstrate techniques for detecting when a classifier’s probabilities are biased and a method for compensating for detected biases.