- This event has passed.
Classifying Unbalanced Data
January 23 @ 2:10 pm - 3:00 pm
Presenter: Professor Norm Matloff
Dates: January 23, 2020
Location: DataLab classroom (Shields Library room 360)
In this talk, Computer Science Professor Norm Matloff will discuss a challenge faced by applied data researchers and practitioners from across domains – working with unbalanced data. Many resources on machine learning (ML) classification problems recommend that if your dataset has unbalanced class sizes, you should modify the data to have equal class counts. Yet, this is both unnecessary and often harmful. He’ll discuss a few motivating examples to unpack where unbalanced data naturally occurs, why it can be a problem, and potential solutions when faced with these scenarios.
About the instructor
Dr. Norm Matloff is a professor of computer science at the University of California at Davis, and was formerly a professor of statistics. He was a database software developer in Silicon Valley, and has been a statistical consultant for firms such as the Kaiser Permanente Health Plan. He holds a PhD in pure mathematics from UCLA, specializing in probability/functional analysis and statistics. His research interests include machine learning, parallel processing, statistical computing and missing value analysis. He has authored a number of books, including Statistical Regression and Classification: from Linear Models to Machine Learning, which was the recipient of the Ziegel Aware in 2017. He is Editor-in-Chief of the R Journal.
Link to talk slides: http://heather.cs.ucdavis.edu/DSI.pdf
- DataLab: Data Science and Informatics (DSI)
- Shields Library, room 360