Using partial pooling when preparing data for machine learning applications
Win-Vector Blog 2018-04-18
Summary:
Geoffrey Simmons writes: I reached out to John Mount/Nina Zumel over at Win Vector with a suggestion for their vtreat package, which automates many common challenges in preparing data for machine learning applications. The default behavior for impact coding high-cardinality variables had been a naive bayes approach, which I found to be problematic due its multi-modal output (assigning […]
The post Using partial pooling when preparing data for machine learning applications appeared first on Statistical Modeling, Causal Inference, and Social Science.