Framework for Semi-Automated Labeling for Predictive Analytics
Thursday, Oct 17, 2019
With large data collected from assets, machine (supervised) learning is increasingly being used to develop predictive analytics solution for maintenance and failure prediction. Examples include failure prediction of main bearings in locomotive engines, troubleshooting of faults in healthcare assets like CT/MR scanners. For developing any supervised learning models, labeling of training data is imperative. In fact, error in labeling is the single biggest source of “bad” machine learning models. Typically, experts are consulted to label “interesting” events viz. failure of an asset, equipment downtime, part replacement etc. History of this “interesting” events resides in engineer/technician notes. The experts manually mine this notes and label the input data. In this presentation, Tapan will explain an unsupervised learning method which can mine through the technical notes and create a corpus of “interesting” events and label each data point to one of these events. It also allows for structured expert feedback to edit the labeling, if required.