Non-exhaustive learning

Non-exhaustive learning for the discovery of emerging pathogens using phenotypic screening systems

Most tools for microbial pathogen detection and recognition are based on the physiological or genetic properties of microorganisms: metabolic characteristics of substrate utilization, analysis of signature molecules by antibodies, nucleic acid analysis, and analysis of the interaction of pathogens with eukaryotic cells. However, there is an enormous interest in devising truly label-free and reagentless biosensors that would operate utilizing the biophysical nature of the samples without the need for sensing and reporting biochemistry. The phenotypic biophysical sensors are closest to realizing this goal. Whether based on mass spectroscopy or forward-scatter phenotyping, phenotypic biosensors rely on a library of spectral signatures generated for different bacterial classes to subsequently detect and classify future samples of unknown nature. However, owing to pathogen evolution and migration and the challenges involved in sample collection and preparation, the use of traditional supervised-learning algorithms in phenotypic biosensing is severely limited by the non-exhaustive nature of these training libraries. It is impractical, often impossible, to define a training library with a complete set of classes and then collect samples for each class, mainly because some of the bacteria classes may not be in existence at the time of training, they may exist but are not known, or their presence may be known, but samples are not obtainable. Thus, the purpose of this research project involves developing robust algorithms capable of utilizing non-exhaustively defined training data sets to identify samples of emerging bacterial classes as novelties, to model their underlying distributions, and to associate them with higher-level groups of classes of known nature as new samples are sequentially classified in real-time.

Page updated on 2022-12-07 15:50:09 -0500