Non-exhaustive learning

Non-exhaustive learning for discovery of emerging pathogens using phenotypic screening systems

The majority of tools for microbial pathogen detection and recognition are based on physiological or genetic properties of microorganisms: metabolic characteristics of substrate utilization, analysis of signature molecules by antibodies, nucleic acid analysis, and analysis of the interaction of pathogens with eukaryotic cells. However, there is an enormous interest in devising truly label-free and reagentless biosensors that would operate utilizing the biophysical nature of the samples without the need for sensing and reporting biochemistry. The phenotypic biophysical sensors are closest to realizing this goal. Whether they are based on mass spectroscopy or on forward-scatter phenotyping, phenotypic biosensors rely on a library of spectral signatures generated for different bacterial classes to subsequently detect and classify future samples of unknown nature. However, owing to pathogen evolution and migration, and the challenges involved in sample collection and preparation, the use of traditional supervised-learning algorithms in phenotypic biosensing is severely limited by the nonexhaustive nature of these training libraries. It is impractical, often impossible, to define a training library with a complete set of classes and then collect samples for each class, mainly because some of the bacteria classes may not be in existence at the time of training, they may exist but are not known, or their existence may be known but samples are simply not obtainable. Thus, the purpose of this research project involves developing robust algorithms capable of utilizing nonexhaustively defined training data sets to identify samples of emerging bacterial classes as novelties, to model their underlying distributions, and to associate them with higher-level groups of classes of known nature as new samples are sequentially classified in real time.

Page updated on 2022-01-17 12:51:32 -0500