The Pattern Analysis and Learning (PAL) Team, associated with the Pattern Recognition Theory and Methods Group of National Laboratory of Pattern Recognition (NLPR), was founded in 2005. The research topics of the team include the theory and methods of pattern recognition and machine learning, and the applications to document analysis, image analysis and data mining.
Pattern recognition methods have found many successful applications, but the problems to encounter are getting more complex. The major challenges include the curse of dimensionality, small sample size, large category set, changing distribution, unlabeled samples, multiple domains, context, etc. The research issues include parametric and non-parametric density estimation, dimensionality reduction and feature selection, classifier learning and adaptation, hybrid statistical/structural classification, semi-supervised learning, contextual classification, etc.
Character recognition and document analysis techniques are aimed to process large volume of paper and pen-based documents, as well as texts in video and scene images. The remaining difficult problems include the recognition of unconstrained handwritten characters, character segmentation, handwritten text recognition, scene text recognition, text in low-resolution or degraded images, and so on. Also, we need to upgrade the task of document analysis to the level of linguistic contents, i.e., it can be combined with text categorization and retrieval.