Workshop on POMDP, Classification and Regression: Relationships and Joint Utilization June 7, 2006 Held in conjunction with ICAPS'06 16th International Conference on Automated Planning & Scheduling June 6-10, 2006 Ambleside, The English Lake District, U.K Workshop Homepage: http://www.ee.duke.edu/~xjliao/pomdpMLworkshop_noFrame.html ============================================================================ Overview: Partially observable Markov decision process (POMDP) is a popular model for planning under uncertainty. Classification and regression are standard statistical tools for reconstructing a source (or its attributes) from noise-corrupted data. Studies of POMDPs and classification/regression have been mostly pursued independently in the past. Recently, however, there have emerged a number of papers reporting using classification/regression techniques to solve POMDPs or using a POMDP to build cost-sensitive classifiers. Much work, however, is still underway in exploring the possibilities of how POMDP and classification/regression techniques can be applied to each other in a mutually beneficial way. The aim of this workshop is to bring together researchers from the POMDP community and researchers from the statistical learning community, and to create an opportunity for exchanging views and reporting on-going work on how a POMDP and a classifier/regressor can mutually benefit each other. The possibilities of research on this subject have not at all been explored to their full extent and it is time to bring this new interdisciplinary area to the attention of additional researchers. We believe that a broader range of contributions will be stimulated to both POMDP and classification/regression by looking at them from new and unified perspectives. ---------------------------------------------------------------------------- Related work: Kearns et al. [1] showed that the concept "sample complexity" used in classification can be extended to the POMDP, and they established an upper bound on the number of trajectories that must be used to insure good generalization. Their work is pioneering in trajectory-based methods and in relating POMDP to classification. Several researchers investigated using modern classifiers like the SVM to learn MDP policies, including Dietterich and Wang [2], Lagoudakis and Parr [3], and Blatt and Hero [7]. Bagnell et al. [4] reported some preliminary results on classification-based policy search in POMDPs, and Langford and Zadrozny [5] did some theoretic analysis on this. Mahadeva [6] and Li et al. [8] studied the regression methods in POMDPs. Along the contrary line, Dimitrakakis and Bengio [11] reported using MDP as a gating network in mixture of experts; Bonet and Geffner [9], Guo [10] applied POMDP techniques to classification problems in which the class features and mis-classification are cost-sensitive. The main drawback of the methods in [9-10] is that the features are assumed independent. Relaxation of this naive Bayes assumption is studied in [12] and encouraging results are reported. The work in [1-12] signals nontrivial relationships between POMDPs and classification/regression that can be utilized to the benefits of both. References [1] M. Kearns, Y. Mansour and A. Y. Ng., "Approximate planning in large POMDPs via reusable trajectories", NIPS 12, 2000 [2] T. Dietterich, X. Wang, "Batch Value Function Approximation via Support Vectors", NIPS 14, 2001 [3] M. Lagoudakis, R. Parr, "Reinforcement Learning as Classification: Leveraging Modern Classifiers", ICML, 2003 [4] J. A. Bagnell, S. Kakade, A. Y. Ng and J. Schneider, "Policy search by dynamic programming", NIPS 16, 2004 [5] J. Langford, B. Zadrozny, "Relating Reinforcement Learning Performance to Classification Performance", ICML, 2005 [6] S. Mahadeva, "Proto-Value Functions: Developmental Reinforcement Learning", ICML, 2005 [7] D. Blatt, A. Hero, "From Weighted Classification to Policy Search", NIPS, 2005 [8] H. Li, L. He, X. Liao, S. Ji, L. Carin, "Region-Based Value Iteration and Its Application to Robot Navigation in a Minefield", NIPS Workshop on Machine Learning Based Robotics in Unstructured Environments, 2005 [9] B. Bonet, H. Geffner, "Learning Sorting and Decision Trees with POMDPs", ICML, 1998 [10] A. Guo, "Decision-theoretic Active Sensing for Autonomous Agents", AAMAS, July 2003 [11] C. Dimitrakakis, S. Bengio, "Online Policy Adaptation for Ensemble Classifiers", Proceedings of European Symposium on Artificial Neural Networks, 28-30, 2004 [12] H. Li, X. Liao, L. Carin, "A Value-directed Bayesian Classifier", ICASSP, 2006 ---------------------------------------------------------------------------- Topics: We seek submissions of contributed work in answering the many challenging questions that are summarized in the following topics. Submissions on related topics are also welcome. - Trajectories-based policy search by using classification and regression approaches. - Value function and Q-function approximation using neural networks, kernel methods, etc. - Novel methods for translating policy learning into classification/regression problems. - Application of classification/regression techniques to MDPs with a very large discrete state space or a continuous state space. - The use of classification/regression techniques in modeling and policy learning for POMDPs with a continuous observation space, or a continuous action space, or a continuous state space. - Application of POMDP to non-myopic active learning in SVM, logistic regression, and other discriminative classifiers. - POMDP methods for cost-sensitive feature selection, sensor scheduling, with the ultimate goal of classification or regression. - Methods for relaxing the naive Bayes assumption in cost-sensitive classification. - Planning and decision making in mixture of experts and Bayesian networks. ---------------------------------------------------------------------------- Important Dates: Paper Submission Deadline: February 30, 2006 Notification of Acceptance/Rejection: March 15, 2006 Camera-ready Copy Due Date: March 30, 2006 Workshop date: June 7, 2006 ---------------------------------------------------------------------------- Paper Submissions: Authors are encouraged to submit papers electronically in PDF format. Papers must be formatted using the AAAI style template (http://www.aaai.org/Publications/Author/macros-link.html) and must not exceed 10 pages in length. Please send submissions by e-mail to either xjliao@ee.duke.edu or lcarin@ee.duke.edu. ---------------------------------------------------------------------------- Organizing Committee: Xuejun Liao, Duke University, USA Lawrence Carin, Duke University, USA Program Committee: Alfred Hero, University of Michigan at Ann Arbor, USA Carey E. Priebe, Johns Hopkins University, USA Ronald Parr, Duke University, USA Carey Schwartz, DARPA/DSO, USA Douglas Cochran, Arizona State University , USA Vikram Krishnamurthy, University of British Columbia, Canada David Castanon, Boston University, USA