Affiliation: BNU-HKBU United International College
Title: Understanding Query Interfaces by Statistical Parsing
Users submit queries to an online database via its query interface. Query interface parsing, which is important for many applications, understands the query capabilities of a query interface. Since most query interfaces are organized hierarchically, we present a novel query interface parsing method, StatParser (Sta-tistical Parser), to automatically extract the hierarchical query capabilities of query interfaces. StatParser automatically learns from a set of parsed query interfaces and parses new query interfaces. StatParser starts from a small grammar and enhances the grammar with a set of probabilities learned from parsed query interfaces under the maximum-entropy principle. Given a new query interface, the probability-enhanced grammar identifies the parse tree with the largest global probability to be the query capabilities of the query interface. Experimental results show that StatParser very accurately extracts the query capabilities and can effectively overcome the problems of existing query interface parsers.
Dr. SU received his PhD in the Department of Computer Science and Engineering at the Hong Kong University of Science and Technology in 2007,. He obtained his master degree from Xiamen University in 2002 and Bachelor degree from China University of Petroleum in 1995. He has published papers in top conference and journals, including ICDE, EDBT, TODS, TKDE and TWEB. His research interests include Deep Web, Data Mining, Machine Learning, Word Sense Disambiguation, and Natural Language Processing.