【日時】東京大学 理学部7号館214教室 (辻井研)
# 夜には会食も予定しております
【話者】Ms. Ruihua Song
【題目】Identification of ambiguous queries in web search
【概要】It is widely believed that many queries submitted to search engines
are inherently ambiguous(e.g., java and apple). However, few studies
have tried to classify queries based on ambiguity and to answer "what
the proportion of ambiguous queries is." Our work deals with these
issues. First, we clarify the definition of ambiguous queries by
constructing the taxonomy of queries from being ambiguous to
specific. Second, we ask human annotators to manually classify
queries. From manually labeled results, we observe that query
ambiguity is to some extent predictable. Third, we propose a
supervised learning approach to automatically identify ambiguous
queries. Experimental results show that we can correctly identify 87%
of labeled queries with the approach. Finally, by using our approach,
we estimate that about 16% of queries in a real search log are
ambiguous.
Bio: Ms. Ruihua Song received B.E. and M.E. degrees from Tsinghua
University in 2000 and 2003. Then she joined Microsoft Research Asia
and now works as a researcher in Web Data Management group. Her main
research interests are Web information retrieval and Web information
extraction. She serves SIGIR, SIGKDD, CIKM, ECIR, etc. as PC members
and NTCIR as a coordinator. Her homepage is
http://research.microsoft.com/users/rsong/.