Query Rewriting Using Shallow Language Processing: Effects on Keyword Subject Searches
Other Titles: Workshop proceedings
Authors: Mastora, Anna 
Kapidakis, Sarantos 
Issue Date: 27-Sep-2012
Conference: International Workshop on Supporting User’s Exploration on Digital Libraries 
Keywords: Digital libraries, Inflectional languages, Natural Language Processing, Spelling, Lemmatising, Query patterns
Abstract: 
The aim of this study is to investigate and report on potential implications of implementing shallow language processing towards rewriting keyword subject queries in Greek. The processing we report includes a speller and a lemmatiser along with stop word removal and query normalisation in terms of punctuation use. Among our findings is that users tend to submit morphologically variant words, which the Aspell tool, for spell checking and correcting, manages to process in a consistent way in 98.7% of the cases. We recorded a semantic drift of the initial query intent in approximately 8.2% of the overall submitted queries, after implementing the spell checker. The lemmatiser (ilsp_nlp) performs extremely well for the words it identifies. Only five cases are recorded, among the initial 750 queries we submitted to the tool, which led to a semantic drift. However, the lemmatiser does not recognise either misspelled or truncated words which remained unaltered during this step of the process. Therefore, we conclude that, responding to the examined data, spelling prior to lemmatising is the appropriate sequence of implementing the specific shallow language processing.
URI: https://uniwacris.uniwa.gr/handle/3000/387
Type: Conference Paper
Department: Department of Archival, Library and Information Studies 
School: School of Administrative, Economics and Social Sciences 
Affiliation: University of West Attica (UNIWA) 
Appears in Collections:Conference Papers or Poster or Presentation / Δημοσιεύσεις σε Συνέδρια

CORE Recommender
Show full item record

Page view(s)

18
checked on Jul 14, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.