Teach Time Encyclopedia - Learn About Our World
Home Page
Teach Time
Featured Topics

United States
by state

CITYology

Academic Disciplines

Historical Timelines

Themed Timelines

Calendars

Reference Tables

Biographies

How-tos



Friday, December 05, 2008

Word sense disambiguation

In computational linguistics, word sense disambiguation (WSD) is the problem of determining in which sense a word having a number of distinct senses is used in a given sentence. For example, consider the word "bass", two distinct senses of which are:
  1. a type of fish
  2. tones of low frequency
and the sentences "The bass part of the song is very moving" and "I went fishing for some sea bass". To a human it is obvious the first sentence is using the word "bass" in sense 2 above, and in the second sentence it is being used in sense 1. But although this seems obvious to a human, developing algorithms to replicate this human ability is a difficult task.

One problem with word sense disambiguation is deciding what are the senses. In cases like the word "bass" above, at least some senses are obviously different. In other cases, however, the different senses can be closely related (one meaning being a metaphorical or metonymic extension of another), and there division of words into senses becomes much more difficult. Consulting different dictionaries will find many different divisions of words into senses. One solution some researchers have used is to choose a particular dictionary, and just use its set of senses. Generally, however, research results using broad distinctions in senses have been much better than those using narrow, so most researchers ignore the fine-grained distinctions in their work.

Another problem is interjudge variance. WSD systems are normally tested by having their results on a task compared against those of a human. However, humans do not agree on the task at hand -- give a list of senses and sentences, and humans will not always agree on which word belongs in which sense. A computer cannot be expected to give better performance on such a task than a human (indeed, since the human serves as the standard, the computer being better than the human is incoherent), so the human performance serves as an upper bound. Human performance, however, is much better on coarse-grained than fine-grained distinctions, so this again is why research on coarse-grained distinctions is most useful.

As in all natural language processing, there are two main approaches to WSD -- deep approaches and shallow approaches. We shall deal with the deep approaches first.

Deep approaches presume access to a comprehensive body of world knowledge. Knowledge such as "you can go fishing for a type of fish, but not for low frequency sounds" and "songs have low frequency sounds as parts, but not types of fish" is then used to determine in which sense the word is used. These approaches are not very successful in practice, mainly because we don't have access to such a body of knowledge, except in very limited domains. But if such knowledge did exist, they would be much better than the shallow approaches.

Shallow approaches don't try to understand the text. They just consider the surrounding words, using information like "if 'bass' has words 'sea' or 'fishing' nearby, it probably is in the fish sense; if 'bass' has the words 'music' or 'song' nearby, it is probably in the music sense." These rules can be automatically derived by the computer, using a training corpus of words tagged with their word senses. This approach, while theoretically not as powerful as deep approaches, gives superior results in practice, due to our limited world knowledge. It can, though, be confused by sentences like "The dog barked the tree."

These approaches normally work by defining a window of N content words around each word to be disambiguated in the corpus, and statistically analyzing those N surrounding words. Two swallow approaches used to train and then disambiguate are Naïve Bayes Classifiers and Decision Lists.

It is instructive to compare the WSD problem with the problem of part-of-speech tagging. Both involve disambiguating or tagging with words, be it with senses or parts of speech. However, algorithms used for one do not tend to work well for the other, mainly because the part of speech of a word is primarily determined by the immediately adjacent 1-3 words, whereas the sense of a word is determined by words a fair way further away. The success rate for POS tagging algorithms is at present much higher than that for WSD.



Internet Hotel Solutions

Site Sponsors
AC Units
Baltimore Harbor
Boot Camp Grads
Bra Size
Burkittsville
College Hotels
Digital Harbor
Free Cell Phones
Golden Hare Travel
Golf Vacations
Golf Courses
Gourmet
Hair Styles
Hippodrome
iWoman
Lesson Plans
Maryland Hotels
MD Genealogy
Minor League Stuff
Motel Site
Ocean City
OC Real Estate
Old Agers
Office Supplies
Orlando
Pet Friendly Hotel
Room Prices
Savannah, GA
Ski Vacations
South Baltimore
Student Teaching
Travel Sources
University Hotels
Visit Military Bases
Washington, DC

Brought to you by NoChildLeftBehind.com and the Beaches and Towns Network, LLC.