emolib.wsd.simlib
Class SimilarityAssessor

java.lang.Object
  extended by emolib.wsd.simlib.SimilarityAssessor

public class SimilarityAssessor
extends java.lang.Object

Title: Java WordNet Similarity

Description: Assesses the semantic similarity between a pair of words as described in Seco, N., Veale, T., Hayes, J. (2004) "An Intrinsic Information Content Metric for Semantic Similarity in WordNet". In Proceedings of the European Conference of Artificial Intelligence

This is the class that is responsible for the similarity calculations. Please note that Documents in the context of this class correspond to synsets. Each Document structure holds the synset offset the list of words in the synset and a list containing all hypernym offsets. For the sake of computational simplicity, in calculating the best MSCA, the list of hypernyms also contains the synset of the current document.

Copyright: Nuno Seco Copyright (c) 2004

Version:
1.0
Author:
Nuno Seco

Constructor Summary
SimilarityAssessor()
          Void constructor.
SimilarityAssessor(java.lang.String wnIndexPath)
          The constructor.
 
Method Summary
 org.apache.lucene.search.Hits getHits(java.lang.String query)
          Returns the list of documents that fulfill the given query.
 double getSenseSimilarity(java.lang.String word1, int senseForWord1, java.lang.String word2, int senseForWord2)
          Calculates the similarity between two specific senses.
 double getSimilarity(java.lang.String word1, java.lang.String word2)
          Calculates the similarity between the two words, given as parameters, according to the referenced paper.
 java.lang.String getWordsField()
          Function to retrieve the WORDS field from the broker.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SimilarityAssessor

public SimilarityAssessor(java.lang.String wnIndexPath)
The constructor. Obtains an instance of an Index Broker.


SimilarityAssessor

public SimilarityAssessor()
Void constructor.

Method Detail

getHits

public org.apache.lucene.search.Hits getHits(java.lang.String query)
Returns the list of documents that fulfill the given query.

Parameters:
query - String The query to be searched
Returns:
Hits A list of hits

getWordsField

public java.lang.String getWordsField()
Function to retrieve the WORDS field from the broker.

Returns:
The WORDS field.

getSenseSimilarity

public double getSenseSimilarity(java.lang.String word1,
                                 int senseForWord1,
                                 java.lang.String word2,
                                 int senseForWord2)
                          throws WordNotFoundException
Calculates the similarity between two specific senses.

Parameters:
word1 - String
senseForWord1 - int The sense number for the first word
word2 - String
senseForWord2 - int The sense number for the second word
Returns:
double The degree of similarity between the words; 0 means no similarity and 1 means that they may belong to the same synset.
Throws:
WordNotFoundException - An exception is thrown if one of the words is not contained in the WordNet dictionary.

getSimilarity

public double getSimilarity(java.lang.String word1,
                            java.lang.String word2)
                     throws WordNotFoundException
Calculates the similarity between the two words, given as parameters, according to the referenced paper.

Parameters:
word1 - String
word2 - String
Returns:
double The degree of similarity between the words; 0 means no similarity and 1 means that they may belong to the same synset.
Throws:
WordNotFoundException - An exception is thrown if one of the words is not contained in the WordNet dictionary.