ca.site.elkb
Class RogetELKB

java.lang.Object
  extended by ca.site.elkb.RogetELKB

public class RogetELKB
extends java.lang.Object

Main class of the Roget's Thesaurus Electronic Lexical KnowledgeBase. It is made up of three major components:

Required files: These files are found in the $HOME/roget_elkb directory.

Version:
1.4 2013
Author:
Mario Jarmasz and Alistsair Kennedy

Field Summary
static java.lang.String CATEG
          Location of the ELKB Tabular Synopsis of Categories.
 Category category
          The ELKB Tabular Synopisis of Categories.
static java.lang.String ELKB_PATH
          Location of the ELKB data directory.
static java.lang.String HEADS
          Location of the Heads.
 Index index
          The ELKB Index.
static java.lang.String INDEX
          Location of the ELKB Index.
static java.lang.String PATH_1911
          Year 1911.
static java.lang.String PATH_1911R
          Year 1911X.
static java.lang.String PATH_1911X
          Year 1911X.
static java.lang.String PATH_1987
          Year 1987.
static java.lang.String PATH_1987R
          Year 1987X.
static java.lang.String PATH_1987X
          Year 1987X.
static java.lang.String PATH_ELKB
          holds real path.
 RogetText text
          The ELKB Text.
static java.lang.String USER_HOME
          Location of user's Home directory.
 
Constructor Summary
RogetELKB()
          Default constructor.
RogetELKB(int year)
           
RogetELKB(java.lang.String year)
          Non-default constructor.
RogetELKB(java.lang.String year, boolean brokenUpPhrases)
          Constructor allows you to choose between an index with broken phrases and one without.
 
Method Summary
 java.util.TreeSet<Path> getAllPaths(java.lang.String strWord1, java.lang.String strWord2)
          Returns all the paths between two words or phrases.
 java.util.TreeSet<Path> getAllPaths(java.lang.String strWord1, java.lang.String strWord2, java.lang.String POS)
          Returns all the paths between two words or phrases of a given part-of-speech.
 java.util.TreeSet<Path> getAllPaths(java.lang.String strWord1, java.lang.String POS1, java.lang.String strWord2, java.lang.String POS2)
          Used to help compute distances for the analogy problem.
 java.util.ArrayList<java.lang.String> getGrouping(java.lang.String identifier)
          Takes an identifier in the form of "1.2.5.3.2" where numbers represent the class, section, subsection, head group, head, POS, paragraph, semicolon group, word.
 java.util.ArrayList<java.lang.String> getHead(int head)
           
 java.util.ArrayList<java.lang.String> getPara(int head, int POS, int para)
           
 java.util.ArrayList<java.lang.String> getPOS(int head, int POS)
           
 java.util.ArrayList<java.lang.String> getSG(int head, int POS, int para, int sg)
           
 void lookUpWordInIndex()
          lookUpWordInIndex - looks up a word or phrase in the Index and returns all possible references.
static void main(java.lang.String[] args)
          Allows the ELKB to be used via the command line.
 Path path(java.lang.String strWord1, java.lang.String strRef1, java.lang.String strWord2, java.lang.String strRef2)
          Calculates the path between two senses of words or phrases.
 java.lang.String t1Relation(java.lang.String strWord1, int iHeadNum1, java.lang.String sRefName1, java.lang.String sPos1, java.lang.String strWord2)
          Determines the thesaural relation that exists between a specific sense of a words or phrases and another word or phrase.
 java.lang.String t1Relation(java.lang.String strWord1, java.lang.String strWord2)
          Determines the thesaural relation that exists between two words or phrases.
 java.lang.String t1RelationHeadOnly(java.lang.String strWord1, int iHeadNum1, java.lang.String sRefName1, java.lang.String sPos1, java.lang.String strWord2)
          Determines the thesaural relation that exists between a specific sense of a words or phrases and another word or phrase.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

USER_HOME

public static final java.lang.String USER_HOME
Location of user's Home directory.


ELKB_PATH

public static final java.lang.String ELKB_PATH
Location of the ELKB data directory.


INDEX

public static final java.lang.String INDEX
Location of the ELKB Index.

See Also:
Constant Field Values

CATEG

public static final java.lang.String CATEG
Location of the ELKB Tabular Synopsis of Categories.

See Also:
Constant Field Values

HEADS

public static final java.lang.String HEADS
Location of the Heads.

See Also:
Constant Field Values

PATH_1987

public static final java.lang.String PATH_1987
Year 1987.


PATH_1911

public static final java.lang.String PATH_1911
Year 1911.


PATH_1911X

public static final java.lang.String PATH_1911X
Year 1911X.


PATH_1987X

public static final java.lang.String PATH_1987X
Year 1987X.


PATH_1911R

public static final java.lang.String PATH_1911R
Year 1911X.


PATH_1987R

public static final java.lang.String PATH_1987R
Year 1987X.


PATH_ELKB

public static java.lang.String PATH_ELKB
holds real path.


index

public Index index
The ELKB Index.


category

public Category category
The ELKB Tabular Synopisis of Categories.


text

public RogetText text
The ELKB Text.

Constructor Detail

RogetELKB

public RogetELKB()
Default constructor.


RogetELKB

public RogetELKB(int year)

RogetELKB

public RogetELKB(java.lang.String year)
Non-default constructor.

Parameters:
year -

RogetELKB

public RogetELKB(java.lang.String year,
                 boolean brokenUpPhrases)
Constructor allows you to choose between an index with broken phrases and one without.

Parameters:
year -
brokenUpPhrases -
Method Detail

main

public static void main(java.lang.String[] args)
Allows the ELKB to be used via the command line.

Parameters:
args -

getGrouping

public java.util.ArrayList<java.lang.String> getGrouping(java.lang.String identifier)
Takes an identifier in the form of "1.2.5.3.2" where numbers represent the class, section, subsection, head group, head, POS, paragraph, semicolon group, word. The results are for any grouping, it need not contain all 9 numbers. If an entry is invalid it will return null;

Parameters:
identifier -
Returns:
ArrayList of Strings

getSG

public java.util.ArrayList<java.lang.String> getSG(int head,
                                                   int POS,
                                                   int para,
                                                   int sg)

getPara

public java.util.ArrayList<java.lang.String> getPara(int head,
                                                     int POS,
                                                     int para)

getPOS

public java.util.ArrayList<java.lang.String> getPOS(int head,
                                                    int POS)

getHead

public java.util.ArrayList<java.lang.String> getHead(int head)

getAllPaths

public java.util.TreeSet<Path> getAllPaths(java.lang.String strWord1,
                                           java.lang.String strWord2)
Returns all the paths between two words or phrases. The paths are sorted from the smallest to the biggest distance. If set of size 0 represents that an error occurred when determining the paths.

Parameters:
strWord1 -
strWord2 -
Returns:
TreeSet of paths

getAllPaths

public java.util.TreeSet<Path> getAllPaths(java.lang.String strWord1,
                                           java.lang.String strWord2,
                                           java.lang.String POS)
Returns all the paths between two words or phrases of a given part-of-speech. The part-of-speech can be any of N., VB., ADJ., ADV. The paths are sorted from the smallest to the biggest distance.

Parameters:
strWord1 -
strWord2 -
POS -
Returns:
TreeSet of paths

getAllPaths

public java.util.TreeSet<Path> getAllPaths(java.lang.String strWord1,
                                           java.lang.String POS1,
                                           java.lang.String strWord2,
                                           java.lang.String POS2)
Used to help compute distances for the analogy problem. Returns all the paths between two words or phrases of a given part-of-speech. The part-of-speech can be any of N., VB., ADJ., ADV. The two words need not have the same part of speech!!! The paths are sorted from the smallest to the biggest distance.

Parameters:
strWord1 -
POS1 -
strWord2 -
POS2 -
Returns:
TreeSet of paths

t1RelationHeadOnly

public java.lang.String t1RelationHeadOnly(java.lang.String strWord1,
                                           int iHeadNum1,
                                           java.lang.String sRefName1,
                                           java.lang.String sPos1,
                                           java.lang.String strWord2)
Determines the thesaural relation that exists between a specific sense of a words or phrases and another word or phrase. There are two kinds of thesaural relations:

t1Relation

public java.lang.String t1Relation(java.lang.String strWord1,
                                   int iHeadNum1,
                                   java.lang.String sRefName1,
                                   java.lang.String sPos1,
                                   java.lang.String strWord2)
Determines the thesaural relation that exists between a specific sense of a words or phrases and another word or phrase. There are two kinds of thesaural relations:

t1Relation

public java.lang.String t1Relation(java.lang.String strWord1,
                                   java.lang.String strWord2)
Determines the thesaural relation that exists between two words or phrases. There are two kinds of thesaural relations:

path

public Path path(java.lang.String strWord1,
                 java.lang.String strRef1,
                 java.lang.String strWord2,
                 java.lang.String strRef2)
Calculates the path between two senses of words or phrases. The references are used to identify the senses. They must be supplied in the following format:

Parameters:
strWord1 -
strRef1 -
strWord2 -
strRef2 -
Returns:
Path between word

lookUpWordInIndex

public void lookUpWordInIndex()
lookUpWordInIndex - looks up a word or phrase in the Index and returns all possible references.