WordNet::

Synset class

WordNet synonym-set object class

Instances of this class encapsulate the data for a synonym set (‘synset’) in a WordNet lexical database. A synonym set is a set of words that are interchangeable in some context.

We can either fetch the synset from a connected Lexicon:

lexicon = WordNet::Lexicon.new( 'postgres://localhost/wordnet31' )
ss = lexicon[ :first, 'time' ]
# => #<WordNet::Synset:0x7ffbf2643bb0 {115265518} 'commencement, first,
#       get-go, offset, outset, start, starting time, beginning, kickoff,
#       showtime' (noun): [noun.time] the time at which something is
#       supposed to begin>

or if you’ve already created a Lexicon, use its connection indirectly to look up a Synset by its ID:

ss = WordNet::Synset[ 115265518 ]
# => #<WordNet::Synset:0x7ffbf257e928 {115265518} 'commencement, first,
#       get-go, offset, outset, start, starting time, beginning, kickoff,
#       showtime' (noun): [noun.time] the time at which something is
#       supposed to begin>

You can fetch a list of the lemmas (base forms) of the words included in the synset:

ss.words.map( &:lemma )
# => ["commencement", "first", "get-go", "offset", "outset", "start",
#     "starting time", "beginning", "kickoff", "showtime"]

But the primary reason for a synset is its lexical and semantic links to other words and synsets. For instance, its hypernym is the equivalent of its superclass: it’s the class of things of which the receiving synset is a member.

ss.hypernyms
# => [#<WordNet::Synset:0x7ffbf25c76c8 {115180528} 'point, point in
#        time' (noun): [noun.time] an instant of time>]

The synset’s hyponyms, on the other hand, are kind of like its subclasses:

ss.hyponyms
# => [#<WordNet::Synset:0x7ffbf25d83b0 {115142167} 'birth' (noun):
#       [noun.time] the time when something begins (especially life)>,
#     #<WordNet::Synset:0x7ffbf25d8298 {115268993} 'threshold' (noun):
#       [noun.time] the starting point for a new state or experience>,
#     #<WordNet::Synset:0x7ffbf25d8180 {115143012} 'incipiency,
#       incipience' (noun): [noun.time] beginning to exist or to be
#       apparent>,
#     #<WordNet::Synset:0x7ffbf25d8068 {115266164} 'starting point,
#       terminus a quo' (noun): [noun.time] earliest limiting point>]

Traversal

Synset also provides a few ‘traversal’ methods which provide recursive searching of a Synset’s semantic links:

# Recursively search for more-general terms for the synset, and print out
# each one with indentation according to how distantly it's related.
lexicon[ :fencing, 'sword' ].
    traverse(:hypernyms).with_depth.
    each {|ss, depth| puts "%s%s [%d]" % ['  ' * (depth-1), ss.words.first, ss.synsetid] }
# (outputs:)
play [100041468]
  action [100037396]
    act [100030358]
      event [100029378]
        psychological feature [100023100]
          abstract entity [100002137]
            entity [100001740]
combat [101170962]
  battle [100958896]
    group action [101080366]
      event [100029378]
        psychological feature [100023100]
          abstract entity [100002137]
            entity [100001740]
      act [100030358]
        event [100029378]
          psychological feature [100023100]
            abstract entity [100002137]
              entity [100001740]

See the Traversal Methods section for more details.

Low-Level API

This library is implemented using Sequel::Model, an ORM layer on top of the excellent Sequel database toolkit. This means that in addition to the high-level methods above, you can also make use of a database-oriented API if you need to do something not provided by a high-level method.

In order to make use of this API, you’ll need to be familiar with Sequel, especially Datasets and Model Associations. Most of Ruby-WordNet’s functionality is implemented in terms of one or both of these.

Datasets

The main dataset is available from WordNet::Synset.dataset:

WordNet::Synset.dataset
# => #<Sequel::SQLite::Dataset: "SELECT * FROM `synsets`">

In addition to this, Synset also defines a few other canned datasets. To facilitate searching by part of speech on the Synset class:

or by the semantic links for a particular Synset:

  • WordNet::Synset#also_see_dataset

  • WordNet::Synset#attributes_dataset

  • WordNet::Synset#causes_dataset

  • WordNet::Synset#domain_categories_dataset

  • WordNet::Synset#domain_member_categories_dataset

  • WordNet::Synset#domain_member_regions_dataset

  • WordNet::Synset#domain_member_usages_dataset

  • WordNet::Synset#domain_regions_dataset

  • WordNet::Synset#domain_usages_dataset

  • WordNet::Synset#entailments_dataset

  • WordNet::Synset#hypernyms_dataset

  • WordNet::Synset#hyponyms_dataset

  • WordNet::Synset#instance_hypernyms_dataset

  • WordNet::Synset#instance_hyponyms_dataset

  • WordNet::Synset#member_holonyms_dataset

  • WordNet::Synset#member_meronyms_dataset

  • WordNet::Synset#part_holonyms_dataset

  • WordNet::Synset#part_meronyms_dataset

  • WordNet::Synset#semlinks_dataset

  • WordNet::Synset#semlinks_to_dataset

  • WordNet::Synset#senses_dataset

  • WordNet::Synset#similar_words_dataset

  • WordNet::Synset#substance_holonyms_dataset

  • WordNet::Synset#substance_meronyms_dataset

  • WordNet::Synset#sumo_terms_dataset

  • WordNet::Synset#verb_groups_dataset

  • WordNet::Synset#words_dataset

Constants

SEMANTIC_TYPEKEYS

Semantic link type keys; maps what the API calls them to what they are in the DB.

Attributes

Public Class Methods

db=( newdb )

Overridden to reset any lookup tables that may have been loaded from the previous database.

# File lib/wordnet/synset.rb, line 296
def self::db=( newdb )
        self.reset_lookup_tables
        super
end
lexdomain_table()

Return the table of lexical domains, keyed by id.

# File lib/wordnet/synset.rb, line 314
def self::lexdomain_table
        @lexdomain_table ||= self.db[:lexdomains].to_hash( :lexdomainid )
end
lexdomains()

Lexical domains, keyed by name as a String (e.g., “verb.cognition”)

# File lib/wordnet/synset.rb, line 320
def self::lexdomains
        @lexdomains ||= self.lexdomain_table.inject({}) do |hash,(id,domain)|
                hash[ domain[:lexdomainname] ] = domain
                hash
        end
end
linktype_table()

Return the table of link types, keyed by linkid

# File lib/wordnet/synset.rb, line 329
def self::linktype_table
        @linktype_table ||= self.db[:linktypes].inject({}) do |hash,row|
                hash[ row[:linkid] ] = {
                        id: row[:linkid],
                        typename: row[:link],
                        type: row[:link].gsub( /\s+/, '_' ).to_sym,
                        recurses: row[:recurses] && row[:recurses] != 0,
                }
                hash
        end
end
linktypes()

Return the table of link types, keyed by name.

# File lib/wordnet/synset.rb, line 343
def self::linktypes
        @linktypes ||= self.linktype_table.inject({}) do |hash,(id,link)|
                hash[ link[:type] ] = link
                hash
        end
end
postype_table()

Return the table of part-of-speech types, keyed by letter identifier.

# File lib/wordnet/synset.rb, line 352
def self::postype_table
        @postype_table ||= self.db[:postypes].inject({}) do |hash, row|
                hash[ row[:pos].to_sym ] = row[:posname]
                hash
        end
end
postypes()

Return the table of part-of-speech names to letter identifiers (both Symbols).

# File lib/wordnet/synset.rb, line 361
def self::postypes
        @postypes ||= self.postype_table.invert
end
reset_lookup_tables()

Unload all of the cached lookup tables that have been loaded.

# File lib/wordnet/synset.rb, line 303
def self::reset_lookup_tables
        @lexdomain_table = nil
        @lexdomains      = nil
        @linktype_table  = nil
        @linktypes       = nil
        @postype_table   = nil
        @postypes        = nil
end

Public Instance Methods

inspect()

Return a human-readable representation of the objects, suitable for debugging.

# File lib/wordnet/synset.rb, line 688
def inspect
        return "#<%p:%0#x {%d} '%s' (%s): [%s] %s>" % [
                self.class,
                self.object_id * 2,
                self.synsetid,
                self.wordlist.join(', '),
                self.part_of_speech,
                self.lexical_domain,
                self.definition,
        ]
end
lexical_domain()

Return the name of the lexical domain the synset belongs to; this also corresponds to the lexicographer’s file the synset was originally loaded from.

# File lib/wordnet/synset.rb, line 446
def lexical_domain
        return self.class.lexdomain_table[ self.lexdomainid ][ :lexdomainname ]
end
part_of_speech()

Return the name of the Synset’s part of speech (pos).

# File lib/wordnet/synset.rb, line 416
def part_of_speech
        return self.class.postype_table[ self.pos.to_sym ]
end
samples()

Return any sample sentences.

# File lib/wordnet/synset.rb, line 452
def samples
        return self.db[:samples].
                filter( synsetid: self.synsetid ).
                order( :sampleid ).
                map( :sample )
end
senses()

The WordNet::Senses associated with the receiver

# File lib/wordnet/synset.rb, line 187
one_to_many :senses,
        key: :synsetid,
        primary_key: :synsetid
sumo_terms()

Terms from the Suggested Upper Merged Ontology

# File lib/wordnet/synset.rb, line 212
many_to_many :sumo_terms,
        join_table: :sumomaps,
        left_key: :synsetid,
        right_key: :sumoid
to_s()

Stringify the synset.

# File lib/wordnet/synset.rb, line 422
def to_s

        # Make a sorted list of the semantic link types from this synset
        semlink_list = self.semlinks_dataset.
                group_and_count( :linkid ).
                to_hash( :linkid, :count ).
                collect do |linkid, count|
                        '%s: %d' % [ self.class.linktype_table[linkid][:typename], count ]
                end.
                sort.
                join( ', ' )

        return "%s (%s): [%s] %s (%s)" % [
                self.words.map( &:to_s ).join(', '),
                self.part_of_speech,
                self.lexical_domain,
                self.definition,
                semlink_list
        ]
end
wordlist()

Return the Synset’s Words as an Array of Strings.

# File lib/wordnet/synset.rb, line 682
def wordlist
        return self.words.map( &:to_s )
end
words()

The WordNet::Words associated with the receiver

# File lib/wordnet/synset.rb, line 179
many_to_many :words,
        join_table: :senses,
        left_key: :synsetid,
        right_key: :wordid

Dataset Methods

↑ top

Public Instance Methods

adjective_satellites()

:singleton-method: adjective_satellites Limit results to adjective satellites.

# File lib/wordnet/synset.rb, line 285
def adjective_satellites
        return self.where( pos: 's' )
end
adjectives()

:singleton-method: adjectives Limit results to adjectives.

# File lib/wordnet/synset.rb, line 271
def adjectives
        return self.where( pos: 'a' )
end
adverbs()

:singleton-method: adverbs Limit results to adverbs.

# File lib/wordnet/synset.rb, line 278
def adverbs
        return self.where( pos: 'r' )
end
nouns()

:singleton-method: nouns Limit results to nouns.

# File lib/wordnet/synset.rb, line 257
def nouns
        return self.where( pos: 'n' )
end
verbs()

:singleton-method: verbs Limit results to verbs.

# File lib/wordnet/synset.rb, line 264
def verbs
        return self.where( pos: 'v' )
end