wn.taxonomy¶
Functions for working with hypernym/hyponym taxonomies.
Overview¶
Among the valid synset relations for wordnets (see
wn.constants.SYNSET_RELATIONS
), those used for describing
is-a taxonomies are
given special treatment and they are generally the most
well-developed relations in any wordnet. Typically these are the
hypernym
and hyponym
relations, which encode is-a-type-of
relationships (e.g., a hermit crab is a type of decapod, which is
a type of crustacean, etc.). They also include instance_hypernym
and instance_hyponym
, which encode is-an-instance-of
relationships (e.g., Oregon is an instance of American state).
The taxonomy forms a multiply-inheriting hierarchy with the synsets as nodes. In the English wordnets, such as the Princeton WordNet and its derivatives, nearly all nominal synsets form such a hierarchy with single root node, while verbal synsets form many smaller hierarchies without a common root. Other wordnets may have different properties, but as many are based off of the Princeton WordNet, they tend to follow this structure.
Functions to find paths within the taxonomies form the basis of all
wordnet similarity measures
. For instance, the
Leacock-Chodorow Similarity measure uses both
shortest_path()
and (indirectly) taxonomy_depth()
.
Wordnet-level Functions¶
Root and leaf synsets in the taxonomy are those with no ancestors
(hypernym
, instance_hypernym
, etc.) or hyponyms (hyponym
,
instance_hyponym
, etc.), respectively.
Finding root and leaf synsets¶
- wn.taxonomy.roots(wordnet, pos=None)¶
Return the list of root synsets in wordnet.
- Parameters
- Return type
Example
>>> import wn, wn.taxonomy >>> ewn = wn.Wordnet('ewn:2020') >>> len(wn.taxonomy.roots(ewn, pos='v')) 573
- wn.taxonomy.leaves(wordnet, pos=None)¶
Return the list of leaf synsets in wordnet.
- Parameters
- Return type
Example
>>> import wn, wn.taxonomy >>> ewn = wn.Wordnet('ewn:2020') >>> len(wn.taxonomy.leaves(ewn, pos='v')) 10525
Computing the taxonomy depth¶
The taxonomy depth is the maximum depth from a root node to a leaf node within synsets for a particular part of speech.
- wn.taxonomy.taxonomy_depth(wordnet, pos)¶
Return the list of leaf synsets in wordnet.
- Parameters
- Return type
Example
>>> import wn, wn.taxonomy >>> ewn = wn.Wordnet('ewn:2020') >>> wn.taxonomy.taxonomy_depth(ewn, 'n') 19
Synset-level Functions¶
- wn.taxonomy.hypernym_paths(synset, simulate_root=False)¶
Return the list of hypernym paths to a root synset.
- Parameters
- Return type
Example
>>> import wn, wn.taxonomy >>> dog = wn.synsets('dog', pos='n')[0] >>> for path in wn.taxonomy.hypernym_paths(dog): ... for i, ss in enumerate(path): ... print(' ' * i, ss, ss.lemmas()[0]) ... Synset('pwn-02083346-n') canine Synset('pwn-02075296-n') carnivore Synset('pwn-01886756-n') eutherian mammal Synset('pwn-01861778-n') mammalian Synset('pwn-01471682-n') craniate Synset('pwn-01466257-n') chordate Synset('pwn-00015388-n') animal Synset('pwn-00004475-n') organism Synset('pwn-00004258-n') animate thing Synset('pwn-00003553-n') unit Synset('pwn-00002684-n') object Synset('pwn-00001930-n') physical entity Synset('pwn-00001740-n') entity Synset('pwn-01317541-n') domesticated animal Synset('pwn-00015388-n') animal Synset('pwn-00004475-n') organism Synset('pwn-00004258-n') animate thing Synset('pwn-00003553-n') unit Synset('pwn-00002684-n') object Synset('pwn-00001930-n') physical entity Synset('pwn-00001740-n') entity
- wn.taxonomy.min_depth(synset, simulate_root=False)¶
Return the minimum taxonomy depth of the synset.
- Parameters
- Return type
Example
>>> import wn, wn.taxonomy >>> dog = wn.synsets('dog', pos='n')[0] >>> wn.taxonomy.min_depth(dog) 8
- wn.taxonomy.max_depth(synset, simulate_root=False)¶
Return the maximum taxonomy depth of the synset.
- Parameters
- Return type
Example
>>> import wn, wn.taxonomy >>> dog = wn.synsets('dog', pos='n')[0] >>> wn.taxonomy.max_depth(dog) 13
- wn.taxonomy.shortest_path(synset, other, simulate_root=False)¶
Return the shortest path from synset to the other synset.
- Parameters
- Return type
Example
>>> import wn, wn.taxonomy >>> dog = ewn.synsets('dog', pos='n')[0] >>> squirrel = ewn.synsets('squirrel', pos='n')[0] >>> for ss in wn.taxonomy.shortest_path(dog, squirrel): ... print(ss.lemmas()) ... ['canine', 'canid'] ['carnivore'] ['eutherian mammal', 'placental', 'placental mammal', 'eutherian'] ['rodent', 'gnawer'] ['squirrel']
- wn.taxonomy.common_hypernyms(synset, other, simulate_root=False)¶
Return the common hypernyms for the current and other synsets.
- Parameters
- Return type
Example
>>> import wn, wn.taxonomy >>> dog = ewn.synsets('dog', pos='n')[0] >>> squirrel = ewn.synsets('squirrel', pos='n')[0] >>> for ss in wn.taxonomy.common_hypernyms(dog, squirrel): ... print(ss.lemmas()) ... ['entity'] ['physical entity'] ['object', 'physical object'] ['unit', 'whole'] ['animate thing', 'living thing'] ['organism', 'being'] ['fauna', 'beast', 'animate being', 'brute', 'creature', 'animal'] ['chordate'] ['craniate', 'vertebrate'] ['mammalian', 'mammal'] ['eutherian mammal', 'placental', 'placental mammal', 'eutherian']
- wn.taxonomy.lowest_common_hypernyms(synset, other, simulate_root=False)¶
Return the common hypernyms furthest from the root.
- Parameters
- Return type
Example
>>> import wn, wn.taxonomy >>> dog = ewn.synsets('dog', pos='n')[0] >>> squirrel = ewn.synsets('squirrel', pos='n')[0] >>> len(wn.taxonomy.lowest_common_hypernyms(dog, squirrel)) 1 >>> wn.taxonomy.lowest_common_hypernyms(dog, squirrel)[0].lemmas() ['eutherian mammal', 'placental', 'placental mammal', 'eutherian']