wn.taxonomy¶
Functions for working with hypernym/hyponym taxonomies.
Overview¶
Among the valid synset relations for wordnets (see
wn.constants.SYNSET_RELATIONS), those used for describing
is-a taxonomies are
given special treatment and they are generally the most
well-developed relations in any wordnet. Typically these are the
hypernym and hyponym relations, which encode is-a-type-of
relationships (e.g., a hermit crab is a type of decapod, which is
a type of crustacean, etc.). They also include instance_hypernym
and instance_hyponym, which encode is-an-instance-of
relationships (e.g., Oregon is an instance of American state).
The taxonomy forms a multiply-inheriting hierarchy with the synsets as nodes. In the English wordnets, such as the Princeton WordNet and its derivatives, nearly all nominal synsets form such a hierarchy with single root node, while verbal synsets form many smaller hierarchies without a common root. Other wordnets may have different properties, but as many are based off of the Princeton WordNet, they tend to follow this structure.
Functions to find paths within the taxonomies form the basis of all
wordnet similarity measures. For instance, the
Leacock-Chodorow Similarity measure uses both
shortest_path() and (indirectly) taxonomy_depth().
Wordnet-level Functions¶
Root and leaf synsets in the taxonomy are those with no ancestors
(hypernym, instance_hypernym, etc.) or hyponyms (hyponym,
instance_hyponym, etc.), respectively.
Finding root and leaf synsets¶
- wn.taxonomy.roots(wordnet: Wordnet, pos: str | None = None) list[Synset]¶
Return the list of root synsets in wordnet.
- Parameters:
wordnet – The wordnet from which root synsets are found.
pos – If given, only return synsets with the specified part of speech.
Example
>>> import wn, wn.taxonomy >>> ewn = wn.Wordnet("ewn:2020") >>> len(wn.taxonomy.roots(ewn, pos="v")) 573
- wn.taxonomy.leaves(wordnet: Wordnet, pos: str | None = None) list[Synset]¶
Return the list of leaf synsets in wordnet.
- Parameters:
wordnet – The wordnet from which leaf synsets are found.
pos – If given, only return synsets with the specified part of speech.
Example
>>> import wn, wn.taxonomy >>> ewn = wn.Wordnet("ewn:2020") >>> len(wn.taxonomy.leaves(ewn, pos="v")) 10525
Computing the taxonomy depth¶
The taxonomy depth is the maximum depth from a root node to a leaf node within synsets for a particular part of speech.
- wn.taxonomy.taxonomy_depth(wordnet: Wordnet, pos: str) int¶
Return the maximum depth of the taxonomy for the given part of speech.
- Parameters:
wordnet – The wordnet for which the taxonomy depth will be calculated.
pos – The part of speech for which the taxonomy depth will be calculated.
Example
>>> import wn, wn.taxonomy >>> ewn = wn.Wordnet("ewn:2020") >>> wn.taxonomy.taxonomy_depth(ewn, "n") 19
Synset-level Functions¶
- wn.taxonomy.hypernym_paths(synset: Synset, simulate_root: bool = False) list[list[Synset]]¶
Return the list of hypernym paths to a root synset.
- Parameters:
synset – The starting synset for paths to a root.
simulate_root – If
True, find the path to a simulated root node.
Example
>>> import wn, wn.taxonomy >>> dog = wn.synsets("dog", pos="n")[0] >>> for path in wn.taxonomy.hypernym_paths(dog): ... for i, ss in enumerate(path): ... print(" " * i, ss, ss.lemmas()[0]) Synset('pwn-02083346-n') canine Synset('pwn-02075296-n') carnivore Synset('pwn-01886756-n') eutherian mammal Synset('pwn-01861778-n') mammalian Synset('pwn-01471682-n') craniate Synset('pwn-01466257-n') chordate Synset('pwn-00015388-n') animal Synset('pwn-00004475-n') organism Synset('pwn-00004258-n') animate thing Synset('pwn-00003553-n') unit Synset('pwn-00002684-n') object Synset('pwn-00001930-n') physical entity Synset('pwn-00001740-n') entity Synset('pwn-01317541-n') domesticated animal Synset('pwn-00015388-n') animal Synset('pwn-00004475-n') organism Synset('pwn-00004258-n') animate thing Synset('pwn-00003553-n') unit Synset('pwn-00002684-n') object Synset('pwn-00001930-n') physical entity Synset('pwn-00001740-n') entity
- wn.taxonomy.min_depth(synset: Synset, simulate_root: bool = False) int¶
Return the minimum taxonomy depth of the synset.
- Parameters:
synset – The starting synset for paths to a root.
simulate_root – If
True, find the depth to a simulated root node.
Example
>>> import wn, wn.taxonomy >>> dog = wn.synsets("dog", pos="n")[0] >>> wn.taxonomy.min_depth(dog) 8
- wn.taxonomy.max_depth(synset: Synset, simulate_root: bool = False) int¶
Return the maximum taxonomy depth of the synset.
- Parameters:
synset – The starting synset for paths to a root.
simulate_root – If
True, find the depth to a simulated root node.
Example
>>> import wn, wn.taxonomy >>> dog = wn.synsets("dog", pos="n")[0] >>> wn.taxonomy.max_depth(dog) 13
- wn.taxonomy.shortest_path(synset: Synset, other: Synset, simulate_root: bool = False) list[Synset]¶
Return the shortest path from synset to the other synset.
- Parameters:
other – endpoint synset of the path
simulate_root – if
True, ensure any two synsets are always connected by positing a fake root node
Example
>>> import wn, wn.taxonomy >>> dog = ewn.synsets("dog", pos="n")[0] >>> squirrel = ewn.synsets("squirrel", pos="n")[0] >>> for ss in wn.taxonomy.shortest_path(dog, squirrel): ... print(ss.lemmas()) ['canine', 'canid'] ['carnivore'] ['eutherian mammal', 'placental', 'placental mammal', 'eutherian'] ['rodent', 'gnawer'] ['squirrel']
- wn.taxonomy.common_hypernyms(synset: Synset, other: Synset, simulate_root: bool = False) list[Synset]¶
Return the common hypernyms for the current and other synsets.
- Parameters:
other – synset that is a hyponym of any shared hypernyms
simulate_root – if
True, ensure any two synsets always share a hypernym by positing a fake root node
Example
>>> import wn, wn.taxonomy >>> dog = ewn.synsets("dog", pos="n")[0] >>> squirrel = ewn.synsets("squirrel", pos="n")[0] >>> for ss in wn.taxonomy.common_hypernyms(dog, squirrel): ... print(ss.lemmas()) ['entity'] ['physical entity'] ['object', 'physical object'] ['unit', 'whole'] ['animate thing', 'living thing'] ['organism', 'being'] ['fauna', 'beast', 'animate being', 'brute', 'creature', 'animal'] ['chordate'] ['craniate', 'vertebrate'] ['mammalian', 'mammal'] ['eutherian mammal', 'placental', 'placental mammal', 'eutherian']
- wn.taxonomy.lowest_common_hypernyms(synset: Synset, other: Synset, simulate_root: bool = False) list[Synset]¶
Return the common hypernyms furthest from the root.
- Parameters:
other – synset that is a hyponym of any shared hypernyms
simulate_root – if
True, ensure any two synsets always share a hypernym by positing a fake root node
Example
>>> import wn, wn.taxonomy >>> dog = ewn.synsets("dog", pos="n")[0] >>> squirrel = ewn.synsets("squirrel", pos="n")[0] >>> len(wn.taxonomy.lowest_common_hypernyms(dog, squirrel)) 1 >>> wn.taxonomy.lowest_common_hypernyms(dog, squirrel)[0].lemmas() ['eutherian mammal', 'placental', 'placental mammal', 'eutherian']