wn

Wordnet Interface.

Project Management Functions

wn.download(project_or_url: str, add: bool = True, progress_handler: type[~wn.util.ProgressHandler] | None = <class 'wn.util.ProgressBar'>) Path

Download the resource specified by project_or_url.

First the URL of the resource is determined and then, depending on the parameters, the resource is downloaded and added to the database. The function then returns the path of the cached file.

If project_or_url starts with 'http://' or 'https://', then it is taken to be the URL for the resource. Otherwise, project_or_url is taken as a project specifier and the URL is taken from a matching entry in Wn's project index. If no project matches the specifier, wn.Error is raised.

If the URL has been downloaded and cached before, the cached file is used. Otherwise the URL is retrieved and stored in the cache.

If the add paramter is True (default), the downloaded resource is added to the database.

>>> wn.download('ewn:2020')
Added ewn:2020 (English WordNet)

The progress_handler parameter takes a subclass of wn.util.ProgressHandler. An instance of the class will be created, used, and closed by this function.

wn.add(source: str | ~pathlib.Path, progress_handler: type[~wn.util.ProgressHandler] | None = <class 'wn.util.ProgressBar'>) None

Add the LMF file at source to the database.

The file at source may be gzip-compressed or plain text XML.

>>> wn.add('english-wordnet-2020.xml')
Added ewn:2020 (English WordNet)

The progress_handler parameter takes a subclass of wn.util.ProgressHandler. An instance of the class will be created, used, and closed by this function.

wn.add_lexical_resource(resource: ~wn.lmf.LexicalResource, progress_handler: type[~wn.util.ProgressHandler] | None = <class 'wn.util.ProgressBar'>) None

Add the lexical resource resource to the database.

The resource argument is an in-memory lexical resource as from wn.lmf.load() and not a file on disk.

>>> resource = wn.lmf.load('english-wordnet-2024.xml')
>>> wn.add_lexical_resource(resource)
Added ewn:2020 (English WordNet)

The progress_handler parameter takes a subclass of wn.util.ProgressHandler. An instance of the class will be created, used, and closed by this function.

wn.remove(lexicon: str, progress_handler: type[~wn.util.ProgressHandler] | None = <class 'wn.util.ProgressBar'>) None

Remove lexicon(s) from the database.

The lexicon argument is a lexicon specifier. Note that this removes a lexicon and not a project, so the lexicons of projects containing multiple lexicons will need to be removed individually or, if applicable, a star specifier.

The progress_handler parameter takes a subclass of wn.util.ProgressHandler. An instance of the class will be created, used, and closed by this function.

>>> wn.remove('ewn:2019')  # removes a single lexicon
>>> wn.remove('*:1.3+omw')  # removes all lexicons with version 1.3+omw
wn.export(lexicons: Sequence[Lexicon], destination: str | Path, version: str = '1.0') None

Export lexicons from the database to a WN-LMF file.

More than one lexicon may be exported in the same file, subject to these conditions:

  • identifiers on wordnet entities must be unique in all lexicons

  • lexicons extensions may not be exported with their dependents

>>> w = wn.Wordnet(lexicon='cmnwn zsmwn')
>>> wn.export(w.lexicons(), 'cmn-zsm.xml')
Parameters:
  • lexicons – sequence of wn.Lexicon objects

  • destination – path to the destination file

  • version – LMF version string

wn.projects() list[dict]

Return the list of indexed projects.

This returns the same dictionaries of information as wn.config.get_project_info, but for all indexed projects.

Example

>>> infos = wn.projects()
>>> len(infos)
36
>>> infos[0]['label']
'Open English WordNet'

Wordnet Query Functions

While it is best to first instantiate a Wordnet object with a specific lexicon and use that for querying (see Default Mode Queries), the following functions are also available for quick and simple queries.

wn.word(id: str, *, lexicon: str | None = None, lang: str | None = None) Word

Return the word with id in lexicon.

This will create a Wordnet object using the lang and lexicon arguments. The id argument is then passed to the Wordnet.word() method.

>>> wn.word('ewn-cell-n')
Word('ewn-cell-n')
wn.words(form: str | None = None, pos: str | None = None, *, lexicon: str | None = None, lang: str | None = None) list[Word]

Return the list of matching words.

This will create a Wordnet object using the lang and lexicon arguments. The remaining arguments are passed to the Wordnet.words() method.

>>> len(wn.words())
282902
>>> len(wn.words(pos='v'))
34592
>>> wn.words(form="scurry")
[Word('ewn-scurry-n'), Word('ewn-scurry-v')]
wn.sense(id: str, *, lexicon: str | None = None, lang: str | None = None) Sense

Return the sense with id in lexicon.

This will create a Wordnet object using the lang and lexicon arguments. The id argument is then passed to the Wordnet.sense() method.

>>> wn.sense('ewn-flutter-v-01903884-02')
Sense('ewn-flutter-v-01903884-02')
wn.senses(form: str | None = None, pos: str | None = None, *, lexicon: str | None = None, lang: str | None = None) list[Sense]

Return the list of matching senses.

This will create a Wordnet object using the lang and lexicon arguments. The remaining arguments are passed to the Wordnet.senses() method.

>>> len(wn.senses('twig'))
3
>>> wn.senses('twig', pos='n')
[Sense('ewn-twig-n-13184889-02')]
wn.synset(id: str, *, lexicon: str | None = None, lang: str | None = None) Synset

Return the synset with id in lexicon.

This will create a Wordnet object using the lang and lexicon arguments. The id argument is then passed to the Wordnet.synset() method.

>>> wn.synset('ewn-03311152-n')
Synset('ewn-03311152-n')
wn.synsets(form: str | None = None, pos: str | None = None, ili: str | ILI | None = None, *, lexicon: str | None = None, lang: str | None = None) list[Synset]

Return the list of matching synsets.

This will create a Wordnet object using the lang and lexicon arguments. The remaining arguments are passed to the Wordnet.synsets() method.

>>> len(wn.synsets('couch'))
4
>>> wn.synsets('couch', pos='v')
[Synset('ewn-00983308-v')]
wn.ili(id: str, *, lexicon: str | None = None, lang: str | None = None) ILI

Return the interlingual index with id.

This will create a Wordnet object using the lang and lexicon arguments. The id argument is then passed to the Wordnet.ili() method.

wn.ilis(status: str | None = None, *, lexicon: str | None = None, lang: str | None = None) list[ILI]

Return the list of matching interlingual indices.

This will create a Wordnet object using the lang and lexicon arguments. The remaining arguments are passed to the Wordnet.ilis() method.

>>> len(wn.ilis())
120071
>>> len(wn.ilis(status='proposed'))
2573
>>> wn.ilis(status='proposed')[-1].definition()
'the neutrino associated with the tau lepton.'
wn.lexicons(*, lexicon: str | None = '*', lang: str | None = None) list[Lexicon]

Return the lexicons matching a language or lexicon specifier.

Example

>>> wn.lexicons(lang='en')
[<Lexicon ewn:2020 [en]>, <Lexicon omw-en:1.4 [en]>]

The Wordnet Class

class wn.Wordnet(lexicon: str | None = None, *, lang: str | None = None, expand: str | None = None, normalizer: ~collections.abc.Callable[[str], str] | None = <function normalize_form>, lemmatizer: ~collections.abc.Callable[[str, str | None], dict[str | None, set[str]]] | None = None, search_all_forms: bool = True)

Class for interacting with wordnet data.

A wordnet object acts essentially as a filter by first selecting matching lexicons and then searching only within those lexicons for later queries. Lexicons can be selected on instantiation with the lexicon or lang parameters. The lexicon parameter is a string with a space-separated list of lexicon specifiers. The lang argument is a BCP 47 language code that selects any lexicon matching the given language code. As the lexicon argument more precisely selects lexicons, it is the recommended method of instantiation. Omitting both lexicon and lang arguments triggers default-mode queries.

Some wordnets were created by translating the words from a larger wordnet, namely the Princeton WordNet, and then relying on the larger wordnet for structural relations. An expand argument is a second space-separated list of lexicon specifiers which are used for traversing relations, but not as the results of queries. Setting expand to an empty string (expand='') disables expand lexicons. For more information, see Cross-lingual Relation Traversal.

The normalizer argument takes a callable that normalizes word forms in order to expand the search. The default function downcases the word and removes diacritics via NFKD normalization so that, for example, searching for san josé in the English WordNet will find the entry for San Jose. Setting normalizer to None disables normalization and forces exact-match searching. For more information, see Normalization.

The lemmatizer argument may be None, which is the default and disables lemmatizer-based query expansion, or a callable that takes a word form and optional part of speech and returns base forms of the original word. To support lemmatizers that use the wordnet for instantiation, such as wn.morphy, the lemmatizer may be assigned to the lemmatizer attribute after creation. For more information, see Lemmatization.

If the search_all_forms argument is True (the default), searches of word forms consider all forms in the lexicon; if False, only lemmas are searched. Non-lemma forms may include, depending on the lexicon, morphological exceptions, alternate scripts or spellings, etc.

lemmatizer

A lemmatization function or None.

word(id: str) Word

Return the first word in this wordnet with identifier id.

words(form: str | None = None, pos: str | None = None) list[Word]

Return the list of matching words in this wordnet.

Without any arguments, this function returns all words in the wordnet's selected lexicons. A form argument restricts the words to those matching the given word form, and pos restricts words by their part of speech.

sense(id: str) Sense

Return the first sense in this wordnet with identifier id.

senses(form: str | None = None, pos: str | None = None) list[Sense]

Return the list of matching senses in this wordnet.

Without any arguments, this function returns all senses in the wordnet's selected lexicons. A form argument restricts the senses to those whose word matches the given word form, and pos restricts senses by their word's part of speech.

synset(id: str) Synset

Return the first synset in this wordnet with identifier id.

synsets(form: str | None = None, pos: str | None = None, ili: str | ILI | None = None) list[Synset]

Return the list of matching synsets in this wordnet.

Without any arguments, this function returns all synsets in the wordnet's selected lexicons. A form argument restricts synsets to those whose member words match the given word form. A pos argument restricts synsets to those with the given part of speech. An ili argument restricts synsets to those with the given interlingual index; generally this should select a unique synset within a single lexicon.

ili(id: str) ILI

Return the first ILI in this wordnet with identifer id.

ilis(status: str | None = None) list[ILI]

Return the list of ILIs in this wordnet.

If status is given, only return ILIs with a matching status.

lexicons() list[Lexicon]

Return the list of lexicons covered by this wordnet.

expanded_lexicons() list[Lexicon]

Return the list of expand lexicons for this wordnet.

describe() str

Return a formatted string describing the lexicons in this wordnet.

Example

>>> oewn = wn.Wordnet('oewn:2021')
>>> print(oewn.describe())
Primary lexicons:
  oewn:2021
    Label  : Open English WordNet
    URL    : https://github.com/globalwordnet/english-wordnet
    License: https://creativecommons.org/licenses/by/4.0/
    Words  : 163161 (a: 8386, n: 123456, r: 4481, s: 15231, v: 11607)
    Senses : 211865
    Synsets: 120039 (a: 7494, n: 84349, r: 3623, s: 10727, v: 13846)
    ILIs   : 120039

Words, Senses, and Synsets

The results of primary queries against a lexicon are Word, Sense, or Synset objects. See The Structure of a Wordnet for more information about the concepts these object represent.

Word Objects

class wn.Word

Word (or "lexical entry") objects encode information about word forms independent from their meaning.

id: str

The identifier used within a lexicon.

pos: str

The part of speech of the Word.

lemma(*, data: Literal[False] = False) str
lemma(*, data: Literal[True] = True) Form
lemma(*, data: bool) str | Form

Return the canonical form of the word.

If the data argument is False (the default), the lemma is returned as a str type. If it is True, a wn.Form object is used instead.

Example

>>> wn.words('wolves')[0].lemma()
'wolf'
>>> wn.words('wolves')[0].lemma(data=True)
Form(value='wolf')
forms(*, data: Literal[False] = False) list[str]
forms(*, data: Literal[True] = True) list[Form]
forms(*, data: bool) list[str] | list[Form]

Return the list of all encoded forms of the word.

If the data argument is False (the default), the forms are returned as str types. If it is True, wn.Form objects are used instead.

Example

>>> wn.words('wolf')[0].forms()
['wolf', 'wolves']
>>> wn.words('wolf')[0].forms(data=True)
[Form(value='wolf'), Form(value='wolves')]
senses() list[Sense]

Return the list of senses of the word.

Example

>>> wn.words('zygoma')[0].senses()
[Sense('ewn-zygoma-n-05292350-01')]
synsets() list[Synset]

Return the list of synsets of the word.

Example

>>> wn.words('addendum')[0].synsets()
[Synset('ewn-06411274-n')]
lexicon() Lexicon

Return the lexicon containing the element.

metadata() dict[str, Any]

Return the word's metadata.

confidence() float

Return the confidence score of the element.

If the element does not have an explicit confidence score, the value defaults to that of the lexicon containing the element.

derived_words() list[Word]

Return the list of words linked through derivations on the senses.

Example

>>> wn.words('magical')[0].derived_words()
[Word('ewn-magic-n'), Word('ewn-magic-n')]
translate(lexicon: str | None = None, *, lang: str | None = None) dict[Sense, list[Word]]

Return a mapping of word senses to lists of translated words.

Parameters:
  • lexicon – lexicon specifier of translated words

  • lang – BCP-47 language code of translated words

Example

>>> w = wn.words('water bottle', pos='n')[0]
>>> for sense, words in w.translate(lang='ja').items():
...     print(sense, [jw.lemma() for jw in words])
...
Sense('ewn-water_bottle-n-04564934-01') ['水筒']

Sense Objects

class wn.Sense

Sense objects represent a pairing of a Word and a Synset.

id: str

The identifier used within a lexicon.

word() Word

Return the word of the sense.

Example

>>> wn.senses('spigot')[0].word()
Word('pwn-spigot-n')
synset() Synset

Return the synset of the sense.

Example

>>> wn.senses('spigot')[0].synset()
Synset('pwn-03325088-n')
examples(*, data: Literal[False] = False) list[str]
examples(*, data: Literal[True] = True) list[Example]
examples(*, data: bool) list[str] | list[Example]

Return the list of examples for the sense.

If the data argument is False (the default), the examples are returned as str types. If it is True, wn.Example objects are used instead.

lexicalized() bool

Return True if the sense is lexicalized.

adjposition() str | None

Return the adjective position of the sense.

Values include "a" (attributive), "p" (predicative), and "ip" (immediate postnominal). Note that this is only relevant for adjectival senses. Senses for other parts of speech, or for adjectives that are not annotated with this feature, will return None.

frames() list[str]

Return the list of subcategorization frames for the sense.

counts(*, data: Literal[False] = False) list[int]
counts(*, data: Literal[True] = True) list[Count]
counts(*, data: bool) list[int] | list[Count]

Return the corpus counts stored for this sense.

lexicon() Lexicon

Return the lexicon containing the element.

metadata() dict[str, Any]

Return the sense's metadata.

confidence() float

Return the confidence score of the element.

If the element does not have an explicit confidence score, the value defaults to that of the lexicon containing the element.

relations(*args: str) dict[str, list[Sense]]

Return a mapping of relation names to lists of senses.

One or more relation names may be given as positional arguments to restrict the relations returned. If no such arguments are given, all relations starting from the sense are returned.

See get_related() for getting a flat list of related senses.

relation_map() dict[Relation, Sense]

Return a dict mapping Relation objects to targets.

Return a list of related senses.

One or more relation types should be passed as arguments which determine the kind of relations returned.

Example

>>> physics = wn.senses('physics', lexicon='ewn')[0]
>>> for sense in physics.get_related('has_domain_topic'):
...     print(sense.word().lemma())
...
coherent
chaotic
incoherent

Return a list of related synsets.

closure(*args: str) Iterator[T]
relation_paths(*args: str, end: T | None = None) Iterator[list[T]]
translate(lexicon: str | None = None, *, lang: str | None = None) list[Sense]

Return a list of translated senses.

Parameters:
  • lexicon – lexicon specifier of translated senses

  • lang – BCP-47 language code of translated senses

Example

>>> en = wn.senses('petiole', lang='en')[0]
>>> pt = en.translate(lang='pt')[0]
>>> pt.word().lemma()
'pecíolo'

Synset Objects

class wn.Synset

Synset objects represent a set of words that share a meaning.

id: str

The identifier used within a lexicon.

pos: str

The part of speech of the Synset.

property ili: ILI | None

The interlingual index of the Synset.

definition(*, data: Literal[False] = False) str | None
definition(*, data: Literal[True] = True) Definition | None
definition(*, data: bool) str | Definition | None

Return the first definition found for the synset.

If the data argument is False (the default), the definition is returned as a str type. If it is True, a wn.Definition object is used instead.

Example

>>> wn.synsets('cartwheel', pos='n')[0].definition()
'a wheel that has wooden spokes and a metal rim'
>>> wn.synsets('cartwheel', pos='n')[0].definition(data=True)
[Definition(text='a wheel that has wooden spokes and a metal rim',
  language=None, source_sense_id=None)]
definitions(*, data: Literal[False] = False) list[str]
definitions(*, data: Literal[True] = True) list[Definition]
definitions(*, data: bool) list[str] | list[Definition]

Return the list of definitions for the synset.

If the data argument is False (the default), the definitions are returned as str objects. If it is True, wn.Definition objects are used instead.

Example

>>> wn.synsets('tea', pos='n')[0].definitions()
['a beverage made by steeping tea leaves in water']
>>> wn.synsets('tea', pos='n')[0].definitions(data=True)
[Definition(text='a beverage made by steeping tea leaves in water',
  language=None, source_sense_id=None)]
examples(*, data: Literal[False] = False) list[str]
examples(*, data: Literal[True] = True) list[Example]
examples(*, data: bool) list[str] | list[Example]

Return the list of examples for the synset.

If the data argument is False (the default), the examples are returned as str types. If it is True, wn.Example objects are used instead.

Example

>>> wn.synsets('orbital', pos='a')[0].examples()
['"orbital revolution"', '"orbital velocity"']
senses() list[Sense]

Return the list of sense members of the synset.

Example

>>> wn.synsets('umbrella', pos='n')[0].senses()
[Sense('ewn-umbrella-n-04514450-01')]
lexicalized() bool

Return True if the synset is lexicalized.

lexfile() str | None

Return the lexicographer file name for this synset, if any.

lexicon() Lexicon

Return the lexicon containing the element.

metadata() dict[str, Any]

Return the synset's metadata.

confidence() float

Return the confidence score of the element.

If the element does not have an explicit confidence score, the value defaults to that of the lexicon containing the element.

words() list[Word]

Return the list of words linked by the synset's senses.

Example

>>> wn.synsets('exclusive', pos='n')[0].words()
[Word('ewn-scoop-n'), Word('ewn-exclusive-n')]
lemmas(*, data: Literal[False] = False) list[str]
lemmas(*, data: Literal[True] = True) list[Form]
lemmas(*, data: bool) list[str] | list[Form]

Return the list of lemmas of words for the synset.

If the data argument is False (the default), the lemmas are returned as str types. If it is True, wn.Form objects are used instead.

Example

>>> wn.synsets('exclusive', pos='n')[0].lemmas()
['scoop', 'exclusive']
>>> wn.synsets('exclusive', pos='n')[0].lemmas(data=True)
[Form(value='scoop'), Form(value='exclusive')]
hypernyms() list[Synset]

Return the list of synsets related by any hypernym relation.

Both the hypernym and instance_hypernym relations are traversed.

hyponyms() list[Synset]

Return the list of synsets related by any hyponym relation.

Both the hyponym and instance_hyponym relations are traversed.

holonyms() list[Synset]

Return the list of synsets related by any holonym relation.

Any of the following relations are traversed: holonym, holo_location, holo_member, holo_part, holo_portion, holo_substance.

meronyms() list[Synset]

Return the list of synsets related by any meronym relation.

Any of the following relations are traversed: meronym, mero_location, mero_member, mero_part, mero_portion, mero_substance.

relations(*args: str) dict[str, list[Synset]]

Return a mapping of relation names to lists of synsets.

One or more relation names may be given as positional arguments to restrict the relations returned. If no such arguments are given, all relations starting from the synset are returned.

See get_related() for getting a flat list of related synsets.

Example

>>> button_rels = wn.synsets('button')[0].relations()
>>> for relname, sslist in button_rels.items():
...     print(relname, [ss.lemmas() for ss in sslist])
...
hypernym [['fixing', 'holdfast', 'fastener', 'fastening']]
hyponym [['coat button'], ['shirt button']]
relation_map() dict[Relation, Synset]

Return a dict mapping Relation objects to targets.

Return the list of related synsets.

One or more relation names may be given as positional arguments to restrict the relations returned. If no such arguments are given, all relations starting from the synset are returned.

This method does not preserve the relation names that lead to the related synsets. For a mapping of relation names to related synsets, see relations().

Example

>>> fulcrum = wn.synsets('fulcrum')[0]
>>> [ss.lemmas() for ss in fulcrum.get_related()]
[['pin', 'pivot'], ['lever']]
closure(*args: str) Iterator[T]
relation_paths(*args: str, end: T | None = None) Iterator[list[T]]
translate(lexicon: str | None = None, *, lang: str | None = None) list[Synset]

Return a list of translated synsets.

Parameters:
  • lexicon – lexicon specifier of translated synsets

  • lang – BCP-47 language code of translated synsets

Example

>>> es = wn.synsets('araña', lang='es')[0]
>>> en = es.translate(lexicon='ewn')[0]
>>> en.lemmas()
['spider']
hypernym_paths(simulate_root=False)

Shortcut for wn.taxonomy.hypernym_paths().

min_depth(simulate_root=False)

Shortcut for wn.taxonomy.min_depth().

max_depth(simulate_root=False)

Shortcut for wn.taxonomy.max_depth().

shortest_path(other, simulate_root=False)

Shortcut for wn.taxonomy.shortest_path().

common_hypernyms(other, simulate_root=False)

Shortcut for wn.taxonomy.common_hypernyms().

lowest_common_hypernyms(other, simulate_root=False)

Shortcut for wn.taxonomy.lowest_common_hypernyms().

Relations

The Sense.relation_map() and Synset.relation_map() methods return a dictionary mapping Relation objects to resolved target senses or synsets. They differ from Sense.relations() and Synset.relations() in two main ways:

  1. Relation objects map 1-to-1 to their targets instead of to a list of targets sharing the same relation name.

  2. Relation objects encode not just relation names, but also the identifiers of sources and targets, the lexicons they came from, and any metadata they have.

One reason why Relation objects are useful is for inspecting relation metadata, particularly in order to distinguish other relations that differ only by the value of their dc:type metadata:

>>> oewn = wn.Wordnet('oewn:2024')
>>> alloy = oewn.senses("alloy", pos="v")[0]
>>> alloy.relations()  # appears to only have one 'other' relation
{'derivation': [Sense('oewn-alloy__1.27.00..')], 'other': [Sense('oewn-alloy__1.27.00..')]}
>>> for rel in alloy.relation_map():  # but in fact there are two
...     print(rel, rel.subtype)
...
Relation('derivation', 'oewn-alloy__2.30.00..', 'oewn-alloy__1.27.00..') None
Relation('other', 'oewn-alloy__2.30.00..', 'oewn-alloy__1.27.00..') material
Relation('other', 'oewn-alloy__2.30.00..', 'oewn-alloy__1.27.00..') result

Another reason why they are useful is to determine the source of a relation used in interlingual queries.

>>> es = wn.Wordnet("omw-es", expand="omw-en")
>>> mapa = es.synsets("mapa", pos="n")[0]
>>> rel, tgt = next(iter(mapa.relation_map().items()))
>>> rel, rel.lexicon()  # relation comes from omw-en
(Relation('hypernym', 'omw-en-03720163-n', 'omw-en-04076846-n'), <Lexicon omw-en:1.4 [en]>)
>>> tgt, tgt.words(), tgt.lexicon()  # target is in omw-es
(Synset('omw-es-04076846-n'), [Word('omw-es-representación-n')], <Lexicon omw-es:1.4 [es]>)
class wn.Relation

Relation objects model relations between senses or synsets.

name

The name of the relation. Also called the relation "type".

source_id

The identifier of the source entity of the relation.

target_id

The identifier of the target entity of the relation.

subtype

The value of the dc:type metadata.

If dc:type is not specified in the metadata, None is returned instead.

lexicon() Lexicon

Return the lexicon containing the element.

metadata() dict[str, Any]

Return the example's metadata.

confidence() float

Return the confidence score of the element.

If the element does not have an explicit confidence score, the value defaults to that of the lexicon containing the element.

Additional Classes

class wn.Form

Form objects are returned by Word.lemma() and Word.forms() when the data=True argument is used, and they make accessible several optional properties of word forms. The word form itself is available via the value attribute.

>>> inu = wn.words('犬', lexicon='wnja')[0]
>>> inu.forms(data=True)[3]
Form(value='いぬ')
>>> inu.forms(data=True)[3].script
'hira'

The script is often unspecified (i.e., None) and this carries the implicit meaning that the form uses the canonical script for the word's language or wordnet, whatever it may be.

value

The word form string.

id

An optional form identifier used within a lexicon. These identifiers are often None.

script

The script of the word form. This should be an ISO 15924 code, or None.

pronunciations()

Return the list of Pronunciation objects.

tags()

Return the list of Tag objects.

lexicon() Lexicon

Return the lexicon containing the element.

class wn.Pronunciation

Pronunciation objects encode a text or audio representation of how a word is pronounced. They are returned by Form.pronunciations().

value: str

The encoded pronunciation.

variety: str | None = None

The language variety this pronunciation belongs to.

notation: str | None = None

The notation used to encode the pronunciation. For example: the International Phonetic Alphabet (IPA).

phonemic: bool = True

True when the encoded pronunciation is a generalized phonemic description, or False for more precise phonetic transcriptions.

audio: str | None = None

A URI to an associated audio file.

class wn.Tag(tag: str, category: str)

A general-purpose tag class for word forms.

Tag objects encode categorical information about word forms. They are returned by Form.tags().

tag: str

The text value of the tag.

category: str

The category, or kind, of the tag.

class wn.Count(value: int, _lexicon: str = '', _metadata: Metadata | None = None)

A count of sense occurrences in some corpus.

Count objects model sense counts previously computed over some corpus. They are returned by Sense.counts().

value: int

The count of sense occurrences.

lexicon() Lexicon

Return the lexicon containing the element.

metadata() dict[str, Any]

Return the example's metadata.

confidence() float

Return the confidence score of the element.

If the element does not have an explicit confidence score, the value defaults to that of the lexicon containing the element.

class wn.Example

Example objects model example phrases for senses and synsets. They are returned by Sense.examples() and Synset.examples() when the data=True argument is given.

text: str

The example text.

language: str | None = None

The language of the example.

lexicon() Lexicon

Return the lexicon containing the element.

metadata() dict[str, Any]

Return the example's metadata.

confidence() float

Return the confidence score of the element.

If the element does not have an explicit confidence score, the value defaults to that of the lexicon containing the element.

class wn.Definition

Definition objects model synset definitions. They are returned by Synset.definition() when the data=True argument is given.

text: str

The example text.

language: str | None = None

The language of the example.

source_sense_id: str | None = None

The id of the particular sense the definition is for.

lexicon() Lexicon

Return the lexicon containing the element.

metadata() dict[str, Any]

Return the example's metadata.

confidence() float

Return the confidence score of the element.

If the element does not have an explicit confidence score, the value defaults to that of the lexicon containing the element.

Interlingual Indices

class wn.ILI

ILI objects represent Interlingual Indices.

property id: str | None

The interlingual index identifier. Unlike id attributes for Word, Sense, and Synset, ILI identifers may be None (see the proposed status).

status: str

The known status of the interlingual index. Loading an interlingual index into the database provides the following explicit, authoritative status values:

  • active – the ILI is in use

  • provisional – the ILI is being staged for permanent inclusion

  • deprecated – the ILI is, or should be, no longer in use

Without an interlingual index loaded, ILIs present in loaded lexicons get an implicit, temporary status from the following:

  • presupposed – a synset uses the ILI, assuming it exists in an ILI file

  • proposed – a synset introduces a concept not yet in an ILI and is suggesting that one should be added for it in the future

definition() str | None
metadata() dict[str, Any]

Return the ILI's metadata.

confidence() float

Return the confidence score of the ILI.

Lexicon Objects

class wn.Lexicon

Lexicon objects contain attributes and metadata about a single lexicon.

id: str

The lexicon's identifier.

label: str

The full name of lexicon.

language: str

The BCP 47 language code of lexicon.

email: str

The email address of the wordnet maintainer.

license: str

The URL or name of the wordnet's license.

version: str

The version string of the resource.

url: str | None = None

The project URL of the wordnet.

citation: str | None = None

The canonical citation for the project.

A URL or path to a project logo.

metadata() dict[str, Any]

Return the example's metadata.

confidence() float

Return the confidence score of the lexicon.

If the lexicon does not specify a confidence score, it defaults to 1.0.

specifier() str

Return the id:version lexicon specifier.

modified() bool

Return True if the lexicon has local modifications.

requires() dict[str, Lexicon | None]

Return the lexicon dependencies.

extends() Lexicon | None

Return the lexicon this lexicon extends, if any.

If this lexicon is not an extension, return None.

extensions(depth: int = 1) list[Lexicon]

Return the list of lexicons extending this one.

By default, only direct extensions are included. This is controlled by the depth parameter, which if you view extensions as children in a tree where the current lexicon is the root, depth=1 are the immediate extensions. Increasing this number gets extensions of extensions, or setting it to a negative number gets all "descendant" extensions.

describe(full: bool = True) str

Return a formatted string describing the lexicon.

The full argument (default: True) may be set to False to omit word and sense counts.

Also see: Wordnet.describe()

The wn.config Object

Wn's data storage and retrieval can be configured through the wn.config object.

See also

Installation and Configuration describes how to configure Wn using the wn.config instance.

wn.config = <wn._config.WNConfig object>

It is an instance of the WNConfig class, which is defined in a non-public module and is not meant to be instantiated directly. Configuration should occur through the single wn.config instance.

class wn._config.WNConfig
data_directory

The file system directory where Wn's data is stored.

Assign a new path to change where the database and downloads are stored.

>>> wn.config.data_directory = "~/.cache/wn"
>>> wn.config.database_path
PosixPath('/home/username/.cache/wn/wn.db')
>>> wn.config.downloads_directory
PosixPath('/home/username/.cache/wn/downloads')
database_path

The path to the database file.

The database path is derived from data_directory and cannot be changed directly.

allow_multithreading

If set to True, the database connection may be shared across threads. In this case, it is the user's responsibility to ensure that multiple threads don't try to write to the database at the same time. The default is False.

downloads_directory

The file system directory where downloads are cached.

The downloads directory is derived from data_directory and cannot be changed directly.

add_project(id: str, type: str = 'wordnet', label: str | None = None, language: str | None = None, license: str | None = None, error: str | None = None) None

Add a new wordnet project to the index.

Parameters:
  • id – short identifier of the project

  • type – project type (default 'wordnet')

  • label – full name of the project

  • languageBCP 47 language code of the resource

  • license – link or name of the project's default license

  • error – if set, the error message to use when the project is accessed

add_project_version(id: str, version: str, url: str | None = None, error: str | None = None, license: str | None = None) None

Add a new resource version for a project.

Exactly one of url or error must be specified.

Parameters:
  • id – short identifier of the project

  • version – version string of the resource

  • url – space-separated list of web addresses for the resource

  • license – link or name of the resource's license; if not given, the project's default license will be used.

  • error – if set, the error message to use when the project is accessed

get_project_info(arg: str) dict

Return information about an indexed project version.

If the project has been downloaded and cached, the "cache" key will point to the path of the cached file, otherwise its value is None.

Parameters:

arg – a project specifier

Example

>>> info = wn.config.get_project_info('oewn:2021')
>>> info['label']
'Open English WordNet'
get_cache_path(url: str) Path

Return the path for caching url.

Note that in general this is just a path operation and does not signify that the file exists in the file system.

update(data: dict) None

Update the configuration with items in data.

Items are only inserted or replaced, not deleted. If a project index is provided in the "index" key, then either the project must not already be indexed or any project fields (label, language, or license) that are specified must be equal to the indexed project.

load_index(path: str | Path) None

Load and update with the project index at path.

The project index is a TOML file containing project and version information. For example:

[ewn]
  label = "Open English WordNet"
  language = "en"
  license = "https://creativecommons.org/licenses/by/4.0/"
  [ewn.versions.2019]
    url = "https://en-word.net/static/english-wordnet-2019.xml.gz"
  [ewn.versions.2020]
    url = "https://en-word.net/static/english-wordnet-2020.xml.gz"

Exceptions

exception wn.Error

Generic error class for invalid wordnet operations.

exception wn.DatabaseError

Error class for issues with the database.

exception wn.WnWarning

Generic warning class for dubious wordnet operations.