wn.compat.sensekey¶
Functions Related to Sense Keys
Sense keys are identifiers of senses that (mostly) persist across
wordnet versions. They are only used by the English wordnets. For the
OMW lexicons derived from the Princeton WordNet and the EWN 2019/2020
lexicons, the sense key is encoded in the identifier metadata of a
Sense:
>>> import wn
>>> en = wn.Wordnet("omw-en:1.4")
>>> sense = en.sense("omw-en-carrousel-02966372-n")
>>> sense.metadata()
{'identifier': 'carrousel%1:06:01::'}
For OEWN 2021+ lexicons, the sense key is encoded in the sense ID, but some characters are escaped or replaced to ensure it is a valid XML ID.
>>> oewn = wn.Wordnet("oewn:2024")
>>> sense = oewn.sense("oewn-carousel__1.06.01..")
>>> sense.id
'oewn-carousel__1.06.01..'
This module has four functions:
escape()transforms a sense key into a form that is valid for XML IDs. The flavor keyword argument specifies the escaping mechanism and it defaults to"oewn-v2".unescape()transforms an escaped sense key back into the original form. The flavor keyword is the same as withescape().sense_key_getter()creates a function for retrieving the sense key for a givenwn.Senseobject. Depending on the lexicon, it will retrieve the sense key from metadata or it will unescape the sense ID.sense_getter()creates a function for retrieving awn.Senseobject given a sense key. Depending on the lexicon, it will build and use a mapping of sense key metadata town.Senseobjects, or it will escape the sense key and use the escaped form as theidargument forwn.Wordnet.sense().
See also
The documentation from the Princeton WordNet: https://wordnet.princeton.edu/documentation/senseidx5wn
- wn.compat.sensekey.escape(sense_key: str, /, flavor: str = 'oewn-v2') str¶
Return an escaped sense key that is valid for XML IDs.
The flavor argument specifies how the escaping will be done. Its default value is
"oewn-v2", which escapes like the Open English Wordnet 2025 editions, including separate rules for the left and right side of the%delimiter. The other possible value is"oewn", which escapes like the Open English Wordnet 2024 and prior editions.>>> from wn.compat import sensekey >>> sensekey.escape("ceramic%3:01:00::") 'ceramic__3.01.00..'
- wn.compat.sensekey.unescape(s: str, /, flavor: str = 'oewn-v2') str¶
Return the original form of an escaped sense key.
The flavor argument specifies how the unescaping will be done. Its default value is
"oewn-v2", which unescapes like the Open English Wordnet 2025 editions, including separate rules for the left and right side of the__delimiter. The other possible value is"oewn", which unescapes like the Open English Wordnet 2024 and prior editions.>>> from wn.compat import sensekey >>> sensekey.unescape("ceramic__3.01.00..") 'ceramic%3:01:00::'
Note that this function does not remove any lexicon ID prefixes on sense IDs, so that may need to be done manually:
>>> sensekey.unescape("oewn-ceramic__3.01.00..") 'oewn-ceramic%3:01:00::' >>> sensekey.unescape("oewn-ceramic__3.01.00..".removeprefix("oewn-")) 'ceramic%3:01:00::'
- wn.compat.sensekey.sense_key_getter(lexicon: str) Callable[[Sense], str | None]¶
Return a function that gets sense keys from senses.
The lexicon argument determines how the function will retrieve the sense key; i.e., whether it is from the
identifiermetadata or unescaping the sense ID. For any unsupported lexicon, an error is raised.The function that is returned accepts one argument, a
wn.Sense(ideally from the same lexicon specified in the lexicon argument), and returns astrif the sense key exists in the lexicon orNoneotherwise.>>> import wn >>> from wn.compat import sensekey >>> oewn = wn.Wordnet("oewn:2024") >>> get_sense_key = sensekey.sense_key_getter("oewn:2024") >>> get_sense_key(oewn.senses("alabaster")[0]) 'alabaster%3:01:00::'
When unescaping a sense ID, if the ID starts with its lexicon's ID and a hyphen (e.g., "oewn-"), it is assumed to be a conventional ID prefix and is removed prior to unescaping.
- wn.compat.sensekey.sense_getter(lexicon: str, wordnet: Wordnet | None = None) Callable[[str], Sense | None]¶
Return a function that gets the sense for a sense key.
The lexicon argument determines how the function will retrieve the sense; i.e., whether a mapping between a sense's
identifiermetadata and the sense will be created and used or the escaped sense key is used as the sense ID. For any unsupported lexicon, an error is raised.The optional wordnet object is used as the source of the returned
wn.Senseobjects. If none is provided, a newwn.Wordnetobject is created using the lexicon argument.The function that is returned accepts one argument, a
strof the sense key, and returns awn.Senseif the sense key exists in the lexicon orNoneotherwise.>>> import wn >>> from wn.compat import sensekey >>> get_sense = sensekey.sense_getter("oewn:2024") >>> get_sense("alabaster%3:01:00::") Sense('oewn-alabaster__3.01.00..')
Warning
The mapping built for the
omw-en*orewnlexicons requires significant memory—around 100MiB—to use. Theoewnlexicons do not require such a mapping and the memory usage is negligible.