معرفی شرکت ها

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

Python API for BabelNet

ویژگی	مقدار
سیستم عامل	-
نام فایل	babelnet-1.1.0
نام	babelnet
نسخه کتابخانه	1.1.0
نگهدارنده	[]
ایمیل نگهدارنده	[]
نویسنده	Babelscape
ایمیل نویسنده	info@babelscape.com
آدرس صفحه اصلی	https://babelnet.org/
آدرس اینترنتی	https://pypi.org/project/babelnet/
مجوز	-

This package consists of a Python API to work with BabelNet, a very large multilingual semantic network. For more information, please refer to the documentation below on how to use the software, and our website (https://babelnet.org) for news, updates and papers. # Version compatibility BabelNet Python API can be used with BabeNet 4.0 and above. # Configuration After the installation, the first step to take when you want to use BabelNet in another project (or in the REPL) is to create a file called `babelnet_conf.yml` in the current working directory. Alternatively, the path of the configuration file can be specified using the `BABELNET_CONF` environment variable. The content of the `babelnet_conf.yml` should vary according to the usage mode of choice: **Online Mode**: uses the online REST service to retrieve the data. To use this mode you need an internet connection and a [valid API key](https://babelnet.org/guide#access). **RPC Mode**: reads data directly from a local copy of the BabelNet indices, making it more suitable for heavy workloads than the online mode since it is faster and doesn't have usage limits. To use this mode you need the [BabelNet indices](https://babelnet.org/downloads) and [Docker installed in your system](https://www.docker.com/get-started/). The *RPC server controller* (see below) requires additional dependencies that can be installed with the following pip command: ```bash pip install babelnet[rpc] ``` Further details on how to use these modes are provided in the following sections. ## Online Mode This is the simplest version to use, since it requires only [a valid API key](https://babelnet.org/guide#access). However, the drawback is that the iterators are unavailable, i.e. the `iterator`, `offset_iterator`, `lexicon_iterator` and `wordnet_iterator` methods. Assuming you have received by e-mail the key `3x54mp13-8au0-o97q-9vzz-3vakcpec8w4p`, add the following line to `babelnet_conf.yml`: ```yaml RESTFUL_KEY: '3x54mp13-8au0-o97q-9vzz-3vakcpec8w4p' ``` This will automatically be used to authenticate you on the official BabelNet REST service. The supported REST endpoints are: - `https://babelnet.io/v8/service` for BabelNet 5.2 *(default)* - `https://babelnet.io/v7/service` for BabelNet 5.1 - `https://babelnet.io/v6/service` for BabelNet 5.0 - `https://babelnet.io/v5/service` for BabelNet 4.0 If you want to use a different REST endpoint, add the following line to `babelnet_conf.yml`: ```yaml # BabelNet 5.2 REST endpoint RESTFUL_URL: 'https://babelnet.io/v8/service' ``` ## RPC Mode To use the RPC mode you need a **local copy of the BabelNet indices**. To download them, follow the procedure on the [official website](https://babelnet.org/downloads). This can be considered a _full_ mode, because it has no usage limit and faster responses. BabelNet Python API requires PyLucene, which has a dependency on Lucene itself. The installation process of Lucene can be tricky since it has many dependencies that need compiling. Because of this, we moved this PyLucene build and install process to a simple Docker image. In the RPC mode, the Remote Procedure Call paradigm is applied in calling this Docker container as a remote service, effectively decoupling PyLucene and BabelNet. To configure the APIs in RPC mode, you just need to add one of these lines to your `babelnet_conf.yml`, depending on which protocol you want to use. The default protocol used by the RPC server is TCP. You can specify the URL where the server is listening with the following configuration line. ```yaml # TCP URL example RPC_URL: "tcp://127.0.0.1:7790" ``` If the RPC server has the optional IPC protocol enabled, you can use it with the following configuration line. ```yaml # IPC URL example RPC_URL: "ipc:///home/user/your_ipc_dir/socket" ``` *__Important__:* to use lambdas in RPC mode, the client code must be run using the **same Python version** of the server, i.e. Python 3.8, and the **same (or older) version of cloudpickle**, i.e. 2.1.0. To start the server, you can either use the *RPC server controller* or manually start the Docker. In any case you need [Docker to be installed in your system](https://www.docker.com/get-started/). The controller is described in the following section; for details on how to directly use the Docker image, please follow the documentation on the [Docker Hub page](https://hub.docker.com/r/babelscape/babelnet-rpc). ***Note:*** when you update the API to a newer version, you need to either restart the server using the controller or pull the new docker from the hub and start a new server with the updated image. # RPC server controller To simplify the management of the RPC server, you can use the `babelnet-rpc` command. The additional dependencies required by the controller can be installed with the command: ```bash pip install babelnet[rpc] ``` ***For Windows users:*** if you are working in an Anaconda environment, you need to install `pywin32` using anaconda with the following command: ```bash conda install pywin32=227 ``` ## Documentation Once the server is started, the documentation of the Python API will be available at [http://localhost:7780](http://localhost:7780), or alternatively to the port defined by the arguments of the `start` command. ## Start the server To start the server, you can use the command `babelnet-rpc start`. If no arguments are provided, it will start in interactive mode, in which you will be prompted to provide the required values. ```console $ babelnet-rpc start BabelNet indices path: /home/user/BabelNet-5.2 Port for documentation ([7780], -1 to ignore): 8080 RPC mode ([tcp]/ipc/all): all Port for TCP mode ([7790]): IPC directory: your_ipc_dir Starting server... Server started BabelNet Python API documentation is available at http://localhost:8080 To use BabelNet in RPC mode, add one of these lines in your babelnet_conf.yml file RPC_URL: "tcp://127.0.0.1:7790" RPC_URL: "ipc:///home/user/your_ipc_dir/socket" ``` Alternatively, the values can be passed as arguments. The available arguments are: - `--bn <path>` required, the BabelNet indices path - `--doc <port>` port for the BabelNet API documentation (default `7780`) - `--no-doc` disable the documentation port - `-m`, `--mode` the RPC mode enabled on the server (`tcp`, `ipc` or `all`, default `tcp`). _On Windows the only available mode is `tcp`_. - `--tcp <port>` the port for TPC mode (default `7790`) - `--ipc <path>` the IPC directory (required with mode `ipc` or `all`) - `--print` print the command instead of executing it ### Examples of usage #### Basic usage ```console $ babelnet-rpc start --bn /home/user/BabelNet-5.2 Starting server... Server started BabelNet Python API documentation will be available at http://localhost:7790 To use BabelNet in RPC mode, add this line in your babelnet_conf.yml file RPC_URL: "tcp://127.0.0.1:7790" ``` #### IPC mode without documentation ```console $ babelnet-rpc start --bn /home/user/BabelNet-5.2 --no-doc -m ipc --ipc your_ipc_dir Starting server... Server started To use BabelNet in RPC mode, add this line in your babelnet_conf.yml file RPC_URL: "ipc:///home/user/your_ipc_dir/socket" ``` #### Custom TCP port, print docker command ```console $ babelnet-rpc start --bn /home/user/BabelNet-5.2 --print --tcp 1234 To start the RPC server, run the following command: docker run -d --name babelnet-rpc -p 7780:8000 -p 1234:1234 -v "/home/user/BabelNet-5.2:/root/babelnet" babelscape/babelnet-rpc:latest BabelNet Python API documentation will be available at http://localhost:7780 To use BabelNet in RPC mode, add this line in your babelnet_conf.yml file RPC_URL: "tcp://127.0.0.1:1234" ``` ## Stop the server To stop a running RPC server, run the command: ```bash babelnet-rpc stop ``` # Code Assuming the installation and configuration phases have been completed, you can start working with BabelNet. The entry point in the library is the `babelnet` package. It contains a set of functions that query the available content. You can import the package by calling: ```python import babelnet as bn ``` The two main classes of BabelNet are: - `BabelSynset` (a concept or named entity identified by a set of multilingual lexicalizations, each being a BabelSense) - `BabelSense` (a lexicalization of a given concept, i.e. a BabelSynset) For more details, see the API documentation at [https://babelnet.org/pydoc/1.1/](https://babelnet.org/pydoc/1.1/). ## BabelSynset A `BabelSynset` is a set of multilingual lexicalizations that are synonyms expressing a given concept or named entity. For instance, the synset for car in the motorcar sense looks [like this](https://babelnet.org/synset?word=bn:00007309n&details=1&orig=car&lang=EN). After importing `babelnet` as `bn` we can use its functions to retrieve one or many `BabelSynset` objects. For instance, to retrieve all the synsets containing `car` we can call `get_synsets`: ```python from babelnet.language import Language # Given a word in a certain language, # returns the concepts (BabelSynsets) denoted by the word. byl = bn.get_synsets('car', from_langs=[Language.EN]) ``` We can also specify which of the parts of speech we are interested in and obtain only synsets for the specified part of speech. In the following example, we retrieve all the verbal synsets containing the English lexicalization `run` : ```python from babelnet.language import Language from babelnet.pos import POS # Given a word in a certain language and pos (part of speech), # returns the concepts denoted by the word. byl = bn.get_synsets('run', from_langs=[Language.EN], poses=[POS.VERB]) ``` Due to the [nature of BabelNet](https://babelnet.org/about), a `BabelSynset` may contain lexicalizations from different sources. You can restrict your search only to your sources of interest. For instance: ```python from babelnet.language import Language from babelnet.pos import POS from babelnet.data.source import BabelSenseSource # Given a word in a certain language, returns the concepts # for the word available in the given sense sources. byl = bn.get_synsets('run', from_langs=[Language.EN], poses=[POS.NOUN], sources=[BabelSenseSource.WIKI, BabelSenseSource.OMWIKI]) ``` Each `BabelSynset` has an ID that univocally identifies the synset, and that can be obtained via the `id` attribute of BabelSynset instances. If we have an ID and want to retrieve the corresponding synset, we can use `get_synset`. For instance: ```python from babelnet.resources import BabelSynsetID # Gets a BabelSynset from a concept identifier (Babel synset ID). by = bn.get_synset(BabelSynsetID('bn:03083790n')) ``` returns the BabelSynset corresponding to ID [bn:03083790n](https://babelnet.org/synset?id=bn%3A03083790n&orig=bn%3A03083790n&lang=EN), that is, the synset about BabelNet. If we want to retrieve the BabelSynset corresponding to a given WordNet 3.0 ID, we can do the following: ```python from babelnet.resources import WordNetSynsetID # Gets the BabelSynsets corresponding to an input WordNet offset. by = bn.get_synset(WordNetSynsetID('wn:06879521n')) ``` If we want to retrieve the BabelSynset corresponding to a given Wikidata page ID, we can do the following: ```python from babelnet.resources import WikidataID # Gets the BabelSynsets corresponding to an input Wikidata page ID. by = bn.get_synset(WikidataID('Q4837690')) ``` If we want to retrieve the BabelSynsets containing a given Wikipedia page title, we can use the function `get_synsets`: ```python from babelnet.language import Language from babelnet.pos import POS from babelnet.resources import WikipediaID # Given a Wikipedia title, returns the BabelSynsets which contain it. byl = bn.get_synsets(WikipediaID('Men in Black (film 1997)', Language.IT, POS.NOUN)) ``` ## BabelSense A `BabelSense` is a term (either word or multi-word expression) in a given language occurring in a certain `BabelSynset` . Each occurrence of the same term (e.g., car) in different synsets is, therefore, a different `BabelSense` of that term. Now let's look at the functions to retrieve a `BabelSense` using the `bn` module we have imported earlier: ```python from babelnet.language import Language from babelnet.pos import POS from babelnet.data.source import BabelSenseSource # Returns the senses for the word in a certain language. senses1 = bn.get_senses('run', from_langs=[Language.EN]) # Returns the senses for the word in a certain language and Part-Of-Speech. senses2 = bn.get_senses('run', from_langs=[Language.EN], poses=[POS.VERB]) # Returns the senses for the word with the given constraints. senses3 = bn.get_senses('run', from_langs=[Language.EN], poses=[POS.VERB], sources=[BabelSenseSource.WIKI, BabelSenseSource.OMWIKI]) ``` Once we have a `BabelSense`, we can go back to the synset it belongs with the `synset` property: ```python by = sense.synset ``` We can view the `BabelSynset` as a container of `BabelSense` s, i.e., the lexicalizations in the various languages contained in the synset that express its concept or named entity. ## Some attributes of BabelSynset and BabelSense We are now going into details about important attributes (methods, properties) of the `BabelSynset` and `BabelSense` classes. ### BabelSynset `BabelSynset` is composed of various elements, which we describe below. Furthermore, a `BabelSynset` is connected to other `BabelSynset` objects. The main components of a `BabelSynset` are objects of the following types: 1. `BabelSense` (a lexicalization of the concept, see above) 2. `POS` (the synset's part of speech) 3. `BabelGloss` (a definition of the concept in a given language) 4. `BabelExample` (an example sentence of the meaning expressed by the synset) 5. `BabelImage` (an image depicting the concept) 6. `BabelSynsetRelation` (an edge semantically connecting the synset to another synset) Let's take a look at the main methods and properties of a `BabelSynset` object which we call `by`. *Note:* to obtain `BabelSynset` objects we can also use the above examples. ```python # Get a BabelSynset from a concept identifier (Babel synset ID). by = bn.get_synset(BabelSynsetID('bn:03083790n')) # Most relevant BabelSense to this BabelSynset for a given language. bs = by.main_sense(Language.EN) # The part of speech of this BabelSynset. pos = by.pos # True if the BabelSynset is a key concept is_key_concept = by.is_key_concept # Gets the senses contained in this BabelSynset. senses = by.senses() # Collects all BabelGlosses in the given source for this BabelSynset. glosses = by.glosses() # Collects all BabelExamples for this BabelSynset. examples = by.examples() # The images (BabelImages) of this BabelSynset. images = by.images # Collects all the edges incident on this BabelSynset. edges = by.outgoing_edges() # Gets the BabelCategory objects of this BabelSynset. cats = by.categories() ``` ### BabelSense We now have a look at the BabelSense attributes. The main components of a BabelSense are: 1. `BabelSynset` (the synset the sense belongs to) 2. `POS` (its part-of-speech tag) 3. the lemma string (the lexicalization of the sense) 4. `BabelSensePhonetics` (the written and audio pronunciations of this sense) 5. `BabelSenseSource` (the source of the sense, e.g.: Wikipedia, WordNet, etc.) Some code retrieving the above information follows: ```python bs = by.main_sense(Language.EN) # The language of this BabelSense lang = bs.language # The part-of-speech tag of this BabelSense pos = bs.pos # True if the BabelSense is a key concept is_key_concept = bs.is_key_sense # The lemma of this BabelSense lemma = bs.full_lemma # The normalized lemma of this sense (i.e., lowercase, without parentheses, etc.) normalized_lemma = bs.normalized_lemma # The pronunciations of this sense pronunciations = bs.pronunciations # The source of the sense; ex: Wikipedia, WordNet, etc. source = bs.source ``` # Usage examples Here we show full examples that show how you can use the BabelNet API to accomplish several tasks. ## Retrieve all BabelSynset objects for a specific word ```python import babelnet as bn from babelnet import Language for synset in bn.get_synsets('home', from_langs=[Language.EN]): print('Synset ID:', synset.id) ``` ## For a specific word retrieves all BabelSynset objects in English, Italian and French ```python import babelnet as bn from babelnet import Language synsets = bn.get_synsets('home', from_langs=[Language.EN], to_langs=[Language.IT, Language.FR]) for synset in synsets: print('Synset ID:', synset.id) ``` ## Retrieve all BabelSense objects for a specific BabelSynset object ```python import babelnet as bn from babelnet import BabelSynsetID synset = bn.get_synset(BabelSynsetID('bn:00000356n')) # a synset is an iterator over its senses for sense in synset: print('Sense: ' + sense.full_lemma, 'Language: ' + str(sense.language), 'Source: ' + str(sense.source), sep='\t') phonetic = sense.pronunciations for audio in phonetic.audios: print('Audio URL', audio.validated_url) ``` ## Retrieve all BabelSense objects for a specific Wikidata page id ```python import babelnet as bn from babelnet.resources import WikidataID synset = bn.get_synset(WikidataID('Q4837690')) # a synset is an iterator over its senses for sense in synset: print('Sense: ' + sense.full_lemma, 'Language: ' + str(sense.language), 'Source: ' + str(sense.source), sep='\t') phonetic = sense.pronunciations for audio in phonetic.audios: print('Audio URL', audio.validated_url) ``` ## Retrieve Wikidata id for each BabelSense in a BabelSynset ```python import babelnet as bn from babelnet import BabelSynsetID, BabelSenseSource by = bn.get_synset(BabelSynsetID('bn:00000356n')) for sense in by.senses(source=BabelSenseSource.WIKIDATA): sensekey = sense.sensekey print(sense.full_lemma, sense.language, sensekey, sep='\t') ``` ## Retrieve neighbors of a BabelSynset object ```python import babelnet as bn from babelnet import BabelSynsetID, Language from babelnet.data.relation import BabelPointer by = bn.get_synset(BabelSynsetID('bn:00015556n')) for edge in by.outgoing_edges(BabelPointer.ANY_HYPERNYM): print(str(by.id) + '\t' + by.main_sense(Language.EN).full_lemma, edge.pointer, edge.id_target, sep=' - ') ``` ## Retrieve the distribution of relationships (frequency of each BabelPointer type) for a specific word ```python from itertools import groupby import babelnet as bn from babelnet import Language synsets = bn.get_synsets('car', from_langs=[Language.EN]) li = [edge.pointer.symbol for synset in synsets for edge in synset.outgoing_edges()] for p, l in groupby(sorted(li)): print(p, len(list(l)), sep='\t') ``` # Multithreading In **online mode** requests can come from different threads or processes and are elaborated concurrently. In **RPC mode**, using the API simultaneously from multiple threads is discouraged due to Python's threading management and the limitations of the RPC library. Since sending concurrent requests to the server can lead to long response times, to avoid timeouts it is recommended to use a limited pool like in the following example. ```python import concurrent.futures from datetime import datetime from sys import stdout import babelnet as bn from babelnet import Language # function called from the threads def func(name: str, word: str): stdout.write(datetime.now().strftime("%H:%M:%S.%f") + " - Start - " + name + "\n") synsets = bn.get_synsets(word, from_langs=[Language.EN]) glosses = [] for synset in synsets: gloss = synset.main_gloss(Language.EN) if gloss: glosses.append(gloss.gloss) stdout.write(datetime.now().strftime("%H:%M:%S.%f") + " - End - " + name + "\n") return {word: glosses} word_list = ["vocabulary", "article", "time", "bakery", "phoenix", "stunning", "judge", "clause", "anaconda", "patience", "risk", "scribble", "writing", "zebra", "trade"] with concurrent.futures.ThreadPoolExecutor(max_workers=8) as executor: future = [] for i, w in enumerate(word_list): future.append(executor.submit(func, f'Thread {i} "{w}"', w)) results = {} for f in future: results.update(f.result()) for w, gs in results.items(): for g in gs: print(w, g, sep='\t') ``` # Authors Babelscape ([info@babelscape.com](mailto:info@babelscape.com)) # License **BabelNet** and its **API** are licensed under the [BabelNet Non-Commercial License](https://babelnet.org/license).

نیازمندی

مقدار	نام
-	cloudpickle
-	zerorpc
!=23.0.*,>=22.3	pyzmq
-	dataclasses
>=1.2	text-unidecode
>=3.0.1	ordered-set
>=2.19.1	requests
>=4.2b4	pyyaml
>=2.1.2	aenum
-	docker

زبان مورد نیاز

مقدار	نام
==3.8.*	Python

نحوه نصب

نصب پکیج whl babelnet-1.1.0:

pip install babelnet-1.1.0.whl

نصب پکیج tar.gz babelnet-1.1.0:

pip install babelnet-1.1.0.tar.gz