معرفی شرکت ها


anda-0.0.8


Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

A package collecting various functions to work with ancient Mediterranean datasets (textual, spatial, etc.)
ویژگی مقدار
سیستم عامل -
نام فایل anda-0.0.8
نام anda
نسخه کتابخانه 0.0.8
نگهدارنده []
ایمیل نگهدارنده []
نویسنده Vojtech Kase
ایمیل نویسنده vojtech.kase@gmail.com
آدرس صفحه اصلی https://github.com/sdam-au/anda
آدرس اینترنتی https://pypi.org/project/anda/
مجوز -
# anda [toc] ```bash pip install anda ``` This is a Python package for collecting, manipulation and visualizing various ancient Mediterranean data. It focus on their temporal, textual and spatial aspects. It is structured into several gradually evolving submodules, namely `gr`, `imda`, `concs`, and `textnet`. ## anda.gr ```python from anda import gr ``` This module is dedicated to preprocessing of ancient Greek textual data. It contains functions for lemmatization, posttagging and translation. It relies heavely on Morhesus Dictionary. ### Lemmatization A minimal usage is to lemmatize individual word. You can either ask for only the first lemma (`return_first_lemma()`) or for all possibilities (`return_all_unique_lemmata()`. In most cases , the outcome is the same: ```python gr.return_first_lemma("ἐπιστήμην") > 'ἐπιστήμη' gr.return_all_unique_lemmata("ἐπιστήμην") > 'ἐπιστήμη' ``` Above these are functions `lemmatize_string()` and `gr.get_lemmatized_sentences()`. Both work with string of any length. The first returns a list of lemmata. The second returns a list of lemmatized sentences. ```python string = "Πρότασις μὲν οὖν ἐστὶ λόγος καταφατικὸς ἢ ἀποφατικὸς τινὸς κατά τινος. Οὗτος δὲ ἢ καθόλου ἢ ἐν μέρει ἢ ἀδιόριστος. Λέγω δὲ καθόλου μὲν τὸ παντὶ ἢ μηδενὶ ὑπάρχειν, ἐν μέρει δὲ τὸ τινὶ ἢ μὴ τινὶ ἢ μὴ παντὶ ὑπάρχειν, ἀδιόριστον δὲ τὸ ὑπάρχειν ἢ μὴ ὑπάρχειν ἄνευ τοῦ καθόλου, ἢ κατὰ μέρος, οἷον τὸ τῶν ἐναντίων εἶναι τὴν αὐτὴν ἐπιστήμην ἢ τὸ τὴν ἡδονὴν μὴ εἶναι ἀγαθόν." gr.lemmatize_string(string) > ['πρότασις', 'λόγος', 'καταφατικός', 'ἀποφατικός', 'καθόλου', 'μέρος', 'ἀδιόριστος', 'λέγω', 'καθόλου', 'πᾶς', 'μηδείς', 'ὑπάρχω', 'μέρος', 'πᾶς', 'ὑπάρχω', 'ἀδιόριστον', 'ὑπάρχω', 'ὑπάρχω', 'ἄνευ', 'καθόλου', 'μέρος', 'οἷος', 'ἐναντίος', 'αὐτην', 'ἐπιστήμη', 'ἡδονην', 'ἀγαθός'] gr.get_lemmatized_sentences(string) > [['πρότασις', 'λόγος', 'καταφατικός', 'ἀποφατικός'], ['καθόλου', 'μέρος', 'ἀδιόριστος'], ['λέγω', 'καθόλου', 'πᾶς', 'μηδείς', 'ὑπάρχω', 'μέρος', 'πᾶς', 'ὑπάρχω', 'ἀδιόριστον', 'ὑπάρχω', 'ὑπάρχω', 'ἄνευ', 'καθόλου', 'μέρος', 'οἷος', 'ἐναντίος', 'αὐτην', 'ἐπιστήμη', 'ἡδονην', 'ἀγαθός']] ``` All lemmatization functions can be further parametrized by several arguments * `all_lemmata=False` : * `filter_by_postag=["n","a","v"]`: returns only nouns ("n"), adjectives ("a") and verbs ("v") * `involve_unknown=True`, if `False`, it returns only words found in the dictionary Thus, you can run: ```python lemmatized_sentences = gr.get_lemmatized_sentences(string, all_lemmata=False, filter_by_postag=["n","a","v"], involve_unknown=False) print(lemmatized_sentences) > [['λόγος'], ['μέρος'], ['πᾶς', 'μηδείς', 'ὑπάρχω', 'μέρος', 'πᾶς', 'ὑπάρχω', 'ὑπάρχω', 'ὑπάρχω', 'ἄνω/ἀνίημι', 'μέρος', 'οἷος', 'ἐναντίος', 'ἐπιστήμη', 'ἀγαθός']] ``` (1) `get_lemmatized_sentences(string, all_lemmata=False, filter_by_postag=None, involve_unknown=False)`: it receives a raw Greek text of any kind and extent as its input Such input is processed by a series of subsequent functions embedded within each other, which might be also used independently (1) `get_sentences()` splits the string into sentences by common sentence separators. (2) `lemmatize_string(sentence)` first calls `tokenize_string()`, which makes a basic cleaning and stopwords filtering for the sentence, and returns a list of words. Subsequently, each word from the tokenized sentence is sent either to `return_first_lemma()` or to `return_all_unique_lemmata()`, on the basis of the value of the parameter `all_lemmata=` (set to `False` by default). (4) `return_all_unique_lemmata()`goes to the `morpheus_dict` values and returns all unique lemmata. (5) Parameter `filter_by_postag=` (default `None`) enables to sub-select chosen word types from the tokens, on the basis of first character in the tag "p" . Thus, to choose only nouns, adjectives, and verbs, you can set `filter_by_postag=["n", "a", "v"].` PREFERENCE: If verb, noun, and adjective variants are available, only then noun and adjective form is returned. If both noun and adjective is available, only noun is returned. ### Translation Next to the lemmatization, there is also a series of functions for translations, like `return_all_unique_translations(word, filter_by_postag=None, involve_unknown=False)`, useful for any wordform, and `lemma_translator(word)`, where we already have a lemma. ```python gr.return_all_unique_translations("ὑπάρχειν", filter_by_postag=None, involve_unknown=False) > 'to begin, make a beginning' gr.lemma_translator("λόγος") > 'the word' ``` ### Morphological analysis You can also do a morphological analysis of a string ```python gr.morphological_analysis(string)[1:4] > [{'i': '564347', 'f': 'μέν', 'b': 'μεν', 'l': 'μέν', 'e': 'μεν', 'p': 'g--------', 'd': '20753', 's': 'on the one hand, on the other hand', 'a': None}, {'i': '642363', 'f': 'οὖν', 'b': 'ουν', 'l': 'οὖν', 'e': 'ουν', 'p': 'g--------', 'd': '23870', 's': 'really, at all events', 'a': None}, {'i': '264221', 'f': 'ἐστί', 'b': 'εστι', 'l': 'εἰμί', 'e': 'ειμι', 'p': 'v3spia---', 'd': '9722', 's': 'I have', 'a': None}] ``` ## imda This module will serve for importing various ancient Mediterranean resources. Most of them will be imported directly from open third-party online resources. However, some of them have been preprocessed as part of the SDAM project. The ideal is that it will work like this: ``` imda.list_datasets() >>> ['roman_provinces_117', 'EDH', 'roman_cities_hanson', 'orbis_network'] ``` And: ```python rp = imda.import_dataset("roman_provinces_117", "gdf") type(rp) >>>geopandas.geodataframe ``` ## concs This module contains functions for working ## textnet This module contains functions for generating, analyzing and visualizing word co-occurrence networks. It has been designed especially for working with textual data in ancient Greek. ## Versions history * 0.0.8 - bugs removed * 0.0.7 - `filter_by_postag` with preference of nouns and adjectives by default * 0.0.6 - greek dictionaries included within the package * 0.0.5 - experimenting with data inclusion * 0.0.4 - docs


زبان مورد نیاز

مقدار نام
>=3.4 Python


نحوه نصب


نصب پکیج whl anda-0.0.8:

    pip install anda-0.0.8.whl


نصب پکیج tar.gz anda-0.0.8:

    pip install anda-0.0.8.tar.gz