معرفی شرکت ها


fasttext-github-0.8.22


Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

fastText Python bindings
ویژگی مقدار
سیستم عامل -
نام فایل fasttext-github-0.8.22
نام fasttext-github
نسخه کتابخانه 0.8.22
نگهدارنده []
ایمیل نگهدارنده []
نویسنده Christian Puhrsch
ایمیل نویسنده cpuhrsch@fb.com
آدرس صفحه اصلی https://github.com/facebookresearch/fastText
آدرس اینترنتی https://pypi.org/project/fasttext-github/
مجوز MIT
fastText ======== `fastText <https://fasttext.cc/>`__ is a library for efficient learning of word representations and sentence classification. Requirements ------------ `fastText <https://fasttext.cc/>`__ builds on modern Mac OS and Linux distributions. Since it uses C++11 features, it requires a compiler with good C++11 support. These include : - (gcc-4.8 or newer) or (clang-3.3 or newer) You will need - `Python <https://www.python.org/>`__ version 2.7 or >=3.4 - `NumPy <http://www.numpy.org/>`__ & `SciPy <https://www.scipy.org/>`__ - `pybind11 <https://github.com/pybind/pybind11>`__ Building fastText ----------------- The easiest way to get the latest version of `fastText is to use pip <https://pypi.python.org/pypi/fasttext>`__. :: $ pip install fasttext If you want to use the latest unstable release you will need to build from source using setup.py. Now you can import this library with :: import fastText Examples -------- In general it is assumed that the reader already has good knowledge of fastText. For this consider the main `README <https://github.com/facebookresearch/fastText/blob/master/README.md>`__ and in particular `the tutorials on our website <https://fasttext.cc/docs/en/supervised-tutorial.html>`__. We recommend you look at the `examples within the doc folder <https://github.com/facebookresearch/fastText/tree/master/python/doc/examples>`__. As with any package you can get help on any Python function using the help function. For example :: +>>> import fastText +>>> help(fastText.FastText) Help on module fastText.FastText in fastText: NAME fastText.FastText DESCRIPTION # Copyright (c) 2017-present, Facebook, Inc. # All rights reserved. # # This source code is licensed under the MIT license found in the # LICENSE file in the root directory of this source tree. FUNCTIONS load_model(path) Load a model given a filepath and return a model object. tokenize(text) Given a string of text, tokenize it and return a list of tokens [...] IMPORTANT: Preprocessing data / enconding conventions ----------------------------------------------------- In general it is important to properly preprocess your data. In particular our example scripts in the `root folder <https://github.com/facebookresearch/fastText>`__ do this. fastText assumes UTF-8 encoded text. All text must be `unicode for Python2 <https://docs.python.org/2/library/functions.html#unicode>`__ and `str for Python3 <https://docs.python.org/3.5/library/stdtypes.html#textseq>`__. The passed text will be `encoded as UTF-8 by pybind11 <https://pybind11.readthedocs.io/en/master/advanced/cast/strings.html?highlight=utf-8#strings-bytes-and-unicode-conversions>`__ before passed to the fastText C++ library. This means it is important to use UTF-8 encoded text when building a model. On Unix-like systems you can convert text using `iconv <https://en.wikipedia.org/wiki/Iconv>`__. fastText will tokenize (split text into pieces) based on the following ASCII characters (bytes). In particular, it is not aware of UTF-8 whitespace. We advice the user to convert UTF-8 whitespace / word boundaries into one of the following symbols as appropiate. - space - tab - vertical tab - carriage return - formfeed - the null character The newline character is used to delimit lines of text. In particular, the EOS token is appended to a line of text if a newline character is encountered. The only exception is if the number of tokens exceeds the MAX\_LINE\_SIZE constant as defined in the `Dictionary header <https://github.com/facebookresearch/fastText/blob/master/src/dictionary.h>`__. This means if you have text that is not separate by newlines, such as the `fil9 dataset <http://mattmahoney.net/dc/textdata>`__, it will be broken into chunks with MAX\_LINE\_SIZE of tokens and the EOS token is not appended. The length of a token is the number of UTF-8 characters by considering the `leading two bits of a byte <https://en.wikipedia.org/wiki/UTF-8#Description>`__ to identify `subsequent bytes of a multi-byte sequence <https://github.com/facebookresearch/fastText/blob/master/src/dictionary.cc>`__. Knowing this is especially important when choosing the minimum and maximum length of subwords. Further, the EOS token (as specified in the `Dictionary header <https://github.com/facebookresearch/fastText/blob/master/src/dictionary.h>`__) is considered a character and will not be broken into subwords.


نحوه نصب


نصب پکیج whl fasttext-github-0.8.22:

    pip install fasttext-github-0.8.22.whl


نصب پکیج tar.gz fasttext-github-0.8.22:

    pip install fasttext-github-0.8.22.tar.gz