معرفی شرکت ها

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

Simple dense retrieval using SciPy, spaCy, and Sentence-Transformers

ویژگی	مقدار
سیستم عامل	-
نام فایل	fasdr-0.0.6
نام	fasdr
نسخه کتابخانه	0.0.6
نگهدارنده	[]
ایمیل نگهدارنده	[]
نویسنده	David Marx
ایمیل نویسنده	david.marx84@gmail.com
آدرس صفحه اصلی	https://github.com/dmarx/fast-and-simple-dense-retrieval/
آدرس اینترنتی	https://pypi.org/project/fasdr/
مجوز	-

# FASDR: Fast and Simple Dense Retrieval 🚧 WORK IN PROGRESS 🚧 FASDR is a simple and lightweight library for fast and efficient document retrieval. It is designed to be easy to setup and use, built on top of popular, trusted components (`scipy`, `spacy`, `transformers`) to ensure it can be seamlessly integrated into existing projects and "just work". It's especially well suited for small-to-medium corpora, such as retrieval-augmented prompting of FOSS documentation. ### Features * Fast and efficient dense retrieval using KDTree data structures. * Simple interface for indexing and searching documents and sentences. * Support for various file formats and customizable indexing options. * Integration with the SpaCy and Sentence-BERT libraries for natural language processing and sentence embeddings. ## Installation First, install `fasdr` via pip: ``` pip install fasdr ``` Next, download sentence tokenization language model for spacy:  ``` python -m spacy download en_core_web_trf ``` ## Quick Start ### Indexing documents ### Quick Start To get started with FASDR, you can create a `DocumentIndex` object by passing in the root directory containing the documents you want to index: ```python from fasdr import DocumentIndex index = DocumentIndex("/path/to/documents") ``` Once you have created the DocumentIndex object, you can search for documents or sentences using the search_documents and search_sentences methods: ```python # Find the top five documents relevant to the query "climate change" results = index.search_documents("climate change", k=5) # Find the top 10 sentences after filtering on the top 5 documents results = index.search_sentences_targeted("climate change", n_docs=5, n_sents=10) ``` You can customize the behavior of the DocumentIndex object by specifying options such as the model name and the file extensions to include in the index: ```python index = DocumentIndex( "/path/to/documents", model_name="all-MiniLM-L6-v2", extensions=[".txt", ".md", ".pdf"] ) ``` ## Design FASDR is designed to be fast and simple, with a focus on ease of use and minimal setup. It uses FAISS for similarity search, which is a highly optimized library for dense vector search, and SpaCy with the Sentence-BERT component for embedding text. The library is built around two main classes: * `Document`: Represents a single document and its embeddings. * `DocumentIndex`: Represents an index of documents and their embeddings. `Document` objects are created by passing in the path to the document file, and can be used to search for similar sentences within the document. `DocumentIndex` objects are created by passing in the root directory containing the documents to index, and can be used to search for similar documents or sentences across all the indexed documents.

نیازمندی

مقدار	نام
-	numpy
-	spacy
-	sentence-transformers
-	scipy
-	spacy-sentence-bert
-	rich

نحوه نصب

نصب پکیج whl fasdr-0.0.6:

pip install fasdr-0.0.6.whl

نصب پکیج tar.gz fasdr-0.0.6:

pip install fasdr-0.0.6.tar.gz