معرفی شرکت ها


auction-scraper-0.4.2


Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

Extensible auction house scraper for ebay, liveauctioneers, catawiki, and other platforms
ویژگی مقدار
سیستم عامل -
نام فایل auction-scraper-0.4.2
نام auction-scraper
نسخه کتابخانه 0.4.2
نگهدارنده []
ایمیل نگهدارنده []
نویسنده Edd Salkield
ایمیل نویسنده edd@salkield.uk
آدرس صفحه اصلی https://github.com/dreamingspires/auction-scraper
آدرس اینترنتی https://pypi.org/project/auction-scraper/
مجوز -
# Auction Scraper > Scrape auction data auction sites into a sqlite database > Currently supports: catawiki, ebay, liveauctioneers > Can be used as a CLI tool, or interfaced with directly ## Installation You can [install with pip](https://pypi.org/project/auction-scraper/): ``` pip install auction-scraper ``` ## New backend support Want to scrape an auction house not listed above? Fear not - through our partnership with [Dreaming Spires](dreamingspires.dev), you can request that we build additional backend scrapers to extend the functionality. Email contact@dreamingspires.dev for more info. We also accept PRs, so feel free to write your own backend and submit it, if you require. Instructions for this can be found under the _Building new backends_ section. ## Usage `auction-scraper` will scrape data from auctions, profiles, and searches on the specified auction site. Resulting textual data is written to a `sqlite3` database, with images and backup web pages optionally being written to a _data directory_. The tool is invoked as: ``` Usage: auction-scraper [OPTIONS] DB_PATH BACKEND:[ebay|liveauctioneers] COMMAND [ARGS]... Options: DB_PATH The path of the sqlite database file to be written to [required] BACKEND:[ebay|liveauctioneers] The auction scraping backend [required] --data-location TEXT The path additional image and html data is saved to --save-images / --no-save-images Save images to data-location. Requires --data-location [default: False] --save-pages / --no-save-pages Save pages to data-location. Requires --data-location [default: False] --verbose / --no-verbose [default: False] --base-uri TEXT Override the base url used to resolve the auction site --install-completion [bash|zsh|fish|powershell|pwsh] Install completion for the specified shell. --show-completion [bash|zsh|fish|powershell|pwsh] Show completion for the specified shell, to copy it or customize the installation. --help Show this message and exit. Commands: auction Scrapes an auction site auction page. profile Scrapes an auction site profile page. search Performs a search, returning the top n_results results for each... ``` ### Auction mode In auction mode, an auction must be specified as either a unique _auction ID_ or as a URL. The textual data is scraped into the `[BACKEND]_auctions` table of `DB_PATH`, the page is scraped into `[data-location]/[BACKEND]/auctions`, and the images into `[data-location]/[BACKEND]/images`. The `--base-url` option determines the base URL from which to resolve _auction IDs_, _profile IDs_, and search _query strings_ if specified, otherwise defaulting to the default for the specified backend. Example usage: ```bash # Scraping an auction by URL auction-scraper db.db liveauctioneers auction https://www.liveauctioneers.com/item/88566418_cameroon-power-or-reliquary-figure # Equivalently scraping from an auction ID auction-scraper db.db liveauctioneers auction 88566418 # Scraping an auction, including all images and the page itself, into data-location auction-scraper --data-location=./data --save-images --save-pages db.db liveauctioneers auction 88566418 ``` ### Profile mode In profile mode, a profile must be specified as either a unique _user ID_ or as a URL. The textual data is scraped into the `[BACKEND]_profiles` table of `DB_PATH`, and the page is scraped into `[data-location]/[BACKEND]/profiles`. The `--base-url` option determines the base URL from which to resolve _auction IDs_, _profile IDs_, and search _query strings_ if specified, otherwise defaulting to the default for the specified backend. Example usage: ```bash # Scraping a profile by URL auction-scraper db.db liveauctioneers profile https://www.liveauctioneers.com/auctioneer/197/hindman/ # Equivalently scraping from a profile ID auction-scraper db.db liveauctioneers auction 197 # Scraping a profile, including the page itself, into data-location auction-scraper --data-location=./data --save-pages db.db liveauctioneers profile 197 ``` ### Search mode In search mode, at least one `QUERY_STRING` must be provided alongside `N_RESULTS`. It will scrape the auctions pertaining to the top `N_RESULTS` results from the `QUERY_STRING`. The `--base-url` option determines the base URL from which to resolve the search if specified, otherwise defaulting to the default for the specified backend. Example usage: ```bash # Search one result by a single search term auction-scraper db.db search 1 "mambila art" # Search ten results by two search terms, scraping images and pages into data-location auction-scraper --data-location=./data --save-images --save-pages db.db search 10 "mambila" "mambilla" ``` ## Running continuously using systemd `auction-scraper@.service` and `auction-scraper@.timer`, once loaded by systemd, can be used to schedule the running of `auction-scraper` with user-given arguments according to a schedule. ### Running as a systemd root service Copy `auction-scraper@.service` and `auction-scraper@.timer` to `/etc/systemd/system/`. Modify `auction-scraper@.timer` to specify the schedule you require. Reload the system daemons. As root: ```bash systemctl daemon-reload ``` Run (start now) and enable (restart on boot) the systemd-timer, specifying the given arguments, within quotes, after the '@'. For example, as root: ```bash systemctl enable --now auction-scraper@"db.db liveauctioneers search 10 mambila".timer ``` Find information about your running timers with: ```bash systemctl list-timers ``` Stop your currently running timer with: ```bash systemctl stop auction-scraper@"db.db liveauctioneers search 10 mambila".timer ``` Disable your currently running timer with: ```bash systemctl disable auction-scraper@"db.db liveauctioneers search 10 mambila".timer ``` A new timer is created for each unique argument string, so the arguments must be specified when stopping or disabling the timer. Some modification may be required to run as a user service, including placing the service and timer files in `~/.local/share/systemd/user/`. ## Building from source Ensure poetry is [installed](https://python-poetry.org/docs/#installation). Then from this directory install dependencies into the poetry virtual environment and build: ```bash poetry install poetry build ``` Source and wheel files are built into `auction_scraper/dist`. Install it across your user with `pip`, outside the venv: ```bash cd ./dist python3 -m pip install --user ./auction_scraper-0.0.1-py3-none-any.whl ``` or ```bash cd ./dist pip install ./auction_scraper-0.0.1-py3-none-any.whl ``` Run `auction-scraper` to invoke the utility. ## Interfacing with the API Each backend of `auction-scraper` can also be invoked as a Python library to automate its operation. The backends implement the abstract class `auction_scraper.abstract_scraper.AbstractAuctionScraper`, alongside the abstract SQLAlchemy models `auction_scraper.abstract_models.BaseAuction` and `auction_scraper.abstract_models.BaseProfile`. The resulting scraper exposes methods to scrape auction, profile, and search pages into these SQLAlchemy model objects, according to the following interface: ``` def scrape_auction(self, auction, save_page=False, save_images=False): """ Scrapes an auction page, specified by either a unique auction ID or a URI. Returns an auction model containing the scraped data. If specified by auction ID, constructs the URI using self.base_uri. If self.page_save_path is set, writes out the downloaded pages to disk at the given path according to the naming convention specified by self.auction_save_name. Returns a BaseAuction """ ``` ``` def scrape_profile(self, profile, save_page=False): """ Scrapes a profile page, specified by either a unique profile ID or a URI. Returns an profile model containing the scraped data. If specified by profile ID, constructs the URI using self.base_uri. If self.page_save_path is set, writes out the downloaded pages to disk at the given path according to the naming convention specified by self.profile_save_name. Returns a BaseProfile """ ``` ``` def scrape_search(self, query_string, n_results=None, save_page=False, save_images=False): """ Scrapes a search page, specified by either a query_string and n_results, or by a unique URI. If specified by query_string, de-paginates the results and returns up to n_results results. If n_results is None, returns all results. If specified by a search_uri, returns just the results on the page. Returns a dict {auction_id: SearchResult} """ ``` ``` def scrape_auction_to_db(self, auction, save_page=False, save_images=False): """ Scrape an auction page, writing the resulting page to the database. Returns a BaseAuction """ ``` ``` def scrape_profile_to_db(self, profile, save_page=False): """ Scrape a profile page, writing the resulting profile to the database. Returns a BaseProfile """ ``` ``` def scrape_search_to_db(self, query_strings, n_results=None, \ save_page=False, save_images=False): """ Scrape a set of query_strings, writing the resulting auctions and profiles to the database. Returns a tuple ([BaseAuction], [BaseProfile]) """ ``` ## Building new backends All backends live at `action_scraper/scrapers` in their own specific directory. It should implement the abstract class `auction_scraper.abstract_scraper.AbstractAuctionScraper` in a file `scraper.py`, and the abstract SQLAlchemy models `auction_scraper.abstract_models.BaseAuction` and `auction_scraper.abstract_models.BaseProfile` in `models.py`. The `AuctionScraper` class must extend `AbstractAuctionScraper` and implement the following methods: ```python3 # Given a uri, scrape the auction page into an auction object (of type BaseAuction) def _scrape_auction_page(self, uri) # Given a uri, scrape the profile page into an profile object (of type BaseAuction) def _scrape_profile_page(self, uri) # Given a uri, scrape the search page into a list of results (of type {auction_id: SearchResult}) def _scrape_search_page(self, uri) ``` It must also supply defaults to the following variables: ```python3 auction_table profile_table base_uri auction_suffix profile_suffix search_suffix backend_name ``` ## Authors Edd Salkield <edd@salkield.uk> - Main codebase Mark Todd - Liveauctioneers scraper Jonathan Tanner - Catawiki scraper


نیازمندی

مقدار نام
>=2.8.0,<3.0.0 babel
>=0.0.1,<0.0.2 bs4
>=4.3,<5.0 datetime
>=1.0.1,<2.0.0 pathlib
>=2.8.1,<3.0.0 python-dateutil
>=2.24.0,<3.0.0 requests
>=3.141.0,<4.0.0 selenium
>=0.8.1,<0.9.0 slimit
>=1.4.0,<2.0.0 sqlalchemy
>=0.37.8,<0.38.0 sqlalchemy_utils
>=1.1.0,<2.0.0 termcolor
>=0.3.2,<0.4.0 typer
>=0.18.0,<0.19.0 validators


زبان مورد نیاز

مقدار نام
>=3.7,<4.0 Python


نحوه نصب


نصب پکیج whl auction-scraper-0.4.2:

    pip install auction-scraper-0.4.2.whl


نصب پکیج tar.gz auction-scraper-0.4.2:

    pip install auction-scraper-0.4.2.tar.gz