DBB Ranking Parser
==================
Extract league rankings from the DBB_ (Deutscher Basketball Bund e.V.)
website.
This library has been extracted from the web application behind the
website of the `BTB Royals Oldenburg`_ (a basketball team from
Oldenburg, Germany) where it has proven itself for many, many years.
Requirements
------------
- Python_ 3.6+
- lxml_
Installation
------------
Install this package via pip_:
.. code:: sh
$ pip install dbb-ranking-parser
Because of the dependency on lxml_, this will also require the header
files for the targeted Python_ version as well as those for libxml2_ and
libxslt_.
On `Debian Linux`_, one should be able to install these from the
distribution's repositories (as the 'root' user):
.. code:: sh
# aptitude update
# aptitude install python3.7-dev libxml2-dev libxslt1-dev
Apart from that (for example, if those packages are not yet installed)
it might be easier to install Debian's pre-built binary packages for
lxml_ instead:
.. code:: sh
# aptitude update
# aptitude install python-lxml
Usage
-----
To fetch and parse a league ranking, the appropriate URL is required.
It can be obtained on the DBB_ website. On every league's ranking page
there should be a link to a (non-"XL") HTML print version.
Its target URL should look like this (assuming the league's ID is
12345):
``http://www.basketball-bund.net/public/tabelle.jsp?print=1&viewDescKey=sport.dbb.views.TabellePublicView/index.jsp_&liga_id=12345``
The league ID has to be identified manually in any of the URLs specific
for that league (ranking, schedule, stats).
For convenience, specifying only the league ID is sufficient; the URL
will be assembled automatically. (Obviously, this might break when the
URL structure changes on the DBB website.)
Programmatically
~~~~~~~~~~~~~~~~
.. code:: python
from dbbrankingparser import load_ranking_for_league
league_id = 12345
ranking = list(load_ranking_for_league(league_id))
top_team = ranking[0]
print('Top team:', top_team['name'])
The URL can be specified explicitly, too:
.. code:: python
from dbbrankingparser import load_ranking_from_url
URL = '<see example above>'
ranking = list(load_ranking_from_url(URL))
Note that a call to a ``load_ranking_*`` method returns a generator. To
keep its elements around, and also to access them by index, they can be
fed into a list (as shown above).
On the Command Line
~~~~~~~~~~~~~~~~~~~
The package includes a command line script to retrieve a league's
rankings non-programmatically, as JSON. It requires a league ID as its
sole argument:
.. code:: sh
$ dbb-ranking-parser get 12345
[{"name": "Team ACME", "rank": 1, …}]
Via HTTP
~~~~~~~~
Also included is an HTTP wrapper around the parser.
To spin up the server:
.. code:: sh
$ dbb-ranking-parser serve
Listening for HTTP requests on 127.0.0.1:8080 ...
The server will attempt to look up a ranking for requests with an URL
part of the form ``/<league id>``:
.. code:: sh
$ curl http://localhost:8080/12345
[{"name": "Team ACME", "rank": 1, …}]
Docker
------
DBB Ranking Parser can also be run in a Docker_ container. This avoids
the local creation of a virtual environment and the installation of the
packages, or be useful in a deployment where containers are used.
Building a Docker_ image requires:
- Docker_ being installed
- a source copy of the `dbb-ranking-parser` package
In the package path:
.. code:: sh
$ docker build -t dbb-ranking-parser .
This should build a Docker_ image based upon `Alpine Linux`_ and which
includes Python_ 3, lxml_ and the DBB Ranking Parser itself. It should
be roughly 70 MB in size.
Running the Docker container accepts the same arguments as the command
line script.
To fetch a single ranking:
.. code:: sh
$ docker run --rm dbb-ranking-parser get 12345
[{"name": "Team ACME", "rank": 1, …}]
To spin up the HTTP server on port 7000 of the host machine:
.. code:: sh
$ docker run -p 7000:8080 --rm dbb-ranking-parser serve --host 0.0.0.0 --port 8080
The ``--rm`` option causes a container (but not the image) to be removed
after it exits.
.. _DBB: https://www.basketball-bund.net/
.. _BTB Royals Oldenburg: http://www.btbroyals.de/
.. _Python: https://www.python.org/
.. _pip: http://www.pip-installer.org/
.. _lxml: https://lxml.de/
.. _libxml2: http://xmlsoft.org/XSLT/
.. _libxslt: http://xmlsoft.org/XSLT/
.. _Debian Linux: https://www.debian.org/
.. _Docker: https://www.docker.com/
.. _Alpine Linux: https://alpinelinux.org/
:Copyright: 2006-2021 Jochen Kupperschmidt
:License: MIT, see LICENSE for details.
:Website: http://homework.nwsnet.de/releases/4a51/#dbb-ranking-parser
DBB Ranking Parser Changelog
============================
Version 0.4.2
-------------
Released on February 20, 2021
- Fixed description of how to run the HTTP server in a Docker container.
Version 0.4.1
-------------
Released on February 13, 2021
- Fixed reStructuredText issues in changelog which prevented a release
on PyPI.
Version 0.4
-----------
Released on February 13, 2021
- Added support for Python 3.6, 3.7, 3.8, and 3.9.
- Dropped support for Python 3.4 and 3.5 (which are end-of-life).
- Updated lxml to at least version 4.6.2.
- Moved package metadata from ``setup.py`` to ``setup.cfg``.
- Switched to a ``src/`` project layout.
- Added type hints (PEP 484).
- Ported tests from ``unittest`` to pytest.
- Merged basic and HTTP server command line interfaces into a single
argument parser with subcommands ``get`` and ``serve``. Removed
``dbb-ranking-server`` entrypoint.
- Renamed command line entrypoint to ``dbb-ranking-parser``.
- Added command line option ``--version`` to show the application's
version.
- Merged the previous three ``Dockerfile`` files into a single one.
- Updated and simplified Docker image and build process by upgrading
Alpine Linux to 3.13 and installing lxml as a binary package,
removing the need for local compilation.
Version 0.3.1
-------------
Released March 10, 2016
- Allowed to specify the HTTP server's host and port on the command
line.
- Fixed ``Dockerfile`` for the HTTP server to bind it to a public address
instead of localhost so that exposing the port actually works.
Version 0.3
-----------
Released March 8, 2016
- Added HTTP server that wraps the parser and responds with rankings as
JSON.
- Added ``Dockerfile`` files for the command line script and the HTTP
server.
Version 0.2
-----------
Released March 6, 2016
- It is now sufficient to specify just the league ID instead of the full
URL. The latter is still possible, though.
- Added a command line script to retrieve a league's ranking as JSON.
- Return nothing when parsing irrelevant HTML table rows.
- Return extracted ranks as a generator instead of a list.
- Split code over several modules.
Version 0.1
-----------
Released March 5, 2016
- first official release