collective.contentgenerator 0.2
===============================
This package was written as part of the Bristol Plone Performance Sprint 2008 by
Ed Crewe ( `ILRT
<http://www.ilrt.bris.ac.uk/>`_ at University of Bristol) and Matt Sital-Singh
(`Netsight
<http://www.netsight.co.uk/>`_) to cater for delivery
of consistent content profiles for performance testing plone.
The purpose of this egg is to create populated plone sites with dummy content
and users. RSS feeds are crawled to generate realistic content for testing
against. Textual content, links, images, groups, users and local roles are
generated. Content profiles can be specified and there are two default ones
supplied - intranet and public.
Overview
--------
This package creates populated plone sites using dummy content and users.
A likely partner package to the buildouts in `collective.loadtesting
<https://svn.plone.org/svn/collective/collective.loadtesting>`_ which use `funkload
<http://funkload.nuxeo.org/>`_ for functional and load testing of plone sites.
Ideally we want an egg that generates content as close to realworld sites as
possible, for testing against.
Whilst keeping them consistent per profile.
The starting point is to have an egg that installs a set of two profiles.
Intranet and public website. These profiles can be adjusted for specific testing
purposes, but by keeping a minimal set of base
standards the benefit is that relative testing between code sets becomes more
achievable.
The hope is that this can serve three user code testing bases ... core plone
developers, third party developers and end
users with sites for which performance is an issue.
Consistent content profiles should also serve the use of buildbot load testing
comparisons over long timescales / code versions
for unreal empty plone and these two more realworld profiles.
It was felt that a model of crawling out newsfeed content with tags, links, real
sentences and language etc. is more
realistic than some bulk plain latin text generator like lorem ipsum.
Similarly users and groups are generated with proper names, rather than the
user1-100 approach.
Although highly consistent generated text has its place for load testing
specific parts of plone such as the catalog.
Well controlled harvested text can serve this purpose just as well and present a
more realistic testing target.
NB: The RSS content source paradigm makes catering for delivery of content
profiles with a large percentage of image or file content,
as part of the same process, and it could also be adapted to point to foreign
language feeds etc.
Currently large images are used for the addition of blobs to the database ... a
common profile for intranets where a large number
of bespoke file formats may be uploaded. Clearly this and a number of other
performance issues can be dealt with by the employment of
the correct add ons to remove blobs from the zodb, cache the interface etc.
However the aim of this egg is to generate
the content by default means. As a user of the egg you can then get relative
quantitative data on the performance difference
for the same content profile when applying any, or all of these techniques.
Progress
--------
Currently the first release (0.2) of the egg is finished and released to pypi,
with the latest source in the collective ...
https://svn.plone.org/svn/collective/collective.contentgenerator/trunk/collective.contentgenerator
It has the levers to pull to generate the two profiles above, ie. generate
content from crawls, add users, groups and local roles etc.
A buildbot and loadtesting setup can now use it with one of the plone site
creation buildout recipes, to select from one of the two profiles.
Profiles
--------
See the setuphandlers.py for the details ...
1. Intranet = 200 Mb - 25 thousand objects ~ 10 minutes to generate
100 users - 10 groups - 45 folders of 10 pages and 10 images (half small and
half large) - local roles off
2. Public website = 200 Mb - 120 thousand objects ~ 20 minutes to generate
2000 users - 500 groups - 150 folders of 20 pages and 6 images (one large) -
local roles on
The use of a number of profiles for the content generator caters for selection
of different types of site.
The profiles could also be used to set related generic setup options, ie. many
users optimisation, intranet workflow etc. but currently they
both use standard out of the box plone settings.
If a custom content profile is desired then copy the setupVarious settings
dictionaries and call to runSetup
from setuphandlers and adjust the values accordingly for use by a profile in
your own theme egg.
Crawling optimisation
---------------------
The crawl and creation of content is only done once and then exported to zexp
files in the import directory of a site build.
If these files are present this step is skipped and the site generation time is
reduced.
If fresh data is desired for some reason - just delete the zexps in the import
directory.
Documentation
-------------
Further documentation of the internal structure and workings of the package is
provided in the doctests within the `tests
<https://svn.plone.org/svn/collective/collective.contentgenerator/trunk/collective.contentgenerator/collective/contentgenerator/tests>`_
directory.
Dependencies
------------
This egg requires plone 3.0 or later
(NB: In order to cater for full backwards compatibility it would be useful to
rewrite
it as an old-style zope Product, this would allow for performance comparison
testing
of content profiles across all plone versions.)
When a profile is first used the system needs access to the internet to crawl
content from RSS feeds.
Changelog for collective.contentgenerator
-----------------------------------------
(name of developer listed in brackets)
* collective.contentgenerator 0.2 (2009-01-08)
- Completion of first release functionality, tests and documentation
- [ed crewe]
* collective.contentgenerator - Unreleased (2008-12-15)
- Initial package work as part of Bristol 2008 performance sprint
- [ed crewe, matt sital-singh]