PLDWWW: TODO/BuilderInfrastructure

mmazur mmazur at pld-linux.org
Mon Aug 14 18:38:05 CEST 2006


Author: mmazur   Date: Mon Aug 14 16:38:05 2006 GMT
Module: PLDWWW   URL: http://www.pld-linux.org/TODO/BuilderInfrastructure
---- Log message:
Toodoo

---- Page affected: TODO/BuilderInfrastructure

---- Diffs:

================================================================
New page:
#language en

= Builder Infrastructure TODO List =

It is advisable to read up on how the [http://cvs.pld-linux.org/cgi-bin/cvsweb/pld-builder.new/doc/ARCHITECTURE?rev=HEAD builder infrastructure works] before diving into this text.

Note: the whole system is written in python.

== Atomic upgrades across builders ==

Currently automatic upgrades on builders work like this -- if a package is built successfully and the request has the 'upgrade' flag set, the newly built packages get upgraded over what was previously installed on a given builder. The status of an upgrade (OK/FAILED) gets mailed to the original requester and to the src.builder.

This has some really ugly side effects -- it is very possible (and quite common) for one package to get successfully built on four builders out of five of which one upgrade fails for whatever reason. This leaves us with desynchronized builders, since three builders have the new versions, while two others have the old one.

The proposed solution goes something like this -- bin.builders should not react to the 'upgrade' flag by themselves. They should happily ignore it. It is the src.builder that should react to the build status messages sent in by bin.builders. After it knows that all builders managed to successfully build a given package, it should send out an upgrade request to the bin.builders, wait for them to send in the upgrade status (OK/FAILED) and then send back one message to the requester saying something like '4/5 upgrades successful'.

Should those 'partial upgrades' occur often, it should be possible to add a rollback mechanism -- that is, if the src.builder figures out that some of the upgrades failed, it could issue out a downgrade command to the bin.builders, to avoid desynchronization (it should use the rpm rollback mechanism).

Required changes:

Binary builders:

 * Stop responding to the 'upgrade' flag.
 * Add support for a new 'upgrade' type request.

Source builders:

 * Add support for the new 'upgrade' type request.
 * Introduce logic for issuing those upgrade requests and informing requesters about the results.

== Desync checker ==

After taking care of the main cause of desynchronizations among builders (described in the previous point) it'd be a good idea to have an automatic system check from time to time whether a desync has occurred. That would allow us to quickly notice any other causes of desyncs and maybe even have a system in place for fixing them automatically (it could simply send out upgrade requests when appropriate).

The builder part of it is quite simple. The bin.builders should  be taught that every lets say 50 requests (on requests numbered 50,100,150...) it should send out a complete "rpm -qa" somewhere. The remote part of the system could then process that information and react appropriately.

== General chroot cleaner ==

The system mentioned in the previous point could also check for other stuff and take care of it (mostly by sending out requests for deinstallation and notices to developers) like the presence of -static packages (which from time to time could interfere with the build process, so it's usually better not to have them inside builders), presence of duplicated packages (usually they shouldn't be present) and any other potential problems we can come up with.

== Write complete docs ==

They're partially done. Available [http://cvs.pld-linux.org/cgi-bin/cvsweb/pld-builder.new/doc/ here].


More information about the pld-cvs-commit mailing list