14th Quattor Workshop Summary (October 2012)

Written
29 October 2012
Author
Michel Jouvin

Quattor Workshop - Univ. of Gent - 29-31/10/2012

Agenda

Site Reports

IHEP

2 annoying problems

  • lock error at installation due to ccm-fetch installing before ccm-initialize
  • AII –configurelist fails to configure nodes with different HW configs (mix up the disk names)

Aquilon at RAL

Achetype: highest grouping of hosts in Aquilon

  • Similar to SCDB sites

Personality: similar to QWG machine type

  • Built by assembling ‘‘features’’ (e.g. apache-httpd, nfs-server)
  • No possibility to cross-reference personalities (include a personality into another one)

Service: a feature with servers and clients

  • Allow to manage the binding between clients and servers

HW description based on:

  • Model: the generic HW description
  • Machine: a ‘‘model’’ instance

Host: a ‘‘machine’’ with an OS and a network config

Sandboxes: allow development of templates without impacting the production

  • Includes the possibility to deploy a machine from the sandbox for testing

Some work needed to integrate QWG into layout expected by Aquilon: mostly done by “hooks”

  • Would be great to have a wiki page summarizing what needs to be done that could be used as input for future QWG developments and restructuring

Quattor profile schema slightly modified to support “metadata” information describing where the profile is coming from

  • Should be pushed to the core repository

Currently, no history available for modifications done by aq command

  • Plans to add to Git the plenary templates (templates generated by the broker)

Internal authentication requires Kerberos

  • Can use Windows Krb

To foster adoption need both to package it and document the basic configuration

  • Documentation may be started on a wiki but the Pan book may be used as the model for the real documentation

Core Tools

QWG

See slides.

Main technical issue: integration with Aquilon

  • Need to better understand the changes that may be needed

Open questions

  • A new relase manager taking over from Michel: Guillaume? Jerome?
  • How to better organize QWG templates into well defined subsets?
  • Easier for users to start without pulling a lot of unneeded things
  • Easier sharing of release manager role between several people
  • Move the QWG repository to Git to ease integration of contributions?
  • Easier and more flexible merge
  • May need to forget about SVN history: previous attempt to migrate it failed: not necessarily a problem if we want to reorganize QWG templates into different subsets

ncm-metaconfig

A filecopy replacement, able to do handle content templating and to validate the configuration provided against a schema.

  • Template system is implemented in ncm-ncd
  • Same features as filecopy regarding ability to restart a service: would try to avoid arbitrary commands

See if it is possible to ensure that metaconfig is a superset for filecopy and download and obsolete them

panc

Current version is 9.2

9.3: development version

  • Significant parts now in clojure
  • Ready to generate multiple output formats
  • Bug fix: timestamp check on all output files
  • xmldb removal?
  • Bug fix: Java 1.7 support
  • Bug fix: catch invalid replace string exception
  • Allow negative values in range expressions
  • annotation/compilation split into 2 commands
  • Change of options for panc command to streamline use: still need to update Ant and Maven
  • panc.old still here and support the old options to ease the transition: should disappear at the next release
  • Ant and Maven will support both option sets
  • Release planned soon after the workshop

Future:

  • more clojure…
  • Plan to remove escaping as soon as xmlbdb support has been dropped off
  • First optional: through a switch
  • Requires an updated version of CCM (tbd by Luis)

SPMA + YUM

Good reasons for SPMA design but:

  • Controlling the version of each package is turning into a nightmare
  • Dependency resolution is a must

New ncm-spma developped uisng a YUM backend rather than spma

  • Use the same configuration information
  • But for most packages, it’s enough to have an empty nlist for the package: huge time saving if not executing pkcg_xxx()
  • No more need for repository resolution: huge improvement in compilation time
  • Small reduction in profile size (~10%)
  • Repository templates are reduced to the declaration of the YUM repository, not its contents
  • Updating a repository will trigger a node update: may need to use repository snapshots for deployment to avoid permanent upgrades
  • Clearly the area needing some efforts and experience feedback but they are standard tools available for that
  • Support for rollbacks with YUM history plugins

Additional benefits

  • No more use of spma and rpmt-py

Missing bit

  • AII update: do not run spma anymore
  • Use Kickstart ‘repository’ directive instead
  • GPG key support
  • Package blacklisting

Development Process

Scrum process - Releases

How useful are the weekly meeting?

  • Very small attendance
  • These meetings turn out to be too much a status report of what people are doing

Backlog is too long for the manpower available to work on non site specific issues

Cal’s proposal

  • Every monday: a check by email of what was done by everyone and decide from this if the meeting is worth
  • Cancel the meeting if there is nothing to discuss

Connected to the release discussion…

  • What do we call a release? When do we do a new one?
  • Feature based: significant new features? bunch of bug fixes?
  • Time based? May be more suitable for a best-effort project. In this case use YYYYMM as the base number
  • A separate release of each subset rather than 1 big fat release: same numbering for all components

Building a release means be confident that the different pieces work together

  • Need an infrastructure to do automatic testing

Specific problem of components: require an upgrade of the templates to provide the appropriate schema, explicit version if used…

How to start with releases

  • Cal will create/initialize a Git repository dedicated to the release management, based on StratusLab experience
  • Target date: end of month
  • Use wathever we have today: goal of first release is to test the release process

Still in SVN

  • ncm-cdispd/ncm-listend
  • gLite configuration component
  • SCDB
  • QWG templates

GitHub vs. SF

GitHub service looks much better and more performant

  • Agreement to move to GitHub
  • If MS has a legal problem, we’ll look for a workaround

Automatic testing of client code

A test framework developed for Quattor configuration modules (Perl)

  • Requires PERL5LIB env variable to be defined and contain CCM, NCM, CAF, LC
  • Add one line to disable actual execution of commands:

    use Test::Quattor;
    

Tests

  • Must not access the network
  • Any function in the module called by the tested function must be mocked

    ue Test::MockModule;
    
  • Must be unit tests: test only one specific feature/function
  • To test the Configure method, use a real profile passed as argument to Test::Quattor
  • Avoid constants in functions as they make testing difficult without impacting the production. Prefer passing these values as arguments, allowing to use different values in tests:

    my ($self,$const) = @_;
    

Result validation can be done on return value, file contents and properties, executed commands

After execution of the tests, it is possible to generate coverage information

  • For statementd, functions, loops, conditionals

Unit tests are easy to schedule with Jenkins

  • Jenkins at UGent does it for configuration modules, CCM and AII

Acceptance tests require deployment of test machines

  • Could be coordinated by Jenkins (in fact done by StratusLab)
  • May a StratusLab cloud help with this? Probably yes, except for AII

Quattor Data Warehouse - J. Adams

Background

  • CDB2SQL broken and requires Oracle
  • A quick and dirty dashboard created, based on grep: inefficient

New project started based on JSON output from panc

  • Input data is a git repository updated when a deploy is done
  • Track profile changes and reindex them
  • Allow very fast arbitrary searches with ElasticSearch
  • No access to Git history yet
  • Naive prototype written in Python for proof of concept

Need to package it and do the initial documentation.

Short-term Workplan

Quattor release: target mid-december

  • Probably not a public release: more to test the process
  • Without QWG for this first one
  • Repository initialization this afternoon…
  • Initial contents: base daemons, panc, core configuration components, AII
  • Testing: mostly manual this time
  • Install some test nodes with a minimal configuration for main components (Luis)

RPM naming

  • Currently we have only snapshots with the date in the name
  • Maven release plugin should be enough to tag releases, based on pom file

QWG

  • Move repo to Git: 1 for core templates, 1 for OS, 1 for grid, 1 for StratusLab
  • Write a tool to produce a package (tar file?) from the different repositories
  • Remove the templates coming from somewhere else (core components) from the QWG repositories
  • Use Nexus to pull the appropriate version: the dependency plugin allows to untar all the things in one place