Posts in 'debian'

The upstream ontologist

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor.

The upstream ontologist is a project that extracts metadata about upstream projects in a consistent format. It does this with a combination of heuristics and reading ecosystem-specific metadata files, such as Python’s setup.py, rust’s Cargo.toml as well as e.g. scanning README files.

Supported Data Sources

It will extract information from a wide variety of sources, including:

Supported Fields

Fields that it currently provides include:

  • Homepage: homepage URL
  • Name: name of the upstream project
  • Contact: contact address of some sort of the upstream (e-mail, mailing list URL)
  • Repository: VCS URL
  • Repository-Browse: Web URL for viewing the VCS
  • Bug-Database: Bug database URL (for web viewing, generally)
  • Bug-Submit: URL to use to submit new bugs (either on the web or an e-mail address)
  • Screenshots: List of URLs with screenshots
  • Archive: Archive used - e.g. SourceForge
  • Security-Contact: e-mail or URL with instructions for reporting security issues
  • Documentation: Link to documentation on the web:
  • Wiki: Wiki URL
  • Summary: one-line description of the project
  • Description: longer description of the project
  • License: Single line license description (e.g. “GPL 2.0”) as declared in the metadata[1]
  • Copyright: List of copyright holders
  • Version: Current upstream version
  • Security-MD: URL to markdown file with security policy

All data fields have a “certainty” associated with them (“certain”, “confident”, “likely” or “possible”), which gets set depending on how the data was derived or where it was found. If multiple possible values were found for a specific field, then the value with the highest certainty is taken.

Interface

The ontologist provides a high-level Python API as well as two command-line tools that can write output in two different formats:

For example, running guess-upstream-metadata on dulwich:

 % guess-upstream-metadata
 <string>:2: (INFO/1) Duplicate implicit target name: "contributing".
 Name: dulwich
 Repository: https://www.dulwich.io/code/
 X-Security-MD: https://github.com/dulwich/dulwich/tree/HEAD/SECURITY.md
 X-Version: 0.20.21
 Bug-Database: https://github.com/dulwich/dulwich/issues
 X-Summary: Python Git Library
 X-Description: |
   This is the Dulwich project.
   It aims to provide an interface to git repos (both local and remote) that
   doesn't call out to git directly but instead uses pure Python.
 X-License: Apache License, version 2 or GNU General Public License, version 2 or later.
 Bug-Submit: https://github.com/dulwich/dulwich/issues/new

Lintian-Brush

lintian-brush can update DEP-12-style debian/upstream/metadata files that hold information about the upstream project that is packaged as well as the Homepage in the debian/control file based on information provided by the upstream ontologist. By default, it only imports data with the highest certainty - you can override this by specifying the —uncertain command-line flag.

[1]Obviously this won’t be able to describe the full licensing situation for many projects. Projects like scancode-toolkit are more appropriate for that.

comments.

Automatic Fixing of Debian Build Dependencies

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor.

In my last blogpost, I introduced the buildlog consultant - a tool that can identify many reasons why a Debian build failed.

For example, here’s a fragment of a build log where the Build-Depends lack python3-setuptools:

849
850
851
852
853
854
855
856
857
858
 dpkg-buildpackage: info: host architecture amd64
  fakeroot debian/rules clean
 dh clean --with python3,sphinxdoc --buildsystem=pybuild
    dh_auto_clean -O--buildsystem=pybuild
 I: pybuild base:232: python3.9 setup.py clean
 Traceback (most recent call last):
   File "/<<PKGBUILDDIR>>/setup.py", line 2, in <module>
     from setuptools import setup
 ModuleNotFoundError: No module named 'setuptools'
 E: pybuild pybuild:353: clean: plugin distutils failed with: exit code=1: python3.9 setup.py clean

The buildlog consultant can identify the line in bold as being key, and interprets it:

 % analyse-sbuild-log --json ~/build.log

 {
    "stage": "build",
    "section": "Build",
    "lineno": 857,
    "kind": "missing-python-module",
    "details": {"module": "setuptools", "python_version": 3, "minimum_version": null}
 }

Automatically acting on buildlog problems

A common reason why Debian builds fail is missing dependencies or incorrect versions of dependencies declared in the package build depends.

Based on the output of the buildlog consultant, it is possible in many cases to determine what dependency needs to be added to Build-Depends. In the example given above, we can use apt-file to look for the package that contains the path /usr/lib/python3/dist-packages/setuptools/__init__.py - and voila, we find python3-setuptools:

 % apt-file search /usr/lib/python3/dist-packages/setuptools/__init__.py
 python3-setuptools: /usr/lib/python3/dist-packages/setuptools/__init__.py

The deb-fix-build command automates these steps:

  1. It builds the package using sbuild; if the package successfully builds then it just exits successfully
  2. It tries to identify the problem by looking through the build log; if it can’t or if it’s a problem it has seen before (but apparently failed to resolve), then it exits with a non-zero exit code
  3. It tries to find a dependency that can address the problem
  4. It updates Build-Depends in debian/control or Depends in debian/tests/control
  5. Go to step 1

This takes away the tedious manual process of building a package, discovering that a dependency is missing, updating Build-Depends and trying again.

For example, when I ran deb-fix-build while packaging saneyaml, the output looks something like this:

 % deb-fix-build
 Using output directory /tmp/tmpyz0nkgqq
 Using sbuild chroot unstable-amd64-sbuild
 Using fixers: …
 Building debian packages, running 'sbuild --no-clean-source -A -s -v'.
 Attempting to use fixer upstream requirement fixer(apt) to address MissingPythonDistribution('setuptools_scm', python_version=3, minimum_version='4')
 Using apt-file to search apt contents
 Adding build dependency: python3-setuptools-scm (>= 4)
 Building debian packages, running 'sbuild --no-clean-source -A -s -v'.
 Attempting to use fixer upstream requirement fixer(apt) to address MissingPythonDistribution('toml', python_version=3, minimum_version=None)
 Adding build dependency: python3-toml
 Building debian packages, running 'sbuild --no-clean-source -A -s -v'.
 Built 0.5.2-1- changes files at [‘saneyaml_0.5.2-1_amd64.changes’].

And in our Git repository, we see these changes as well:

% git log -p
 commit 5a1715f4c7273b042818fc75702f2284034c7277 (HEAD -> master)
 Author: Jelmer Vernooij <jelmer@jelmer.uk>
 Date:   Sun Apr 4 02:35:56 2021 +0100

     Add missing build dependency on python3-toml.

 diff --git a/debian/control b/debian/control
 index 5b854dc..3b27b73 100644
 --- a/debian/control
 +++ b/debian/control
 @@ -1,6 +1,6 @@
  Rules-Requires-Root: no
  Standards-Version: 4.5.1
 -Build-Depends: debhelper-compat (= 12), dh-sequence-python3, python3-all, python3-setuptools (>= 50), python3-wheel, python3-setuptools-scm (>= 4)
 +Build-Depends: debhelper-compat (= 12), dh-sequence-python3, python3-all, python3-setuptools (>= 50), python3-wheel, python3-setuptools-scm (>= 4), python3-toml
  Testsuite: autopkgtest-pkg-python
  Source: python-saneyaml
  Priority: optional

 commit f03047da80fcd8468ee231fbc4cf8488d7a0acd1
 Author: Jelmer Vernooij <jelmer@jelmer.uk>
 Date:   Sun Apr 4 02:35:34 2021 +0100

     Add missing build dependency on python3-setuptools-scm (>= 4).

 diff --git a/debian/control b/debian/control
 index a476cc2..5b854dc 100644
 --- a/debian/control
 +++ b/debian/control
 @@ -1,6 +1,6 @@
  Rules-Requires-Root: no
  Standards-Version: 4.5.1
 -Build-Depends: debhelper-compat (= 12), dh-sequence-python3, python3-all, python3-setuptools (>= 50), python3-wheel
 +Build-Depends: debhelper-compat (= 12), dh-sequence-python3, python3-all, python3-setuptools (>= 50), python3-wheel, python3-setuptools-scm (>= 4)
  Testsuite: autopkgtest-pkg-python
  Source: python-saneyaml
  Priority: optional

Using deb-fix-build

You can run deb-fix-build by installing the ognibuild package from unstable. The only requirements for using it are that:

  • The package is maintained in Git
  • A sbuild schroot is available for use

Caveats

deb-fix-build is fairly easy to understand, and if it doesn’t work then you’re no worse off than you were without it - you’ll have to add your own Build-Depends.

That said, there are a couple of things to keep in mind:

  • At the moment, it doesn’t distinguish between general, Arch or Indep Build-Depends.
  • It can only add dependencies for things that are actually in the archive
  • Sometimes there are multiple packages that can provide a file, command or python package - it tries to find the right one with heuristics but doesn’t always get it right

comments.

The Buildlog Consultant

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor.

Reading build logs

Build logs for Debian packages can be quite long and difficult for a human to read. Anybody who has looked at these logs trying to figure out why a build failed will have spent time scrolling through them and skimming for certain phrases (lines starting with “error:” for example). In many cases, you can spot the problem in the last 10 or 20 lines of output – but it’s also quite common that the error is somewhere at the beginning of many pages of error output.

The buildlog consultant

The buildlog consultant project attempts to aid in this process by parsing sbuild and non-Debian (e.g. the output of “make”) build logs and trying to identify the key line that explains why a build failed. It can then either display this specific line, or a fragment of the log around surrounding the key line.

Classification

In addition to finding the key line explaining the failure, it can also classify and parse the error in many cases and return a result code and some metadata.

For example, in a failed build of gnss-sdr that has produced 2119 lines of output, the reason for the failure is that log4cpp is missing – which is on line 641:

634
635
636
637
638
639
640
641
642
643
644
645
646
647
 -- Required GNU Radio Component: ANALOG missing!
 -- Could NOT find GNURADIO (missing: GNURADIO_RUNTIME_FOUND)
 -- Could NOT find PkgConfig (missing: PKG_CONFIG_EXECUTABLE)
 -- Could NOT find LOG4CPP (missing: LOG4CPP_INCLUDE_DIRS
 LOG4CPP_LIBRARIES)

 CMake Error at CMakeLists.txt:593 (message):
   *** Log4cpp is required to build gnss-sdr

 -- Configuring incomplete, errors occurred!
 See also "/<<PKGBUILDDIR>>/obj-x86_64-linux-gnu/CMakeFiles/
 CMakeOutput.log".
 See also "/<<PKGBUILDDIR>>/obj-x86_64-linux-gnu/CMakeFiles/
 CMakeError.log".

In this case, the buildlog consultant can both figure out line was problematic and what the problem was:

 % analyse-sbuild-log build.log
 Failed stage: build
 Section: build
 Failed line: 641:
   *** Log4cpp is required to build gnss-sdr
 Error: Missing dependency: Log4cpp

Or, if you’d like to do something else with the output, use JSON output:

 % analyse-sbuild-log --json build.log
 {"stage": "build", "section": "Build", "lineno": 641, "kind": "missing-dependency", "details": {"name": "Log4cpp""}}

How it works

The consultant does some structured parsing (most notably it can parse the sections from a sbuild log), but otherwise is a large set of carefully crafted regular expressions and heuristics. It doesn’t always find the problem, but has proven to be fairly accurate. It is constantly improved as part of the Debian Janitor project, and that exposes it to a wide variety of different errors.

You can see the classification and error detection in action on the result codes page of the Janitor.

Using the buildlog consultant

You can get the buildlog consultant from either pip or Debian unstable (package: python3-buildlog-consultant ).

The buildlog consultant comes with two scripts – analyse-build-log and analyse-sbuild-log, for analysing build logs and sbuild logs respectively.

comments.

Debian Janitor: Hosters used by Debian packages

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor.

The Janitor knows how to talk to different hosting platforms. For each hosting platform, it needs to support the platform- specific API for creating and managing merge proposals. For each hoster it also needs to have credentials.

At the moment, it supports the GitHub API, Launchpad API and GitLab API. Both GitHub and Launchpad have only a single instance; the GitLab instances it supports are gitlab.com and salsa.debian.org.

This provides coverage for the vast majority of Debian packages that can be accessed using Git. More than 75% of all packages are available on salsa - although in some cases, the Vcs-Git header has not yet been updated.

Of the other 25%, the majority either does not declare where it is hosted using a Vcs-* header (10.5%), or have not yet migrated from alioth to another hosting platform (9.7%). A further 2.3% are hosted somewhere on GitHub (2%), Launchpad (0.18%) or GitLab.com (0.15%), in many cases in the same repository as the upstream code.

The remaining 1.6% are hosted on many other hosts, primarily people’s personal servers (which usually don’t have an API for creating pull requests).

Packages per hoster

Outdated Vcs-* headers

It is possible that the 20% of packages that do not have a Vcs-* header or have a Vcs header that say there on alioth are actually hosted elsewhere. However, it is hard to know where they are until a version with an updated Vcs-Git header is uploaded.

The Janitor primarily relies on vcswatch to find the correct locations of repositories. vcswatch looks at Vcs-* headers but has its own heuristics as well. For about 2,000 packages (6%) that still have Vcs-* headers that point to alioth, vcswatch successfully finds their new home on salsa.

Merge Proposals by Hoster

These proportions are also visible in the number of pull requests created by the Janitor on various hosters. The vast majority so far has been created on Salsa.

Hoster Open Merged & Applied Closed
github.com921685
gitlab.com1230
code.launchpad.net24511
salsa.debian.org1,3605,657126
Merge Proposal statistics

In this graph, “Open” means that the pull request has been created but likely nobody has looked at it yet. Merged means that the pull request has been marked as merged on the hoster, and applied means that the changes have ended up in the packaging branch but via a different route (e.g. cherry-picked or manually applied). Closed means that the pull request was closed without the changes being incorporated.

Note that this excludes ~5,600 direct pushes, all of which were to salsa-hosted repositories.

See also:

For more information about the Janitor’s lintian-fixes efforts, see the landing page.

comments.

Debian Janitor: How to Contribute Lintian-Brush Fixers

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor.

lintian-brush can currently fix about 150 different issues that lintian can report, but that’s still a small fraction of the more than thousand different types of issue that lintian can detect.

If you’re interested in contributing a fixer script to lintian-brush, there is now a guide that describes all steps of the process:

  1. how to identify lintian tags that are good candidates for automated fixing
  2. creating test cases
  3. writing the actual fixer

For more information about the Janitor’s lintian-fixes efforts, see the landing page.

comments.

Debian Janitor: Expanding Into Improving Multi-Arch

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor.

As of dpkg 1.16.2 and apt 0.8.13, Debian has full support for multi-arch. To quote from the multi-arch implementation page:

Multiarch lets you install library packages from multiple architectures on the same machine. This is useful in various ways, but the most common is installing both 64 and 32- bit software on the same machine and having dependencies correctly resolved automatically. In general you can have libraries of more than one architecture installed together and applications from one architecture or another installed as alternatives.

The Multi-Arch specification describes a new Multi-Arch header which can be used to indicate how to resolve cross-architecture dependencies.

The existing Debian Multi-Arch hinter is a version of dedup.debian.net that compares binary packages between architectures and suggests fixes to resolve multi-arch problems. It provides hints as to what Multi- Arch fields can be set, allowing the packages to be safely installed in a Multi-Arch world. The full list of almost 10,000 hints generated by the hinter is available at https://dedup.debian.net/static/multiarch-hints.yaml.

Recent versions of lintian-brush now include a command called apply-multiarch-hints that downloads and locally caches the hints and can apply them to a package maintained in Git. For example, to apply multi-arch hints to autosize.js:

 $ debcheckout autosize.js
 declared git repository at https://salsa.debian.org/js-team/autosize.js.git
 git clone https://salsa.debian.org/js-team/autosize.js.git autosize.js ...
 Cloning into 'autosize.js'...
 [...]
 $ cd autosize.js
 $ apply-multiarch-hints
 Downloading new version of multi-arch hints.
 libjs-autosize: Add Multi-Arch: foreign.
 node-autosize: Add Multi-Arch: foreign.
 $ git log -p
 commit 3f8d1db5af4a87e6ebb08f46ddf79f6adf4e95ae (HEAD -> master)
 Author: Jelmer Vernooij <jelmer@debian.org>
 Date:   Fri Sep 18 23:37:14 2020 +0000

     Apply multi-arch hints.
     + libjs-autosize, node-autosize: Add Multi-Arch: foreign.

     Changes-By: apply-multiarch-hints

 diff --git a/debian/changelog b/debian/changelog
 index e7fa120..09af4a7 100644
 --- a/debian/changelog
 +++ b/debian/changelog
 @@ -1,3 +1,10 @@
 +autosize.js (4.0.2~dfsg1-5) UNRELEASED; urgency=medium
 +
 +  * Apply multi-arch hints.
 +    + libjs-autosize, node-autosize: Add Multi-Arch: foreign.
 +
 + -- Jelmer Vernooij <jelmer@debian.org>  Fri, 18 Sep 2020 23:37:14 -0000
 +
  autosize.js (4.0.2~dfsg1-4) unstable; urgency=medium

    * Team upload
 diff --git a/debian/control b/debian/control
 index 01ca968..fbba1ae 100644
 --- a/debian/control
 +++ b/debian/control
 @@ -20,6 +20,7 @@ Architecture: all
  Depends: ${misc:Depends}
  Recommends: javascript-common
  Breaks: ruby-rails-assets-autosize (<< 4.0)
 +Multi-Arch: foreign
  Description: script to automatically adjust textarea height to fit text - NodeJS
   Autosize is a small, stand-alone script to automatically adjust textarea
   height to fit text. The autosize function accepts a single textarea element,
 @@ -32,6 +33,7 @@ Package: node-autosize
  Architecture: all
  Depends: ${misc:Depends}
   , nodejs
 +Multi-Arch: foreign
  Description: script to automatically adjust textarea height to fit text - Javascript
   Autosize is a small, stand-alone script to automatically adjust textarea
   height to fit text. The autosize function accepts a single textarea element,

The Debian Janitor also has a new multiarch-fixes suite that runs apply-multiarch-hints across packages in the archive and proposes merge requests. For example, you can see the merge request against autosize.js here.

For more information about the Janitor’s lintian-fixes efforts, see the landing page.

comments.

Debian Janitor: All Packages Processed with Lintian-Brush

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor.

On 12 July 2019, the Janitor started fixing lintian issues in packages in the Debian archive. Now, a year and a half later, it has processed every one of the almost 28,000 packages at least once.

Graph with Lintian Fixes Burndown

As discussed two weeks ago, this has resulted in roughly 65,000 total changes. These 65,000 changes were made to a total of almost 17,000 packages. Of the remaining packages, for about 4,500 lintian-brush could not make any improvements. The rest (about 6,500) failed to be processed for one of many reasons – they are e.g. not yet migrated off alioth, use uncommon formatting that can’t be preserved or failed to build for one reason or another.

Graph with runs by status (success, failed, nothing-to-do)

Now that the entire archive has been processed, packages are prioritized based on the likelihood of a change being made to them successfully.

Over the course of its existence, the Janitor has slowly gained support for a wider variety of packaging methods. For example, it can now edit the templates for some of the generated control files. Many of the packages that the janitor was unable to propose changes for the first time around are expected to be correctly handled when they are reprocessed.

If you’re a Debian developer, you can find the list of improvements made by the janitor in your packages by going to https://janitor.debian.net/m/.

For more information about the Janitor’s lintian-fixes efforts, see the landing page.

comments.

Debian Janitor: The Slow Trickle from Git Repositories to the Debian Archive

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor.

Last week’s blog post documented how there are now over 30,000 lintian issues that have been fixed in git packaging repositories by the Janitor.

It’s important to note that any fixes from the Janitor that make it into a Git packaging repository will also need to be uploaded to the Debian archive. This currently requires that a Debian packager clones the repository and builds and uploads the package.

Until a change makes it into the archive, users of Debian will unfortunately not see the benefits of improvements made by the Janitor.

82% of the 30,000 changes from the Janitor that have made it into a Git repository have not yet been uploaded, although changes do slowly trickle in as maintainers make other changes to packages and upload them along with the lintian fixes from the Janitor. This is not just true for changes from the Janitor, but for all sorts of other smaller improvements as well.

However, the process of cloning and building git repositories and uploading the resulting packages to the Debian archive is fairly time-consuming – and it’s probably not worth the time of developers to follow up every change from the Janitor with a labour-intensive upload to the archive.

It would be great if it was easier to trigger uploads from git commits. Projects like tag2upload will hopefully help, and make it more likely that changes end up in the Debian archive.

The majority packages do get at least one new source version upload per release, so most changes will eventually make it into the archive.

For more information about the Janitor’s lintian-fixes efforts, see the landing page.

comments.

Debian Janitor: > 60,000 Lintian Issues Automatically Fixed

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor.

Scheduling Lintian Fixes

To determine which packages to process, the Janitor looks at the import of lintian output across the archive that is available in UDD [1]. It will prioritize those packages with the most and more severe issues that it has fixers for.

Once a package is selected, it will clone the packaging repository and run lintian-brush on it. Lintian-brush provides a framework for applying a set of “fixers” to a package. It will run each of a set of “fixers” in a pristine version of the repository, and handles most of the heavy lifting.

The Inner Workings of a Fixer

Each fixer is just an executable which gets run in a clean checkout of the package, and can make changes there. Most of the fixers are written in Python or shell, but they can be in any language.

The contract for fixers is pretty simple:

  • If the fixer exits with non-zero, the changes are reverted and fixer is considered to have failed
  • If it exits with zero and made changes, then it should write a summary of its changes to standard out

If a fixer is uncertain about the changes it has made, it should report so on standard output using a pseudo-header. By default, lintian-brush will discard any changes with uncertainty but if you are running it locally you can still apply them by specifying --uncertain.

The summary message on standard out will be used for the commit message and (possibly) the changelog message, if the package doesn’t use gbp dch.

Example Fixer

Let’s look at an example. The package priority “extra” is deprecated since Debian Policy 4.0.1 (released August 2 017) – see Policy 2.5 “Priorities”. Instead, most packages should use the “optional” priority.

Lintian will warn when a package uses the deprecated “extra” value for the “Priority” - the associated tag is priority-extra-is-replaced-by-priority-optional. Lintian-brush has a fixer script that can automatically replace “extra” with “optional”.

On systems that have lintian-brush installed, the source for the fixer lives in /usr/share/lintian-brush/fixers/priority-extra-is-replaced-by-priority-optional.py, but here is a copy of it for reference:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
#!/usr/bin/python3

from debmutate.control import ControlEditor
from lintian_brush.fixer import report_result, fixed_lintian_tag

with ControlEditor() as updater:
    for para in updater.paragraphs:
        if para.get("Priority") == "extra":
            para["Priority"] = "optional"
            fixed_lintian_tag(
                para, 'priority-extra-is-replaced-by-priority-optional')

report_result("Change priority extra to priority optional.")

This fixer is written in Python and uses the debmutate library to easily modify control files while preserving formatting — or back out if it is not possible to preserve formatting.

All the current fixers come with tests, e.g. for this particular fixer the tests can be found here: https://salsa.debian.org/jelmer/lintian-brush/-/tree/master/tests/priority-extra-is-replaced-by-priority-optional.

For more details on writing new fixers, see the README for lintian-brush.

For more details on debugging them, see the manual page.

Successes by fixer

Here is a list of the fixers currently available, with the number of successful merges/pushes per fixer:

Lintian Tag Previously merged/pushed Ready but not yet merged/pushed
uses-debhelper-compat-file 4906 4161
upstream-metadata-file-is-missing 4281 3841
package-uses-old-debhelper-compat-version 4256 3617
upstream-metadata-missing-bug-tracking 2438 2995
out-of-date-standards-version 2062 2936
upstream-metadata-missing-repository 1936 2987
trailing-whitespace 1720 2295
insecure-copyright-format-uri 1791 1093
package-uses-deprecated-debhelper-compat-version 1391 1287
vcs-obsolete-in-debian-infrastructure 872 782
homepage-field-uses-insecure-uri 527 1111
vcs-field-not-canonical 850 655
debian-changelog-has-wrong-day-of-week 224 376
debian-watch-uses-insecure-uri 314 242
useless-autoreconf-build-depends 112 428
priority-extra-is-replaced-by-priority-optional 315 194
debian-rules-contains-unnecessary-get-orig-source-target 35 428
tab-in-license-text 125 320
debian-changelog-line-too-long 186 190
debian-rules-sets-dpkg-architecture-variable 69 166
debian-rules-uses-unnecessary-dh-argument 42 182
package-lacks-versioned-build-depends-on-debhelper 125 95
unversioned-copyright-format-uri 43 136
package-needs-versioned-debhelper-build-depends 127 50
binary-control-field-duplicates-source 34 134
renamed-tag 73 69
vcs-field-uses-insecure-uri 14 109
uses-deprecated-adttmp 13 91
debug-symbol-migration-possibly-complete 12 88
copyright-refers-to-symlink-license 51 48
debian-control-has-unusual-field-spacing 33 66
old-source-override-location 32 62
out-of-date-copyright-format 20 62
public-upstream-key-not-minimal 43 30
older-source-format 17 54
custom-compression-in-debian-source-options 12 57
copyright-refers-to-versionless-license-file 29 39
tab-in-licence-text 33 31
global-files-wildcard-not-first-paragraph-in-dep5-copyright 28 33
out-of-date-copyright-format-uri 9 50
field-name-typo-dep5-copyright 29 29
copyright-does-not-refer-to-common-license-file 13 42
debhelper-but-no-misc-depends 9 45
debian-watch-file-is-missing 11 41
debian-control-has-obsolete-dbg-package 8 40
possible-missing-colon-in-closes 31 13
unnecessary-testsuite-autopkgtest-field 32 9
missing-debian-source-format 7 33
debhelper-tools-from-autotools-dev-are-deprecated 9 29
vcs-field-mismatch 8 29
debian-changelog-file-contains-obsolete-user-emacs-setting 33 0
patch-file-present-but-not-mentioned-in-series 24 9
copyright-refers-to-versionless-license-file 22 9
debian-control-has-empty-field 25 6
missing-build-dependency-for-dh-addon 10 20
obsolete-field-in-dep5-copyright 15 13
xs-testsuite-field-in-debian-control 20 7
ancient-python-version-field 13 12
unnecessary-team-upload 19 5
misspelled-closes-bug 6 16
field-name-typo-in-dep5-copyright 1 20
transitional-package-not-oldlibs-optional 4 17
maintainer-script-without-set-e 9 11
dh-clean-k-is-deprecated 4 14
no-dh-sequencer 14 4
missing-vcs-browser-field 5 12
space-in-std-shortname-in-dep5-copyright 6 10
xc-package-type-in-debian-control 4 11
debian-rules-missing-recommended-target 4 10
desktop-entry-contains-encoding-key 1 13
build-depends-on-obsolete-package 4 9
license-file-listed-in-debian-copyright 1 12
missing-built-using-field-for-golang-package 9 4
unused-license-paragraph-in-dep5-copyright 4 7
missing-build-dependency-for-dh_command 6 4
comma-separated-files-in-dep5-copyright 3 6
systemd-service-file-refers-to-var-run 4 5
copyright-not-using-common-license-for-apache2 3 5
debian-tests-control-autodep8-is-obsolete 2 6
dh-quilt-addon-but-quilt-source-format 2 6
no-homepage-field 3 5
font-packge-not-multi-arch-foreign 1 6
homepage-in-binary-package 1 4
vcs-field-bitrotted 1 3
built-using-field-on-arch-all-package 2 1
copyright-should-refer-to-common-license-file-for-apache-2 1 2
debian-pyversions-is-obsolete 3 0
debian-watch-file-uses-deprecated-githubredir 1 1
executable-desktop-file 1 1
skip-systemd-native-flag-missing-pre-depends 1 1
vcs-field-uses-not-recommended-uri-format 1 1
init.d-script-needs-depends-on-lsb-base 1 0
maintainer-also-in-uploaders 1 0
public-upstream-keys-in-multiple-locations 1 0
wrong-debian-qa-group-name 1 0
Total 29656 32209

Footnotes

[1]temporarily unavailable due to Debian bug #960156 – but the Janitor is relying on historical data

For more information about the Janitor’s lintian-fixes efforts, see the landing page.

comments.

Debian Janitor: 8,200 landed changes landed so far

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor.

The bot has been submitting merge requests for about seven months now. The rollout has happened gradually across the Debian archive, and the bot is now enabled for all packages maintained on Salsa, GitLab, GitHub and Launchpad.

There are currently over 1,000 open merge requests, and close to 3,400 merge requests have been merged so far. Direct pushes are enabled for a number of large Debian teams, with about 5,000 direct pushes to date. That covers about 11,000 lintian tags of varying severities (about 75 different varieties) fixed across Debian.

Janitor pushes over time Janitor merges over time

For more information about the Janitor’s lintian-fixes efforts, see the landing page.

comments.

Improvements to Merge Proposals by the Janitor

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor.

Since the original post, merge proposals created by the janitor now include the debdiff between a build with and without the changes (showing the impact to the binary packages), in addition to the merge proposal diff (which shows the impact to the source package).

New merge proposals also include a link to the diffoscope diff between a vanilla build and the build with changes. Unfortunately these can be a bit noisy for packages that are not reproducible yet, due to the difference in build environment between the two builds.

This is part of the effort to keep the changes from the janitor high-quality.

The rollout surfaced some bugs in lintian-brush; these have been either fixed or mitigated (e.g. by disabling specified fixers).

For more information about the Janitor’s lintian-fixes efforts, see the landing page.

comments.

The Debian Janitor

There are a lot of small changes that can be made to the Debian archive to increase the overall quality. Many of these changes are small and have just minor benefits if they are applied to just a single package. Lintian encourages maintainers to fix these problems by pointing out the common ones.

Most of these issues are often trivially fixable; they are in general an inefficient use of human time, and it takes a lot of effort to keep up with. This is something that can clearly be automated.

Several tools (e.g. onovy’s mass tool, and the lintian-brush tool that I’ve been working on) go a step further and (for a subset of the issues reported by lintian) fix the problems for you, where they can. Lintian-brush can currently fix most instances of close to 100 lintian tags.

Thanks to the Vcs-* fields set by many packages and the APIs provided by hosting platforms like Salsa, it is now possible to proactively attempt to fix these issues.

The Debian Janitor is a tool that will run lintian-brush across the entire archive, and propose fixes to lintian issues via pull request.

Objectives

The aim of Debian Janitor is to take some drudge work away from Debian maintainers where possible, so they can spend their time on more important packaging work. Its purpose is to make automated changes quick and easy to apply, with minimal overhead for package maintainers. It is essentially a bit of infrastructure to run lintian-brush across all of the archive.

The actions of the bot are restricted to a limited set of problems for which obviously correct actions can be taken. It is not meant to automate all packaging, or even to cover automating all instances of the issues it knows about.

The bot is designed to be conservative and delight with consistently correct fixes instead of proposing possibly incorrect fixes and hoping for the best. Considerable effort has been made to avoid the janitor creating pull requests with incorrect changes, as these take valuable time away from maintainers, the package doesn’t actually improve (since the merge request is rejected) and it makes it likelier that future pull requests from the Debian Janitor bot are ignored or rejected.

In short: The janitor is meant to propose correct changes if it can, and back off otherwise.

Design

The Janitor finds package sources in version control systems from the Vcs*- control field in Debian source packages. If the packaging branch is hosted on a hosting platform that the Janitor has a presence on, it will attempt to run lintian-brush on the packaging branch and (if there are any changes made) build the package and propose a merge. It is based on silver-platter and currently has support for:

The Janitor is driven from the lintian and vcswatch tables in UDD. It queries for packages that are affected by any of the lintian tags that lintian-brush has a fixer script for. This way it can limit the number of repositories it has to process.

Ensuring quality

There are a couple of things I am doing to make sure that the Debian Janitor delights rather than annoys.

High quality changes

Lintian-brush has end-to-end tests for its fixers.

In order to make sure that merge requests are useful and high-value, the bot will only propose changes from lintian-brush that:

  • successfully build in a chroot and pass autopkgtest and piuparts;
  • are not completely trivial - e.g. only stripping whitespace

Changes for a package will also be reviewed by a human before they make it into a pull request.

One open pull request per package

If the bot created a pull request previously, it will attempt to update the current request by adding new commits (and updating the pull request description). It will remove and fix the branch when the pull request conflicts because of new upstream changes.

In other words, it will only create a single pull request per package and will attempt to keep that pull request up to date.

Gradual rollout

I’m slowly adding interested maintainers to receiving pull requests, before opening it up to the entire archive. This should help catch any widespread issues early.

Providing control

The bot will be upfront about its pull requests and try to avoid overwhelming maintainers with pull requests by:

  • Clearly identifying any merge requests it creates as being made by a bot. This should allow maintainers to prioritize contributions from humans.
  • Limiting the number of open proposals per maintainer. It starts by opening a single merge request and won’t open additional merge requests until the first proposal has a response
  • Providing a way to opt out of future merge requests; just a reply on the merge request is sufficient.

Any comments on merge requests will also still be reviewed by a human.

Current state

Debian janitor is running, generating changes and already creating merge requests (albeit under close review). Some examples of merge requests it has created:

Using the janitor

The janitor can process any package that’s maintained in Git and has its Vcs-Git header set correctly (you can use vcswatch to check this).

If you’re interested in receiving pull requests early, leave a comment below. Eventually, the janitor should get to all packages, though it may take a while with the current number of source packages in the archive.

By default, salsa does not send notifications when a new merge request for one of the repositories you’re a maintainer for is created. Make sure you have notifications enabled in your Salsa profile, by ticking “New Merge Requests” for the packages you care about.

You can also see the number of open merge requests for a package repository on QA - it’s the ! followed by a number in the pull request column.

It is also possible to download the diff for a particular package (if it’s been generated) ahead of the janitor publishing it:

 $ curl https://janitor.debian.net/api/lintian-fixes/pkg/PACKAGE/diff

E.g. for i3-wm, look at https://janitor.debian.net/api/lintian-fixes/pkg/i3-wm/diff.

Future Plans

The current set of supported hosting platforms covers the bulk of packages in Debian that is maintained in a VCS. The only other 100+ package platform that’s unsupported is dgit. If you have suggestions on how best to submit git changes to dgit repositories (BTS bugs with patches? or would that be too much overhead?), let me know.

The next platform that is currently missing is bitbucket, but there are only about 15 packages in unstable hosted there.

At the moment, lintian-brush can fix close to 100 lintian tags. It would be great to add fixers for more common issues.

The janitor should probably be more tightly integrated with other pieces of Debian infrastructure, e.g. Jenkins for running jobs or linked to from the tracker or lintian.debian.org.

More information

See the FAQ on the homepage.

If you have any concerns about these roll-out plans, have other ideas or questions, please let me know in the comments.

comments.

Lintian Brush

With Debian packages now widely being maintained in Git repositories, there has been an uptick in the number of bulk changes made to Debian packages. Several maintainers are running commands over many packages (e.g. all packages owned by a specific team) to fix common issues in packages.

Examples of changes being made include:

  • Updating the Vcs-Git and Vcs-Browser URLs after migrating from alioth to salsa
  • Stripping trailing whitespace in various control files
  • Updating e.g. homepage URLs to use https rather than http

Most of these can be fixed with simple sed or perl one-liners.

Some of these scripts are publically available, for example:

Lintian-Brush

Lintian-Brush is both a simple wrapper around a set of these kinds of scripts and a repository for these scripts, with the goal of making it easy for any Debian maintainer to run them.

The lintian-brush command-line tool is a simple wrapper that runs a set of “fixer scripts”, and for each:

  • Reverts the changes made by the script if it failed with an error
  • Commits the changes to the VCS with an appropriate commit message
  • Adds a changelog entry (if desired)

The tool also provides some basic infrastructure for testing that these scripts do what they should, and e.g. don’t have unintended side-effects.

The idea is that it should be safe, quick and unobtrusive to run lintian-brush, and get it to opportunistically fix lintian issues and to leave the source tree alone when it can’t.

Example

For example, running lintian-brush on the package talloc fixes two minor lintian issues:

 % debcheckout talloc
 declared git repository at https://salsa.debian.org/samba-team/talloc.git
 git clone https://salsa.debian.org/samba-team/talloc.git talloc ...
 Cloning into 'talloc'...
 remote: Enumerating objects: 2702, done.
 remote: Counting objects: 100% (2702/2702), done.
 remote: Compressing objects: 100% (996/996), done.
 remote: Total 2702 (delta 1627), reused 2601 (delta 1550)
 Receiving objects: 100% (2702/2702), 1.70 MiB | 565.00 KiB/s, done.
 Resolving deltas: 100% (1627/1627), done.
 % cd talloc
 talloc% lintian-brush
 Lintian tags fixed: {'insecure-copyright-format-uri', 'public-upstream-key-not-minimal'}
 % git log
 commit 0ea35f4bb76f6bca3132a9506189ef7531e5c680 (HEAD -> master)
 Author: Jelmer Vernooij <jelmer@debian.org>
 Date:   Tue Dec 4 16:42:35 2018 +0000

     Re-export upstream signing key without extra signatures.

     Fixes lintian: public-upstream-key-not-minimal
     See https://lintian.debian.org/tags/public-upstream-key-not-minimal.html for more details.

  debian/changelog                |   1 +
  debian/upstream/signing-key.asc | 102 +++++++++++++++---------------------------------------------------------------------------------------
  2 files changed, 16 insertions(+), 87 deletions(-)

 commit feebce3147df561aa51a385c53d8759b4520c67f
 Author: Jelmer Vernooij <jelmer@debian.org>
 Date:   Tue Dec 4 16:42:28 2018 +0000

     Use secure copyright file specification URI.

     Fixes lintian: insecure-copyright-format-uri
     See https://lintian.debian.org/tags/insecure-copyright-format-uri.html for more details.

  debian/changelog | 3 +++
  debian/copyright | 2 +-
  2 files changed, 4 insertions(+), 1 deletion(-)

Script Interface

A fixer script is run in the root directory of a package, where it can make changes it deems necessary, and write a summary of what it’s done for the changelog (and commit message) to standard out.

If a fixer can not provide any improvements, it can simply leave the working tree untouched - lintian-brush will not create any commits for it or update the changelog. If it exits with a non-zero exit code, then it is assumed that it failed to run and it will be listed as such and its changes reset rather than committed.

In addition, tests can be added for fixers by providing various before and after source package trees, to verify that a fixer script makes the expected changes.

For more details, see the documentation on writing new fixers.

Availability

lintian-brush is currently available in unstable and testing. See man lintian-brush(1) for an explanation of the command-line options.

Fixer scripts are included that can fix (some of the instances of) 34 lintian tags.

Feedback would be great if you try lintian-brush - please file bugs in the BTS, or propose pull requests with new fixers on salsa.

comments.

Migrating packaging from Bazaar to Git

A while ago I migrated most of my packages from Bazaar to Git. The rest of the world has decided to use Git for version control, and I don’t have enough reason to stubbornly stick with Bazaar and make it harder for myself to collaborate with others.

So I’m moving away from a workflow I know and have polished over the last few years - including the various bzr plugins and other tools involved. Trying to do the same thing using git is frustrating and time-consuming, but I’m sure that will improve with time. In particular, I haven’t found a good way to merge in a new upstream release (from a tarball) while referencing the relevant upstream commits, like bzr merge-upstream can. Is there a good way to do this? What helper tools can you recommend for maintaining a Debian package in git?

Having been upstream for bzr-git earlier, I used its git-remote-bzr implementation to do the conversions of the commits and tags:

% git clone bzr::/path/to/bzr/foo.bzr /path/to/git/foo.git

One of my last contributions to bzr-git was a bzr git-push-pristine-tar-deltas subcommand, which will export all bzr-builddeb-style pristine-tar metadata to a pristine-tar branch in a Git repository that can be used by pristine-tar directly or through something like git-buildpackage.

Once you have created a git clone of your bzr branch, it should be a matter of running bzr git-push-pristine-tar-deltas with the target git repository and the Debian package name:

% cd /path/to/bzr/foo.bzr
% bzr git-push-pristine-tar-deltas /path/to/git/foo.git foo
% cd /path/to/git/foo.git foo
% git branch
*  master
   pristine-tar

comments.

Samba 4 and OpenChange daily Ubuntu packages

Daily builds

As of a month ago there are Ubuntu archives with fresh packages of Samba 4 and OpenChange, built on a daily basis day from the latest upstream revision.

This means that it is now possible to run a version of Samba 4 that is less than 24 hours old, without having to know how to extract source code from the version control system that upstream is using, without having to know how to build and install an application from source, but perhaps most importantly: without having to go through the tedious process of manually updating the source code and rebuilding.

OpenChange is tightly coupled to Samba 4, so installing a new version of OpenChange usually involves installing a new version of Samba 4 as well. To make matters more confusing, the two projects use different version control systems (Samba 4 is in Git, while OpenChange is in Subversion) and different build systems (Samba 4 uses waf, OpenChange uses autoconf and make).

I have been involved in Samba 4 and OpenChange as an upstream developer and more recently also as a packager for both Debian and Ubuntu.

As an upstream developer for both these projects it is important for me that users can easily run the development versions. It makes it possible for interested users to confirm the fixes for issues they have reported and to test new features. The more users run the development version, the more confident I can be as a developer that doing a release will not cause any unexpected surprises.

As a packager it is useful to know when there are upstream changes that are going to break my package with the next release.

Recipes

The daily builds work using so-called recipes which describe how to build a Debian source package from a set of Bazaar branches. For example, the Samba 4 recipe looks like this:

1
2
3
4
# bzr-builder format 0.2 deb-version 4.0.0~alpha14~bzr{revno}~ppa{revno:packaging}+{revno:debian}
lp:samba
merge debian lp:~samba-team/samba/unstable
merge packaging lp:~samba-team/samba/4.0-ppa-maverick

This dictates that a source package should be built by taking the upstream Samba branch and merging the Debian packaging and some recipe-specific tweaking. The last bit on the first line indicates the version string to be used when generating a changelog entry for the daily build.

Every night Launchpad (through bzr-builder) merges these branches and attempts to build the resulting source package, e-mailing me in case of build problems. Generally I fix issues that come up by committing directly to upstream VCS or to the Debian packaging branch. There is no overhead in maintaining the daily build after I’ve set it up.

For more information on creating source package recipes, see getting started.

Toolchain

The entire toolchain that does the daily package builds for Ubuntu is Free Software, and I have contributed to various bits of that toolchain over the years. It’s exciting to see everything come together.

Soyuz

Launchpad consists of multiple pillars - one of those pillars is Soyuz, which I hack on as part of my day job at Canonical. Soyuz is responsible for the archive management and package building. Debian source packages (a combination of upstream source code and packaging metadata) get uploaded by users and then built for various architectures on our buildfarm and published to the Ubuntu archive or to users personal package archives.

Launchpad-code

Another pillar of Launchpad is Launchpad-code, which is responsible for the hosting and management of version control branches. Launchpad users can either host their branches on Launchpad directly or mirror branches (either native Bazaar branches or branches in a foreign format such as Subversion, Git or Mercurial). The mirrorring of native and foreign branches happens using standard Bazaar API’s. In the case of Samba and OpenChange we import the branches of the upstream projects (Samba is in Git, OpenChange is in Subversion) and the packaging for both projects is in Bazaar.

Launchad-code calls out to Bazaar to do the actual mirrorring. Over the last few years I have done a lot of work to improve Bazaars support for foreign branches, in particular on supporting Subversion, Git and Mercurial. As the code mirrorring in Launchpad is one of the biggest users of bzr-svn and bzr-git it has helped find some of the more obscure bugs in those plugins over the last few years, to the point where there are only a handful of issues with Git imports and Subversion imports left.

bzr-git and dulwich

bzr-git provides transparent access to Git repositories from within Bazaar and is built on top of Dulwich. Dulwich is a Python library that provides access to the Git file formats and protocols that is completely independent of Bazaar. James Westby originally started it and I adopted it for bzr-git and further extended it. There are now several other projects that use it as well, including hg-git, and rabbitvcs. Apart from James and myself, almost two dozen other people have contributed it so far.

bzr-svn and subvertpy

bzr-svn provides transparant access to Subversion repositories in Bazaar. When I grew frustrated with the existing Subversion Python bindings for various reasons, I decided to create independent Python bindings for Subversion from scratch. These bindings have since been split out into a separate project - subvertpy - and other projects have since also started using them, e.g. hgsubversion and basie.

Using the daily builds

To use the Samba 4 and OpenChange daily builds (Ubuntu Maverick only for now), run:

1
2
$ apt-add-repository ppa:samba-team/ppa
$ apt-add-repository ppa:openchange/daily-builds

Currently Playing: Karnivool - Themata

comments.