Merge ~lvoytek/ubuntu/+source/bind9:revert-inline-signing-enforcement-jammy into ubuntu/+source/bind9:ubuntu/jammy-devel

Proposed by Lena Voytek
Status: Work in progress
Proposed branch: ~lvoytek/ubuntu/+source/bind9:revert-inline-signing-enforcement-jammy
Merge into: ubuntu/+source/bind9:ubuntu/jammy-devel
Diff against target: 354 lines (+290/-3)
6 files modified
debian/README.Debian (+14/-1)
debian/changelog (+12/-0)
debian/control (+5/-1)
debian/patches/lp2015793-revert-inline-signing-enforcement.patch (+253/-0)
debian/patches/series (+1/-0)
debian/rules (+5/-1)
Reviewer Review Type Date Requested Status
Athos Ribeiro Pending
Canonical Server Reporter Pending
Review via email: mp+440776@code.staging.launchpad.net

Description of the change

Reverted the inline-signing line requirement added in the MRE.

PPA: https://launchpad.net/~lvoytek/+archive/ubuntu/bind9-revert-inline-signing

Tests to confirm

Install from scratch:
# lxc launch images:ubuntu/jammy test-bind9
# lxc exec test-bind9 bash
# apt update && apt dist-upgrade -y

# apt install bind9 software-properties-common -y
# add-apt-repository ppa:lvoytek/bind9-revert-inline-signing
# apt update && apt upgrade -y

# cat <<EOF >/etc/bind/named.conf.options
options { directory "/var/cache/bind"; listen-on port 53 { 127.0.0.1; }; allow-query { any; }; recursion yes; };
EOF

# cat <<EOF >/etc/bind/named.conf.local
zone "localdomain.test" { type master; file "/etc/bind/zones/forward.localdomain.test"; dnssec-policy default;};
EOF

# mkdir -p /etc/bind/zones/

# cat <<EOF >/etc/bind/zones/forward.localdomain.test
\$TTL 604800
@ IN SOA localdomain.test. root.localdomain.test. (
         2 ; Serial
    604800 ; Refresh
     86400 ; Retry
   2419200 ; Expire
    604800 ) ; Negative Cache TTL
;
@ IN NS ns.localdomain.test.
ns IN A 127.0.0.1
EOF

# systemctl restart bind9
> no output

Update from broken:
# lxc launch images:ubuntu/jammy test-bind9
# lxc exec test-bind9 bash
# apt update && apt dist-upgrade -y

# apt install bind9 -y

# cat <<EOF >/etc/bind/named.conf.options
options { directory "/var/cache/bind"; listen-on port 53 { 127.0.0.1; }; allow-query { any; }; recursion yes; };
EOF

# cat <<EOF >/etc/bind/named.conf.local
zone "localdomain.test" { type master; file "/etc/bind/zones/forward.localdomain.test"; dnssec-policy default;};
EOF

# mkdir -p /etc/bind/zones/

# cat <<EOF >/etc/bind/zones/forward.localdomain.test
\$TTL 604800
@ IN SOA localdomain.test. root.localdomain.test. (
         2 ; Serial
    604800 ; Refresh
     86400 ; Retry
   2419200 ; Expire
    604800 ) ; Negative Cache TTL
;
@ IN NS ns.localdomain.test.
ns IN A 127.0.0.1
EOF

# systemctl restart bind9

Job for named.service failed because the control process exited with error code.
See "systemctl status named.service" and "journalctl -xeu named.service" for details.

# apt install software-properties-common -y
# add-apt-repository ppa:lvoytek/bind9-revert-inline-signing
# apt update && apt upgrade -y

# systemctl restart bind9
> no output

To post a comment you must log in.
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Did you confirm that named-checkzone (as is, from upstream), actually checked for this misconfiguration? We saw yesterday that it did in your lxd, so I must have done something wrong on my test, or in focal (which is what I used), it really didn't do that check.

Revision history for this message
Lena Voytek (lvoytek) wrote :

I haven't been able to catch it with named-checkzone but named-checkconf does:

# named-checkconf
/etc/bind/named.conf.local:1: 'inline-signing yes;' must also be configured explicitly for zones using dnssec-policy without a configured 'allow-update' or 'update-policy'. See https://kb.isc.org/docs/dnssec-policy-requires-dynamic-dns-or-inline-signing

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Ok, here are my thoughts for now, and I welcome other reviewers to chime in:

- this revert (and by that, I also mean the original upstream commit), is a bit complex, as I thought

- I wonder if other commits after this one depend on the behavior that was introduced by the upstream commit. I spotted this one which touches the inline-signing setting, but don't know yet if it depends on the behavior we are reverting:

commit b12572b4af12ee4c5531603bdeb80840b9b48134
Author: Matthijs Mekking <email address hidden>
Date: Tue Oct 11 11:07:43 2022 +0200

    If a zone is not reusable, trigger full sign

    If after a reconfig a zone is not reusable because inline-signing
    was turned on/off, trigger a full resign. This is necessary because
    otherwise the zone maintenance may decide to only apply the changes
    in the journal, leaving the zone in an inconsistent DNSSEC state.

    (cherry picked from commit 4d143f2cc46663e6a7935b3d650c361ed630e03a)

- can you check if we could run (manually, somewhere) the upstream test suite? We would have to revert a commit in there too, because of this MP here, but it could give us more confidence we are not breaking anything else. I have never ran the upstream test suite before.

- do we need to adjust the documentation after this revert?

- the declarations of named_zone_reusable() and named_zone_inlinesigning() are changing. I checked the code (with grep) and they are not used elsewhere, so we should be fine, but it's a bit scary. It's quite the complicated handling of config options that we have there, and pointers all over the place :/

- good that named-checkconf catches the problematic configuration, but I'm afraid we can't use that in preinst as I originally thought, because we need the *new* named-checkconf, and that is not yet unpacked at preinst time :/. If we want to add a preinst check, it will have to rely on manual parsing of the config file(s), and, given what we can see in this patch, that could become very complicated.

What I would like to do still is to reproduce the original bug (https://gitlab.isc.org/isc-projects/bind9/-/issues/3381) in order to understand it well, and its consequences, and also to understand what it means for a zone to be marked as reusable or not, which I see mentioned in other commits. Because the code after the fix can definitely be assuming that the change is there, but we are now reverting it.

Revision history for this message
Andreas Hasenack (ahasenack) :
Revision history for this message
Andreas Hasenack (ahasenack) wrote (last edit ):

Some experiments, and learning.

First, some concepts:

journal files: are files recording changes to the "parent" file. For example, you have db.example.fake and db.example.fake.jnl. The actual *current* content of db.example.fake is the merge of the two. Bind won't write to db.example.fake, but will create and update db.example.fake.jnl. It's kind of like an overlay filesystem: you have the read-only "base", and the read-write portion on top of it.

inline-signing yes: plain text zone and signed zone are actual separate files on disk. So two files. (db.example.fake and db.example.fake.signed). The signed one is maintained by bind.

inline-signing no: all in one file. Plain text zone gets the signature records (via a journal file, so the original plain text file of the zone is not modified).

Now the zone definition samples:
a) This one gets an implied "inline-signing yes" config set:
zone "example.fake" {
  type primary;
  file "/var/lib/bind/db.example.fake";
  dnssec-policy default;
};

Here we will have db.example.fake supplied by the administrator, and db.example.fake.signed managed by bind (due to dnssec-policy "default").
If you bump the serial of db.example.fake two 2, for example, bind will resign the zone and produce db.example.fake.signed with a serial of 3, and serve that.
Along those two, we will also have the normal journal files (.jnl).

b) This one is invalid:
zone "example.fake" {
  type primary;
  file "/var/lib/bind/db.example.fake";
  dnssec-policy default;
  inline-signing no; // invalid: must be absent (which means "yes"), or "yes" explicitly
};

c) This one gets no implicit inline-signing, so it resumes its default value which is "no":
zone "example.fake" {
  type primary;
  file "/var/lib/bind/db.example.fake";
  dnssec-policy default;
  allow-update { 1.2.3.4/32; };
};

Here we will have just db.example.fake, and db.example.fake.jnl. The journal one will have the DNSSEC records, bump of serial number, etc, and is what will be served.

Crucially, if you go from (a) to (c), that's when the bug happens, because it flips inline-signing from the implicit "yes" to the default "no".

Revision history for this message
Athos Ribeiro (athos-ribeiro) wrote :

Hi Lena, I am reviewing this one. Thanks for working on it :)

While I gather more information and properly analyze the changes, I have 2 initial comments in line with Andreas suggestions:

- There are documentation changes that could be patched as well. Moreover, it would be nice to add an entry in the d/README.Debian describing the discrepant behavior between the upstream version and the jammy/kinetic version of the package and why the difference is there.

- It would be of great benefit to be able to run the upstream test suite indeed. From the package changelog, I see it has been disabled a while ago because we need some networking privileges to be able to setup the test environment.

I believe we should be able to at least run the unit tests of the package during the package build, and it should (hopefully) be possible to run the system tests during an autopkgtest run. The patch below is a kick-off towards that effort.

I also verified that fedora has some mechanisms to run the tests in controlled environments: https://src.fedoraproject.org/rpms/bind/blob/rawhide/f/bind.spec. While the do run the unit tests, they are skipping the system ones (maybe they do run in some CI somewhere, but I could not find STI tests for fedora's bind): https://kojipkgs.fedoraproject.org//packages/bind/9.18.13/1.fc39/data/logs/x86_64/build.log

diff --git a/debian/control b/debian/control
index 671206a01..65a1d2dbd 100644
--- a/debian/control
+++ b/debian/control
@@ -29,7 +29,11 @@ Build-Depends: bison,
                libxml2-dev,
                pkg-config,
                python3,
- zlib1g-dev
+ zlib1g-dev,
+ kyua,
+ iproute2,
+ python3-dnspython,
+ libnet-dns-perl
 Build-Depends-Indep: fonts-freefont-otf,
                      latexmk,
                      python3-sphinx,
diff --git a/debian/rules b/debian/rules
index 8b9f41227..414de0701 100755
--- a/debian/rules
+++ b/debian/rules
@@ -78,6 +78,7 @@ override_dh_auto_configure:
   --with-gssapi=yes \
   --with-libidn2 \
   --with-json-c \
+ --with-cmocka \
   --with-lmdb=/usr \
   --with-gnu-ld \
   --with-maxminddb \
@@ -100,7 +101,10 @@ override_dh_auto_install:
  dh_auto_install --destdir=$(CURDIR)/debian/tmp -- install install-html

 override_dh_auto_test:
- :
+ make unit
+ bin/tests/system/ifconfig.sh up
+ dh_auto_test $@
+ bin/tests/system/ifconfig.sh down

 override_dh_installinit:
  dh_installinit -pbind9 --name=named

Revision history for this message
Lena Voytek (lvoytek) wrote :

Thanks for the review Athos!

I added a changelog entry and the changes to run the tests at buildtime. Currently, two of the tests are failing- netmgr_test and doh_test. I'm investigating why that might be the case. Here is the buildlog in the PPA with the changes in place:

https://launchpadlibrarian.net/662280251/buildlog_ubuntu-jammy-amd64.bind9_1%3A9.18.12-0ubuntu0.22.04.2~ppa3_BUILDING.txt.gz

Revision history for this message
Lena Voytek (lvoytek) wrote :

Still looking into enabling upstream unit tests. I managed to disable netmgr_test and doh_test because they both create virtual interfaces on lo which breaks them. However, now the test suite is running an additional 106 tests that also have this problem. Do you think I should keep working on enabling the tests or should we just focus on reverting inline-signing for now? Here's the current build log:

https://launchpadlibrarian.net/663280763/buildlog_ubuntu-jammy-amd64.bind9_1%3A9.18.12-0ubuntu0.22.04.2~ppa8_BUILDING.txt.gz

Also the current test suite can't be run on i386 since kyua hasn't been build for it in Kinetic and Jammy for some reason

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Is there a log of an upstream run of this test suite somewhere? Maybe there we can see what trick they use to run them? What environment? Maybe they don't even run them in ubuntu?

Revision history for this message
Lena Voytek (lvoytek) wrote :

I haven't found any passing ones for 9.18.12 upstream as the pipelines are often cancelled but I can find some examples of ones passing for 9.18.x on Jammy, such as today:

https://gitlab.isc.org/isc-projects/bind9/-/jobs/3348923

Not sure what's special about their environment but I can try digging a bit

There was an error fetching revisions from git servers. Please try again in a few minutes. If the problem persists, contact Launchpad support.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches