Merge ~rafaeldtinoco/ubuntu/+source/qemu:lp1805256-bionic into ubuntu/+source/qemu:ubuntu/bionic-devel
- Git
- lp:~rafaeldtinoco/ubuntu/+source/qemu
- lp1805256-bionic
- Merge into ubuntu/bionic-devel
Status: | Merged |
---|---|
Approved by: | Rafael David Tinoco |
Approved revision: | 4972897a218dbed489d24bd9c6446241b771be1f |
Merged at revision: | 4972897a218dbed489d24bd9c6446241b771be1f |
Proposed branch: | ~rafaeldtinoco/ubuntu/+source/qemu:lp1805256-bionic |
Merge into: | ubuntu/+source/qemu:ubuntu/bionic-devel |
Diff against target: |
585 lines (+533/-0) 8 files modified
debian/changelog (+12/-0) debian/patches/series (+6/-0) debian/patches/ubuntu/lp-1805256-aio-Do-aio_notify_accept-only-during-blocking.patch (+116/-0) debian/patches/ubuntu/lp-1805256-aio-posix-Assert-that-aio_poll_is-always-called.patch (+41/-0) debian/patches/ubuntu/lp-1805256-aio-rename-aio_context_in_iothread.patch (+62/-0) debian/patches/ubuntu/lp-1805256-aio-wait_delegate_poll_aioctx_bql.patch (+107/-0) debian/patches/ubuntu/lp-1805256-async_use_explicit_mem_barriers.patch (+164/-0) debian/patches/ubuntu/lp-1805256-dont-count-ctx-not-as-progress.patch (+25/-0) |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Christian Ehrhardt (community) | Approve | ||
Rafael David Tinoco (community) | Approve | ||
Ike Panhc (community) | Approve | ||
dann frazier (community) | Needs Fixing | ||
Canonical Server | Pending | ||
Review via email: mp+383566@code.staging.launchpad.net |
Description of the change
Rafael David Tinoco (rafaeldtinoco) wrote : | # |
Rafael David Tinoco (rafaeldtinoco) wrote : | # |
Finishing up small build fixes related to implicit declarations, will give an ok soon here.
Rafael David Tinoco (rafaeldtinoco) wrote : | # |
Okay, after finishing up the declarations issue I faced a new building problem:
https:/
for Ubuntu Bionic:
/usr/include/
/home/rafaeldti
#define BITS_PER_LONG (sizeof (unsigned long) * BITS_PER_BYTE)
/home/rafaeldti
#define BITS_PER_LONG (sizeof (unsigned long) * BITS_PER_BYTE)
Working on it
Rafael David Tinoco (rafaeldtinoco) wrote : | # |
Already spoken to you about the previous comment. You pointed me:
https:/
So we're good there.
Christian Ehrhardt (paelzer) wrote : | # |
This looks well prepared, 7.24 will eventually also get CVEs added and the kernel bug that makes it an FTFBS needs to be fixed. So +1 for now and the steps from here:
1. wait for bug 1877123 (kernel team)
2. test current SRU (my job)
3. release my SRU with security fixes (mdeslaur)
4. rebase this MP and the others (rafael)
5. hand over to the SRU Team to accept it
dann frazier (dannf) wrote : | # |
I self-built the package from the PPA w/ -proposed disabled to avoid the build failure. Unfortunately, this fails testing on a Cavium Sabre (ThunderX2):
ubuntu@starbuck:~$ while :; do i=$((i + 1)); echo "=== $i ==="; qemu-img convert -p -f qcow2 -O qcow2 bionic-
=== 1 ===
(100.00/100%)
=== 2 ===
(100.00/100%)
=== 3 ===
(100.00/100%)
=== 4 ===
(100.00/100%)
=== 5 ===
(100.00/100%)
=== 6 ===
(100.00/100%)
=== 7 ===
(100.00/100%)
=== 8 ===
(100.00/100%)
=== 9 ===
(100.00/100%)
=== 10 ===
(100.00/100%)
=== 11 ===
(100.00/100%)
=== 12 ===
(100.00/100%)
=== 13 ===
(100.00/100%)
=== 14 ===
(100.00/100%)
=== 15 ===
(0.00/100%)
<HANG>
Rafael David Tinoco (rafaeldtinoco) wrote : | # |
@dannf,
Have you tested the other ones as well ? Is the bionic version the only one hanging ? My intent with this question is to check whether this was intermittent enough to be seen in bionic by a coincidence (frequency is not *that high*, and, by accident it happened with bionic backport) OR it definitely does not happen in the other backports, just with this one.
@paelzer,
Should I get a dump and analyse it with elf+dwarf symbols ? I can update upstream on the issue if needed.
dann frazier (dannf) wrote : | # |
@rafaeldtinoco - I tested focal (groovy by proxy) and eoan, and those worked fine. See my +1s in those merge proposals. bionic is the only one hanging for me. That reflects my own "clean-room" backport testing as well - my eoan/focal backports were clean cherry picks that worked fine. My bionic one however required some additional backporting, and that still hit hangs. However, Ike's backport worked fine for me.
dann frazier (dannf) wrote : | # |
@rafaeldtinoco: The relevant difference between your backport and Ike's is below. After applying this patch, my reproducer no longer fails. This undoes a change that came in w/ the extra backport of debian/
Index: qemu-2.
=======
--- qemu-2.
+++ qemu-2.
@@ -647,7 +647,6 @@ bool aio_poll(AioContext *ctx, bool bloc
if (blocking) {
/* Finish the poll before clearing the flag. */
- aio_notify_
}
/* Adjust polling time */
@@ -691,6 +690,8 @@ bool aio_poll(AioContext *ctx, bool bloc
}
}
+ aio_notify_
+
/* if we have any readable fds, dispatch event */
if (ret > 0) {
for (i = 0; i < npfd; i++) {
Rafael David Tinoco (rafaeldtinoco) wrote : | # |
Hey Dann,
This, per se, is not a change backed by an upstream change (not an entire change at least). The diff brought by you is part of a bigger commit:
----
Message-Id: <email address hidden>
Signed-off-by: Fam Zheng <email address hidden>
Signed-off-by: Rafael David Tinoco <email address hidden>
Author: Paolo Bonzini <email address hidden>
Origin: upstream, https:/
Bug: https:/
Bug-Ubuntu: https:/
Reviewed-By: Rafael David Tinoco <email address hidden>
Last-Update: 2020-05-07
----
also added, by me, to the Bionic SRU, just like all further SRUs, with the following backport note:
[Rafael David Tinoco]
I have cherry-picked this fix because of the AIO primitives fix
dependency, nevertheless, it is good to notice it fixes the case
above as well. Per se it would be difficult to SRU this, but,
since we're stressing AIO mechanism with the other Aarch64 needed
fix, this one fits the purpose as well.
and the filename:
debian/
containing:
https:/
and the *most important* part of the code change you referred to.
----
The thing is.. changes to AIO code can make the issue to appear or disappear and/or make it more or less intermittent than it was, depending on how the cache line is being used by the binary object during its execution (specially in the case where the 2 variables that need syncing are sharing the same cache line <- which is hardly ever the truth).
It's hard to consider something fixed because of a change in binaries not backed by upstream in this case (remember when we added a local - to a function - variable and caused the issue to temporarily disappear because of diff cache lines having the 2 addresses that had to be synced ?).
Looks like there are no available TX2 machines for me to play with: I provisioned one and after a bit of work there (couldn't reproduce) I realized it wasn't a TX2 =\. I'll keep it there so if I can get a TX2 this week I can just move my files from one machine to another.
----
Current Status in my POV:
- Either we fixed the issue and the patch above is causing the same regression for the version we have in Bionic (which seems less likely)
- Or the issue is still there but being reproduced in a few builds only (like the one we have in my PPA) and not in others (like the one you had without the patch above).
----
Id like to have access to TX2 machine so I can test the same build with and without commit
lp-1805256-
It could be the case that we did not fix any version but we can only reproduce the issue in Bionic, for some reason, for example. Then I would have to get back to Paolo showing that by using a dump analysis to back us up.
Thoughts ?
dann frazier (dannf) wrote : | # |
[ + Ike ]
On Tue, May 19, 2020 at 9:00 PM Rafael David Tinoco
<email address hidden> wrote:
>
> Hey Dann,
>
> This, per se, is not a change backed by an upstream change (not an entire change at least). The diff brought by you is part of a bigger commit:
>
> ----
> Message-Id: <email address hidden>
> Signed-off-by: Fam Zheng <email address hidden>
> Signed-off-by: Rafael David Tinoco <email address hidden>
>
> Author: Paolo Bonzini <email address hidden>
> Origin: upstream, https:/
> Bug: https:/
> Bug-Ubuntu: https:/
> Reviewed-By: Rafael David Tinoco <email address hidden>
> Last-Update: 2020-05-07
> ----
>
> also added, by me, to the Bionic SRU, just like all further SRUs, with the following backport note:
>
> [Rafael David Tinoco]
>
> I have cherry-picked this fix because of the AIO primitives fix
> dependency, nevertheless, it is good to notice it fixes the case
> above as well. Per se it would be difficult to SRU this, but,
> since we're stressing AIO mechanism with the other Aarch64 needed
> fix, this one fits the purpose as well.
>
> and the filename:
>
> debian/
>
> containing:
>
> https:/
>
> and the *most important* part of the code change you referred to.
Right, I noticed and mentioned that. In fact, when I did a clean room
backport myself, I pulled the same change in with the same result.
> ----
>
> The thing is.. changes to AIO code can make the issue to appear or disappear and/or make it more or less intermittent than it was, depending on how the cache line is being used by the binary object during its execution (specially in the case where the 2 variables that need syncing are sharing the same cache line <- which is hardly ever the truth).
>
To be clear, I'm not suggesting we blindly drop that change w/o
understanding why we should.
>
> It's hard to consider something fixed because of a change in binaries not backed by upstream in this case (remember when we added a local - to a function - variable and caused the issue to temporarily disappear because of diff cache lines having the 2 addresses that had to be synced ?).
>
>
> Looks like there are no available TX2 machines for me to play with: I provisioned one and after a bit of work there (couldn't reproduce) I realized it wasn't a TX2 =\. I'll keep it there so if I can get a TX2 this week I can just move my files from one machine to another.
>
I've released one I was using (apollo), feel free to grab. You can
also use any Hi1620 chip - filter for MAAS tags soc-thunderx2 and
soc-hi1620, respectively.
>
> Current Status in my POV:
>
> - Either we fixed the issue and the patch above is causing the same regression for the version we have in Bionic (which seems less likely)
>
> - Or the issue is still there but being reproduced in a few builds only (like the one we have in my PPA) and not in others (like the one you had without the patch above).
>
> ----
>
> Id like to have access to TX2 machine so I can test the same build with a...
Rafael David Tinoco (rafaeldtinoco) wrote : | # |
>> debian/
>>
>> containing:
>>
>> https:/
>>
>> and the *most important* part of the code change you referred to.
> Right, I noticed and mentioned that. In fact, when I did a clean room
> backport myself, I pulled the same change in with the same result.
Great! We're synced then!
>> ----
>>
>> The thing is.. changes to AIO code can make the issue to appear or disappear and/or make it more or less intermittent than it was, depending on how the cache line is being used by the binary object during its execution (specially in the case where the 2 variables that need syncing are sharing the same cache line <- which is hardly ever the truth).
> To be clear, I'm not suggesting we blindly drop that change w/o
> understanding why we should.
> Ah gotcha. Sorry for misunderstanding.
> I've released one I was using (apollo), feel free to grab. You can
> also use any Hi1620 chip - filter for MAAS tags soc-thunderx2 and
> soc-hi1620, respectively.
Will do!
>> It could be the case that we did not fix any version but we can only reproduce the issue in Bionic, for some reason, for example. Then I would have to get back to Paolo showing that by using a dump analysis to back us up.
> We could easily reproduce in eoan and focal before backporting the
> upstream fix. My hypothesis is that something changed after bionic
> that made lp-1805256-
> safe/correct to apply.
Yes! I thought the same! I'm just afraid that we can be thinking that
because the builds (each new major/minor version) are not 100%
reproducible and could be generating different objects with the cache
line main issue. But I share the same opinion as you on the hypothesis!
dann frazier (dannf) wrote : | # |
I think you may need the following commit as well:
70232b5253 aio-posix: Don't count ctx->notifier as progress when polling
I added that to your package and started a test - so far > 100 iterations OK. But I won't be confident until at least 500.
dann frazier (dannf) wrote : | # |
At 2214 iterations and still going, so I'm confident :)
Rafael David Tinoco (rafaeldtinoco) wrote : | # |
Very nice indeed! Nice finding!
I have not replied to you before because unfortunately had a massive I/O issue with my nvme card #). Let me know if you're confident enough so I can include that patch in the official SRU, please!
dann frazier (dannf) wrote : | # |
On Fri, May 22, 2020 at 6:56 PM Rafael David Tinoco
<email address hidden> wrote:
>
> Very nice indeed! Nice finding!
>
> I have not replied to you before because unfortunately had a massive I/O issue with my nvme card #). Let me know if you're confident enough so I can include that patch in the official SRU, please!
+1 from my POV. It made it to 3093 successful tests, which is when my
connection dropped. btw, the way I found it was to bisect, applying
the set of patches you selected at each stage (those of which weren't
already applied at the chosen commit).
dann frazier (dannf) wrote : | # |
fyi, there are merge conflict markers in this version
Rafael David Tinoco (rafaeldtinoco) wrote : | # |
Hey Dann, will sort them out..
Actually I stopped in the middle of my previous attempt - thus the leftovers - and discussed this case with @cpaelzer. It turns out that the commit you mention, like I predicted, just added entropy - likely flushing local/remote cache lines - in between 2 primitives/barriers manipulation inside AIO synchronization code.
That explains why I did not see it before, it does not play a role in the locking primitives. With that in mind, it is very likely that the real original issue has not been fixed and is been only seeing in current bionic binary and code base.
After speaking to cpaelzer: I'll try to create a small code that mimics the issue and see if we can reproduce it at will to demonstrate to both: HW designers and Paolo. This effort will be done "boxed" in some of my schedule hours.
Meanwhile, after fixing the markers, @cpaelzer will likely accept this SRU as is cause it won't hurt anyway.. BUT we *could*, eventually, face the same issue again. Be aware :o).
Last case scenario, we can also just add some padding in between the 2 locking variables and make sure they never, ever, are loaded in the same cacheline <- that would mitigate the original issue for sure.
Rafael David Tinoco (rafaeldtinoco) wrote : | # |
BTW the leftovers were there just because I did not rebase newer bionic versions. I'm pushing a rebase to latest bionic-devel version.
dann frazier (dannf) wrote : | # |
On Tue, May 26, 2020 at 11:26 AM Rafael David Tinoco
<email address hidden> wrote:
>
> Hey Dann, will sort them out..
>
> Actually I stopped in the middle of my previous attempt - thus the leftovers - and discussed this case with @cpaelzer. It turns out that the commit you mention, like I predicted, just added entropy - likely flushing local/remote cache lines - in between 2 primitives/barriers manipulation inside AIO synchronization code.
Could it just be that there are 2 different issues here?
A) One that impacts all releases, which is fixed by "async: use
explicit memory barriers"
B) One that only impacts bionic, which requires both "aio-posix: Don't
count ctx->notifier as progress when polling" and "aio: Do
aio_notify_accept only during blocking aio_poll"? These 2 patches were
submitted together to fix an IOThread hang:
https:/
I'll see if I can collect a backtrace w/ the fix for A) and not B) to
see how it compares to the one we've attributed to B).
-dann
dann frazier (dannf) wrote : | # |
Here's a backtrace using the current bionic PPA package, which includes:
async: use explicit memory barriers
aio: Do aio_notify_accept only during blocking aio_poll
But does *not* include:
aio-posix: Don't count ctx->notifier as progress when polling
(gdb) thread apply all bt
Thread 2 (Thread 0xffff9d787ff0 (LWP 70291)):
#0 0x0000ffff9dc022cc in __GI___sigtimedwait (set=<optimized out>, set@entry=0x
aaaaebac2a20, info=info@
try=0x0) at ../sysdeps/
#1 0x0000ffff9dd39fac in __sigwait (set=set@
=0xffff9d787764) at ../sysdeps/
#2 0x0000aaaac2311d70 in sigwait_compat (opaque=
tfd.c:36
#3 0x0000ffff9dd2f088 in start_thread (arg=0xffffdd0d
463
#4 0x0000ffff9dc9f4ec in thread_start () at ../sysdeps/
clone.S:78
Thread 1 (Thread 0xffff9d789010 (LWP 70290)):
#0 0x0000aaaac230e36c in aio_bh_poll (ctx=ctx@
sync.c:120
#1 0x0000aaaac2311818 in aio_poll (ctx=ctx@
mized out>) at ./util/
#2 0x0000aaaac229b2f0 in bdrv_prwv_co (child=
=offset@
lse, flags=flags@
#3 0x0000aaaac229b654 in bdrv_preadv (qiov=0xffffdd0
=0xaaaaebace2f0) at ./block/io.c:764
#4 bdrv_pread (child=
<optimized out>) at ./block/io.c:785
#5 0x0000aaaac226c514 in qcow2_do_open (bs=0xaaaaebaf3f90, options=
e0, flags=25090, errp=0xffffdd0d
#6 0x0000aaaac224d834 in bdrv_open_driver (bs=bs@
aac2375458 <bdrv_qcow2>, drv@entry=
tions=options@
entry=0x4202) at ./block.c:1146
#7 0x0000aaaac224eb90 in bdrv_open_common (errp=0x4202, options=
file=0xaaaaeba
#8 bdrv_open_inherit (filename=
#9 0x0000aaaac224f9d8 in bdrv_open (filename=
#10 0x0000aaaac228b3c4 in blk_new_open (filename=
#11 0x0000aaaac22415b0 in img_open_file (filename=
#12 0x0000aaaac224437c in img...
Rafael David Tinoco (rafaeldtinoco) wrote : | # |
> Could it just be that there are 2 different issues here?
>
> A) One that impacts all releases, which is fixed by "async: use
> explicit memory barriers"
>
> B) One that only impacts bionic, which requires both "aio-posix: Don't
> count ctx->notifier as progress when polling" and "aio: Do
> aio_notify_accept only during blocking aio_poll"? These 2 patches were
> submitted together to fix an IOThread hang:
> https:/
>
> I'll see if I can collect a backtrace w/ the fix for A) and not B) to
> see how it compares to the one we've attributed to B).
A more plausible scenario, discussed with @cpaelzer as well, would be
that the patch we just placed in the patchset, has added entropy in
between 2 lock acquire/releases. This way, just like adding variables in
between the 2 lock variables, we are mitigating the issue.
Considering the initial root cause:
- 2 locking variables adjacent in address space
- thus sharing the same cache line
- when being manipulated in 2 x different CPUs
- each CPU with its own cache line
- CPUs should invalidate each others cacheline
We have the following mitigations:
- replacing locking primitives by full mutex implementation
- removing variables from the same cache line
- changing entropy reducing/postponing lock reads/writes
- this one is here
So I think its more of the same... explaining to @cpaelzer he felt that
this was also plausible. Only way to confirm would be to isolate
everything else and stick with a small reproducer.
Rafael David Tinoco (rafaeldtinoco) wrote : | # |
Your backtrace looks weird to me..
If I just unapply:
+ - aio-posix: Don't count ctx->notifier as progress when polling
I have this as instruction pointer at the time of the dump:
if (bh->deleted) { -> line 120
deleted = true;
}
inside a bigger loop, stepping through existing "bottom halves" (scheduled work in the form of callbacks).
It does not look like the original issue:
- we don't have multiple threads running for the locking variables
- sine qua non condition for the race to happen
- we have a single thread environment ?
- are you specifying 1 couroutine only ? (this was our first mitigation idea)
- it does not look like it is "hang"
- we are not waiting on locking anywhere as it appears
Rafael David Tinoco (rafaeldtinoco) wrote : | # |
I pinged you on IRC, maybe we can have a hangout/meet session to help with the back and forth.
Rafael David Tinoco (rafaeldtinoco) wrote : | # |
Okay, as we spoke in the hangout, wrapping up:
- Paolo patches backported to bionic likely fixed the original issue
- ... and the issue in all other versions Eoan, Focal, Groovy
- This "new" issue that appeared in Bionic in unrelated with the original one
- That explains why having patch:
"- aio-posix: Don't count ctx->notifier as progress when polling"
fixes "the issue" but it was a "loop forever inside RCU foreach macro"
- Original issue was not reproduced any longer
Bionic is good to go with all the patches:
* d/p/ubuntu/
- aio: rename aio_context_
- aio: Do aio_notify_accept only during blocking aio_poll
- aio-posix: Assert that aio_poll() is always called in home thread
- async: use explicit memory barriers (LP: #1805256)
- aio-wait: delegate polling of main AioContext if BQL not held
- aio-posix: Don't count ctx->notifier as progress when polling
All patches but the latest one are fixing the original (to LP: #1805256) issue and the last patch is fixing an issue observed by your tests with those backports (perhaps something that exited before or not, but we are sure it fixes the issue).
With this summary we're good to go for the SRU @cpaelzer.
Christian Ehrhardt (paelzer) wrote : | # |
Thanks for all the discussion and tests already!
This one is already rebased for the CVE-2019-
The code is a bit more than on >=Eoan as discussed in the past I think it is ok that way to get the things to apply and work as expected. ALso e.g. the assert (fix #3) will help in case assumptions are not true on bionics qemu (things should be ok, but if not better a crash than silent corruption).
Re-read the patches LGTM to me (still).
Can you update the PPA (https:/
Also this time the former FTBFS in Bionic is resolved, so it will actually build.
I'll then run a regression test against all four versions and if ok sponsor it into the -unapproved queue after the groovy upload completed.
Ike Panhc (ikepanhc) wrote : | # |
Thanks for picking up those patches from upstream. Build the qemu-img on 96cores arm64 system and it pass 100 times of qemu-img convert. The qemu-img in bionic-update will hang in less then 10 times run.
Rafael David Tinoco (rafaeldtinoco) wrote : | # |
@paelzer:
Done!
Ike Panhc (ikepanhc) wrote : | # |
Overnight run (>7000 qemu-img convert) and do not hang on any run.
Christian Ehrhardt (paelzer) wrote : | # |
Thanks for the rebase, non case specific regression testing is a few thousand tests in and still working.
Approving the MP now.
If no issue shows up late I'll sponsor this to -unapproved later on.
Christian Ehrhardt (paelzer) wrote : | # |
Regression tests completed - sponsoring once groovy is complete
Christian Ehrhardt (paelzer) wrote : | # |
Groovy is completed, SRU sponsored into -unapproved
To ssh://git.
* [new tag] upload/
Some issues as discussed with rbasak a while ago with the new: ui/keycodemapdb
Not sure if the tag will ever be able to match in Bionic :-/
But eventually
Uploading to ubuntu (via ftp to upload.ubuntu.com):
Uploading qemu_2.
Uploading qemu_2.
Uploading qemu_2.
Uploading qemu_2.
Successfully uploaded packages.
There was an error fetching revisions from git servers. Please try again in a few minutes. If the problem persists, contact Launchpad support.
@paelzer,
Here the same thing I said in:
https:/ /code.launchpad .net/~rafaeldti noco/ubuntu/ +source/ qemu/+git/ qemu/+merge/ 383530/ comments/ 1006812
applies. All the other cherry-picks and/or backports were needed to make a smoother (and better maintained later) SRU, I added [] notes in each of those patches I considered needed a note from me to you.