curtin

Code review comment for lp://staging/~wesley-wiedenmeier/curtin/partial-testing

partial-testing
Merge into trunk

Revision history for this message

Christian Ehrhardt  (paelzer) wrote on 2016-07-12:

For a speed check I tried to see if it has issues going concurrently.
I think I didn't find the right way to call it - I tried:
rm -rf output/; CURTIN_VMTEST_KEEP_DATA_FAIL=all ./tools/jenkins-runner -vv --nologcapture --processes=10 --process-timeout=3600 tests/storagetest_runner/

I found that with this call the section following section tries to write an infinitely huge file:
Building tarball of curtin: /mnt/nvme/curtin-wesley/partial-testing

Wanted to let you know just in case that infinite write is a bug.

Later on I found this in the doc:
nosetests3 --processes=-1 tests/storagetest_runner

That gave me a stuck system - maybe too much cpus (6x2threads) and by that too much output?
In any case the output indenting was totally broken - I had to reset my console to scroll again.

That failed me then with errors, not sure if that is bad or just a wrong call - here is the log: http://paste.ubuntu.com/19157752/
I realized that since this first "hanging" processes -1 run all tests were failing this way now.
All on:
Traceback (most recent call last):
File "/mnt/nvme/curtin-wesley/partial-testing/tests/storagetest_runner/__init__.py", line 385, in test_reported_results
self.assertTrue(os.path.isfile(self.storage_log))
AssertionError: False is not true

Debugging gave me: "(qemu) qemu-system-x86_64: cannot set up guest memory 'pc.ram': Cannot allocate memory"
That likely also was my first hanging - but that could be fixed by freeing some up :-)
But I wonder if we need some sort of "is enough mem avail" prior to call qemu?

In the following retry it left me again with a good return code, but plenty of running qemu processes up.
That really needs some hardening.

Then I wanted to step down and did only:
nosetests3 --processes=2 tests/storagetest_runner
To check if it works at all.
I got the same console that gets misformatted after a while ending with
Ran 0 tests in 112.494s

Since all fails keep the logs around here the log file: http://paste.ubuntu.com/19158435/

TL;DR: concurrent execution needs some fixes and probably a bit hardening against shooting itself :-)