Merge lp://staging/~brendan-donegan/checkbox/story871_session_restore_dbus into lp://staging/checkbox

Proposed by Brendan Donegan
Status: Merged
Approved by: Brendan Donegan
Approved revision: 2289
Merged at revision: 2289
Proposed branch: lp://staging/~brendan-donegan/checkbox/story871_session_restore_dbus
Merge into: lp://staging/checkbox
Diff against target: 177 lines (+121/-0)
4 files modified
plainbox/contrib/dbus-mini-client.py (+19/-0)
plainbox/plainbox/impl/service.py (+86/-0)
plainbox/plainbox/impl/session/jobs.py (+12/-0)
plainbox/plainbox/impl/session/test_jobs.py (+4/-0)
To merge this branch: bzr merge lp://staging/~brendan-donegan/checkbox/story871_session_restore_dbus
Reviewer Review Type Date Requested Status
Brendan Donegan (community) Approve
Daniel Manrique (community) Approve
Review via email: mp+178284@code.staging.launchpad.net

Description of the change

Adds methods to the DBus API to allow saving and resuming of sessions. The main methods added are PreviousSessionFile which can be used to determine if there was a previous session saved, PersistentSave which actually saves the session, Resume which will restore the session to the saved state and Clean() which wipes the previously saved state.

There appear to be some issues integrating this functionality as running a resumed session seems to result in various kinds if crashes in the mini client.

To post a comment you must log in.
Revision history for this message
Daniel Manrique (roadmr) wrote :

Hi!

OK, long comment here, because the reason why this code didn't entirely work was more intricate than we thought.

First, this will need rebasing after the recent changes to the miniclient to run local jobs first. It seems to be enough to run session_object.PersistentSave() after run_local_jobs() (see the merge conflict, it'll be pretty obvious).

Second, I was able to replicate an "unable to lookup object wrapper" error. It's similar to the problem I worked on yesterday, but backwards, somehow! Luckily that gave me exactly the knowledge I needed to help debug this.

When starting a brand-new session, JobStateWrappers get created for each object in the session's job_state_map, these are based on existing JobDefinitions that are supplied by the provider.

Then, when local jobs run, they may create new JobDefinitions outside the provider. The code we pushed yesterday creates the JobDefinitionWrapper and JobStateWrapper for these.

Now, what happens with the session resume code is as follows:

1- The session has to be created. Upon creating the session with a job_list, which is usually a set of all the jobs exposed by the provider, we will have a SessionState object, as well as many JobStateWrapper objects, one for each of the jobs the session knows about. Each one points in turn to a jobDefinitionWrapper (the /plainbox/job/xxxxx objects).
2- When session_object.Resume is called, the native session overwrites the old job_list and updates the job_state_map with a new one, taken from the saved session data. This contains a bunch of JobState objects that are defined in the job_list. However, on the DBus service side, the wrappers for the JobState objects never get created, meaning that the SessionStateWrapper's state map points to the old, and by now deleted, set of JobState wrappers. That's why when trying to load those JobStates you get the "unable to lookup object wrapper" problem.

The solution I came up with is similar to the existing check_and_wrap_new_jobs method. Once the resumed session is loaded, we need to look at the native job_list, and wrap and publish any objects that didn't exist before.

The code is somewhat different than check_and_wrap_new_jobs, because, while I only need to create JobDefinitionWrappers for jobs that don't already exist, we need to create new JobStateWrappers for *all* the jobs in the job_state_map.

I'll propose a merge request to your branch, this at least runs correctly but I didn't validate all the behavior. Even if you want to implement things a bit differently, this should give you an idea of where to look.

I'll set to "Needs Fixing" but hopefully this information plus the upcoming merge request should help get things working easily.

review: Needs Fixing
Revision history for this message
Daniel Manrique (roadmr) wrote :

Ugh, this complicated more than expected... working on it, I'll have something soon.

Revision history for this message
Daniel Manrique (roadmr) wrote :

OK, this is complicated. I wrote some code to create the new JobStateWrappers when restoring the session, and with this code in place, the tests can actually be run from a restored session.

But then I ran into a second problem. When trying to generate the report, we need to access each JobState's "job" property, which should point to an existing JobDefinition over dBus. However, since the restored session also contains "copies" of job definitions, the 'job's actually point to these "restored" copies (which exist in memory as native objects), rather than to the known-over-dbus job definitions.

I tried the approach I used for the local job problem, where I simply create new JobDefinitionWrappers for these, but since the resulting object has the same DBus name as the existing ones (as given by the job's checksum), publishing it over DBus fails for obvious reasons.

I also tried taking the old session's job definitions prior to restoring it, then replacing each job (for each item in the native job_state_map) with its old equivalent, which should point to the already-known job and solve things. The problem here is that a JobState's "job" property is not writable, so this approach again failed. Plus I'm not too happy with this solution which seems kludgy.

Anyway, this needs some more thinking, I'll submit the proposed merge request so you can see where my ideas were going but it's still not working' it gets a bit farther in the running process but fails when trying to get the results.

Revision history for this message
Brendan Donegan (brendan-donegan) wrote :

With all of Daniel's good work, this now finally works when resuming a session. Let's get this out there so people can use it.

review: Needs Resubmitting
Revision history for this message
Daniel Manrique (roadmr) wrote :

yes, of course +1 since all this code has been reviewed and ping-ponged to death.

For the record, there are horrible KLUDGES in this to make things work, the logic behind them seems to be sound but the real, clean solution would be architecturally more involved. However for the purposes of getting the API out there, this is good enough.

Thanks for working on this!

review: Approve
Revision history for this message
Daniel Manrique (roadmr) wrote :
Download full text (5.1 KiB)

The attempt to merge lp:~brendan-donegan/checkbox/story871_session_restore_dbus into lp:checkbox failed. Below is the output from the failed tests.

[precise] Bringing VM 'up'
[precise] (timing) 9.37user 4.16system 3:28.95elapsed 6%CPU (0avgtext+0avgdata 21388maxresident)k
[precise] (timing) 8inputs+136outputs (0major+178952minor)pagefaults 0swaps
[precise] Starting tests...
[precise] CheckBox test suite: PASS
[precise] (timing) 0.81user 0.30system 0:37.48elapsed 2%CPU (0avgtext+0avgdata 20552maxresident)k
[precise] (timing) 0inputs+32outputs (0major+49330minor)pagefaults 0swaps
[precise] (timing) 0.75user 0.31system 0:05.94elapsed 17%CPU (0avgtext+0avgdata 20844maxresident)k
[precise] (timing) 2176inputs+16outputs (34major+49206minor)pagefaults 0swaps
[precise] PlainBox test suite: FAIL
[precise] stdout: http://paste.ubuntu.com/5951812/
[precise] stderr: http://paste.ubuntu.com/5951813/
[precise] (timing) Command exited with non-zero status 1
[precise] (timing) 0.92user 0.31system 0:12.98elapsed 9%CPU (0avgtext+0avgdata 20728maxresident)k
[precise] (timing) 0inputs+192outputs (0major+47569minor)pagefaults 0swaps
[precise] PlainBox documentation build: PASS
[precise] (timing) 0.83user 0.20system 0:18.82elapsed 5%CPU (0avgtext+0avgdata 20780maxresident)k
[precise] (timing) 0inputs+16outputs (0major+47512minor)pagefaults 0swaps
[precise] CheckBoxNG test suite: PASS
[precise] (timing) 0.82user 0.21system 0:06.62elapsed 15%CPU (0avgtext+0avgdata 20608maxresident)k
[precise] (timing) 0inputs+16outputs (0major+47443minor)pagefaults 0swaps
[precise] Integration tests: PASS
[precise] (timing) 0.82user 0.22system 0:06.91elapsed 15%CPU (0avgtext+0avgdata 20332maxresident)k
[precise] (timing) 0inputs+8outputs (0major+47999minor)pagefaults 0swaps
[precise] Destroying VM
[quantal] Bringing VM 'up'
[quantal] (timing) 7.38user 3.40system 2:34.18elapsed 6%CPU (0avgtext+0avgdata 20980maxresident)k
[quantal] (timing) 8inputs+112outputs (1major+193503minor)pagefaults 0swaps
[quantal] Starting tests...
[quantal] CheckBox test suite: PASS
[quantal] (timing) 0.88user 0.28system 0:40.07elapsed 2%CPU (0avgtext+0avgdata 19864maxresident)k
[quantal] (timing) 0inputs+32outputs (0major+47399minor)pagefaults 0swaps
[quantal] (timing) 0.79user 0.27system 0:07.43elapsed 14%CPU (0avgtext+0avgdata 19724maxresident)k
[quantal] (timing) 0inputs+16outputs (0major+47438minor)pagefaults 0swaps
[quantal] PlainBox test suite: FAIL
[quantal] stdout: http://paste.ubuntu.com/5951828/
[quantal] stderr: http://paste.ubuntu.com/5951829/
[quantal] (timing) Command exited with non-zero status 1
[quantal] (timing) 0.97user 0.32system 0:17.91elapsed 7%CPU (0avgtext+0avgdata 19984maxresident)k
[quantal] (timing) 0inputs+192outputs (0major+47407minor)pagefaults 0swaps
[quantal] PlainBox documentation build: PASS
[quantal] (timing) 0.77user 0.30system 0:22.05elapsed 4%CPU (0avgtext+0avgdata 19912maxresident)k
[quantal] (timing) 0inputs+16outputs (0major+47277minor)pagefaults 0swaps
[quantal] CheckBoxNG test suite: PASS
[quantal] (timing) 0.79user 0.26system 0:09.95elapse...

Read more...

Revision history for this message
Daniel Manrique (roadmr) wrote :

I proposed a merge to Brendan's branch to fix the failing test. Once that's approved and this branch updated, we can re-approve this one.

I could also request merging from my copy of the branch but I don't want to steal credit from Brendan :)

2289. By Brendan Donegan

Fix test from Daniel

Revision history for this message
Brendan Donegan (brendan-donegan) wrote :

Frankly, given the amount of input you've had on this branch, you probably deserve more of the credit than I do at this point! But anyway I've merged the changes, and re-approving now.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches