Merge lp://staging/~sjakthol/update-manager/fix-slow-calculation into lp://staging/update-manager

Proposed by Sami Jaktholm
Status: Merged
Merged at revision: 2621
Proposed branch: lp://staging/~sjakthol/update-manager/fix-slow-calculation
Merge into: lp://staging/update-manager
Diff against target: 286 lines (+127/-68)
1 file modified
UpdateManager/Core/UpdateList.py (+127/-68)
To merge this branch: bzr merge lp://staging/~sjakthol/update-manager/fix-slow-calculation
Reviewer Review Type Date Requested Status
Michael Terry Approve
Review via email: mp+161284@code.staging.launchpad.net

Commit message

Implement various solutions to speed up the group calculation.

1. Create a set of recursive dependencies for each app group.
   The previous implementation traversed the dependency tree until a dependency
   was found (if there was one) and repeated the same procedure for every
   package. Software Updater spent most of it's time doing the same thing again
   and again.

   This branch traverses the dependency tree once and stores the data in a set
   (for automatic deduplication). The cache is initiated during first call to
   is_dependency and expanded with new items as new packages are added to the
   group. Once the cache is created, it's just a matter of simple "pkg in deps"
   test to see if the package is a dependency of an application in that group.

2. Don't glob /usr/share/app-install/desktop/ for every package.
   Instead, glob the directory once and save the URIs of desktop files that
   have an upgradeable package associated with them. These URIs are later added
   to the list of desktop_files in _get_application_for_package.

3. Cache package dependencies.
   When the package dependencies are calculated for the first time, the data is
   saved to a dict shared between groups. The cached data is used in the
   consequent queries for that same package (in other app groups). It's faster
   to fetch a list of strings from a dict than it's to iterate Dependency
   objects to get the package names.

4. Speed up _file_is_application.
   Make _file_is_application to return false right away if the file name
   doesn't have a .desktop extension.

   The old version was spending a lot of time checking whether file was located
   in a known application directory and then returned false because it didn't
   have a .desktop extension. As this is called for every file in every
   upgradeable package, the performance impact was substantial.

5. Don't spin main loop on every call to _add_deps.
   _add_deps is invoked thousands of times in a few seconds and the UI
   doesn't need to be updated every time. As _add_deps doesn't take ages to
   complete it's enough to spin the loop once in a while. That'll be enough to
   keep the UI alive.

6. Remove useless sorting by source package in _make_groups.
   The list of packages was first sorted by source but that information was
   never used. The useless nested loops were removed.

Description of the change

The tests are still passing and the packages are grouped the same way as they are now.

Inside VirtualBox it took nearly five minutes to calculate groups for 269 updates. This version was able to do it in 0.67 seconds (varied between 0.67 and 2 second for a cold start).

Without VirtualBox overhead the time it took to calculate groups for 308 upgrades went from 2 minutes to 0.75 seconds. However, this was measured before some final changes were made and I don't have access to that system anymore. Inside VirtualBox the changes shaved nearly 1 second off the group calculation time so those should also cut that 0.75 seconds down considerably.

The total time for update calculation is still long, but group calculation shouldn't be a problem anymore. This version spends most of its time opening the cache, which is a different story...

To post a comment you must log in.
Revision history for this message
Michael Terry (mterry) wrote :

Sami, love the work. Thank you so much!

60 + if callable(eventloop_callback) and random.random() < 0.01:

I'm a little uncomfortable using random(). What about if (len(self._deps) % 1000 == 0) or something similar?

Revision history for this message
Sami Jaktholm (sjakthol) wrote :

I didn't even thought of that. That sounds a lot better. I chose to use 200 instead of 1000. That spins the loop 61 times on my test system which is more than enough to keep the UI alive.

2622. By Sami Jaktholm

Use modulo instead of random() to determine if we should spin the loop.

Revision history for this message
Michael Terry (mterry) wrote :

One more thing. If you run nosetests3 in your checkout, you'll see that the pep8 and pyflakes tests fail. Please fix those errors. But beyond that, I think this is good to approve.

(Don't worry if test_is_port_listening fails for you; that just means you aren't running ssh. It's not a good test.)

2623. By Sami Jaktholm

Fix pyflakes and pep8 errors.

Revision history for this message
Sami Jaktholm (sjakthol) wrote :

Fixed those errors. Apparently no warning or error message is shown if you don't have pep8 or pyflakes installed...

Revision history for this message
Michael Terry (mterry) wrote :

Hum, seems bad. But thanks for fixing these! Looks good. Will merge shortly.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches

to status/vote changes: