Merge lp://staging/~jml/lazr.restfulclient/multiple-instance-safe into lp://staging/lazr.restfulclient

Proposed by Jonathan Lange
Status: Merged
Approved by: j.c.sackett
Approved revision: 141
Merged at revision: 122
Proposed branch: lp://staging/~jml/lazr.restfulclient/multiple-instance-safe
Merge into: lp://staging/lazr.restfulclient
Diff against target: 319 lines (+260/-6)
3 files modified
buildout.cfg (+1/-0)
src/lazr/restfulclient/_browser.py (+101/-6)
src/lazr/restfulclient/tests/test_atomicfilecache.py (+158/-0)
To merge this branch: bzr merge lp://staging/~jml/lazr.restfulclient/multiple-instance-safe
Reviewer Review Type Date Requested Status
j.c.sackett (community) Approve
Richard Harding code* Approve
Review via email: mp+98873@code.staging.launchpad.net

Commit message

Creates AtomicFileCache as a parent class for MultipleRepresentationCache to handle concurrent use issues.

Description of the change

Hello,

I now work on a couple of projects that carry workarounds for bug 459418. It's not economical for my employer to be constantly maintaining these workarounds and to keep hitting this bug when writing new code that uses launchpadlib. As such, I offer this patch to make the file cache used by lazr.restfulclient safely shareable by multiple processes/threads.

It's an adaptation of the patch at http://code.google.com/p/httplib2/issues/detail?id=125.

It adds a new class, AtomicFileCache, and makes the MultipleRepresentationCache subclass that instead of the one that comes with httplib2.

Essentially, there are two components:

 1. Change look-before-you-leap to try/except. This prevents races where a thing exists (or doesn't exist) when we do the check, but then doesn't exist (or exists) when we do the operation.

 2. Do an atomic write in set, relying on the filesystem's atomic rename feature. This is not actually atomic on Windows, but will not be any buggier for them than the current code.

If you are interested in the subject of atomic writes, you might like to read <http://twistedmatrix.com/documents/current/api/twisted.python.filepath.FilePath.html#setContent>. The implementation of the method is much the same as set() here.

There are no tests. It's really hard to produce reliable tests for the race conditions that you find here without substantially changing the public API of the file cache. httplib2 doesn't have any tests for the interface of FileCache, otherwise I would have tried to re-use those. It wouldn't be too draconian to insist on adding some basic interface tests (e.g. does get() get what set() sets?) here, I think.

I look forward to hearing from you.

jml

To post a comment you must log in.
Revision history for this message
Richard Harding (rharding) wrote :

Thanks for this.

review: Approve (code*)
Revision history for this message
j.c.sackett (jcsackett) wrote :

Jml:

This looks really great. I'm going to be that not-too-draconian reviewer and ask for the basic tests you've suggested.

I absolutely concur that creating a test for the race condition is out of consideration--the test case would like be sufficiently complicated that failures might be from a change to the test, not to the code being tested.

Next, I note several imports being whacked--I assume those are just not in use in the code anymore, correct?

Lastly, do you have landing rights for this, or will you need someone to push it along for you?

review: Needs Information
Revision history for this message
Martin Pool (mbp) wrote :

I'm very grateful for you addressing this bug.

This is very likely to fail on Windows due to the file being unable to be overwritten if it's in use. I know people have at least tried to use lazr.restfulclient and lplib on Windows. I don't know if it's actually supported or working. If it does work a bit, you might want to try not to regress it.

Revision history for this message
Robert Collins (lifeless) wrote :

+1 for the LoC increase per
https://dev.launchpad.net/PolicyAndProcess/MaintenanceCosts - this is
a significant source of friction in using lazr.restful.

-Rob

131. By Jonathan Lange

Remove the destination file if the platform is windows (thanks mbp)

132. By Jonathan Lange

Initial tests for atomic file cache.

133. By Jonathan Lange

Interface tests for FileCache.

134. By Jonathan Lange

Extract out how we make the file cache.

135. By Jonathan Lange

Rename the test class.

136. By Jonathan Lange

Test things not being strings.

137. By Jonathan Lange

Tests for unicode behaviour.

138. By Jonathan Lange

Run the tests against httplib2.FileCache in order to prevent interface drift.

139. By Jonathan Lange

Copyright update.

140. By Jonathan Lange

Implementation-specific tests.

141. By Jonathan Lange

Docstrings.

Revision history for this message
Jonathan Lange (jml) wrote :

Hello all,

Thanks for the reviews, and for the free pass on increasing LoC.

I've added interface tests for FileCache and have set these up to run against both httplib2.FileCache and AtomicFileCache. The difference it makes to test run time is pretty small relative to current run time, and it will prevent interface drift.

Adding these tests revealed one difference between FileCache and AtomicFileCache: if FileCache fails to set(), say, because of a TypeError, then it will return '' on get() for that key; AtomicFileCache will return None, as if the key were never set. I think the difference in behaviour doesn't actually matter for our purposes, but I've left the tests in as documentation anyway. I also think that AtomicFileCache behaves correctly here and that FileCache itself is wrong.

I have also added docstrings, which I hope will help.

Per Martin's comment, I've updated set() to remove the file before renaming it, following the pattern laid out in twisted.python.filepath (which has had extensive usage and testing on Windows).

JC, yes, the removed imports are unused. If you don't believe me, well, I guess the automated test suite will catch me out ;)

I don't have sufficient permissions to land this myself, nor do I actually know how to land it. (Is it PQM-managed? Tarmac? etc.) I would be grateful if you would land it for me.

cheers,
jml

Revision history for this message
j.c.sackett (jcsackett) wrote :

> I've added interface tests for FileCache and have set these up to run against
> both httplib2.FileCache and AtomicFileCache. The difference it makes to test
> run time is pretty small relative to current run time, and it will prevent
> interface drift.
>
> Adding these tests revealed one difference between FileCache and
> AtomicFileCache: if FileCache fails to set(), say, because of a TypeError,
> then it will return '' on get() for that key; AtomicFileCache will return
> None, as if the key were never set. I think the difference in behaviour
> doesn't actually matter for our purposes, but I've left the tests in as
> documentation anyway. I also think that AtomicFileCache behaves correctly
> here and that FileCache itself is wrong.
>
> I have also added docstrings, which I hope will help.

Fantastic, thanks for those changes.

> JC, yes, the removed imports are unused. If you don't believe me, well, I
> guess the automated test suite will catch me out ;)

I believe you; I couldn't get at the code myself on Thursday for reasons best attributed to massive ISP fail, so fell back on the old method of asking. :-p

> I don't have sufficient permissions to land this myself, nor do I actually
> know how to land it. (Is it PQM-managed? Tarmac? etc.) I would be grateful if
> you would land it for me.

I will see what I can do; I think this is something we certainly want landed swiftly.
> cheers,
> jml

review: Approve
Revision history for this message
Jonathan Lange (jml) wrote :

On 26 March 2012 15:16, j.c.sackett <email address hidden> wrote:
...
>> I don't have sufficient permissions to land this myself, nor do I actually
>> know how to land it. (Is it PQM-managed? Tarmac? etc.)  I would be grateful if
>> you would land it for me.
>
> I will see what I can do; I think this is something we certainly want landed swiftly.

Thanks! A release would help a lot too.

jml

Revision history for this message
Jonathan Lange (jml) wrote :

How's it going getting this landed? (It's filling up a WIP slot on our kanban).

Revision history for this message
Martin Pool (mbp) wrote :

I don't want to stall this, but I do think that

    os.unlink(cache_full_path)

may still fail on Windows if the existing file is in use by some other client. Solutions could include:

 - just log it (perhaps silent by default) and continue
 - warn
 - retry for a bit (not great because it slows down the writer)
 - change to using win32 apis that don't lock against deletion

I don't know how well this works on Windows at the moment, but if people are using it this might be a regression.

Revision history for this message
Robert Collins (lifeless) wrote :

On Mon, Apr 2, 2012 at 5:24 PM, Martin Pool <email address hidden> wrote:
> I don't want to stall this, but I do think that
>
>    os.unlink(cache_full_path)
>
> may still fail on Windows if the existing file is in use by some other client.  Solutions could include:
>
>  - just log it (perhaps silent by default) and continue
>  - warn
>  - retry for a bit (not great because it slows down the writer)
>  - change to using win32 apis that don't lock against deletion
>
> I don't know how well this works on Windows at the moment, but if people are using it this might be a regression.

It will die horribly at the moment, this isn't going to be a
regression: we don't currently support concurrent use. This patch will
give us concurrent use on Linux. A future iteration can do concurrent
use on win32.

-Rob

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches