Merge lp://staging/~fo0bar/swoffsite/generator into lp://staging/swoffsite
Proposed by
Ryan Finnie
Status: | Merged |
---|---|
Approved by: | Stuart Bishop |
Approved revision: | 43 |
Merged at revision: | 43 |
Proposed branch: | lp://staging/~fo0bar/swoffsite/generator |
Merge into: | lp://staging/swoffsite |
Diff against target: |
39 lines (+7/-4) 1 file modified
swoffsite/mirror.py (+7/-4) |
To merge this branch: | bzr merge lp://staging/~fo0bar/swoffsite/generator |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Stuart Bishop (community) | Approve | ||
Canonical IS Reviewers | Pending | ||
Review via email: mp+367817@code.staging.launchpad.net |
Commit message
Set full_listing=False, revert r16, change SWIFT_BATCHSIZE to 1000
Description of the change
- full_listing=False allows the rest of swift_walk's generator
functionality to work, as full_listing=True internally walks the entire
container before returning the full data set, causing memory exhaustion
on large containers.
- Change SWIFT_BATCHSIZE to 1000 to avoid Swift server timeouts. Comments
updated to point out why >10000 is futile.
- Reverting r16 allows s3_walk to be used as a generator, fixing crashes
on large S3 buckets.
To post a comment you must log in.
Yup.
Reducing SWIFT_BATCHSIZE probably does nothing, making an order of magnitude more calls that each return an order of magnitude less information each. I think if pauses cause timeouts between batches, those pauses are due to waiting for large file uploads to complete to S3 (at least on the librarian deploy, it is not uncommon to be blocked on 3 or 4 GB sized files uploading, providing enough disk space to continue, because I was unable to stream them directly from PGP). But we should certainly go with what you have been testing.