Merge lp://staging/~abentley/storm/executemany into lp://staging/storm

Proposed by Aaron Bentley
Status: Rejected
Rejected by: Kevin McDermott
Proposed branch: lp://staging/~abentley/storm/executemany
Merge into: lp://staging/storm
Diff against target: 875 lines (+442/-81)
8 files modified
storm/database.py (+39/-6)
storm/store.py (+7/-0)
storm/tracer.py (+50/-7)
tests/database.py (+109/-4)
tests/databases/postgres.py (+14/-0)
tests/mocker.py (+2/-2)
tests/store/base.py (+20/-0)
tests/tracer.py (+201/-62)
To merge this branch: bzr merge lp://staging/~abentley/storm/executemany
Reviewer Review Type Date Requested Status
Aaron Bentley (community) Disapprove
Stuart Bishop (community) Approve
Storm Developers Pending
Review via email: mp+93653@code.staging.launchpad.net

Commit message

Support cursor.executemany all through the stack.

Description of the change

This branch adds support for cursor.executemany to Storm.

My use case is the Launchpad's branch scanner, and even compared to a single execute using multiple escaped rows, executemany completes in 1/3 the time.

I've done my best to make sure executemany is tested just as rigourously as execute, but please let me know if there are any ways you'd like the tests done differently.

test_capture_with_exception uses Python's 'with' statement, so I changed it to use ExpectedException for clarity.

if action.kind was "getattr", Patcher.execute would raise an AttributeError referring to the wrong variable, so I changed the bare raise to "raise e".

To post a comment you must log in.
Revision history for this message
Stuart Bishop (stub) wrote :

> This branch adds support for cursor.executemany to Storm.
>
> My use case is the Launchpad's branch scanner, and even compared to a single execute using multiple escaped rows, executemany completes in 1/3 the time.

There is another branch going through review that implements
multivalued INSERT. For this use case, I'm not sure which will end up
faster. Still, this branch has other uses such as multiple UPDATEs,
and possibly multiple SELECTs where a stored procedure is being
invoked (possibly this will fail, as each SELECT will return a result
set and the DB-API lets the driver fail in this case).

I'm fine with this. I found one typo, and have a few comments that
don't require any changes.

> +    def _raw_execute(self, statement, params, executemany):

Did you try having this split into _raw_execute and _raw_execute_many?
I suspect it will be better despite some code duplication. Similarly
for _execute. I'm fine with this design though if you feel this is the
more maintainable variant.

> +    def connection_raw_executemany_error(self, connection, raw_cursor,
> +                                     statement, params, error):
> +        """Raise TimeoutError if the given error was a timeout issue.
> +
> +        Must be specialized in the backend.
> +        """
> +        raise NotImplementedError("%s.connection_raw_execute_error() must be "
> +                                  "implemented" % self.__class__.__name__)

Typo here in the error message - execute__error instead of executemany_error.

>     def test_execute_convert_param_style(self):
> -        self.connection.execute("'?' ? '?' ? '?'")
> -        self.assertEquals(self.executed, [("'?' ? '?' ? '?'", marker)])

Should this be removed, or was this a valid test with an invalid name?

> +    def test_unicode_many(self):
> +        raw_str = "\xc3\xa1\xc3\xa9\xc3\xad\xc3\xb3\xc3\xba"
> +        uni_str = raw_str.decode("UTF-8")

I see you are cribbing from existing tests so this is fine. I always
prefer to use u'\N{TRADE MARK SIGN}' style notation to construct valid
Unicode strings without polluting ASCII source files.

> +    def prepare_raise_timeout_on_granularity(self):
> +        tracer_mock = self.mocker.patch(self.tracer)
> +
> +        self.mocker.order()
> +
> +        tracer_mock.get_remaining_time()
> +        self.mocker.result(self.tracer.granularity)
> +        tracer_mock.set_statement_timeout(self.raw_cursor,
> +                                          self.tracer.granularity)
> +        tracer_mock.get_remaining_time()
> +        self.mocker.result(0)
> +        self.mocker.replay()
> +
>     def test_raise_timeout_on_granularity(self):

Instead of these prepare + 2 * execute tests, it might have been
better to subclass the TestCase and override the self.execute method
to do executemany.

--
Stuart Bishop <email address hidden>

Revision history for this message
Stuart Bishop (stub) :
review: Approve
Revision history for this message
Aaron Bentley (abentley) wrote :

wgrant and I have determined that this strategy of performing bulk insertions does not perform as well as lp:~wgrant/storm/bulk-insert. wgrant has further determined that psycopg's executemany works by repeatedly executing statements, so single-statement approaches should always be superiour to executemany. In other words, executemany is not and will not be useful for psycopg-based projects.

I believe we should not land this change on that basis.

review: Disapprove
Revision history for this message
Free Ekanayaka (free.ekanayaka) wrote :

Aaron, since bulk insert covers your use case well and we didn't receive any feature request for supporting executemany, I agree that there's no need to land this branch (though it's nice to have it in LP in case of future reference). I'd suggest to mark this Merge Proposal as Rejected.

Revision history for this message
Aaron Bentley (abentley) wrote :

I don't have the authority to mark this proposal rejected, so could someone please do that for me?

Unmerged revisions

440. By Aaron Bentley

Cleanup.

439. By Aaron Bentley

Test executemany unicode handling on Postgres.

438. By Aaron Bentley

Add and test Store.executemany

437. By Aaron Bentley

executeall -> executemany

436. By Aaron Bentley

Test parameter conversion with executeall.

435. By Aaron Bentley

Remove duplicate test.

434. By Aaron Bentley

Test tracing executeall.

433. By Aaron Bentley

Update all tracer tests for executeall.

432. By Aaron Bentley

Add raw_executeall support to DebugTracer.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches

to status/vote changes: