Merge into trunk : executemany : Code : Storm

Status:	Rejected
Rejected by:	Kevin McDermott on 2012-02-29
Proposed branch:	lp://staging/~abentley/storm/executemany
Merge into:	lp://staging/storm
Diff against target:	875 lines (+442/-81) 8 files modified storm/database.py (+39/-6) storm/store.py (+7/-0) storm/tracer.py (+50/-7) tests/database.py (+109/-4) tests/databases/postgres.py (+14/-0) tests/mocker.py (+2/-2) tests/store/base.py (+20/-0) tests/tracer.py (+201/-62)
To merge this branch:	bzr merge lp://staging/~abentley/storm/executemany
Related bugs:	Link a bug report

Reviewer	Date Requested	Status
Aaron Bentley (community)		Disapprove on 2012-02-21
Stuart Bishop (community)	2012-02-17	Approve on 2012-02-20
Storm Developers	2012-02-20	Pending
Review via email: mp+93653@code.staging.launchpad.net

Commit message

Support cursor.executemany all through the stack.

Description of the change

This branch adds support for cursor.executemany to Storm.

My use case is the Launchpad's branch scanner, and even compared to a single execute using multiple escaped rows, executemany completes in 1/3 the time.

I've done my best to make sure executemany is tested just as rigourously as execute, but please let me know if there are any ways you'd like the tests done differently.

test_capture_with_exception uses Python's 'with' statement, so I changed it to use ExpectedException for clarity.

if action.kind was "getattr", Patcher.execute would raise an AttributeError referring to the wrong variable, so I changed the bare raise to "raise e".

Revision history for this message

Stuart Bishop (stub) wrote on 2012-02-20:

#

> This branch adds support for cursor.executemany to Storm.
>
> My use case is the Launchpad's branch scanner, and even compared to a single execute using multiple escaped rows, executemany completes in 1/3 the time.

There is another branch going through review that implements
multivalued INSERT. For this use case, I'm not sure which will end up
faster. Still, this branch has other uses such as multiple UPDATEs,
and possibly multiple SELECTs where a stored procedure is being
invoked (possibly this will fail, as each SELECT will return a result
set and the DB-API lets the driver fail in this case).

I'm fine with this. I found one typo, and have a few comments that
don't require any changes.

> + def _raw_execute(self, statement, params, executemany):

Did you try having this split into _raw_execute and _raw_execute_many?
I suspect it will be better despite some code duplication. Similarly
for _execute. I'm fine with this design though if you feel this is the
more maintainable variant.

> + def connection_raw_executemany_error(self, connection, raw_cursor,
> + statement, params, error):
> + """Raise TimeoutError if the given error was a timeout issue.
> +
> + Must be specialized in the backend.
> + """
> + raise NotImplementedError("%s.connection_raw_execute_error() must be "
> + "implemented" % self.__class__.__name__)

Typo here in the error message - execute__error instead of executemany_error.

> def test_execute_convert_param_style(self):
> - self.connection.execute("'?' ? '?' ? '?'")
> - self.assertEquals(self.executed, [("'?' ? '?' ? '?'", marker)])

Should this be removed, or was this a valid test with an invalid name?

> + def test_unicode_many(self):
> + raw_str = "\xc3\xa1\xc3\xa9\xc3\xad\xc3\xb3\xc3\xba"
> + uni_str = raw_str.decode("UTF-8")

I see you are cribbing from existing tests so this is fine. I always
prefer to use u'\N{TRADE MARK SIGN}' style notation to construct valid
Unicode strings without polluting ASCII source files.

> + def prepare_raise_timeout_on_granularity(self):
> + tracer_mock = self.mocker.patch(self.tracer)
> +
> + self.mocker.order()
> +
> + tracer_mock.get_remaining_time()
> + self.mocker.result(self.tracer.granularity)
> + tracer_mock.set_statement_timeout(self.raw_cursor,
> + self.tracer.granularity)
> + tracer_mock.get_remaining_time()
> + self.mocker.result(0)
> + self.mocker.replay()
> +
> def test_raise_timeout_on_granularity(self):

Instead of these prepare + 2 * execute tests, it might have been
better to subclass the TestCase and override the self.execute method
to do executemany.

--
Stuart Bishop <email address hidden>

> This branch adds support for cursor.executemany to Storm.
>
> My use case is the Launchpad's branch scanner, and even compared to a single execute using multiple escaped rows, executemany completes in 1/3 the time.

There is another branch going through review that implements
multivalued INSERT. For this use case, I'm not sure which will end up
faster. Still, this branch has other uses such as multiple UPDATEs,
and possibly multiple SELECTs where a stored procedure is being
invoked (possibly this will fail, as each SELECT will return a result
set and the DB-API lets the driver fail in this case).

I'm fine with this. I found one typo, and have a few comments that
don't require any changes.

> +    def _raw_execute(self, statement, params, executemany):

Did you try having this split into _raw_execute and _raw_execute_many?
I suspect it will be better despite some code duplication. Similarly
for _execute. I'm fine with this design though if you feel this is the
more maintainable variant.

> +    def connection_raw_executemany_error(self, connection, raw_cursor,
> +                                     statement, params, error):
> +        """Raise TimeoutError if the given error was a timeout issue.
> +
> +        Must be specialized in the backend.
> +        """
> +        raise NotImplementedError("%s.connection_raw_execute_error() must be "
> +                                  "implemented" % self.__class__.__name__)

Typo here in the error message - execute__error instead of executemany_error.

>     def test_execute_convert_param_style(self):
> -        self.connection.execute("'?' ? '?' ? '?'")
> -        self.assertEquals(self.executed, [("'?' ? '?' ? '?'", marker)])

Should this be removed, or was this a valid test with an invalid name?

> +    def test_unicode_many(self):
> +        raw_str = "\xc3\xa1\xc3\xa9\xc3\xad\xc3\xb3\xc3\xba"
> +        uni_str = raw_str.decode("UTF-8")

I see you are cribbing from existing tests so this is fine. I always
prefer to use u'\N{TRADE MARK SIGN}' style notation to construct valid
Unicode strings without polluting ASCII source files.

> +    def prepare_raise_timeout_on_granularity(self):
> +        tracer_mock = self.mocker.patch(self.tracer)
> +
> +        self.mocker.order()
> +
> +        tracer_mock.get_remaining_time()
> +        self.mocker.result(self.tracer.granularity)
> +        tracer_mock.set_statement_timeout(self.raw_cursor,
> +                                          self.tracer.granularity)
> +        tracer_mock.get_remaining_time()
> +        self.mocker.result(0)
> +        self.mocker.replay()
> +
>     def test_raise_timeout_on_granularity(self):

Instead of these prepare + 2 * execute tests, it might have been
better to subclass the TestCase and override the self.execute method
to do executemany.

-- 
Stuart Bishop <stuart.bishop@canonical.com>

Revision history for this message

Stuart Bishop (stub) on 2012-02-20:

#

review: Approve

Revision history for this message

Aaron Bentley (abentley) wrote on 2012-02-21:

#

wgrant and I have determined that this strategy of performing bulk insertions does not perform as well as lp:~wgrant/storm/bulk-insert. wgrant has further determined that psycopg's executemany works by repeatedly executing statements, so single-statement approaches should always be superiour to executemany. In other words, executemany is not and will not be useful for psycopg-based projects.

I believe we should not land this change on that basis.

review: Disapprove

Revision history for this message

Free Ekanayaka (free.ekanayaka) wrote on 2012-02-29:

#

Aaron, since bulk insert covers your use case well and we didn't receive any feature request for supporting executemany, I agree that there's no need to land this branch (though it's nice to have it in LP in case of future reference). I'd suggest to mark this Merge Proposal as Rejected.

Revision history for this message

Aaron Bentley (abentley) wrote on 2012-02-29:

#

I don't have the authority to mark this proposal rejected, so could someone please do that for me?

Unmerged revisions

440. By Aaron Bentley on 2012-02-17: Cleanup.
439. By Aaron Bentley on 2012-02-17: Test executemany unicode handling on Postgres.
438. By Aaron Bentley on 2012-02-16: Add and test Store.executemany
437. By Aaron Bentley on 2012-02-16: executeall -> executemany
436. By Aaron Bentley on 2012-02-16: Test parameter conversion with executeall.
435. By Aaron Bentley on 2012-02-16: Remove duplicate test.
434. By Aaron Bentley on 2012-02-16: Test tracing executeall.
433. By Aaron Bentley on 2012-02-15: Update all tracer tests for executeall.
432. By Aaron Bentley on 2012-02-15: Add raw_executeall support to DebugTracer.

Storm

Merge lp://staging/~abentley/storm/executemany into lp://staging/storm

Commit message

Description of the change

Unmerged revisions

Preview Diff

Subscribers