BeeSeek

Merge lp://staging/~mascha/beeseek/fixesLP505596 into lp://staging/beeseek/1.0

fixesLP505596
Merge into 1.0

Proposed by Martin Schaaf on 2010-01-21

Status:

Merged

Approved by:

Andrea Corbellini on 2010-01-23

Approved revision:

not available

Merged at revision:

not available

Proposed branch:

lp://staging/~mascha/beeseek/fixesLP505596

Merge into:

lp://staging/beeseek/1.0

Diff against target:

21 lines (+2/-2)

1 file modified

beeseek/ui/resultspage.py (+2/-2)

To merge this branch:

bzr merge lp://staging/~mascha/beeseek/fixesLP505596

Medium

Fix Released

Link a bug report

Reviewer	Date Requested	Status
Andrea Corbellini	2010-01-21	Approve on 2010-01-23
Martin Schaaf (community)		Needs Resubmitting on 2010-01-22
Review via email: mp+17845@code.staging.launchpad.net

Revision history for this message

Martin Schaaf (mascha) wrote on 2010-01-21:

Fixes lp #505596

Revision history for this message

Andrea Corbellini (andrea.corbellini) wrote on 2010-01-22:

Hi and thanks again for your contribution.

Your branch will work and your patch does its job correctly, but I think that using a regular expression is too much complicated for this bug. I think that the better way to fix it is catching ValueError. The reason why I think this is that catching an exception is faster than compiling a matching a regular expression.

Here's what `timeit` says on my computer:

>>> timeit("""
... try:
... index = int(value)
... except ValueError:
... index = 0
... """, "value = '123'")
0.83609890937805176
>>> timeit("""
... if re.match('\d+$', value):
... index = int(value)
... else:
... index = 0
... """, "import re; value = '123'")
3.7652208805084229

In this case, where the index given by the user is a valid number, the difference is big (~2.929 seconds).

>>> timeit("""
... try:
... index = int(value)
... except ValueError:
... index = 0
... """, "value = 'abc'")
3.6588859558105469
>>> timeit("""
... if re.match('\d+$', value):
... index = int(value)
... else:
... index = 0
... """, "import re; value = 'abc'")
2.3073050975799561

Here the index is not valid, and the faster code is the one that uses the regex (~1.351 seconds). However: 1. the delta is smaller than the first case and 2. this case is more rare (if a non-digit is added to the URL, it's because the user did it intentionally).

Of course, this is my opinion and you are free to reply if you thing I'm wrong or if I missed something.

lp://staging/~mascha/beeseek/fixesLP505596 updated on 2010-01-22

416. By Martin Schaaf on 2010-01-22: o for better performance pre-compile the regular expression

Revision history for this message

Martin Schaaf (mascha) wrote on 2010-01-22:

Fix performance concerns by pre compiling the regular expression.

review: Needs Resubmitting

Revision history for this message

Andrea Corbellini (andrea.corbellini) wrote on 2010-01-23:

Pre-compiling the regular expression gives a great advantage:

>>> timeit("""
... if NUMBER_MATCH.match(value):
... index = int(value)
... else:
... index = 0
... """, "import re; NUMBER_MATCH = re.compile(r'^\d+$'); value = '123'")
1.5664610862731934
>>> timeit("""
... if NUMBER_MATCH.match(value):
... index = int(value)
... else:
... index = 0
... """, "import re; NUMBER_MATCH = re.compile(r'^\d+$'); value = 'abc'")
0.54535198211669922

It's still slower than the try..except block, but the difference is not as big as before.
Thank you for your fix!

review: Approve

Revision history for this message

Martin Schaaf (mascha) wrote on 2010-01-23:

Thank you for accepting. The reason why I prefer the solution with the regular expression is, that now the time for the worst case is much lower.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Andrea Colangelo

Andrea Corbellini

Jacopo Moretto

Martin Schaaf

1	=== modified file 'beeseek/ui/resultspage.py'
2	--- beeseek/ui/resultspage.py 2010-01-06 12:51:25 +0000
3	+++ beeseek/ui/resultspage.py 2010-01-22 20:10:24 +0000
4	@@ -25,6 +25,7 @@
5
6	WORD_SEPARATOR = re.compile(r'\W', re.UNICODE)
7	RESULTS_PER_PAGE = 10
8	+NUMBER_MATCH = re.compile('^\d+$')
9
10	def show_results(client, page):
11	"""Send to the client an HTML page with the results."""
12	@@ -35,8 +36,7 @@
13	url_chunks = page.split('/')
14	query = unquote_plus(url_chunks[2])
15	keywords = WORD_SEPARATOR.split(query.lower())
16	- if len(url_chunks) >= 4:
17	- # XXX This will raise an exception if the page index is not a number.
18	+ if len(url_chunks) >= 4 and NUMBER_MATCH.match(url_chunks[3]):
19	page_index = int(url_chunks[3])
20	else:
21	page_index = 0

BeeSeek

Merge lp://staging/~mascha/beeseek/fixesLP505596 into lp://staging/beeseek/1.0

Commit message

Description of the change

Preview Diff

Subscribers