There is no need for a database patch - the stored procedure should be added to trusted.sql (which is becoming unwieldy - I think I need to split it up).
> +CREATE OR REPLACE FUNCTION calculate_bug_heat(bug_id integer) RETURNS integer
> +LANGUAGE plpythonu AS $$
> + from datetime import datetime
> +
> + class BugHeatConstants:
> + PRIVACY = 150
> + SECURITY = 250
> + DUPLICATE = 6
> + AFFECTED_USER = 4
> + SUBSCRIBER = 2
> +
> +
> + def get_max_heat_for_bug(bug_id):
> + bug_tasks = plpy.execute(
> + "SELECT * FROM BugTask WHERE bug = %s" % bug_id)
> +
> + max_heats = []
> + for bug_task in bug_tasks:
> + if bug_task['product'] is not None:
> + product = plpy.execute(
> + "SELECT max_bug_heat FROM Product WHERE id = %s" %
> + bug_task['product'])[0]
> + max_heats.append(product['max_bug_heat'])
> + elif bug_task['distribution']:
> + distribution = plpy.execute(
> + "SELECT max_bug_heat FROM Distribution WHERE id = %s" %
> + bug_task['distribution'])[0]
> + max_heats.append(distribution['max_bug_heat'])
> + elif bug_task['productseries'] is not None:
> + product_series = plpy.execute("""
> + SELECT Product.max_bug_heat
> + FROM ProductSeries, Product
> + WHERE ProductSeries.Product = Product.id
> + AND ProductSeries.id = %s"""%
> + bug_task['productseries'])[0]
> + max_heats.append(product_series['max_bug_heat'])
> + elif bug_task['distroseries']:
> + distro_series = plpy.execute("""
> + SELECT Distribution.max_bug_heat
> + FROM DistroSeries, Distribution
> + WHERE DistroSeries.Distribution = Distribution.id
> + AND DistroSeries.id = %s"""%
> + bug_task['distroseries'])[0]
> + max_heats.append(distro_series['max_bug_heat'])
> + else:
> + pass
> +
> + return max(max_heats)
You can do this in a single query rather than one per task:
SELECT MAX(GREATEST(Product.max_bug_heat, Distribution.max_bug_heat))
FROM BugTask
LEFT OUTER JOIN ProductSeries ON BugTask.productseries = ProductSeries.id
LEFT OUTER JOIN Product ON (
BugTask.product = Product.id
OR ProductSeries.product = Product.id)
LEFT OUTER JOIN DistroSeries ON BugTask.distroseries = DistroSeries.id
LEFT OUTER JOIN Distribution ON (
BugTask.distribution = Distribution.id
OR DistroSeries.distribution = Distribution.id)
WHERE
BugTask.bug = 1
> + # It would be nice to be able to just SELECT * here, but we need to
> + # format the timestamps so that datetime.strptime() won't choke on
> + # them.
Using epoch time would avoid the need for parsing strings (use 'EXTRACT(epoch FROM datecreated)' instead of 'TO_CHAR(...)').
It probably doesn't save any significant time though.
> + if bug_data.nrows() == 0:
> + return 0
I'd return None or raise an exception here - the bug doesn't exist so the heat is undefined.
> + bug = bug_data[0]
> + if bug['duplicateof'] is not None:
> + return 0
Again, None or the heat of the master bug. We should catch cases where we are attempting to calculate the bug heat of non-existent or duplicate bugs.
> + if bug['private']:
> + heat['privacy'] = BugHeatConstants.PRIVACY
> + if bug['security_related']:
> + heat['security'] = BugHeatConstants.SECURITY
> +
> + # Get the heat from subscribers.
> + sub_count = plpy.execute("""
> + SELECT bug, COUNT(id) AS sub_count
> + FROM BugSubscription
> + WHERE bug = %s
> + GROUP BY bug
> + """ % bug_id)
This is unnecessarily complex. The simpler query is just "SELECT COUNT(*) FROM BugSubscription WHERE bug=%s".
You don't need the nrows() guard, as the query will always return one row.
> + # Get the heat from subscribers via duplicates.
> + subs_from_dupes = plpy.execute("""
> + SELECT bug.duplicateof AS dupe_of,
> + COUNT(bugsubscription.id) AS dupe_sub_count
> + FROM bugsubscription, bug
> + WHERE bug.duplicateof = %s
> + AND Bugsubscription.bug = bug.id
> + GROUP BY Bug.duplicateof""" % bug_id)
> + if subs_from_dupes.nrows() > 0:
> + heat['subcribers_from_dupes'] = (
> + BugHeatConstants.SUBSCRIBER * sub_count[0]['dupe_sub_count'])
Similar here. nrows() guard is unnecessary. The query is also overly complex:
SELECT COUNT(*) FROM BugSubscription, Bug WHERE Bug.duplicateof = %s AND BugSubscription.bug = Bug.id
I notice that if a person is subscribed to both a duplicate and the master job, they are counted twice. If this is not correct, we can collapse the above two sections into one using the query:
SELECT COUNT(DISTINCT BugSubscription.person)
FROM BugSubscription, Bug
WHERE Bug.id = BugSubscription.bug
AND (Bug.id = %s OR Bug.duplicateof = %s)
If you want to expand teams to get the genuine number of people subscribed:
SELECT COUNT(DISTINCT TeamParticipation.person)
FROM TeamParticipation, BugSubscription, Bug
WHERE
TeamParticipation.team = BugSubscription.person
AND Bug.id = BugSubscription.bug
AND (Bug.id = %s OR Bug.duplicateof = %s)
Bug #1 for instance has 313 subscribers without expanding the teams, or 678 if you do expand the teams.
> + # Bugs decay over time. Every day the bug isn't touched its heat
> + # decreases by 1%.
> + pg_datetime_fmt = "%Y-%m-%d %H:%M:%S"
> + date_last_updated = datetime.strptime(
> + bug['formatted_date_last_updated'], pg_datetime_fmt)
> + days_since_last_update = (datetime.utcnow() - date_last_updated).days
> + total_heat = int(total_heat * (0.99 ** days_since_last_update))
Yer... this would be simpler and clearer using epoch times.
> +INSERT INTO LaunchpadDatabaseRevision VALUES (2207, 99, 0);
As before, this should go into trusted.sql rather than a DB patch. We get to treat stored procedures as any other code, unlike database patches which cannot be reverted once applied.
> -Sometimes, when a bug changes, we want to see the changes reflected in the bug's
> -heat value immidiately, without waiting for heat to be recalculated. Currently
> -we adjust heat immidiately for bug privacy and security.
immediately, not immidiately.
> + def updateHeat(self):
> + """See `IBug`."""
> + # We need to flush the store first to ensure that changes are
> + # reflected in the new bug heat total.
> + store = Store.of(self)
> + store.flush()
You sure about that? Storm should be applying updates in order, so when calculate_bug_heat is actually executed (at commit time or when heat is next accessed) changes will be implicitly flushed.
On Wed, May 26, 2010 at 10:26 PM, Graham Binns <email address hidden> wrote:
> === added file 'database/ schema/ patch-2207- 99-0.sql'
There is no need for a database patch - the stored procedure should be added to trusted.sql (which is becoming unwieldy - I think I need to split it up).
> +CREATE OR REPLACE FUNCTION calculate_ bug_heat( bug_id integer) RETURNS integer heat_for_ bug(bug_ id): 'product' ])[0] append( product[ 'max_bug_ heat']) 'distribution' ]: 'distribution' ])[0] append( distribution[ 'max_bug_ heat']) 'productseries' ] is not None: max_bug_ heat Product = Product.id 'productseries' ])[0] append( product_ series[ 'max_bug_ heat']) 'distroseries' ]: max_bug_ heat Distribution = Distribution.id 'distroseries' ])[0] append( distro_ series[ 'max_bug_ heat'])
> +LANGUAGE plpythonu AS $$
> + from datetime import datetime
> +
> + class BugHeatConstants:
> + PRIVACY = 150
> + SECURITY = 250
> + DUPLICATE = 6
> + AFFECTED_USER = 4
> + SUBSCRIBER = 2
> +
> +
> + def get_max_
> + bug_tasks = plpy.execute(
> + "SELECT * FROM BugTask WHERE bug = %s" % bug_id)
> +
> + max_heats = []
> + for bug_task in bug_tasks:
> + if bug_task['product'] is not None:
> + product = plpy.execute(
> + "SELECT max_bug_heat FROM Product WHERE id = %s" %
> + bug_task[
> + max_heats.
> + elif bug_task[
> + distribution = plpy.execute(
> + "SELECT max_bug_heat FROM Distribution WHERE id = %s" %
> + bug_task[
> + max_heats.
> + elif bug_task[
> + product_series = plpy.execute("""
> + SELECT Product.
> + FROM ProductSeries, Product
> + WHERE ProductSeries.
> + AND ProductSeries.id = %s"""%
> + bug_task[
> + max_heats.
> + elif bug_task[
> + distro_series = plpy.execute("""
> + SELECT Distribution.
> + FROM DistroSeries, Distribution
> + WHERE DistroSeries.
> + AND DistroSeries.id = %s"""%
> + bug_task[
> + max_heats.
> + else:
> + pass
> +
> + return max(max_heats)
You can do this in a single query rather than one per task:
SELECT MAX(GREATEST( Product. max_bug_ heat, Distribution. max_bug_ heat)) productseries = ProductSeries.id product = Product.id) distroseries = DistroSeries.id distribution = Distribution.id distribution = Distribution.id)
FROM BugTask
LEFT OUTER JOIN ProductSeries ON BugTask.
LEFT OUTER JOIN Product ON (
BugTask.product = Product.id
OR ProductSeries.
LEFT OUTER JOIN DistroSeries ON BugTask.
LEFT OUTER JOIN Distribution ON (
BugTask.
OR DistroSeries.
WHERE
BugTask.bug = 1
> + # It would be nice to be able to just SELECT * here, but we need to
> + # format the timestamps so that datetime.strptime() won't choke on
> + # them.
Using epoch time would avoid the need for parsing strings (use 'EXTRACT(epoch FROM datecreated)' instead of 'TO_CHAR(...)').
It probably doesn't save any significant time though.
> + if bug_data.nrows() == 0:
> + return 0
I'd return None or raise an exception here - the bug doesn't exist so the heat is undefined.
> + bug = bug_data[0]
> + if bug['duplicateof'] is not None:
> + return 0
Again, None or the heat of the master bug. We should catch cases where we are attempting to calculate the bug heat of non-existent or duplicate bugs.
> + if bug['private']: s.PRIVACY related' ]: s.SECURITY
> + heat['privacy'] = BugHeatConstant
> + if bug['security_
> + heat['security'] = BugHeatConstant
> +
> + # Get the heat from subscribers.
> + sub_count = plpy.execute("""
> + SELECT bug, COUNT(id) AS sub_count
> + FROM BugSubscription
> + WHERE bug = %s
> + GROUP BY bug
> + """ % bug_id)
This is unnecessarily complex. The simpler query is just "SELECT COUNT(*) FROM BugSubscription WHERE bug=%s".
> + if sub_count.nrows() > 0: s.SUBSCRIBER * sub_count[ 0]['sub_ count'] )
> + heat['subscribers'] = (
> + BugHeatConstant
You don't need the nrows() guard, as the query will always return one row.
> + # Get the heat from subscribers via duplicates. iption. id) AS dupe_sub_count dupes.nrows( ) > 0: s_from_ dupes'] = ( s.SUBSCRIBER * sub_count[ 0]['dupe_ sub_count' ])
> + subs_from_dupes = plpy.execute("""
> + SELECT bug.duplicateof AS dupe_of,
> + COUNT(bugsubscr
> + FROM bugsubscription, bug
> + WHERE bug.duplicateof = %s
> + AND Bugsubscription.bug = bug.id
> + GROUP BY Bug.duplicateof""" % bug_id)
> + if subs_from_
> + heat['subcriber
> + BugHeatConstant
Similar here. nrows() guard is unnecessary. The query is also overly complex:
SELECT COUNT(*) FROM BugSubscription, Bug WHERE Bug.duplicateof = %s AND BugSubscription.bug = Bug.id
I notice that if a person is subscribed to both a duplicate and the master job, they are counted twice. If this is not correct, we can collapse the above two sections into one using the query:
SELECT COUNT(DISTINCT BugSubscription .person)
FROM BugSubscription, Bug
WHERE Bug.id = BugSubscription.bug
AND (Bug.id = %s OR Bug.duplicateof = %s)
If you want to expand teams to get the genuine number of people subscribed:
SELECT COUNT(DISTINCT TeamParticipati on.person) pation. team = BugSubscription .person
FROM TeamParticipation, BugSubscription, Bug
WHERE
TeamPartici
AND Bug.id = BugSubscription.bug
AND (Bug.id = %s OR Bug.duplicateof = %s)
Bug #1 for instance has 313 subscribers without expanding the teams, or 678 if you do expand the teams.
> + # Bugs decay over time. Every day the bug isn't touched its heat date_last_ updated' ], pg_datetime_fmt) last_update = (datetime.utcnow() - date_last_ updated) .days last_update) )
> + # decreases by 1%.
> + pg_datetime_fmt = "%Y-%m-%d %H:%M:%S"
> + date_last_updated = datetime.strptime(
> + bug['formatted_
> + days_since_
> + total_heat = int(total_heat * (0.99 ** days_since_
Yer... this would be simpler and clearer using epoch times.
> +INSERT INTO LaunchpadDataba seRevision VALUES (2207, 99, 0);
As before, this should go into trusted.sql rather than a DB patch. We get to treat stored procedures as any other code, unlike database patches which cannot be reverted once applied.
> -Sometimes, when a bug changes, we want to see the changes reflected in the bug's
> -heat value immidiately, without waiting for heat to be recalculated. Currently
> -we adjust heat immidiately for bug privacy and security.
immediately, not immidiately.
> + def updateHeat(self):
> + """See `IBug`."""
> + # We need to flush the store first to ensure that changes are
> + # reflected in the new bug heat total.
> + store = Store.of(self)
> + store.flush()
You sure about that? Storm should be applying updates in order, so when calculate_bug_heat is actually executed (at commit time or when heat is next accessed) changes will be implicitly flushed.