mirror of
1
Fork 0
forgejo/models/issues
Chl 544cbc6f01 Optimization of labels handling in issue_search (#4228)
This PR optimizes the SQL query and de-duplicate the labels' ids when generating the query string, on the issue page.

<hr/>

### Background

Some time ago, BingBot and some other crawlers have been putting my instance on its knees with requests containing a lot of label ids, like this one :

```
[07/Aug/2023:11:28:37 +0200] "GET /Dolibarr/sendrecurringinvoicebymail/issues?q=&type=all&sort=&state=closed&labels=1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c2%2c10%2c2%2c1%2c1%2c10%2c10%2c7%2c6%2c10%2c10%2c3%2c2%2c1%2c5%2c10%2c1%2c6%2c2%2c7%2c3%2c7%2c6%2c10%2c1%2c10%2c1%2c1%2c7%2c7%2c1%2c1%2c1%2c1%2c10%2c10%2c1%2c2%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c2%2c1%2c12%2c6%2c6%2c10&milestone=0&project=-1&poster=0 HTTP/1.1" 499 0 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) Chrome/103.0.5060.134 Safari/537.36"
```

Since each of the label ids implies a join, it grows exponentially expensive for the database engine (at least on PostgreSQL but SQLite suffers a little too).

Thus, this PR proposes two enhancements:

* rewrite the database query to use only one squashed condition,
* deduplicate the label ids when generating the URL.

### Performance comparison

Here are some timings on Postgresql-backed, Forgejo 7.0.4 instances :
```sh
$ time curl -s -o /dev/null "http://localhost:3000/toto/tata/issues?q=&type=all&sort=&labels=19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25&state=open&milestone=0&project=0&assignee=0&poster=0"

real    0m10,491s
user    0m0,017s
sys     0m0,008s
```
...and with the patch:
```sh
$ time curl -s -o /dev/null "http://localhost:3000/toto/tata/issues?q=&type=all&sort=&labels=19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25&state=open&milestone=0&project=0&assignee=0&poster=0"

real    0m0,094s
user    0m0,012s
sys     0m0,013s
```

### Annex

This issue was originally proposed to [Gitea](https://github.com/go-gitea/gitea/pull/26460) but didn't get much attention, and I switched to Forgejo in the meantime :)

Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/4228
Reviewed-by: Earl Warren <earl-warren@noreply.codeberg.org>
Co-authored-by: Chl <chl@xlii.si>
Co-committed-by: Chl <chl@xlii.si>
2024-06-28 05:11:57 +00:00
..
assignees.go Performance improvements for pull request list API (#30490) 2024-06-02 16:26:54 +02:00
assignees_test.go
comment.go Prevent simultaneous editing of comments and issues (#31053) 2024-06-02 16:26:54 +02:00
comment_code.go Do some performance optimize for issues list and view issue/pull (gitea#29515) 2024-04-08 14:47:31 +02:00
comment_list.go Only update poster in issue/comment list if it has been loaded (#31216) 2024-06-02 16:26:54 +02:00
comment_test.go
content_history.go
content_history_test.go
dependency.go
dependency_test.go
issue.go Performance improvements for pull request list API (#30490) 2024-06-02 16:26:54 +02:00
issue_index.go Do not update PRs based on events that happened before they existed 2024-04-11 11:16:23 +02:00
issue_index_test.go Do not update PRs based on events that happened before they existed 2024-04-11 11:16:23 +02:00
issue_label.go Performance improvements for pull request list API (#30490) 2024-06-02 16:26:54 +02:00
issue_label_test.go
issue_list.go Only update poster in issue/comment list if it has been loaded (#31216) 2024-06-02 16:26:54 +02:00
issue_list_test.go Fix: missing value for In() condition 2024-04-08 15:16:40 +02:00
issue_lock.go
issue_project.go Rename project board -> column to make the UI less confusing (#30170) 2024-06-02 09:42:39 +02:00
issue_search.go Optimization of labels handling in issue_search (#4228) 2024-06-28 05:11:57 +00:00
issue_stats.go Fix bug in GetIssueStats 2024-06-13 10:25:26 +02:00
issue_stats_test.go Run make fmt 2024-06-16 15:59:59 +02:00
issue_test.go Optimization of labels handling in issue_search (#4228) 2024-06-28 05:11:57 +00:00
issue_update.go Prevent simultaneous editing of comments and issues (#31053) 2024-06-02 16:26:54 +02:00
issue_user.go
issue_user_test.go
issue_watch.go
issue_watch_test.go
issue_xref.go
issue_xref_test.go Resolve lint for unused parameter and unnecessary type arguments (#30750) 2024-05-05 08:38:16 +01:00
label.go Optimization of labels handling in issue_search (#4228) 2024-06-28 05:11:57 +00:00
label_test.go Optimization of labels handling in issue_search (#4228) 2024-06-28 05:11:57 +00:00
main_test.go
milestone.go
milestone_list.go
milestone_test.go
pull.go Make gitea webhooks openproject compatible (gitea#28435) 2024-06-05 15:58:51 +02:00
pull_list.go Fix PullRequestList.GetIssueIDs's logic (#31352) 2024-06-16 13:42:58 +02:00
pull_test.go Do not update PRs based on events that happened before they existed 2024-04-11 11:16:23 +02:00
reaction.go Add container.FilterSlice function (gitea#30339) 2024-04-16 11:49:44 +02:00
reaction_test.go
review.go Fix automerge will not work because of some events haven't been triggered (#30780) 2024-05-26 19:01:36 +02:00
review_list.go Add container.FilterSlice function (gitea#30339) 2024-04-16 11:49:44 +02:00
review_test.go Prevent re-review and dismiss review actions on closed and merged PRs (#30065) 2024-03-30 07:17:32 +01:00
stopwatch.go
stopwatch_test.go
tracked_time.go Add codespell support and fix a good number of typos with its help (#3270) 2024-05-09 13:49:37 +00:00
tracked_time_test.go