mirror of
1
Fork 0
Commit Graph

20 Commits

Author SHA1 Message Date
Daenney 1bf40e755c
feat: Relax URL matching (#3925)
* feat: Relax URL matching

Instead of only linkifying things with an explicit http or https scheme,
the xurls.Relaxed also matches links with known TLDs. This means that
text like 'banana.com' will also be matched, despite the missing
http/https scheme. This also works to linkify email addresses, which is
handy.

This should also ensure we catch links without a scheme for the purpose
of spam checking.
2025-03-24 14:13:32 +01:00
tobi d8113c11e4
[feature] Parse content warning to HTML, serialize via client API as plaintext (#3876)
* [feature] Parse content warning as HTML, serialize via API to plaintext

* tidy up some cruft

* whoops

* oops

* i'm da joker baybee

* clemency muy lorde

* rename some of the text functions for clarity

* jiggle the opts

* fiddle de deee

* hopefully the last test fix i ever have to do in my beautiful life
2025-03-07 14:04:34 +00:00
tobi 0362d49da0
[bugfix] Parse links that contain non-ascii characters (#2762) 2024-03-15 17:26:53 +00:00
kim c6e00afc7c
[feature] tentatively start adding polls support (#2249) 2023-10-04 13:09:42 +01:00
tobi 536d9e482d
[chore/bugfix] Deinterface text.Formatter, allow underscores in hashtags (#2233) 2023-09-29 10:39:56 +02:00
tobi dc96562b40
[bugfix] Use custom bluemonday policy to disallow inline img tags (#2100) 2023-08-11 14:40:11 +02:00
tobi 0e29f1f5bb
[feature] Enable federation in/out of profile PropertyValue fields (#1722)
Co-authored-by: kim <grufwub@gmail.com>
Co-authored-by: kim <89579420+NyaaaWhatsUpDoc@users.noreply.github.com>
2023-05-09 11:16:10 +01:00
Daenney 5e2bf0bdca
[chore] Improve copyright header handling (#1608)
* [chore] Remove years from all license headers

Years or year ranges aren't required in license headers. Many projects
have removed them in recent years and it avoids a bit of yearly toil.

In many cases our copyright claim was also a bit dodgy since we added
the 2021-2023 header to files created after 2021 but you can't claim
copyright into the past that way.

* [chore] Add license header check

This ensures a license header is always added to any new file. This
avoids maintainers/reviewers needing to remember to check for and ask
for it in case a contribution doesn't include it.

* [chore] Add missing license headers

* [chore] Further updates to license header

* Use the more common // indentend comment format
* Remove the hack we had for the linter now that we use the // format
* Add SPDX license identifier
2023-03-12 16:00:57 +01:00
Daenney 68e6d08c76
[feature] Add a request ID and include it in logs (#1476)
This adds a lightweight form of tracing to GTS. Each incoming request is
assigned a Request ID which we then pass on and log in all our log
lines. Any function that gets called downstream from an HTTP handler
should now emit a requestID=value pair whenever it logs something.

Co-authored-by: kim <grufwub@gmail.com>
2023-02-17 12:02:29 +01:00
Autumn! 49beb17a8f
[chore] Text formatting overhaul (#1406)
* Implement goldmark debug print for hashtags and mentions

* Minify HTML in FromPlain

* Convert plaintext status parser to goldmark

* Move mention/tag/emoji finding logic into formatter

* Combine mention and hashtag boundary characters

* Normalize unicode when rendering hashtags
2023-02-03 11:58:58 +01:00
tobi 0dbe6c514f
[chore] Update/add license headers for 2023 (#1304) 2023-01-05 12:43:00 +01:00
tobi c84384e660
[bugfix] html escape special characters in text instead of totally removing them (#719)
* remove minify dependency

* tidy up some tests

* remove pre + postformat funcs

* rework sanitization + formatting

* update tests

* add some more markdown tests
2022-07-19 15:21:17 +02:00
tobi 5668ce1ec7
[bugfix] Fix HTML escaping in instance title (#607)
* move caption sanitization -> sanitize.go

* use sanitizeplaintext rather than removehtml

* rename sanitizecaption to sanitizeplaintext

* avoid removing html twice from statuses

* unexport remoteHTML
it's no longer used outside the text package so this
makes it less confusing

* test instance PATCH
2022-05-26 11:37:13 +02:00
kim 26b74aefaf
[bugfix] Fix existing bio text showing as HTML (#531)
* fix existing bio text showing as HTML

- updated replaced mentions to include instance
- strips HTML from account source note in Verify handler
- update text formatter to use buffers for string writes

Signed-off-by: kim <grufwub@gmail.com>

* go away linter

Signed-off-by: kim <grufwub@gmail.com>

* change buf reset location, change html mention tags

Signed-off-by: kim <grufwub@gmail.com>

* reduce FindLinks code complexity

Signed-off-by: kim <grufwub@gmail.com>

* fix HTML to text conversion

Signed-off-by: kim <grufwub@gmail.com>

* Update internal/regexes/regexes.go

Co-authored-by: Mina Galić <mina.galic@puppet.com>

* use improved html2text lib with more options

Signed-off-by: kim <grufwub@gmail.com>

* fix to produce actual plaintext from html

Signed-off-by: kim <grufwub@gmail.com>

* fix span tags instead written as space

Signed-off-by: kim <grufwub@gmail.com>

* performance improvements to regex replacements, fix link replace logic for un-html-ing in the future

Signed-off-by: kim <grufwub@gmail.com>

* fix tag/mention replacements to use input string, fix link replace to not include scheme

Signed-off-by: kim <grufwub@gmail.com>

* use matched input string for link replace href text

Signed-off-by: kim <grufwub@gmail.com>

* remove unused code (to appease linter :sobs:)

Signed-off-by: kim <grufwub@gmail.com>

* improve hashtagFinger regex to be more compliant

Signed-off-by: kim <grufwub@gmail.com>

* update breakReplacer to include both unix and windows line endings

Signed-off-by: kim <grufwub@gmail.com>

* add NoteRaw field to Account to store plaintext account bio, add migration for this, set for sensitive accounts

Signed-off-by: kim <grufwub@gmail.com>

* drop unnecessary code

Signed-off-by: kim <grufwub@gmail.com>

* update text package tests to fix logic changes

Signed-off-by: kim <grufwub@gmail.com>

* add raw note content testing to account update and account verify

Signed-off-by: kim <grufwub@gmail.com>

* remove unused modules

Signed-off-by: kim <grufwub@gmail.com>

* fix emoji regex

Signed-off-by: kim <grufwub@gmail.com>

* fix replacement of hashtags

Signed-off-by: kim <grufwub@gmail.com>

* update code comment

Signed-off-by: kim <grufwub@gmail.com>

Co-authored-by: Mina Galić <mina.galic@puppet.com>
2022-05-07 17:55:27 +02:00
tobi ef5a9256a8
Extend license notices to 2022 (#354) 2021-12-20 18:42:19 +01:00
tobi 2dc9fc1626
Pg to bun (#148)
* start moving to bun

* changing more stuff

* more

* and yet more

* tests passing

* seems stable now

* more big changes

* small fix

* little fixes
2021-08-25 15:34:33 +02:00
Tobi Smethurst ce190d867c
Text/status parsing fixes (#141)
* aaaaaa

* vendor minify

* update + test markdown parsing
2021-08-16 19:17:56 +02:00
Tobi Smethurst a940a520d3
Link hashtag bug (#121)
* link + hashtag bug

* remove printlns

* tidy up some duplicated code
2021-07-29 13:18:22 +02:00
Tobi Smethurst ea8ad8b346
Link parsing (#120)
* add link parsing + formatting functionality

* refinement + docs

* add missing test

* credit url library
2021-07-28 11:42:26 +02:00
Tobi Smethurst ad0e26dc04
Markdown Statuses (#116)
* parse markdown statuses if desired

* add some preliminary docs for writing posts
2021-07-26 20:25:54 +02:00