Closes#1159
This uses the binaryen tool wasm2js to compile the Anubis WASM blobs
to JavaScript. This produces biblically large (520Ki) outputs when you
inline both hashx and sha256 solvers, but this is a tradeoff that I'm
willing to accept. The performance is good enough in my testing with
JIT enabled. I fear that this may end up being terrible with JIT
disabled. I have no idea if this will work on big endian or not.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(worker): constrain nonce value to be a whole integer
Closes#1043
Sometimes the worker could get into a strange state where it has a
decimal nonce, but the server assumes that the nonce can only be a whole
number. This patch constrains the nonce to be a whole number on the
worker end by detecting if the nonce is a decimal number and then
truncating away the decimal portion.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(algorithms/fast): truncate decimal place on number of threads
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
* refactor: make challenge pages return the challenge component
This means that challenge pages will return only the little bit that
actually matters, not the entire component.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(web): move Anubis version info to be implicitly in the footer
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(web): embed challenge ID into generated pages
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(lib): make tests pass
Signed-off-by: Xe Iaso <me@xeiaso.net>
* test(lib/policy/config): amend tests
Signed-off-by: Xe Iaso <me@xeiaso.net>
* test(lib): fix tests again
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Xe Iaso <xe.iaso@techaro.lol>
* chore(web/js): delete proof-of-work-slow.mjs
This code has served its purpose and now needs to be retired to the
great beyond. There is no replacement for this, the fast implementation
will be used instead.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore(web): handle building multiple JS entrypoints and web workers
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(web): rewrite frontend worker handling
This completely rewrites how the proof of work challenge works based on
feedback from browser engine developers and starts the process of making
the proof of work function easier to change out.
- Import @aws-crypto/sha256-js to use in Firefox as its implementation
of WebCrypto doesn't jump directly from highly optimized browser
internals to JIT-ed JavaScript like Chrome's seems to.
- Move the worker code to `web/js/worker/*` with each worker named after
the hashing method and hash method implementation it uses.
- Update bench.mjs to import algorithms the new way.
- Delete video.mjs, it was part of a legacy experiment that I never had
time to finish.
- Update LibreJS comment to add info about the use of
@aws-crypto/sha256-js.
- Also update my email to my @techaro.lol address.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(web): don't hard dep webcrypto anymore
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore(lib/policy): start the deprecation process for slow
This mostly adds a warning, but the "slow" method is in the process of
being removed. Warn admins with slog.Warn.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs: update CHANGELOG
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(web/js): allow running Anubis in non-secure contexts
Signed-off-by: Xe Iaso <me@xeiaso.net>
* Update metadata
check-spelling run (pull_request) for Xe/purge-slow
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
on-behalf-of: @check-spelling <check-spelling-bot@check-spelling.dev>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
Fixes#877
Continued from #879, event loop thrashing can cause stack space
exhaustion on ia32 systems. Previously this would thrash the event loop
in Firefox and Firefox derived browsers such as Pale Moon. I suspect
that this is the ultimate root cause of the bizarre irreproducible bugs
that Pale Moon (and maybe Cromite) users have been reporting since at
least #87 was merged.
The root cause is an invalid boolean statement:
```js
// send a progress update every 1024 iterations. since each thread checks
// separate values, one simple way to do this is by bit masking the
// nonce for multiples of 1024. unfortunately, if the number of threads
// is not prime, only some of the threads will be sending the status
// update and they will get behind the others. this is slightly more
// complicated but ensures an even distribution between threads.
if (
(nonce > oldNonce) | 1023 && // we've wrapped past 1024
(nonce >> 10) % threads === threadId // and it's our turn
) {
postMessage(nonce);
}
```
The logic here looks fine but is subtly wrong as was reported in #877
by a user in the Pale Moon community. Consider the following scenario:
`nonce` is a counter that increments by the worker count every loop.
This is intended to spread the load between CPU cores as such:
| Iteration | Worker ID | Nonce |
| :-------- | :-------- | :---- |
| 1 | 0 | 0 |
| 1 | 1 | 1 |
| 2 | 0 | 2 |
| 3 | 1 | 3 |
And so on.
The incorrect part of this is the boolean logic, specifically the part
with the bitwise or `|`. I think the intent was to use a logical or
(`||`), but this had the effect of making the `postMessage` handler fire
on every iteration. The intent of this snippet (as the comment clearly
indicates) is to make sure that the main event loop is only updated with
the worker status every 1024 iterations per worker. This had the
opposite effect, causing a lot of messages to be sent from workers to
the parent JavaScript context.
This is bad for the event loop.
Instead, I have ripped out that statement and replaced it with a much
simpler increment only counter that fires every 1024 iterations.
Additionally, only the first thread communicates back to the parent
process. This does mean that in theory the other workers could be ahead
of the first thread (posting a message out of a worker has a nonzero
cost), but in practice I don't think this will be as much of an issue as
the current behaviour is.
The root cause of the stack exhaustion is likely the pressure caused by
all of the postMessage futures piling up. Maybe the larger stack size in
64 bit environments is causing this to be fine there, maybe it's some
combination of newer hardware in 64 bit systems making this not be as
much of a problem due to it being able to handle events fast enough to
keep up with the pressure.
Either way, thanks much to @wolfbeast and the Pale Moon community for
finding this. This will make Anubis faster for everyone!
Signed-off-by: Xe Iaso <xe.iaso@techaro.lol>
Possible fix for #877
In some cases, the parallel solution finder in Anubis could cause
all of the worker promises to leak due to the fact the promises
were being improperly terminated. A recursion bomb happens in the
following scenario:
1. A worker sends a message indicating it found a solution to the proof
of work challenge.
2. The `onmessage` handler for that worker calls `terminate()`
3. Inside `terminate()`, the parent process loops through all other
workers and calls `w.terminate()` on them.
4. It's possible that terminating a worker could lead to the `onerror`
event handler.
5. This would create a recursive loop of `onmessage` -> `terminate` ->
`onerror` -> `terminate` -> `onerror` and so on.
This infinite recursion quickly consumes all available stack space, but
this has never been noticed in development because all of my computers
have at least 64Gi of ram provisioned to them under the axiom paying for
more ram is cheaper than paying in my time spent having to work around
not having enough ram. Additionally, ia32 has a smaller base stack size,
which means that they will run into this issue much sooner than users on
other CPU architectures will.
The fix adds a boolean `settled` flag to prevent termination from
running more than once.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(anubis): add /healthz route to metrics server
Also add health check test for Docker Compose and update documentation
for health checking Anubis with Docker Compose.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(lib): fix race condition when rendering multiple challenge pages at once
Closes#832
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(web): make try again button work
Looks like the intent of this was "try the solution again". This fix
makes the client try the challenge again.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(web): don't block a user if they have an invalid challenge cookie
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs: update CHANGELOG
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
* Fix centered-div class usage in index.templ
There was a redundant <center> tag around a div with centered-div class. Well, not so redundant because a typo in the class attribute caused it to not apply.
Removed another <center> tag and replaced by a div.centered-div for consistency.
Signed-off-by: Jesús Martínez Novo <martineznovo@gmail.com>
* Fix centered-div class usage in index.templ (continuation)
Template needed to be compiled into go code...
---------
Signed-off-by: Jesús Martínez Novo <martineznovo@gmail.com>
I'm gonna be totally honest here, I'm still not sure why #564 is still
an issue. This is really confusing and I'm going to totally throw out
how Anubis issues challenges and redo it with Valkey (#201, #622).
The problem seems to be that I assume that the makeChallenge function in
package lib is idempotent for the same client. I have no idea why this
would be inconsistent, but for some reason it is and I'm just at a loss
for words as to why this is happening.
This stops the bleeding by improving the UX as a stopgap.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* lib/localization: implement localization system
Locale files are placed in lib/localization/locales/. If you add a
locale, update manifest.json with available locales.
* Exclude locales from check spelling
* tests(lib/localization): add comprehensive translations test
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(challenge/metarefresh): enable localization
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix: use simple syntax for localization in templ
Also localize CELPHASE into French according to the wishes of the
artist.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore:(js): fix forbidden patterns
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: add goi18n to tools
Signed-off-by: Xe Iaso <me@xeiaso.net>
* test(lib/localization): dynamically determine the list of supported languages
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* fix(bench): await benchmark loop and adjust outline styles in templates
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* refactor: remove unused showContinueBar function and clean up video error handling
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* style: format code for consistency and readability using prettier
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
---------
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
Closes#565
The page already had the version number embedded into it, but that was
not printed to the page. This prints the version number set at compile
time to the page.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* refactor: reorder import statements in fetch.go and fetch_test.go
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* fix: optimize struct field alignment to reduce memory usage
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
---------
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* fix(js): use pure JS SHA256 library, refactor
Closes#458
Additionally, I made a horrifying discovery: Firefox seems to actively
hinder performance if you are using more than one Worker per page. It
does not spread the load out across cores like I expected. Instead it
seems to make that one Worker thrash and have to constantly context
switch, which caused a lot of slowdown.
The benchmarks in #155 continue to be the best contribution ever made to
Anubis. What clued me into there being a problem here was the fact that
the "slow" algorithm was faster than the "fast" algorithm on my laptop.
This made no intuitive sense to me so I dug further.
Either way I think this is a Firefox bug at its core, but for now we
have to work around it by doing the hacky terrible thing that I hate.
I also swapped the SHA256 operations to @aws-crypto/sha256-js on the
advice of a trusted cryptography expert. I don't know what performance
differences this makes, but I'm getting 150-225 kilohashes per second,
which is pretty dang good.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(js): apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(js): use fast algo for fast worker
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>