andrewducker: (my brain)
[personal profile] andrewducker
Reading this article on advice to teachers in the UK about using AI, they suggest using it for things like "marking quizzes" and "generating routine letters".

And what really annoys me about this is that it's a perfect example of where simple automation could be used without the need for AI.

The precise example in the article is "Generate a letter to parents about a head lice outbreak." - which is a fairly common thing to happen in schools. So why on earth isn't there one standard letter per school, if not one standard letter for the whole country, that can be reused by absolutely everyone whenever this standard event happens? Why does this require AI to generate a new one every time, rather than just being a standard email that gets sent?

Same with marking quizzes. If children get multiple-choice quizzes regularly across all schools, and marking them uses precious teacher time, why is there not a standard piece of software, paid for once (or written once internally) which enables all children to do quizzes in a standard way, and get them marked automatically?

If we're investing a bunch of money into automating the various processes that teachers spend far too much time on, start with simple automation, which is cheap, easy, and reliable.

Also, wouldn't it be sensible to do some research into how accurately AI marks homework *before* you tell teachers to use it to do that? Here's some research from February which shows that its agreement with examiners was only 0.61 (where 1.00 would be perfect agreement). So I'm sceptical about the quality of the marking it's going to be doing...

performance of random floats (amended)

Monday, 9 June 2025 10:00 pm
fanf: (Default)
[personal profile] fanf

After I found some issues with my benchmark which invalidated my previous results, I have substantially revised my previous blog entry. There are two main differences:

  • A proper baseline revealed that my amd64 numbers were nonsense because I wasn’t fencing enough, and after tearing my hair out and eventually fixing that I found that the bithack conversion is one or two cycles faster.

  • A newer compiler can radically improve the multiply conversion on arm64 so it’s the same speed as the bithack conversion; I've added some source and assembly snippets to the blog post to highlight how nice arm64 is compared to amd64 for this task.

To-read pile, 2025, May

Monday, 9 June 2025 07:31 pm
rmc28: (reading)
[personal profile] rmc28

Books on pre-order:

  1. Queen Demon (Rising World 2) by Martha Wells (7 Oct 2025)

Books acquired in May:

  • and read:
    1. Copper Script by KJ Charles
    2. Red Boar's Baby by Lauren Esker
  • and unread:
    1. The Wrath & The Dawn by Renée Ahdieh [3]
    2. The Unexpected Inheritance of Inspector Chopra by Vaseem Khan [3]
    3. Kidnap on the California Comet by M.G. Leonard & Sam Sedgman [3]
    4. Betrayal (Trinity 1) by Fiona McIntosh [3]

Borrowed books read in May:

  1. The Good Thieves by Katherine Rundell
  2. One Christmas Wish by Katherine Rundell
  3. You Have a Match by Emma Lord [2][6]

I continue to not read much (by my standards). I did not manage to read any of the physical books I had out of the library until they needed to be returned, and I've got several half-finished books in progress. (Oh, and in writing this I've realised I already have the Renée Ahdieh book in ebook, and haven't read it there either!)

[1] Pre-order
[2] Audiobook
[3] Physical book
[4] Crowdfunding
[5] Goodbye read
[6] Cambridgeshire Reads/Listens
[7] FaRoFeb / FaRoCation / Bookmas / HRBC
[8] Prime Reading / Kindle Unlimited

EHRC nonsense

Monday, 9 June 2025 11:14 am
lnr: Halloween 2023 (Default)
[personal profile] lnr

We still haven't met with Senior Management: it's now due tomorrow, in person. I'm gently trying not to panic.

There's still been no message of support to all members of staff and students from the University, and nothing at all from the department. Though I understand they're still in discussions in the background. This is frustrating.

The subject was raised at a recent All Staff meeting (in which people submit questions as text, and senior management attempt to answer them). We were given broad assurances that the university values and supports trans people, but nothing actually useful or genuinely supportive was said.

In the meantime a new EHRC chair is due to be appointed, and they're considering a person with a known anti-trans background. There's an Open Letter available to sign in protest, written by a very good friend and colleague: https://docs.google.com/forms/d/e/1FAIpQLSe_Y77t7CQqKjdGifNa0lE3HKjDAb1UoJdjuLAbInhIQsRMhw/viewform

I've also seen a good template if you want to write directly: https://docs.google.com/document/d/1865KMfu24JgmwnWmYXaVc3jlzj5uQFEq69hXMxKP6BU/edit?tab=t.0

And I wrote my own version:

9th June 2025
Dear Women’s and Equalities Select Committee and Joint Committee on Human Rights,
Cc: Pippa Heylings, as my MP

I am writing to express my grave concern about the proposed appointment of Dr Mary-Ann Stephenson as the Chair of the Equality and Human Rights Commission.

I won't include a string of references here, because I think you will have seen them all already, but I think it is imperative that the next person appointed as Head of the EHRC should not be seen to have a strong anti-trans background. Trans people are currently scared. Scared for their jobs, if they cannot access their workplace in safety and dignity. Scared of being assaulted if they go to the "wrong" toilet. Scared of being outed as trans in public if they try to follow the new guidelines.

And I am scared as a cis woman, a woman who is not trans, at what is happening in our country, and what this means for my friends and colleagues and for trans people in general. For intersex people, non-binary people, and any woman who might be mistaken for being trans. Other women need to feel safe too, but excluding trans people is not the way to do this.

The EHRC needs to stand up for the rights of everyone, and to be seen to do so. I sincerely hope you will take this into account.

Kind Regards,

Eleanor Blair
Great Shelford, Cambridge, CB22

I'm not even going to attempt to get into the member of the EHRC who was quoted as effectively saying that trans people have been misled about their rights under the Equality Act for the last 15 years, and there will now be a period of adjustment, but they should just get used to having fewer rights than they thought they did. The Guardian changed their headline and reporting three times as a result of her protesting about being misquoted, but that seems to have been the gist of it. Not mentioning that the "misleading" guidance came from the EHRC themselves, and was based on the previous understanding of the Equalities Act and entirely consistent with it. FFS

performance of random floats

Sunday, 8 June 2025 03:15 am
fanf: (Default)
[personal profile] fanf

https://dotat.at/@/2025-06-08-floats.html

A couple of years ago I wrote about random floating point numbers. In that article I was mainly concerned about how neat the code is, and I didn't pay attention to its performance.

Recently, a comment from Oliver Hunt and a blog post from Alisa Sireneva prompted me to wonder if I made an unwarranted assumption. So I wrote a little benchmark, which you can find in pcg-dxsm.git.

(Note 2025-06-09: I've edited this post substantially after discovering some problems with the results.)

recap

Briefly, there are two basic ways to convert a random integer to a floating point number between 0.0 and 1.0:

  • Use bit fiddling to construct an integer whose format matches a float between 1.0 and 2.0; this is the same span as the result but with a simpler exponent. Bitcast the integer to a float and subtract 1.0 to get the result.

  • Shift the integer down to the same range as the mantissa, convert to float, then multiply by a scaling factor that reduces it to the desired range. This produces one more bit of randomness than the bithacking conversion.

(There are other less basic ways.)

code

The double precision code for the two kinds of conversion is below. (Single precision is very similar so I'll leave it out.)

It's mostly as I expect, but there are a couple of ARM instructions that surprised me.

bithack

The bithack function looks like:

double bithack52(uint64_t u) {
    u = ((uint64_t)(1023) << 52) | (u >> 12);
    return(bitcast(double, u) - 1.0);
}

It translates fairly directly to amd64 like this:

bithack52:
    shr     rdi, 12
    movabs  rax, 0x3ff0000000000000
    or      rax, rdi
    movq    xmm0, rax
    addsd   xmm0, qword ptr [rip + .number]
    ret
.number:
    .quad   0xbff0000000000000

On arm64 the shift-and-or becomes one bfxil instruction (which is a kind of bitfield move), and the constant -1.0 is encoded more briefly. Very neat!

bithack52:
    mov     x8, #0x3ff0000000000000
    fmov    d0, #-1.00000000
    bfxil   x8, x0, #12, #52
    fmov    d1, x8
    fadd    d0, d1, d0
    ret

multiply

The shift-convert-multiply function looks like this:

double multiply53(uint64_t u) {
    return ((double)(u >> 11) * 0x1.0p-53);
}

It translates directly to amd64 like this:

multiply53:
    shr       rdi, 11
    cvtsi2sd  xmm0, rdi
    mulsd     xmm0, qword ptr [rip + .number]
    ret
.number:
    .quad     0x3ca0000000000000

GCC and earlier versions of Clang produce the following arm64 code, which is similar though it requires more faff to get the constant into the right register.

multiply53:
    lsr     x8, x0, #11
    mov     x9, #0x3ca0000000000000
    ucvtf   d0, x8
    fmov    d1, x9
    fmul    d0, d0, d1
    ret

Recent versions of Clang produce this astonishingly brief two instruction translation: apparently you can convert fixed-point to floating point in one instruction, which gives us the power of two scale factor for free!

multiply53:
    lsr     x8, x0, #11
    ucvtf   d0, x8, #53
    ret

benchmark

My benchmark has 2 x 2 x 2 tests:

  • bithacking vs multiplying

  • 32 bit vs 64 bit

  • sequential integers vs random integers

I ran the benchmark on my Apple M1 Pro and my AMD Ryzen 7950X.

These functions are very small and work entirely in registers so it has been tricky to measure them properly.

To prevent the compiler from inlining and optimizing the benchmark loop to nothing, the functions are compiled in a separate translation unit from the test harness. This is not enough to get plausible measurements because the CPU overlaps successive iterations of the loop, so we also use fence instructions.

On arm64, a single ISB (instruction stream barrier) in the loop is enough to get reasonable measurements.

I have not found an equivalent of ISB on amd64, so I'm using MFENCE. It isn't effective unless I pass the argument and return values via pointers (because it's a memory fence) and place MFENCE instructions just before reading the argument and just after writing the result.

results

In the table below, the leftmost column is the number of random bits; "old" is arm64 with older clang, "arm" is newer clang, "amd" is gcc.

The first line is a baseline do-nothing function, showing the overheads of the benchmark loop, function call, load argument, store return, and fences.

The upper half measures sequential numbers, the bottom half is random numbers. The times are nanoseconds per operation.

         old    arm    amd

    00  21.44  21.41  21.42

    23  24.28  24.31  22.19
    24  25.24  24.31  22.94
    52  24.31  24.28  21.98
    53  25.32  24.35  22.25

    23  25.59  25.56  22.86
    24  26.55  25.55  23.03
    52  27.83  27.81  23.93
    53  28.57  27.84  25.01

The times vary a little from run to run but the difference in speed of the various loops is reasonably consistent.

The numbers on arm64 are reasonably plausible. The most notable thing is that the "old" multiply conversion is about 3 or 4 clock cycles slower, but with a newer compiler that can eliminate the multiply, it's the same speed as the bithacking conversion.

On amd64 the multiply conversion is about 1 or 2 clock cycles slower than the bithacking conversion.

conclusion

The folklore says that bithacking floats is faster than normal integer to float conversion, and my results generally agree with that, apart from on arm64 with a good compiler. It would be interesting to compare other CPUs to get a better idea of when the folklore is right or wrong -- or if any CPUs perform the other way round!

Photo cross-post

Saturday, 7 June 2025 12:29 pm
andrewducker: (Default)
[personal profile] andrewducker


My brother Mike got me this for my birthday, and it just takes a weight off my mind being able to say "bring the steam temperature up to 95 degrees and hold it there"

(Control over oil temperature when frying eggs is also awesome.)
Original is here on Pixelfed.scot.

A mostly-free day

Saturday, 7 June 2025 10:31 am
rmc28: Rachel post-game, slumped sideways in a chair eyes closed (tired)
[personal profile] rmc28

I'm playing an ice hockey game tonight in Cambridge, a charity fundraiser between Warbirds and Tri-Base Lightning. But until then I have a strangely unscheduled day. I might sleep or read or something.

I could post about what I've been up to lately!

Work:

  • spoke on a panel about effective 1:1s, it seemed to go well
  • played my usual Senior Tech Woman role for a colleague's recruitment panel, and am happy that our preferred candidate has apparently just accepted. (a frustrating number of timewasting applicants more or less obviously using LLMs to write their applications and generate their free-text statements on suitability for the role; I really resent having to wade through paragraphs of verbose buzzword bilge to ... fail to find any evidence they actually know how to do the job)

Hockey:

  • KODIAKS WON PLAYOFFS on the bank holiday weekend oh yes they did. So proud of the players, and definitely earned my share of reflected glory managing the team this season and running around half the weekend. League winners, Cup winners, Playoff winners, promotion to Division 1 next season, utter delight.
  • Very much an Insufficient Sleep weekend, we topped off the playoff win with a night out in Sheffield, I got back to my hotel as the sky was getting light, good times.
  • Kodiaks awards evening last night: lots of celebration of the hard work and lovely camaraderie of this group of players, A and B teams both. I got to announce and hand out the B team awards, and I received a really nice pair of gifts for me as manager: a canvas print of a post-final winners photo, and a personalised insulated travel mug (club logo and MANAGER on it). I love this team.
  • I'm still enjoying also playing with Warbirds, and have now been to a few summer Friday scrimmages run by Tri-Base. I went to a couple of Friday scrims at the end of last summer and felt everyone was very kind but I was pretty outclassed. I'm pleased to feel like I'm keeping up a bit better now after training a lot harder this last season.
  • I trained three days in a row this week (Warbirds Monday, Haringey Greyhounds tryouts in Alexandra Palace on Tuesday, Kodiaks Wednesday) and that was Too Much and I was pretty sore Wednesday evening and Thursday. Rest days are important even if I am much improved in fitness compared to this time last year.

Other:

  • I did a formal hall at my old College! Using my alumna rights and having a nice evening hanging out with old friends (who were the ones to suggest the plan). Good times, will do again but probably not this term.
  • I had an excessive number of books out from Suffolk libraries that needed returning, so I did a flying visit to Newmarket by bus last Saturday, this turned out to be the cheapest/quickest way across the county border. I managed to stick to my resolution not to borrow any more physical books but slipped and fell on the "withdrawn books for sale" stand. Managed to only come home with four.
  • I did a little indoor cricket the Friday before playoffs (it's now finished due to exam period), and some nets practice last Sunday, but I keep being too busy to actually play any of my team's games. I'd like to do more nets practice though, that was intense but also felt like I was beginning to improve.
  • I did a little table tennis with Active Staff but that's also now suspended for exams. I'm considering getting a cheap set of bats and balls for me and the family to go use at the local rec ground, or in the free indoor tables at the Grafton Centre.

Coming up: my summer is full of ice hockey camps and tournaments (Prague, Hull, Sheffield, Biarritz) and my old club Streatham have just announced all their summer training sessions will be "Summer Skills Camps" open to all interested WNIHL players, so I'm looking at going to London regularly again in July and August.

The Sickening Has Me

Friday, 6 June 2025 08:20 pm
andrewducker: (xkcd boomdeyada)
[personal profile] andrewducker
I spent the day feeling bad for lacking focus, and wondering why I couldn't get anything done.
And then I slept for an hour on no notice.
And now I'm very wobbly and all of my muscles gently ache.
So I think I'm going to chalk it up as "The Plague" and hope I feel better tomorrow.
andrewducker: (Default)
[personal profile] andrewducker
I see we're back at the "Labour attempt to introduce a mandatory ID card" stage of history*.

My feeling last time, was that the main problem that they always have is that they *start* with the cards being mandatory.

If you start with "Here is a thing that makes your life much easier, that you can carry about if you like." then that will get you 85% of the way there. And then, once you have a voluntary ID card that's not causing any problems for anyone, and that 85% of the population is using to make their life easier, *then* you move in and say "The only people who don't carry an ID card are weirdos and troublemakers, and they're causing friction in the system, we could make it all run more smoothly if only they *had* to carry one."

But no, they always try to go instantly from "Nobody has an ID card." to "Everyone must carry one at all times." - which forms a coalition of all sorts of people from across the political spectrum, and ends up being far more politically costly to them than if they'd just boiled their frog slowly.

(None of which should be taken as me taking a position on ID cards. I'm just constantly bemused by their inability to get things done by trying to rush them through in the most authoritarian manner possible.)

*Younger readers may not remember the fuss in 2006 (repealed in 2011)
[personal profile] mjg59
As I wrote in my last post, Twitter's new encrypted DM infrastructure is pretty awful. But the amount of work required to make it somewhat better isn't large.

When Juicebox is used with HSMs, it supports encrypting the communication between the client and the backend. This is handled by generating a unique keypair for each HSM. The public key is provided to the client, while the private key remains within the HSM. Even if you can see the traffic sent to the HSM, it's encrypted using the Noise protocol and so the user's encrypted secret data can't be retrieved.

But this is only useful if you know that the public key corresponds to a private key in the HSM! Right now there's no way to know this, but there's worse - the client doesn't have the public key built into it, it's supplied as a response to an API request made to Twitter's servers. Even if the current keys are associated with the HSMs, Twitter could swap them out with ones that aren't, terminate the encrypted connection at their endpoint, and then fake your query to the HSM and get the encrypted data that way. Worse, this could be done for specific targeted users, without any indication to the user that this has happened, making it almost impossible to detect in general.

This is at least partially fixable. Twitter could prove to a third party that their Juicebox keys were generated in an HSM, and the key material could be moved into clients. This makes attacking individual users more difficult (the backdoor code would need to be shipped in the public client), but can't easily help with the website version[1] even if a framework exists to analyse the clients and verify that the correct public keys are in use.

It's still worse than Signal. Use Signal.

[1] Since they could still just serve backdoored Javascript to specific users. This is, unfortunately, kind of an inherent problem when it comes to web-based clients - we don't have good frameworks to detect whether the site itself is malicious.
[syndicated profile] simont_quasiblog_feed

Posted by Simon Tatham

A collection of semi-connected rants about context-free grammars, parser generators, and the ways in which they aren’t quite as useful as I’d like them to be.
[personal profile] mjg59
(Edit: Twitter could improve this significantly with very few changes - I wrote about that here. It's unclear why they'd launch without doing that, since it entirely defeats the point of using HSMs)

When Twitter[1] launched encrypted DMs a couple
of years ago, it was the worst kind of end-to-end
encrypted - technically e2ee, but in a way that made it relatively easy for Twitter to inject new encryption keys and get everyone's messages anyway. It was also lacking a whole bunch of features such as "sending pictures", so the entire thing was largely a waste of time. But a couple of days ago, Elon announced the arrival of "XChat", a new encrypted message platform built on Rust with (Bitcoin style) encryption, whole new architecture. Maybe this time they've got it right?

tl;dr - no. Use Signal. Twitter can probably obtain your private keys, and admit that they can MITM you and have full access to your metadata.

The new approach is pretty similar to the old one in that it's based on pretty straightforward and well tested cryptographic primitives, but merely using good cryptography doesn't mean you end up with a good solution. This time they've pivoted away from using the underlying cryptographic primitives directly and into higher level abstractions, which is probably a good thing. They're using Libsodium's boxes for message encryption, which is, well, fine? It doesn't offer forward secrecy (if someone's private key is leaked then all existing messages can be decrypted) so it's a long way from the state of the art for a messaging client (Signal's had forward secrecy for over a decade!), but it's not inherently broken or anything. It is, however, written in C, not Rust[2].

That's about the extent of the good news. Twitter's old implementation involved clients generating keypairs and pushing the public key to Twitter. Each client (a physical device or a browser instance) had its own private key, and messages were simply encrypted to every public key associated with an account. This meant that new devices couldn't decrypt old messages, and also meant there was a maximum number of supported devices and terrible scaling issues and it was pretty bad. The new approach generates a keypair and then stores the private key using the Juicebox protocol. Other devices can then retrieve the private key.

Doesn't this mean Twitter has the private key? Well, no. There's a PIN involved, and the PIN is used to generate an encryption key. The stored copy of the private key is encrypted with that key, so if you don't know the PIN you can't decrypt the key. So we brute force the PIN, right? Juicebox actually protects against that - before the backend will hand over the encrypted key, you have to prove knowledge of the PIN to it (this is done in a clever way that doesn't directly reveal the PIN to the backend). If you ask for the key too many times while providing the wrong PIN, access is locked down.

But this is true only if the Juicebox backend is trustworthy. If the backend is controlled by someone untrustworthy[3] then they're going to be able to obtain the encrypted key material (even if it's in an HSM, they can simply watch what comes out of the HSM when the user authenticates if there's no validation of the HSM's keys). And now all they need is the PIN. Turning the PIN into an encryption key is done using the Argon2id key derivation function, using 32 iterations and a memory cost of 16MB (the Juicebox white paper says 16KB, but (a) that's laughably small and (b) the code says 16 * 1024 in an argument that takes kilobytes), which makes it computationally and moderately memory expensive to generate the encryption key used to decrypt the private key. How expensive? Well, on my (not very fast) laptop, that takes less than 0.2 seconds. How many attempts to I need to crack the PIN? Twitter's chosen to fix that to 4 digits, so a maximum of 10,000. You aren't going to need many machines running in parallel to bring this down to a very small amount of time, at which point private keys can, to a first approximation, be extracted at will.

Juicebox attempts to defend against this by supporting sharding your key over multiple backends, and only requiring a subset of those to recover the original. I can't find any evidence that Twitter's does seem to be making use of this,Twitter uses three backends and requires data from at least two, but all the backends used are under x.com so are presumably under Twitter's direct control. Trusting the keystore without needing to trust whoever's hosting it requires a trustworthy communications mechanism between the client and the keystore. If the device you're talking to can prove that it's an HSM that implements the attempt limiting protocol and has no other mechanism to export the data, this can be made to work. Signal makes use of something along these lines using Intel SGX for contact list and settings storage and recovery, and Google and Apple also have documentation about how they handle this in ways that make it difficult for them to obtain backed up key material. Twitter has no documentation of this, and as far as I can tell does nothing to prove that the backend is in any way trustworthy. (Edit to add: The Juicebox API does support authenticated communication between the client and the HSM, but that relies on you having some way to prove that the public key you're presented with corresponds to a private key that only exists in the HSM. Twitter gives you the public key whenever you communicate with them, so even if they've implemented this properly you can't prove they haven't made up a new key and MITMed you the next time you retrieve your key)

On the plus side, Juicebox is written in Rust, so Elon's not 100% wrong. Just mostly wrong.

But ok, at least you've got viable end-to-end encryption even if someone can put in some (not all that much, really) effort to obtain your private key and render it all pointless? Actually no, since you're still relying on the Twitter server to give you the public key of the other party and there's no out of band mechanism to do that or verify the authenticity of that public key at present. Twitter can simply give you a public key where they control the private key, decrypt the message, and then reencrypt it with the intended recipient's key and pass it on. The support page makes it clear that this is a known shortcoming and that it'll be fixed at some point, but they said that about the original encrypted DM support and it never was, so that's probably dependent on whether Elon gets distracted by something else again. And the server knows who and when you're messaging even if they haven't bothered to break your private key, so there's a lot of metadata leakage.

Signal doesn't have these shortcomings. Use Signal.

[1] I'll respect their name change once Elon respects his daughter

[2] There are implementations written in Rust, but Twitter's using the C one with these JNI bindings

[3] Or someone nominally trustworthy but who's been compelled to act against your interests - even if Elon were absolutely committed to protecting all his users, his overarching goals for Twitter require him to have legal presence in multiple jurisdictions that are not necessarily above placing employees in physical danger if there's a perception that they could obtain someone's encryption keys
Page generated Friday, 13 June 2025 05:22 am
Powered by Dreamwidth Studios