I realized a couple of days ago that I could use the 32 GB swap partition that I had allocated specifically to the NVME drives to debug this. Since the swap isn't normally in use, I deactivated it and trimmed those entire partitions. Immediately afterward, the lagspikes decreased substantially, but since then they seem to have been coming back again, so it wasn't exactly a resounding success.
Having looked at more I/O traces since then, however, I just can't help but get the feeling that it should be TRIM-related, because the only I/O operations that seem to be getting high latency are writes, and at the times when they happen, large batches of writes are often completed at once, after they have experienced high latency together. Not sure what to make of that, but it's not like it didn't seem to do "something", and it's not like it couldn't be that 32 GB is just not enough (though that analysis also does sound quite optimistic). Perhaps I should try to reserve more unused space on the drives and try again. That will take some downtime, however, so I think I'll ponder it a bit more.