libera/#sicl - IRC Chatlog
Search
8:16:06
Duuqnd
Any suggestions for how I should get the initial seed? Normally I'd want to use the OS's randomness source and use the current time as a fallback, but I'm not sure how to do OS-specific things with SICL (should I even?).
8:19:30
Duuqnd
Well yeah but I'm not sure how that would work when bootstrapping. I've only read one paper from 2019 about how it works and I'm still not sure I got it all. Is using #+ for OS detection fine in SICL modules?
8:20:20
hayley
I don't think any other module uses RANDOM, so you can assume that your code will only be loaded while producing the final environment?
8:22:32
Duuqnd
no-defun-allowed: It feels a bit wrong to do it like that but I guess I will. Also, since random-state is a class and I'm making the MT19937 generator a subclass of that, I could make another subclass of random-state that uses the OS's randomness source and use that as the default seed source.
8:23:01
Duuqnd
Wait, how do I respond to your messages when using matrix, it turns the asterisks into italic?
8:24:04
hayley
Because some Matrix clients aren't careful with internal and external representations, and decide the asterisks (yes, my Matrix nick is still *no-defun-allowed*) are italic syntax.
8:25:38
mfiano
Good. I would much rather have something with better properties, like something in the PCG family.
8:26:50
hayley
I have implemented worse but faster random number generators for fuzz testing before.
8:27:24
Duuqnd
I'm making the MT19937 generator is a subclass of random-state so adding something else should be as easy as adding a new subclass to it and changing which class make-random-state uses.
11:59:40
lonjil
I wonder if it might be a good idea to have SipHash or some other keyable hash function available for hash tables.
12:00:45
hayley
I had considered SipHash, but wasn't too keen on implementing it. So I went with FNV-1a which is easier to implement, but still has an initial state that you can change.
12:01:38
hayley
(See section 11.4 "Hashing hash table protocol" of the SICL specification, especially HASH-TABLE-HASH-FUNCTION)
12:14:22
lonjil
hayley: hm, are you sure that actually achieves the same properties as a keyed hash? I can't find anything about it.
12:16:22
hayley
Good question. To my untrained eye, multiplication and XOR are pretty good at spreading entropy, and the "offset" provided is totally arbitrary. But apparently Python ditched FNV-1a for SipHash.
12:18:24
hayley
https://www.python.org/dev/peps/pep-0456/ claims the seed can be recovered, but Python used a different keying technique...I suppose what is in SICL could be classified as rolling one's own cryptography.
12:22:03
hayley
Once I convinced the Z3 solver to find the inverse of the Java random number generator, so that I could create a RNG with a specific seed, and getting the first six bytes would return the bytes that encode "Hayley". I could try something similar to tell if this will be a problem, but it probably is.
12:24:54
Duuqnd
I've more or less finished my implementation of the RNG stuff. The one thing I'm still not sure about is how to deal with floats.
12:28:00
Duuqnd
Numbers bigger than the RNG's output (including bignums) was pretty easy but floats I'm not sure about.
12:33:40
Bike
"the procedure ugfsr_CreateMT19937_98() based on the Mersenne Twister [15] MT19937 to generate floats in TestU01 [10] passes the Small Crush battery of tests without failure, even though all double precision numbers produced have the 32nd bit of their fractional part always set to '1,' and their 31st bit has a 75% chance of being '0'" god forbid
12:48:34
scymtym
ACTION is still recovering from https://github.com/sbcl/sbcl/commit/3f6e9bc23 , https://github.com/sbcl/sbcl/commit/97ce63359 , https://github.com/sbcl/sbcl/commit/ef1ff011f
12:51:33
Duuqnd
I did it in a sorta lazy way and just used two random numbers, one for the integer part and one for the decimal part.
12:52:21
Duuqnd
It works but I'm sure it's easy to find problems with it for someone more knowledgeable about floats
13:01:32
Bike
i guess the basic issue is the float exponent should actually follow a geometric distribution
13:01:54
Bike
https://allendowney.com/research/rand/downey07randfloat.pdf has a simple algorithm but might not be fast
13:10:56
Duuqnd
I could use that but it seems to require poking around the bits of the float and I don't know what the proper way for a SICL module to do that would be.
13:21:31
Duuqnd
My crappy method should work for now but I'll make sure to replace it before opening a pull request