freenode/#sicl - IRC Chatlog

5:50:08 no-defun-allowed I was going to say that I had succeeded in porting Kulukundis's hash table, then I ran the tests and found I have a #<SICL-LINEAR-PROBING-HASH-TABLE:LINEAR-PROBING-HASH-TABLE :test EQL size -1/64 {100E29E5A3}>.

5:50:43 beach Negative size?

5:50:51 beach That's ... special.

5:50:57 no-defun-allowed Indeed.

5:53:09 jackdaniel now you could advertise sicl as the only implementation with negative consing - imagine the possibilities

5:53:23 beach Negative and fractional.

5:54:37 jackdaniel out of memory? no worries, just add 1M cons to the hash table.

5:54:46 no-defun-allowed Just checked, Haskell already has snoc which appears to negative cons already. So I will probably have to fix that, as we aren't first to invent that, and we'll look like weirdo Haskell users.

5:54:58 no-defun-allowed ...if that's kept in.

6:03:00 no-defun-allowed Okay, now it works. I had accidentally broken the metadata scanning functions. However, the REHASHING tests provided are not necessarily testing the specification; the HyperSpec states "the actual amount by which the hash-table will grow upon rehash is implementation-dependent." and that :rehash-size is just "advice".

6:08:57 beach Right.

6:12:45 no-defun-allowed In the case of this design, hash table sizes have to be powers of 2 and larger than one group, and resizing must always multiply the size by another (positive) power of 2 to preserve the size invariants.

6:15:31 beach So is your plan to compare all those different strategies using benchmarks?

6:15:50 no-defun-allowed Eventually.

6:15:56 beach Sounds great!

6:36:42 beach So I guess my next task is MIR-to-LIR, aside from the FIXMEs in HIR-to-MIR. The current MIR-to-LIR is very error prone because I essentially put every operand in a register before issuing the operation, and then store the resulting register on the stack.

6:36:43 beach There is no guarantee that I am not using the same register for more than one thing at the same time. So perhaps this is a good tie to start working on register allocation.

6:39:41 no-defun-allowed I found GETHASH is currently 30% or so faster on linear-probing hash tables than bucket hash tables, but only up to a certain size. When it is larger than that size, the linear-probing hash table becomes much slower, and I'm not immediately sure why.

6:41:23 beach Oh, that would be interesting to understand!

6:41:56 beach If it is intrinsic, one could switch implementations automatically according to size.

6:42:35 no-defun-allowed I hope it's not intrinsic; the graphs in the Cppcon presentation didn't show the original table becoming much slower with larger tables.

6:43:14 beach Yes, it would be strange if it were an intrinsic problem.

7:20:24 no-defun-allowed I ran the benchmark with smaller increments in hash table counts (using the E6 series instead of powers of 10), and there appears to be a sawtooth pattern on a plot of GETHASH time versus hash table count.

7:21:48 beach Do you have an explanation for that behavior?

7:23:02 no-defun-allowed No clue honestly.

7:28:51 no-defun-allowed My first guess is that it has to do with when we decide to resize, if it is periodic. Also, the graph looks like <https://applied-langua.ge/~hayley/gethash-versus-count.png>

7:30:08 beach That's no good.

7:30:32 beach The purpose of a hash table is to have relatively constant access time, independent of the size.

7:30:41 no-defun-allowed Yes, that shouldn't happen.

7:34:34 no-defun-allowed I changed the resize-threshold (from 0.8 to 0.5) and got another sawtooth pattern, but almost at a 180° phase offset. (A strange way to describe a graph.)

7:39:24 no-defun-allowed Okay, I replaced the hash function with (the moral equivalent of) #'identity and it works perfectly. Now I realise my hash function only samples the bottom 16 bits of an integer.

7:39:47 beach Oh, that explains a lot.

7:39:57 no-defun-allowed It sounds stupid, but FNV-1a takes in a "stream" of bytes and I decided to only provide two, giving us a whole 16 bits of entropy. That doesn't make it sound any less stupid :)

7:41:16 beach Er, what is FNV-1a?

7:41:55 no-defun-allowed My bad, the hash function used to mix in the stuff we hash. It is described in https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function#FNV-1_hash

7:43:01 no-defun-allowed And that page suggests I shouldn't use that, in order to avoid complexity attacks: "For many years, the Python programming language used a slightly modified version of the FNV hash for its default hash function. This was changed in Python 3.4 in order to protect from DoS attacks to Python applications." That's great.

7:45:04 beach I see.

7:45:50 no-defun-allowed I said "That's great" because one of my initial goals was to avoid such attacks on hash tables. Guess I'll fix it for now, then go look for a safer hash function.

7:47:55 no-defun-allowed It's strange that my mistake didn't affect the bucket hash table though.

8:09:48 beach Yeah.

9:01:38 no-defun-allowed I haven't found any documents describing how FNV-1a is attacked; all I found was that someone attempted to fix CPython by xor-ing the output of the old hash function (which also was a xor-multiply loop apparently) with some random value, which doesn't fix anything.

9:03:09 no-defun-allowed Or, more accurately, any document describing how the hash function is attacked, when the function uses a random initial value. It is just a matter of time for someone to find a partial collision with no randomization.

10:36:55 no-defun-allowed So, for now, I am not worried about the choice of hash function. Here is the performance graph of tonight's work: <https://applied-langua.ge/~hayley/sicl-hash-tables.png> linear-probing-hash-table beats out bucket-hash-table at GETHASHing existing mappings, and the graph looks basically the same for misses.

10:37:44 beach Looks good.

11:41:05 beach My idea for a register-allocation technique based on the page replacement algorithm OPT depends on a metric that I call "Estimated Distance to Use" or EDU, and so far I have been unable to come up with a technique for computing the EDU.

11:41:11 beach But this morning I noticed that it is a linear-equation system. The problem, though, is that the system is large very sparse, so it would be too costly to solve it with something like Gaussian elimination.

11:41:21 beach Otherwise, the idea is as follows: The EDU is defined for each variable at program points before each instruction.

11:41:30 beach The EDU of a variable V before instruction I (i.e., EDU[V,I]) is 0 if I has V as an input.

11:41:38 beach Otherwise, if I has a single successor J, then EDU[V,I] = EDU[V,J] + 1.

11:41:44 beach If I has two successors, and V is live in only one successor J, and we estimate the probability of the branch from I to J to be taken by probability p, then EDU[V,I] = 1/p * EDU[V,J].

11:41:46 beach Otherwise, let J and K be the successors of I and p the probability that the branch from I to J is taken, then EDU[V,I] = p * EDU[V,I] + (1-p) * EDU[V,J].

11:42:27 beach Er, the last one should be EDU[V,I] = p * EDU[V,J] + (1-p) * EDU[V,K]

11:43:19 beach Time for a lunch break.

13:10:23 beach The only real problem is with loop inside of which some variable is live but not used. The other cases are simple.

13:10:30 beach *with loops

13:11:14 beach However, I have also hinted in the past that register allocation should be done in loops first, from the innermost to the outermost.

13:20:14 heisig Hello everyone!

13:20:23 beach Hello heisig.

14:23:42 beach The equations would be a lot fewer if established by basic blocks.

14:27:40 beach So, anyway, there are two completely separate phases to my idea. Phase 1 is to compute the EDUs and phase 2 is to use those EDUs to allocate registers. So I'll put phase 1 on hold for the time being and start thinking about phase 2.

15:11:55 heisig ACTION just read up on the idea of using EDU for register allocation.

15:14:26 heisig Solving such a linear system efficiently should not be too hard. Especially since we don't need an exact solution - an approximation would be fine.

15:14:46 beach It is also in chapter 10 of the Cleavir documentation in SICL/Code/Cleavir/Documentation.

15:15:07 beach Yes, I agree. It should not be hard.

15:17:58 beach In the Cleavir documentation, the equations are not linear, but I think a linear system is a fine approximation.

15:49:36 beach x86-64 register allocation is tricky because of the two-argument instructions.

15:50:00 beach But I think I can abstract out decisions to a "strategy" object.

15:52:58 beach ... which is too bad, because I am not really that interested in that particular architecture. But that's what we have for the time being. Oh, well.

15:59:03 heisig Hmm, x86 could actually be gone soon. They will never be able to decode more than five instructions per cycle, and that is starting to hurt.

15:59:30 heisig Or is that wishful thinking from my side? Who knows.

16:07:09 beach Yeah, hard to say.

16:09:41 beach Bike: What semantics do we attribute to UNBOX instructions? Are the responsible for checking that the input is of one of the right types (and signaling an error otherwise) and then applying the relevant unboxing operation? If so, is there any way that the type inferencer could annotate such an instruction so that some steps could be skipped in some cases?

16:10:10 beach *Are they

16:11:54 Bike boxing and unboxing is kind of still in flux. HIR unbox definitely checked types, though.

16:12:02 Bike probably a better design would be to separate the type check from the unboxing

16:13:01 beach That's what I though. Yes, maybe. But it's tricky since more than one object type could work, and the unboxing operation would be different for each one.

16:14:14 Bike how do you mean?

16:14:25 beach I guess it's perhaps only for the ((un)signed-byte 64) that both a fixnum and a bignum might work.

16:14:40 Bike oh, yeah.

16:14:55 Bike i think integers kind of form a special case as far as un/boxing goes

16:15:39 beach For the smaller integer types, only fixnums apply. But the value has to be tested.

16:16:01 beach But, sure, we may have unboxed 128-bit integers some day.