freenode/#sicl - IRC Chatlog
Search
5:50:08
no-defun-allowed
I was going to say that I had succeeded in porting Kulukundis's hash table, then I ran the tests and found I have a #<SICL-LINEAR-PROBING-HASH-TABLE:LINEAR-PROBING-HASH-TABLE :test EQL size -1/64 {100E29E5A3}>.
5:53:09
jackdaniel
now you could advertise sicl as the only implementation with negative consing - imagine the possibilities
5:54:46
no-defun-allowed
Just checked, Haskell already has snoc which appears to negative cons already. So I will probably have to fix that, as we aren't first to invent that, and we'll look like weirdo Haskell users.
6:03:00
no-defun-allowed
Okay, now it works. I had accidentally broken the metadata scanning functions. However, the REHASHING tests provided are not necessarily testing the specification; the HyperSpec states "the actual amount by which the hash-table will grow upon rehash is implementation-dependent." and that :rehash-size is just "advice".
6:12:45
no-defun-allowed
In the case of this design, hash table sizes have to be powers of 2 and larger than one group, and resizing must always multiply the size by another (positive) power of 2 to preserve the size invariants.
6:36:42
beach
So I guess my next task is MIR-to-LIR, aside from the FIXMEs in HIR-to-MIR. The current MIR-to-LIR is very error prone because I essentially put every operand in a register before issuing the operation, and then store the resulting register on the stack.
6:36:43
beach
There is no guarantee that I am not using the same register for more than one thing at the same time. So perhaps this is a good tie to start working on register allocation.
6:39:41
no-defun-allowed
I found GETHASH is currently 30% or so faster on linear-probing hash tables than bucket hash tables, but only up to a certain size. When it is larger than that size, the linear-probing hash table becomes much slower, and I'm not immediately sure why.
6:42:35
no-defun-allowed
I hope it's not intrinsic; the graphs in the Cppcon presentation didn't show the original table becoming much slower with larger tables.
7:20:24
no-defun-allowed
I ran the benchmark with smaller increments in hash table counts (using the E6 series instead of powers of 10), and there appears to be a sawtooth pattern on a plot of GETHASH time versus hash table count.
7:28:51
no-defun-allowed
My first guess is that it has to do with when we decide to resize, if it is periodic. Also, the graph looks like <https://applied-langua.ge/~hayley/gethash-versus-count.png>
7:30:32
beach
The purpose of a hash table is to have relatively constant access time, independent of the size.
7:34:34
no-defun-allowed
I changed the resize-threshold (from 0.8 to 0.5) and got another sawtooth pattern, but almost at a 180° phase offset. (A strange way to describe a graph.)
7:39:24
no-defun-allowed
Okay, I replaced the hash function with (the moral equivalent of) #'identity and it works perfectly. Now I realise my hash function only samples the bottom 16 bits of an integer.
7:39:57
no-defun-allowed
It sounds stupid, but FNV-1a takes in a "stream" of bytes and I decided to only provide two, giving us a whole 16 bits of entropy. That doesn't make it sound any less stupid :)
7:41:55
no-defun-allowed
My bad, the hash function used to mix in the stuff we hash. It is described in https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function#FNV-1_hash
7:43:01
no-defun-allowed
And that page suggests I shouldn't use that, in order to avoid complexity attacks: "For many years, the Python programming language used a slightly modified version of the FNV hash for its default hash function. This was changed in Python 3.4 in order to protect from DoS attacks to Python applications." That's great.
7:45:50
no-defun-allowed
I said "That's great" because one of my initial goals was to avoid such attacks on hash tables. Guess I'll fix it for now, then go look for a safer hash function.
9:01:38
no-defun-allowed
I haven't found any documents describing how FNV-1a is attacked; all I found was that someone attempted to fix CPython by xor-ing the output of the old hash function (which also was a xor-multiply loop apparently) with some random value, which doesn't fix anything.
9:03:09
no-defun-allowed
Or, more accurately, any document describing how the hash function is attacked, when the function uses a random initial value. It is just a matter of time for someone to find a partial collision with no randomization.
10:36:55
no-defun-allowed
So, for now, I am not worried about the choice of hash function. Here is the performance graph of tonight's work: <https://applied-langua.ge/~hayley/sicl-hash-tables.png> linear-probing-hash-table beats out bucket-hash-table at GETHASHing existing mappings, and the graph looks basically the same for misses.
11:41:05
beach
My idea for a register-allocation technique based on the page replacement algorithm OPT depends on a metric that I call "Estimated Distance to Use" or EDU, and so far I have been unable to come up with a technique for computing the EDU.
11:41:11
beach
But this morning I noticed that it is a linear-equation system. The problem, though, is that the system is large very sparse, so it would be too costly to solve it with something like Gaussian elimination.
11:41:21
beach
Otherwise, the idea is as follows: The EDU is defined for each variable at program points before each instruction.
11:41:30
beach
The EDU of a variable V before instruction I (i.e., EDU[V,I]) is 0 if I has V as an input.
11:41:44
beach
If I has two successors, and V is live in only one successor J, and we estimate the probability of the branch from I to J to be taken by probability p, then EDU[V,I] = 1/p * EDU[V,J].
11:41:46
beach
Otherwise, let J and K be the successors of I and p the probability that the branch from I to J is taken, then EDU[V,I] = p * EDU[V,I] + (1-p) * EDU[V,J].
13:10:23
beach
The only real problem is with loop inside of which some variable is live but not used. The other cases are simple.
13:11:14
beach
However, I have also hinted in the past that register allocation should be done in loops first, from the innermost to the outermost.
14:27:40
beach
So, anyway, there are two completely separate phases to my idea. Phase 1 is to compute the EDUs and phase 2 is to use those EDUs to allocate registers. So I'll put phase 1 on hold for the time being and start thinking about phase 2.
15:14:26
heisig
Solving such a linear system efficiently should not be too hard. Especially since we don't need an exact solution - an approximation would be fine.
15:14:46
beach
It is also in chapter 10 of the Cleavir documentation in SICL/Code/Cleavir/Documentation.
15:17:58
beach
In the Cleavir documentation, the equations are not linear, but I think a linear system is a fine approximation.
15:52:58
beach
... which is too bad, because I am not really that interested in that particular architecture. But that's what we have for the time being. Oh, well.
15:59:03
heisig
Hmm, x86 could actually be gone soon. They will never be able to decode more than five instructions per cycle, and that is starting to hurt.
16:09:41
beach
Bike: What semantics do we attribute to UNBOX instructions? Are the responsible for checking that the input is of one of the right types (and signaling an error otherwise) and then applying the relevant unboxing operation? If so, is there any way that the type inferencer could annotate such an instruction so that some steps could be skipped in some cases?
16:11:54
Bike
boxing and unboxing is kind of still in flux. HIR unbox definitely checked types, though.
16:13:01
beach
That's what I though. Yes, maybe. But it's tricky since more than one object type could work, and the unboxing operation would be different for each one.
16:14:25
beach
I guess it's perhaps only for the ((un)signed-byte 64) that both a fixnum and a bignum might work.