freenode/#clasp - IRC Chatlog
Search
13:09:50
kpoeck
Looking at the latest flamegraph for cl-bench (compiler) the widest blocks seem to be map-instructions-with-owner and map-instructions-arbitrary-order with 20% and 15%
13:11:48
kpoeck
I wonder whether this is faster with generic functions for the 2 instructions instead of the typep
13:23:25
kpoeck
drmeister: does generating the flamegraph slows the execution substantially down or is that just noise?
13:35:53
beach
drmeister: Can you explain in a few short phrases how Clasp hash tables are implemented?
13:36:52
beach
drmeister: I was just thinking, if that implementation can be improved, there might not be any need to avoid using hash tables in the compiler, and Clasp would be faster in general as well.
13:58:13
drmeister
kpoeck: I don't know if dtrace slows down the program that you are tracing - I think it's designed to minimize that as much as possible.
14:07:56
drmeister
beach: I have tried in the map-instructions-xxx functions to add :rehash-size 4.0 to the make-hash-table and that helps - about 10%
14:08:53
drmeister
General: I suspect that there is a problem with generic functions and multi-threading - if anyone sees hangs when building asdf - please tell me.
14:09:51
drmeister
I'm building with the serial compile-file - and I'm looking at set-funcallable-instance-function - I think it needs a per-generic function spin-lock.
14:11:41
kpoeck
Is that the thing with (find ',slot-name (clos:class-slots ,class) :key #'clos:slot-definition-name)
14:15:55
drmeister
Bike: I'll try it in about an hour. I'm building with compile-file-serial to see if that doesn't hang.
14:17:58
Bike
i know i said s-f-i-f needs a lock before, but i've been thinking about it and i can't come up with a way that it'll go wrong without one
14:18:29
Bike
it always sets the GFUN_DISPATCHER to something valid. it sets the entry point to either funcallable_entry_point (which is valid if the GFUN_DISPATCHER is valid), or to some other entry point but it won't do that if the entry point needs the GFUN_DISPATCHER.
14:18:53
drmeister
It hung twice now in DtreeInterpreter_O::entry_point and the second time it was definitely looking up the dtree of the interpreter
14:19:07
Bike
you might end up with a case where something is going through funcallable_entry_point that doesn't have to, but that won't actually be a break...
14:20:18
Bike
we should probably take out the special dtree interpreter o thing, too. since it's just a closure.
14:27:45
kpoeck
drmeister: there are about 140 CL_DEFUN T_mv in the codebase but a lot of them don't seem to actually return multiple-values, but just (Values(single-value))
14:28:26
Bike
it could end up in a situation where the function and the entry point are out of sync, but the only operational effect is get-funcallable-instance-function returning a function that's not actually being used, and who cares about that
14:40:46
beach
drmeister: How about instead of a vector of CONS cells, you just make every other element a key and every other element a value?
15:04:30
Bike
https://github.com/clasp-developers/clasp/blob/master/src/core/hashTable.cc#L736-L739 am i reading this right that if it needs to rehash it actually prints something??
15:05:40
drmeister
Bike: That only happens if the hash-table runs out of slots - that should never happen.
15:07:42
drmeister
I put that in there when I was fixing the rehash-thresholds throughout clasp so that they were all less than 1.0 - then I just left it in.
15:09:51
drmeister
There is this... https://github.com/clasp-developers/clasp/blob/dev/src/core/hashTable.cc#L338
15:09:53
Bike
i think we're sort of ignoring some of the finer aspects of rehashing control, but we're allowed to so whatever
15:10:23
Bike
strictly speaking it could be a type error but we don't have to and it's kind of unlikely
15:10:34
beach
Other than the two elements represented in the vector itself rather than as a CONS, there is no difference.
15:12:37
Bike
not hasing twice would be good too, but hashing is just arithmetic so it probably doesn't matter relatively speaking
15:15:26
drmeister
https://github.com/clasp-developers/clasp/blob/dev/include/clasp/core/hashTable.h#L87
15:16:35
drmeister
The actual table is a Vec0<Cons_O> - it's not a simple-vector of Cons_sp cells like I said earlier. I forgot I made it a vector of Cons_O cells - so there is no allocation unless there is a rehash.
15:20:57
Bike
we still have a few issues like "[x] conses" and having an easy way to check that would be good
15:22:32
drmeister
(time (loop for x below 1000 do (cons 'a 'b))) -> Time real(0.000 secs) run(0.000 secs) consed(16000 bytes)
15:22:48
drmeister
(time (loop for x below 10000 do (cons 'a 'b))) . -> Time real(0.000 secs) run(0.000 secs) consed(160000 bytes)
15:47:37
Bike
make sure it can build slime and some quicklisp libraries. that's where i started hitting problems before.
15:48:41
beach
OK, so no consing. That's good. Do you have statistics about how many iterations are required on the average when adding an element and when searching for one?
16:10:25
Bike
https://github.com/clasp-developers/clasp/blob/dev/src/core/hashTable.cc#L630-L640 here, for the read
16:12:12
beach
OK, so if you look for something and you find such an empty marker, you have to iterate, right?
16:12:54
beach
So if the hash table is nearly empty, and you are looking for something that isn't there, you must iterate over half the elements on the average?
16:14:27
beach
Because an element can be anywhere, due to there being collisions when it was inserted.
16:14:30
Bike
well, there's also an unbound marker, and if the iteration hits that it returns not found.
16:14:56
drmeister
I haven't written a function to tell me that about a hash table - but I have visually inspected them - it's not bad. Here's an example:
16:16:04
Bike
which makes sense because you could have two values hash to the same number, and then delete the earlier one in the table
16:16:28
beach
Ah, so it is in fact when the table is close to full that searching for a non-existing item is expensive.
16:18:31
beach
It would be interesting to compare the number of iterations on the average for open hashing compared to hasing per bucket.
16:18:33
drmeister
What I'm not doing is caching the previous searched for entry as sbcl does - stassats says that is a good thing to do.
16:19:10
kpoeck
Perhaps you could have a lot at the benchmark and verify that nothing important is missing
16:19:10
beach
drmeister: Do you have reasons to believe that this would apply to the use case in Cleavir?
16:20:55
beach
kpoeck: Well, execution time depends on lots of things, like how good the compiler is.
16:20:56
Bike
https://github.com/robert-strandh/SICL/blob/master/Code/Cleavir/Intermediate-representation/map-instructions-arbitrary-order.lisp#L17-L18 bam.
16:21:03
drmeister
stassats indicated the use case where it was useful. Traversing graphs wasn't it.
16:22:39
Bike
i guess you could do both? if you find the item, cache its location. if you don't find the item, cache the first place it could go
16:22:52
drmeister
The (sys:next-number) returns a fixnum that is incremented atomically and guaranteed never to be zero.
16:25:19
beach
I am trying to figure out whether there is something that could be done to improve the performance of hash tables in Clasp, so that no special case is needed.
16:26:24
beach
It appears to me that, if no consing is done, the main performance issue is the number of iterations to find (or not find) an item being looked for.
16:26:55
beach
Such numbers could then be compared to what is expected for a hash table with buckets.
16:30:16
beach
Or, let me ask this: what prompted the decision to use open hashing rather than bucket hashing?
16:30:44
beach
Presumably, you either did some comparison, or you had a hunch that open hashing would be a better choice.
16:37:13
drmeister
Ideas: If we knew roughly how many instructions we are dealing with we could pass :size to make-hash-table in the map-instructions-xxx functions.
16:38:22
drmeister
I think the problem is that inlining is mapping over the instructions again and again and again. So hash-tables are ballooning over and over again.
16:39:17
Bike
we could have the enter-instruction or something keep a list of all the instructions in that function, and have all the instructions refer to the enter or the list, and then mapping with arbitrary order would just be a linear loop
16:40:06
drmeister
Or decide that iterating over graphs is something that the graphs need to be good at and add a 'touched' slot to each instruction - but deal with the extra complexity.
16:41:44
drmeister
Or at the very least - add :rehash-size 4.0 to map-instructions-xxx make-hash-table to rehash less often.
16:42:43
drmeister
Or keep improving the inlining algorithm to iterate less over the instruction graph - we have been doing very well with this - how much improvement is there to be had?
16:53:06
drmeister
Sure - and there is some overhead to remove in the single-dispatch-generic-function call - but where else?
16:53:36
Bike
we can see whether we're going through more iterations than we need to like beach was saying
16:55:35
drmeister
Oh - but visual inspection of the example I posted above shows that it's usually at the index and less often one away.
16:56:49
drmeister
In other news - I cannot get compile-file-parallel to fail on ASDF when I do it by hand. Grrrr.
17:32:38
kpoeck
Defining the hashtables in (map-instructions-* ) like (make-hash-table :test #'eq :size 64 :rehash-size 4.0) seems to bring a 10% improvement in compilation time
17:47:34
karlosz
im hoping that these changes will make the map-instruction changes unnecessary, since those big chunks of map instructions are mostly coming from those
17:52:07
karlosz
kpoeck: i have a patch which entirely removes the map-instructions-owner block you are looking at, so it might not be necessary to try to hard to speed it up
17:52:24
drmeister
Here is some more data on clasp's hash-tables - the average search length isn't too bad.
17:52:55
drmeister
I create a 10000 element hash-table and add 1000 elements keyed on fresh cons cells
17:55:38
drmeister
I don't think there are a lot of improvements to be had in improving the hash table in terms of search length.
17:56:32
karlosz
i just updated my pull request here: https://github.com/robert-strandh/SICL/pull/131. do you think you could get a flamegraph for the second set of patches as well? kpoeck
17:57:24
drmeister
Bike: Can I push my sicl changes to a sicl branch? I'd like to move them to my linux machine.
18:09:01
drmeister
I added a 'touched' slot to instructions and I changed the map-instructions-xxx functions.
18:12:37
karlosz
Bike: by the way, i did the function-dag incremental thing as an update to the pull request. i noticed you had done something similar before and reverted. just wanted to make sure i didn't run into the same pitfall. do you remember what the issue was before?
18:13:07
Bike
karlosz: i don't think i ever worked out what the exact problem was, but i hit issues compiling slime.
18:13:47
Bike
clasp dev uses the clasp-changes branch of my repo. the master branch of my repo is unused
18:14:27
Bike
i mean, clasp actually ignores the branches, the commit just needs to be available somewhere
18:17:29
Bike
but if you make a forced change to clasp-changes, or something like that, the commit won't be up any more and clasp configure won't be able to download sicl
18:37:21
karlosz
Bike: did you ever try incrementally doing compute-destinies? i think ive got something worked out for it but its a little messy. after that goes away then the main bars for map-instructions in the flamegraphs will go away
18:40:59
karlosz
but now that my previously 12 hour build takes less than 6 hours, i can spare a few more self builds now. less inclined to believe its timing unreliability :)
19:13:43
karlosz
drmeister: you mean the linux perf stuff? i have no idea. ive never used any of those types of tools that the os provides before
20:51:52
drmeister
As a baseline - I built clasp 'dev' using the default everything on my linux workstation in 124min
22:11:46
drmeister
Bike: what was that SSA-like form that we ran into recently, where instead of PHI-nodes you use jumps with arguments.
22:12:10
Bike
i don't think it had a name. it was part of the extended blocks used by that jit thing you were looking at.