freenode/#clasp - IRC Chatlog
Search
16:14:27
beach
Because an element can be anywhere, due to there being collisions when it was inserted.
16:14:30
Bike
well, there's also an unbound marker, and if the iteration hits that it returns not found.
16:14:56
drmeister
I haven't written a function to tell me that about a hash table - but I have visually inspected them - it's not bad. Here's an example:
16:16:04
Bike
which makes sense because you could have two values hash to the same number, and then delete the earlier one in the table
16:16:28
beach
Ah, so it is in fact when the table is close to full that searching for a non-existing item is expensive.
16:18:31
beach
It would be interesting to compare the number of iterations on the average for open hashing compared to hasing per bucket.
16:18:33
drmeister
What I'm not doing is caching the previous searched for entry as sbcl does - stassats says that is a good thing to do.
16:19:10
kpoeck
Perhaps you could have a lot at the benchmark and verify that nothing important is missing
16:19:10
beach
drmeister: Do you have reasons to believe that this would apply to the use case in Cleavir?
16:20:55
beach
kpoeck: Well, execution time depends on lots of things, like how good the compiler is.
16:20:56
Bike
https://github.com/robert-strandh/SICL/blob/master/Code/Cleavir/Intermediate-representation/map-instructions-arbitrary-order.lisp#L17-L18 bam.
16:21:03
drmeister
stassats indicated the use case where it was useful. Traversing graphs wasn't it.
16:22:39
Bike
i guess you could do both? if you find the item, cache its location. if you don't find the item, cache the first place it could go
16:22:52
drmeister
The (sys:next-number) returns a fixnum that is incremented atomically and guaranteed never to be zero.
16:25:19
beach
I am trying to figure out whether there is something that could be done to improve the performance of hash tables in Clasp, so that no special case is needed.
16:26:24
beach
It appears to me that, if no consing is done, the main performance issue is the number of iterations to find (or not find) an item being looked for.
16:26:55
beach
Such numbers could then be compared to what is expected for a hash table with buckets.
16:30:16
beach
Or, let me ask this: what prompted the decision to use open hashing rather than bucket hashing?
16:30:44
beach
Presumably, you either did some comparison, or you had a hunch that open hashing would be a better choice.
16:37:13
drmeister
Ideas: If we knew roughly how many instructions we are dealing with we could pass :size to make-hash-table in the map-instructions-xxx functions.
16:38:22
drmeister
I think the problem is that inlining is mapping over the instructions again and again and again. So hash-tables are ballooning over and over again.
16:39:17
Bike
we could have the enter-instruction or something keep a list of all the instructions in that function, and have all the instructions refer to the enter or the list, and then mapping with arbitrary order would just be a linear loop
16:40:06
drmeister
Or decide that iterating over graphs is something that the graphs need to be good at and add a 'touched' slot to each instruction - but deal with the extra complexity.
16:41:44
drmeister
Or at the very least - add :rehash-size 4.0 to map-instructions-xxx make-hash-table to rehash less often.
16:42:43
drmeister
Or keep improving the inlining algorithm to iterate less over the instruction graph - we have been doing very well with this - how much improvement is there to be had?
16:53:06
drmeister
Sure - and there is some overhead to remove in the single-dispatch-generic-function call - but where else?
16:53:36
Bike
we can see whether we're going through more iterations than we need to like beach was saying
16:55:35
drmeister
Oh - but visual inspection of the example I posted above shows that it's usually at the index and less often one away.
16:56:49
drmeister
In other news - I cannot get compile-file-parallel to fail on ASDF when I do it by hand. Grrrr.
17:32:38
kpoeck
Defining the hashtables in (map-instructions-* ) like (make-hash-table :test #'eq :size 64 :rehash-size 4.0) seems to bring a 10% improvement in compilation time
17:47:34
karlosz
im hoping that these changes will make the map-instruction changes unnecessary, since those big chunks of map instructions are mostly coming from those
17:52:07
karlosz
kpoeck: i have a patch which entirely removes the map-instructions-owner block you are looking at, so it might not be necessary to try to hard to speed it up
17:52:24
drmeister
Here is some more data on clasp's hash-tables - the average search length isn't too bad.
17:52:55
drmeister
I create a 10000 element hash-table and add 1000 elements keyed on fresh cons cells
17:55:38
drmeister
I don't think there are a lot of improvements to be had in improving the hash table in terms of search length.
17:56:32
karlosz
i just updated my pull request here: https://github.com/robert-strandh/SICL/pull/131. do you think you could get a flamegraph for the second set of patches as well? kpoeck
17:57:24
drmeister
Bike: Can I push my sicl changes to a sicl branch? I'd like to move them to my linux machine.
18:09:01
drmeister
I added a 'touched' slot to instructions and I changed the map-instructions-xxx functions.
18:12:37
karlosz
Bike: by the way, i did the function-dag incremental thing as an update to the pull request. i noticed you had done something similar before and reverted. just wanted to make sure i didn't run into the same pitfall. do you remember what the issue was before?
18:13:07
Bike
karlosz: i don't think i ever worked out what the exact problem was, but i hit issues compiling slime.
18:13:47
Bike
clasp dev uses the clasp-changes branch of my repo. the master branch of my repo is unused
18:14:27
Bike
i mean, clasp actually ignores the branches, the commit just needs to be available somewhere
18:17:29
Bike
but if you make a forced change to clasp-changes, or something like that, the commit won't be up any more and clasp configure won't be able to download sicl
18:37:21
karlosz
Bike: did you ever try incrementally doing compute-destinies? i think ive got something worked out for it but its a little messy. after that goes away then the main bars for map-instructions in the flamegraphs will go away
18:40:59
karlosz
but now that my previously 12 hour build takes less than 6 hours, i can spare a few more self builds now. less inclined to believe its timing unreliability :)
19:13:43
karlosz
drmeister: you mean the linux perf stuff? i have no idea. ive never used any of those types of tools that the os provides before
20:51:52
drmeister
As a baseline - I built clasp 'dev' using the default everything on my linux workstation in 124min
22:11:46
drmeister
Bike: what was that SSA-like form that we ran into recently, where instead of PHI-nodes you use jumps with arguments.
22:12:10
Bike
i don't think it had a name. it was part of the extended blocks used by that jit thing you were looking at.
22:49:42
drmeister
I just finished building with karlosz' latest PR on my linux workstation. The time dropped from 124m10s to 83m
22:52:54
kpoeck
yes, the CL_DEFMETHOD T_sp HashTable_O::hash_table_setf_gethash(T_sp key, T_sp value) seems to be relevant
23:03:54
selwyn
The symbol HASH-TABLE-SETF-GETHASH already has a function bound to it and it is not a SingleDispatchGenericFunction - it cannot become a SingleDispatchGenericFunction
23:06:10
kpoeck
CL_DEFUN T_mv cl__gethash(T_sp key, HashTableBase_sp hashTable, T_sp default_value) { return hashTable->gethash(key, default_value); };
23:19:17
karlosz
are you all looking at the gethash call in compute-destinies? i plan to get rid of the bar entirely so it may be worthwhile to see if it is still expensive after that chunk is gone
23:27:58
drmeister
karlosz: We are looking at the (setf gethash) call in compute-destinies - if you get rid of that bar that would be great. Then it won't depend on hash-table's like it was. But it's still a good idea to get rid of overhead for (setf gethash).
23:30:14
drmeister
I am running into a problem though. When I try to load quicklisp/setup.lisp with your latest PR I get an error.
23:36:48
karlosz
but it certainly *looks* like an inlining problem. maybe you have some merge damage?
23:42:25
drmeister
git remote add karlosz http://github.com/karlosz/SICL.git ; git pull karlosz rui-v2
23:46:34
drmeister
The only other change is I (setf *compile-file-parallel* nil) in compile-file-parallel.lsp
23:55:05
karlosz
but without any of my patches, quicklisp wouldn't load in a reasonable amount of time
1:11:19
drmeister
The change to (setf gethash) is not perceptible. It built in 83min previously and now 82m10s
1:14:39
drmeister
Hmm, I'm not sure if that is in both runs or just this one. What is definitely just in the latest is converting hash-table-setf-gethash from a CL_DEFMETHOD to a CL_DEFUN
1:26:21
karlosz
drmeister: what does the backtrace look like? i have never seen an issue with asdf, but i also have a 3 day old clasp
2:07:46
karlosz
drmeister: did the build start failing after you pulled clasp? Bike made some inline.lisp changes yesterday that look relevant
2:10:46
karlosz
might want to try building with commit 727f6da7779a1ec4a4b330410e64ee04f4cfdae7 and see if that fixes things, since i don't have and i suspect kpoeck doesnt have the absolute latest clasp
3:20:23
karlosz
its interesting that cloning instructions for partial inlining takes more time than sparse conditional constant propagation
4:01:16
karlosz
https://github.com/robert-strandh/SICL/pull/131 updated to kill the compute-destinies bar in the flame graph (hopefully)
4:08:19
karlosz
if you wanted to go back to a particular commit in that branch, you can git checkout