freenode/#clasp - IRC Chatlog
Search
19:11:01
drmeister
In memory - it's just a word - returning a reference to an atomic should just be the compiler treating it as a std::atomic<T>
19:13:19
Bike
it makes sense that it would work, i just got kind of frustrated at the C++ atomics interface
19:14:24
Colleen
karlosz: Bike said 4 hours, 13 minutes ago: local call idea: 1) lose local-call class 2) add slot to abstract-call that's either NIL or the FUNCTION being called 3) inline etc use that slot but leave enclose in place 4) a later, possibly client specific pass eliminates the enclose instruction if the calls don't need it
19:14:38
drmeister
In my second mission in Hyrule Warriors I destroyed an Ancient Guardian - then I had an ironsmith level up a soup ladle to level 6 and with it - I am leveling armies. I won't be going much farther with this.
19:19:23
drmeister
So now it searches the call-history alist for the receiver-class and calls the associated method. If it doesn't find one it searches the class-precedence-list of the receiver-class looking for methods that handle one of those classes and updates the call history and then dispatches to the method.
19:21:53
drmeister
Hmmm, this does look kind of slow. aclasp building is getting bogged down on some of the files.
19:28:58
karlosz
Bike: what do we gain from doing that? the idea with deleting encloses as part of the analysis was so that we have an easy way of querying whether a function escapes or is only used in call position
19:29:59
karlosz
also i don't really understand why get rid of the class, since it seems like a good use of using a different class
19:30:22
Bike
i want to be able to do that for functions that have invovled lambda lists and/or are mv-called
19:31:38
Bike
the question of whether it's only called seems orthogonal to the question of whether a closure needs to be allocated
19:32:41
karlosz
i don't mean to say enclose necessarily means closure allocation - closurettes prove that
19:33:35
Bike
i mean right now if we have a function that's only called, we'll transform calls into local calls if only if its lambda list doesn't have &rest or &key, right?
19:35:35
karlosz
the interpolator can't handle hairier lambda lists, and clasp can't make direct calls out of local calls with hairier lambda lists, but that's fine
19:36:00
Bike
well yes, i mean i want to do it for any kind of lambda list, but clasp and possibly other clients might need a closure for that
19:36:04
karlosz
we can have the local call translator just fall back to a normal enclose + call style translation
19:36:27
karlosz
like, the local call translator just allocates the closure and calls it in the fallback case
19:36:53
karlosz
there's no issue with the local call translator allocating the closure because its not being used first class so you don't get EQ problems
19:38:04
karlosz
translate-instruction for local call in the fallback case just does whatever enclose would have done and then calls it
19:38:48
karlosz
i mean, local call analysis is already optional, but this gives clients a way to choose the granularity of what it can handle directly
20:04:28
karlosz
Bike: the other reason for this suggestion is that we have way more flexibility if we don't use the enclose instruction in BIR
20:04:52
karlosz
like, with the strategy you proposed of reusing the enclose instruction, i don't think it would be possible to separate out the action of closure parsing and arg parsing
20:05:06
karlosz
but no matter how hairy the lambda list is, we can still skip closure parsing with local calls
20:07:34
drmeister
I was wrong about using the call-history being too slow to be useful. I had a bug in my call-history update - I didn't use the argument class and so the call-histories were screwed up and it kept falling back to the slow path. It's as fast as our old dispatching method.
20:10:40
drmeister
So that's good - we can use this and not have to come up with some Frankenstein data structures to get aclasp and bclasp built in a reasonable amount of time. After bclasp is up we can compile discriminating functions - that should just require a bit of tweaking.
20:18:18
karlosz
i mean, the enclose instruction encloses the entry point which loads out of the closure vector and parses arguments right
20:18:44
karlosz
we can just direct call with the environment augmented as arugments in the beginning
20:19:43
Bike
i can just have it allocate a closure first and then worry about more entry points as an improvement
20:29:19
karlosz
asdf compiles faster and conses less, i'm guessing because of tighter code generation
20:30:31
karlosz
right now im struggling with pass management stuff, since meta evaluation is the first "real" lisp level optimization that can apply over and over again which we do in bir
20:30:57
karlosz
right now i just run it 3 times over the graph , but that doesn't catch everything and is sometimes unnecessary
20:31:17
karlosz
since eliminating an IF IF and then folding an unreachable branch can cause more optimizations
20:32:16
karlosz
since we "binary" merge currently (things like (if ... ... (if ...))) generates a "redundant" merge block) something like (if (cond .... ... ..) ...) takes as many passes as there are cond branches to IF-IF eliminate everything
20:32:42
karlosz
so rather than hardwiring the number of passes to grovel over the graph we need to find a smarter way to fixpoint to optimality
20:33:31
karlosz
Python does it with reoptimize slots on nodes blocks and components but introducing slots on BIR objects for a bir transformation optimization pass is of course meh
20:34:21
karlosz
i'm currently thinking about a work queue which carefully keeps things in forward flow so we don't need to examine more than we need to, but that requires some coordination with everything
20:34:59
karlosz
also we probably want to push types in conjuction with the other abstract domain types in the metaeval "phase" too, so gotta keep that in mind
20:37:54
drmeister
I mean good going all around - this is really an impressive team effort bringing BIR online and now using it to improve things.
20:46:20
drmeister
I pushed the single dispatch generic function changes to the 'master' branch. It will remove those annoying debug messages.
20:56:49
drmeister
::notify selwyn Sorry to bring this up but the new-rebuild-dist doesn't work like rebuild-dist for quickclasp. new-rebuild-dist fails and dumps me in the sbcl command line.
21:10:04
drmeister
Because I think I just improved quicklisp compilation by about 20% and I wanted to know if that was me or you.
21:11:23
karlosz
drmeister: it's sitting as this PR... https://github.com/clasp-developers/clasp/pull/1094
23:23:04
drmeister
Time real(148.430 secs) run(148.432 secs) consed(9568808064 bytes) interps(68) unwinds(1767)
23:23:33
drmeister
Time real(157.104 secs) run(157.105 secs) consed(9902079664 bytes) interps(5) unwinds(1767)
0:16:13
karlosz
drmeister: well, it should show in the disassembly, but an easy way to see if it's firing is to uncomment this https://github.com/s-expressionists/Cleavir/blob/ae7cd58d00f6ed586c5c0f752091d6809a864f03/BIR-transformations/meta-evaluate.lisp#L112
0:17:30
karlosz
i don't know if your new single dispatch changes is supposed to cause consing to go down significantly, but the if-if changes did cause reduced consing because less llvm instructions got consed up
3:25:37
drmeister
That is probably your stuff. The single dispatch changes shouldn't change how much consing is going on - not 250 mb
3:26:27
drmeister
::notify Bike We are getting a weird crash on the buildbot with the deploy script. Not anything else, not macOS, not my linux machine - just the buildbot. The error and some of the backtrace is here: https://gist.github.com/drmeister/8232a97e40b55f477bcbdef56fba3fa1
3:28:29
drmeister
It's very reproducible - it's happened five times in a row with different commits. Any ideas?
4:58:14
karlosz
wow, some incredibly simple changes now gets ironclad compiling like: Time real(125.112 secs) run(125.114 secs) consed(8553760464 bytes) interps(2) unwinds(1540)
5:00:53
karlosz
::notify Bike you're not going to believe this, but a 3 line change causes an entire gigabyte less of consing while compiling ironclad and also shaves 20 seconds off the whole kaboodle from 2 mins 20 seconds to 2 mins. the change looks like this... https://paste.gnome.org/pryehpgkv
5:01:55
karlosz
since (cl:car ...) and (cl:null ...) are so frequently inlined, making the expansions easier for the compiler to digest (not having to bring in the definition of EQ in, for example), makes the compiler's job a lot easier
5:02:38
karlosz
so i think rethinking how we implement primitive operations/transform stuff is going to be pretty important, since we want to have a better handle on the type directed machinery anyway
5:05:54
karlosz
how i noticed this was i just did https://paste.gnome.org/pmrzyzftg and noticed that's a shitton of initial BIR code just to compile CAR which we need to crunch through on every inline
6:49:26
karlosz
drmeister: it's just code corresponding to this: https://github.com/clasp-developers/clasp/blob/3a758559ed3ac3b2868a97743d3ede111d47aefb/src/lisp/kernel/cleavir/inline.lisp#L310
6:52:02
karlosz
hence why removing the call to eq (which needs to get inlined) reduces the amount the compiler has to process
6:52:26
karlosz
it only makes such a dramatic effect because these things like CAR are so ubiquitous