freenode/#clasp - IRC Chatlog
Search
13:34:14
Colleen
Bike: karlosz said 8 hours, 42 minutes ago: there's a problem with the way arguments are checked in interpolate-function. this will for example interpolate if you have 2 invalid calls but 1 valid call: https://github.com/s-expressionists/Cleavir/blob/7166084dc669413f9031a773cfa46436d20aba8a/BIR-transformations/interpolate-function.lisp#L308
14:58:59
drmeister
A thing where you can run icando-boehm to start slime and THEN build all of the cando quicklisp code.
14:59:25
drmeister
So compilation problems of the quicklisp code can be debugged with all the facilities of slime.
15:01:18
drmeister
Bike: If you are developing with clasp you can clone cando into the clasp/extensions directory and build as normal. There are more C++ files - but you do those once and they are done.
15:02:13
drmeister
<path-to-clasp>/build/boehm/icando-boehm -f dont-start-cando-user -f force-compile-file-serial
15:03:02
drmeister
It will start up just like you are used to and do a little more work setting up directories and load quicklisp automatically.
15:03:42
drmeister
Then you can do all your normal development stuff but when you want to see if cando's quicklisp code all builds you evaluate (start-cando-user)
15:04:29
drmeister
Right - then you have sldb and inspectors and all that for debugging code generation problems.
15:05:48
drmeister
So while you were out "living your life and exercising good work/life balance" - we were here slaving over a hot compiler.
15:49:23
yitzi
drmeister: Not sure if you got my response on the Cando arch stuff on gchat. Did you need some clarification?
15:49:52
Bike
karlosz: https://github.com/s-expressionists/Cleavir/blob/master/BIR-transformations/inline.lisp#L49 why is this "unless" rather than "when"?
16:39:06
Bike
well i ask because if i make it a when, it makes the compilation failure in the tests go away
16:43:27
karlosz
i mean the idea is that we'll clean up encloses if all non-local call references disappear
16:53:17
karlosz
the interpolation thing with the "generalized" local call thing is less broken now, but still broken
16:54:36
karlosz
we could of course solve it by defining local calls as "legal calls to functions in the same module"
16:54:56
karlosz
but to be honest its not even that important because you'll only see problems if you write obviously invalid code
16:59:29
Bike
the main thing was separating determination of what can be interpolated (lambdalist wise) versus what's a local call. if we put something else back in to check legality i don't mind too much
17:20:27
karlosz
er, at least i don't see why keywords will be less weird if we do the legality checking in the client vs straight away in cleavir
18:16:13
Bike
if we contify local mv calls, and why not, it would be more efficient to have a single values/arguments processor, unlike with local calls we have now where each call site feeds into a shared phi (if i'm reading correctly)
18:37:18
Bike
actually i suppose there's no reason we can't contify mv calls and regular calls together, not that it would come up much in practice i imagine
18:41:05
Bike
okay, i mean, for a multiple-value-call, we look at the number of values in the values location, and based on that number assign whether options are taken from the values vector or have suppliedp nil, etc., that kind of thing.
18:42:21
karlosz
well if the phi had the type annotated onto it like (&values t t t t) or something, then that info would still be there
18:42:46
karlosz
but i guess we'd want to duplicate the call site for each phi def like with if-if elim and then contify that way
18:43:21
karlosz
so you're probably right that it would be more efficient if we could avoid the phi feeding
18:44:25
Bike
i was imagining some kind of more involved multiple-to-fixed instruction, which would be complicated
18:45:47
Bike
well, right now i'm thinking in terms of how to make multiple-value-call primitive kind of like sbcl does
18:45:48
karlosz
the only thing i was hoping for was that we could delay these coercion instructions till later, but maybe with mv-bind that doesn't make sense
18:48:00
Bike
it doesn't leave closures around for multiple-value-bind, right? no way. does it leave a (mv) call?
18:50:39
karlosz
https://github.com/sbcl/sbcl/blob/1c10a449acf5f892e55e9bf94f85f291cfbacc74/src/compiler/ir2tran.lisp#L963
18:52:23
Bike
i mean in bir terms, if we leave in the call that means we have variables that are now shared, etc
18:55:08
karlosz
DX analysis knows that if the function doesn't have an enclose (i.e. doesn't escape)
18:57:51
karlosz
in sbcl there's no such thing as interpolation, clambdas (our bir:function) just always stay around and the backend knows how to compile them like lets or whatever
18:59:31
karlosz
the body of a clambda (mv-call or not) never gets interpolated into a caller at the ir level
19:01:08
karlosz
yeah i mean in the backend what it does is that it just loads the arguments from the environment and then does a jump into the function body
19:03:25
karlosz
(flet ((f () (print x))) (mv-call #'f (whatever) (whatever))) not allocating a closure for f would be nice
19:06:44
karlosz
Bike: by the way, could you merge the clasp PR? its just a simple thing and doesn't require cleavir changes
19:14:33
Bike
maybe we should have the cmacros but also keep the inline definitions, for more exotic uses? i guess that's a little far off to worry about
19:19:03
karlosz
anything wrong with the trick of just doing (defun not (x) (not x)) so that the cmacro def is used?
19:20:21
Bike
i don't like sbcl's stub definitions. like, i understand why and how they work, but still, i prefer how cleavir breaks out the primops
19:28:12
karlosz
this does surprise me a little: sbcl actual heap allocates a value cell for mv-bind https://github.com/sbcl/sbcl/blob/1c10a449acf5f892e55e9bf94f85f291cfbacc74/src/compiler/ir2tran.lisp#L1581
19:33:05
Bike
i think our arguments.lsp code might be generalized enough already that doing it to make local-mv-calls use just the main entry point might not be... totally terrible
19:45:29
karlosz
i actually tried to do rest list allocation in the caller but couldn't figure out how
19:46:30
karlosz
*sigh* it would be great if we could just have one unified lambda list parser once and for all
19:48:22
karlosz
lambda list parsing is one of those things where i really don't think each client needs to have their own hundreds of lines parser which all will end up pretty much looking the same
19:48:46
Bike
hmm. right. the arguments.lsp code expects an actual va-list because it's passed to the &rest list allocator function. which is why we do that horrible rewinding thing...
19:55:06
karlosz
is there any way we can make eclector get compiled first by cleavir in the bootstrap process?
19:55:55
karlosz
since the reader is so performance critical and benefits from all the unwinding optimizations i wonder if we could get an easy bootstrap time win just from hoisting it up in the build process
19:58:13
drmeister
Note: eclector is load/compiled first using bclasp. Then it gets compile-file'd but not loaded until you run cclasp.
19:59:19
karlosz
right. i was thinking if we could use cclasp'd compiled eclector as soon as possible while bootstrapping we'd bootstrap faster
20:00:11
drmeister
Yeah - that's been a problem forever. I think unless we implement a Common Lisp compiler in C++ we are stuck with that.
20:00:43
Bike
but maybe we could do eclector, since we don't actually need to load eclector until after the cclasp compiler is fully loaded, right?
20:01:36
karlosz
oh yeah, i mean i guess i don't have evidence besides the fact that we've seen unwinding really slows down the reader
20:03:12
drmeister
But half of the build time is spent compiling C++, then we build aclasp, then we build bclasp and then we build cclasp.
20:03:47
drmeister
The bclasp compiled eclector may be doing a lot of unwinding - but maybe not. We should check that first.
20:04:41
drmeister
bclasp used to nest try/catch blocks and that led to less unwinding than cclasp. But I don't recall what it's doing these days.
20:06:02
drmeister
If we really want to move the needle - we could write a clisp/ecl-like Common Lisp bytecode compiler in C++.
20:06:40
Bike
drmeister: another question. did we have a reason to do the va list unwinding besides argument parsing? something about the register save area maybe?
20:07:48
drmeister
I think we should move to multiple entry points before we do any more screwing around with va list unwinding.
20:07:51
Bike
i'm looking at arguments.lsp again and wondering. god help me. i know i tried it before
20:08:23
drmeister
We should set up the multiple entry points in the FunctionDescription objects. We can do it and test it without breaking everything.
20:08:44
Bike
well, karlosz's addition of the "main" entry point already opens things up. i'm wondering about extending the "local call" mechanism such that we never allocate closures for local functions that are only called
20:08:54
drmeister
We can set things up to switch to it on the fly and test it within a working system.
20:10:46
drmeister
Yeah - we need to start analyzing what will be involved. We allocate a vector of entry points in FunctionDescription and we update the FunctionDescription struct in functor.h and in cmpIntrinsics.lsp (I think)
20:11:20
drmeister
From Common Lisp we need to generate those entry point functions and we need to do it in C++.
20:13:40
karlosz
oh, hey, look at this: i just profiled 60 seconds of the initial cclasp bootstrap (after cleavir is loaded into bclasp before eclector is compiled with cclasp) ocf.io/~karlos/60s-cclasp-bootstrap.svg
20:14:55
drmeister
I don't see exactly how we will do this from C++. Maybe a template struct that stores the real function entry point and a bunch of static functions entry_point0(T_O* closure), entry_point1(T_O* closure, T_O* arg0); entry_point1(T_O* closure, T_O* arg0, T_O* arg1) ... entry_pointN(T_O* closure, T_O* arg0,T_O* arg1,T_O* arg2,T_O* arg3,T_O* arg4, ...)
20:15:28
Bike
when you say from C++, you mean like, constructing lisp closures for C++ functions we call?
20:16:38
drmeister
So what do you think? load all the cleavir code - compile-file eclector and then load the eclector code?
20:17:44
karlosz
yeah, if we can get a cclasp compiled eclector running as early as possible it should mitigate this
20:18:10
drmeister
Bike: I mean calling a function. We have things like core__foo(int x) - we need to call it through this multiple entry point approach.
20:20:39
karlosz
we just need to load the eclector stuff again during the loading/compiling ... stage but after we've switched to using cclasp
20:20:46
drmeister
I dunno - it's a little fussy. Right now the list of Common Lisp files that cclasp compiles are passed as command line arguments.
20:20:48
drmeister
https://github.com/clasp-developers/clasp/blob/master/src/lisp/kernel/clasp-builder.lsp#L830
20:22:40
drmeister
Now, if you just want to compile-file a few of them and then load them - things get tricky.
20:23:02
drmeister
I'm doing a little bit of mucking around by moving some files earlier in the compilation.
20:23:02
drmeister
https://github.com/clasp-developers/clasp/blob/master/src/lisp/kernel/clasp-builder.lsp#L826
20:23:24
drmeister
The order of the files is important because it is used to both compile-file the files and link the files.
20:24:10
karlosz
i guess its actually just a few files that have the offending functions with lots of bclasp unwinding
20:24:39
drmeister
You know what? You are a great engineer - go to town on it - just dont' commit it until I can take a look at it ok?
20:28:12
drmeister
I ask because I started bypassing eclector in the chemistry file readers when unwinding was bad .
20:32:13
Bike
the issue-982-place or whatever waf test error doesn't seem to happen with the newest cleavir/clasp. instead there's something due to invalid local calls. mrm.
21:01:12
Bike
we're not going to get rid of it, it's just a question of using the compiled faster version
21:05:24
Bike
i mean we can't get rid of it. the C++ reader, besides being buggy, doesn't produce csts, and we're not gonna put in the effort to make it do so
21:12:07
karlosz
but then i get an error when its done compiling everything and loading the new image
21:12:55
Bike
like, it dumps an inline ast, and then while reconstructing it at load time it fucks up
21:13:45
karlosz
the good news is that it actually compiles much faster and the new flamegraph has eclector at 1% instead of 85%
21:15:37
karlosz
Bike: i'm not sure why it would mess up though. i mean all i'm doing is recompiling and loading eclector after inline.lisp and then transform.lisp get loaded
21:19:31
karlosz
i.e. Eclector is now double loaded by cclasp right after inline.lisp and transform.lisp so that the rest of cclasp bootstrapping can use the no unwinding eclector
21:19:50
karlosz
yeah, i'm not sure why Eclector would get loaded in the earlier image just given the diff i put
21:19:51
drmeister
karlosz: We've been around the block on these boot strapping issues: can you try... ./build/boehm/iclasp-boehm -f debug-startup
21:21:58
drmeister
That should print a message for every top level form that is evaluated at startup.
21:27:15
karlosz
my suspicion is that https://paste.gnome.org/p3zwaoq3q is the reason why it works with the parallel build and not with the serial build
22:15:39
Bike
drmeister: so what we could do instead of &va-rest is just pass arguments on the stack, right? and then directly access them with an index. if we have a count of args passed that should be fine
22:16:05
Bike
(also, i think rather than specify &more sbcl tries to derive it? maybe? flipping through sources here)
22:22:45
drmeister
The &va-rest was a way to access variable numbers of arguments in cases where you don't need to cons a &rest list. Does what you suggest do that?
22:23:59
Bike
the extra arguments would just be on the stack. there'd be no data structure connecting them.
22:35:13
drmeister
Because if so - then if it escapes - then we cons it. If it doesn't we allocate 8 cons cells on the stack and copy the first 8 &rest arguments into that and cons the rest. Badda bing - you optimized 90% of the cases.
22:38:28
Bike
so i'm looking at what sbcl does and it's interesting. i think it automatically simplifies certain uses of &rest.
22:38:42
Bike
for example if a &rest argument is only used as an argument to nth, it doesn't bother consing a list, i think.
22:53:37
karlosz
i'm running a serial build with inline.lisp hoisted to the front of the build like with parallel in hopes that will fix the serial build
22:56:07
Bike
what's the load order? i mean these are all eclector files, right, has inline.lisp been loaded at this point?
22:59:36
Bike
inline.lisp is what turns on the compiler saving inline definitions. if you compile something before inline.lisp is compiled, during build, no inline definitions are saved.
22:59:49
Bike
so if you move this to be compiled afterward, now the inline definition is saved and will need to be loaded.
23:00:49
karlosz
we basically have that inline.lisp is compiled during build => inline definitions are now saved => eclector is compiled and laoded again, with saved inline definitions
23:01:09
karlosz
then since eclector is also loaded before inline.lisp, now we're trying to load it without the inline definitoins
23:01:58
karlosz
so i think hoisting the compilation/loading of inline.lisp as with the fork build is probably the right thing to do to , so we avoid problems in the future like this
23:02:46
karlosz
now i get: boehm/fasl/cclasp-boehm-bitcode/src/lisp/modules/serve-event/serve-event.fasp
23:08:06
drmeister
I don't know the answer - I'm rusty. But in the fork build every compile-file that happens after the source is loaded/compile'd is compiled from a clasp with no side effects from the other compile-file's.
23:09:11
drmeister
It's like you load/compile every cleavir source file and THEN compile-file one of them, shut the whole thing down, start it up again load/compile every cleavir source file and then compile-file another source file.
23:10:26
drmeister
No - the final image load order is the order of the files passed on the command-line from waf.
23:11:23
drmeister
Now, confession time - I don't really, REALLY understand what is going on. There may be some mongo huge bootstrapping problems here that I am blissfully unaware of.
23:11:56
drmeister
It works as well as it has. My plan is to implement save-image-and-die and finally not have to worry about this.
23:12:45
drmeister
beach has been wrestling with bootstrapping for years as well - it's a crazy hard problem. I feel like I'm in good company.
23:14:35
drmeister
Wait - it only reorders it in that it moves the compilation so that it starts earlier.
23:15:18
drmeister
Because it takes a long time to compile - you get more parallelism by moving it earlier. I'm pretty sure it doesn't effect the outcome one iota.
23:16:07
drmeister
If you fork on 8 processors it's better to start the large one early than to do it late.
23:17:55
drmeister
These AST things - we are saving the ASTs of inlined functions - somewhere - right?
23:18:53
drmeister
We can't save any of them until cleavir is fully loaded. So no ASTs get generated until after inline.lisp is loaded/compiled
23:19:36
drmeister
Say you have a inline function foo that is inlined in bar and bar is also inlined...
23:20:34
drmeister
Does it help to put together a scenario like that? You have a foo-file.lisp and a bar-file.lisp and if they get compiled in the fork build they can't see each other.
23:22:23
drmeister
Maybe in the serial version the one that is compile-file'd earlier gets inlined into the one that is later?
23:23:48
karlosz
well, i do think Bike hit the nail on the head earlier. that Eclector is now saving its inline definitions but during the final image load Eclector gets loaded before inline.lisp before those classes get defined
23:24:43
drmeister
Oh - if you guys understand what is going on then I will quietly back away and exit the conversation. I don't want to confuse things.
0:10:26
karlosz
drmeister: this is all that was needed to make it work: https://github.com/clasp-developers/clasp/pull/1096 . i'm seeing 3x speed increase for cclasp with serial build and 2x speed increase for cclasp bootstrap with fork build
0:10:42
karlosz
i suspect that on linux it will be even more pronounced because unwinding is more pronounced there
0:11:21
karlosz
the key was to only compile and load eclector again while building cclasp and not while cclasp is building itself
0:11:44
karlosz
that way the inline definition doesn't get saved and there won't be problems when loading in the final image