freenode/#clasp - IRC Chatlog
Search
18:19:13
yitzi
Still doing a local user install of cando dependencies since I don't want to wait for a complete clasp rebuild right now.
18:22:10
kpoeck
optimized define-unicode-tables.lisp (https://github.com/clasp-developers/clasp/pull/1013)
18:32:24
drmeister
kpoeck: Could you write an example of the previous code that will illustrate the problem?
19:32:25
karlosz
Bike: you suggested making the cleavir analysis weaker with respect to special bindings and stuff like that, right? because C++ unwind needs to take care of them. couldn't we reuse the same machinery that local unwind had?
19:33:08
karlosz
2. Could we maybe handle the mapc stuff and more by just testing the function for membership in the CL package? i suspect there won't be many ordinary functions in the CL package relying on C++ unwinding
19:33:41
Bike
C++ unwind doesn't take care of them, the compiler generates the code for it. it's just that that code is meant to run in the stack frame doing the binding; for example the old value of the variable is just a datum in the frame
19:33:47
karlosz
the problem with marking any kind of user function would be redefinition: you can't guarantee that the uesr won't put in some C++ function that requires unwinding
19:35:06
karlosz
Bike: i see. so i guess we'd have to "manually" unwind the stack and do every cleanup action in frame if that's the case
19:35:43
Bike
in other news, if i do map-ast-etc without consing, asdf with the ast-interpreter is now only slightly slower than using the compiler
19:36:08
Bike
without consing -> er, i mean without cleavir-ast:children and its consing, it still does the hash table thing
19:37:25
karlosz
cool. so just a bit more before it's hopefully faster. still wondrous to me that the hash tables are doing it
19:37:42
Bike
also on the topic of CL functions and unwinding, that might not actually be true. maphash takes a lock, for example
19:41:20
karlosz
ugh, doing specials binding with setjmp means we'd have to maintain a jmp buffer stack ourselves (sounding more and more like an ordinary machine code implementation)
19:42:06
karlosz
and cleavir doesn't really emit instructions to maintain the stack at that level, maybe that only happens in MIR, idk how sicl handles that
19:42:58
Bike
which is why i want to start with the cases where there are no bindings, cos yeah, could be a mess
19:43:17
Bike
dunno if you saw, i did a really dumb prototype to see how setjmp/longjmp performs and it was like twenty times faster than exceptions.
19:48:11
Bike
the builtin operators take differently typed arguments from gcc, so i worry how that's going to work out
19:51:30
karlosz
oh okay. i guess i was confused by this: "see how setjmp/longjmp performs and it was like twenty times faster than exceptions."
20:13:41
Bike
which was (block nil (mapc (lambda (x) (when x (return t))) list) nil) versus (core:call-with-setjmp (lambda (p) (mapc (lambda (x) (when x (core:longjmp p))) list) nil) (lambda () t))
20:13:55
kpoeck
drmeister the code that run so slow on the buildbot is here: https://gist.github.com/kpoeck/5e9b64834283dccff701b4fd45272a27
20:28:02
Bike
eclector read-from-string uses signal internally. kpoeck is reducing the use of read-from-string, so less signaling occurs, so things are faster. right?
20:28:21
drmeister
In the docker container - where I expected it to take 700x longer it's 7.5 seconds
20:29:28
drmeister
This code should replicate the bad code using read-from-string - or did I misunderstand what kpoeck provided?
20:36:21
drmeister
I'm going to have to futz around with the previous commit to reproduce the problem.
20:36:43
drmeister
This is why I wasn't that excited about changing the code - I had a really good example of the problem there.
20:38:54
scymtym
drmeister: not in the sense of being merged into eclector master anytime soon, but it passes all important tests in eclector if i remember correctly
20:39:46
scymtym
in any case, READ-FROM-STRING should not be very different compared to READ unless WITH-INPUT-FROM-STRING does something crazy in Clasp
20:41:22
kpoeck
drmeister I wonder whether we need "fork" as in our build process, to make the example real slow
20:42:08
scymtym
yes, i'm saying that 70.000 READ calls should be just as (or maybe almost as) bad unless WITH-INPUT-FROM-STRING does something crazy
20:43:11
kpoeck
Stil believe that the pattern is different compiling, we read source code, do transformation, generate code, read again ...
20:45:36
drmeister
This was the commit with terrible performance in the docker container and the buildbot.
20:47:44
scymtym
kpoeck: the equivalent would be READing 70.000 toplevel forms. because READ-FROM-STRING is like a non-recursive READ call. that entails stetting up circularity tracking and other pre- and post-processing
20:48:41
scymtym
kpoeck: i have plans to delay the circularity tracking setup until the first #N= is encountered. that should help with READ-FROM-STRING and non-recursive READ calls
20:55:30
drmeister
I don't know what the buildbot is. I hit the problem in docker and that was the worst case I've ever seen.
20:56:22
drmeister
kpoeck: You pointed this problem out a long time ago. One of the cl-bench examples runs 100x slower than expected.
20:58:50
Bike
and the relation with signaling was what, having to far call to get to __cxa_throw or something?
20:59:54
drmeister
It's just that all compiled code uses the large code model - because library functions and data may be too far away to access with a 32bit relative address.
21:01:29
drmeister
I'm building in the docker container with the commit just before kpoeck's improvement.
21:02:29
drmeister
But you see why I wasn't in a hurry to incorporate your improvement kpoeck. This problem is so ephemeral, and then I had a really good test case.
21:09:38
kpoeck
I'd bet if I write a ltest-file with a loop reading all of clasp source code with eclector we have the problem again
21:10:36
karlosz
okay, i updated the analysis code to include doing a coarse check for bind-instruction and unwind-protect-insttruction
21:11:32
karlosz
Bike: that should be all the analysis needs to check for, right? the presence of bind-instruction and unwind-protect-instruction in every [catch->unwind] "intermediate" function
21:17:41
karlosz
checking the nested cleanups in a function itself is fine, but then you have look deeper by relating the function-dynenvs together in a DAG
21:18:04
karlosz
because if you have different calls to the same function, you have to go up all the dynenvs
21:19:03
karlosz
Bike: is just having an escaping-catch-instruction enough interface for you to hook it up to the translator stuff you're doing?
21:19:39
Bike
i'd do it the other way though. make catch-instruction the general and add an instruction for the simple case
21:25:50
Bike
i guess for a start i could od it the stupid way of defining intrinsics to call that return the value of setjmp
21:33:17
karlosz
if you paste in that code somewhere and call analyze-catches from my-hir-transformations before do-inlining it should just work
21:33:42
karlosz
it should just work after do-inlining as well, but the dag can get more complicated
21:35:38
karlosz
it shouldnt matter, since i work off of the DAG, but big interactions with inlining are a bit harder to graph
21:36:29
karlosz
definitely has to happen before PCV though, because cells block proper def-use chains
21:38:36
Bike
to see if an unwind is a non-escape unwind, you just check if the destination is an escape-catch
21:39:30
Bike
anyway, that reminds me of a tangentially related issue. can copy propagation or inlining coupled with PCV result in extra reads from the closure vector? like if you have (defun mcar (x) (if (primop:typeq x cons) (primop:car x) nil)) and inline that, it looks like it does a separate read for each x
21:45:22
Bike
i only looked at a quick disassembly but it seemed like there were already two reads. maybe i should look closer though.
21:45:36
Bike
anyway, besides being inefficient this would have thread safetey problems, so we should avoid it
21:47:32
karlosz
yes. there is no general copy-prop going on at the moment before P-C-V, so you must be seeing an existing p-c-v thing
21:50:16
Bike
like if you have "read -> typeq -> read -> car", another thread could write to the sell after the typeq but before the next read
21:51:12
drmeister
kpoeck: Building with the old generated-encodings.lsp code in the docker container is taking a looooooooong time.
21:54:15
Bike
maybe you could have a binding assignment for the parameter of car, and not copy prop through it?
21:54:19
karlosz
yes, i know how to fix it. the solution is to keep the redundant assignments around and make p-c-v isolate X
21:54:36
karlosz
the copy prop in full-inlining-pass when converting binding assignments is the problem
21:55:02
frgo__
kpoeck: There's also https://sysdig.com/partners/docker/ - but I don't have experience with sysdig
21:55:22
karlosz
i made a change recently that broke that, since i wasn't thinking of that case, i'll send a PR to revert it
21:56:29
karlosz
as long as this lexical-location/binding-assignment/p-c-v stuff is thought out a bit differently
21:59:28
karlosz
but this is a problem because some lexical locations should not be subject to copy prop
21:59:55
karlosz
you're right we can just find every load value form and copy prop out from there and it should be fine though
22:02:40
karlosz
i just meant that there are obviously some useless assignments lying around and lexical locations that serve no purpose other than to copy
22:03:01
karlosz
but sometimes they are actually important, and i introduced binding assignments to make it more clear when that should be the case
22:03:12
karlosz
but the problem is there are more cases like that where it's dangerous to copy prop
22:03:27
karlosz
like how when you had assignments introduced between dynenvs from eliminate-catches
22:03:49
karlosz
it's just not obvious at all from looking at the class of the datum whether it's actually important in some way
22:04:22
karlosz
but all the different types of lexical locations get lumped together (why is dynenv a lexical location and not some other class of datum?)
22:06:13
karlosz
it doesn't know it has to do some special logic because DE2 looks like a plain lexical location
22:08:19
karlosz
things like the lexical locations introduced by binding assignments are also special
22:09:11
karlosz
that's the lambda-var/lvar distinction in sbcl - one can be closed over, the other can't
22:10:09
karlosz
it's not a real issue right now, but i think once optimizations start being used more aggressively i think treating everything as a lexical location will sort of be annoying
22:15:01
Bike
i've been thinking about things because at some point i want to introduce compiler support for CAS of lexical variables, and that will have to tie in to PCV somehow or another
22:15:52
Bike
possibly also assignments with an atomic order mark, though it'll only matter if it's closed over
22:20:45
karlosz
yeah. the thing with variables introduced by lambda is that they can become memory cells
22:22:17
karlosz
P-C-V won't distinguish between temporaries and lambda variables and just close over anything willy nilly
22:22:52
Bike
we also close over return values at the moment. it's dumb. i think that's removed in cleavir2 tho.
23:15:29
drmeister
::notify yitzi The docker image is working great. I think we can get started prototyping with this.
23:18:14
drmeister
::notify yitzi Ping me when you are online - I have some questions about portability of code from jupyter notebook to jupyter lab
23:25:40
Colleen
yitzi: drmeister said 10 minutes, 11 seconds ago: The docker image is working great. I think we can get started prototyping with this.
23:25:40
Colleen
yitzi: drmeister said 7 minutes, 26 seconds ago: Ping me when you are online - I have some questions about portability of code from jupyter notebook to jupyter lab
0:18:15
karlosz
Bike: i flamed the new ast-interpreter code here: ocf.io/~karlos/asdf-60s-new-ast-int.svg
0:21:41
karlosz
there's some consing for augment environment (maybe a list makes more sense than a hash table), but it seems like maybe it's interpreting more than it should be?
0:22:16
karlosz
i don't think it makes sense for the ast-interpreter to walk the same closure more than once, for example
0:23:36
karlosz
i find firefox is much better than chrome at actually rendering these svgs in particular
0:28:03
karlosz
well, it's not obvious to me from looking at asdf.lisp whether that's actually what is happening
0:47:04
karlosz
i've been seeing a pretty big build time regression (even since before kpoeck's changes) which i'm going to try and track down. The recent ast-interpreter changes and and generated encodings fix is saving 2 minutes, but it's still slower than it was last week
0:47:47
Bike
i made some pretty wide changes to how standard objects are allocated but i didn't see a slowdown with my commits