freenode/#clasp - IRC Chatlog
Search
14:16:20
Bike
maybe C++ has some kind of annotation so the compiler can check that. but that would probably be too convenient
14:22:18
yitzi
drmeister: Sorry, but your email about the jupyter mockup and problems with radio buttons was in my spam. just saw it
14:22:34
drmeister
Do you think this will work? An optimization to detect when we can replace C++ exception handling with setjmp/longjmp?
14:23:02
drmeister
yitzi: No worries. You demonstrated radio buttons work in your latest docker image.
14:23:46
drmeister
yitzi: Take a look at my fork of cando-clj and I'll take a look at your Dockerfile - we can merge them.
14:24:11
yitzi
The issue is with the :options and :value keys. It's not part of the model spec. ipython simulates some of that stuff for select boxes. I've added some code to do that recently.
14:25:00
drmeister
Cando can start a swank server allowing an external slime to connect into it and modify code. So I build slime into the docker image.
14:25:05
yitzi
And yes, I am looking at your file. I think I have most of the stuff in mine with the exception of slime.
14:25:47
drmeister
So - what we did in the past is turn a docker image like this into a portable development environment that worked on macOS and Windows.
14:26:23
drmeister
You run the docker container and map in directories of source code from github overtop of the existing source directories.
14:26:41
drmeister
You also map the quicklisp cache directory from the host into the docker container.
14:27:28
drmeister
It lets you edit source code and commit it to github and generally do software development using everything built within the docker container.
14:28:03
drmeister
One problem we had though is that a couple of years ago when we did this mounted directories were significantly slower than internal directories.
14:28:33
drmeister
So compilation slowed way down and it was slow already. It took hours sometime for the docker container/development system to get ready for development.
14:31:56
Bike
alright, i tried a very dumb test, with a simple and inefficient use of setjmp/longjmp, and the version with block/return-from runs 20 times slower
14:32:22
drmeister
I'll need to do some digging into the Dockerfile to see which of my assumptions were wrong.
14:32:46
drmeister
Bike: Yeah - in the docker image it's single threaded compilation of generated-encodings.lisp.
14:33:58
yitzi
drmeister: Also, you won't have a cando kernel till I finish with some stuff on my end. Currrently building on my machine...
14:36:33
drmeister
kpoeck: Is 'enconding-strong-to-encoding-symbol' supposed to be 'encoding-string-to-encoding-symbol' ?
14:37:54
drmeister
Thank you. And thank you for making these changes. The "slow" one was valuable to push us to fix this problem and the "fast" one to get through the next couple of days.
14:39:24
drmeister
It's just on some machines it's slower - and not a few percent slower, 700x slower!
14:40:29
Bike
what we've learned is that C++ implementors do not consider using exceptions to be reasonable.
14:47:18
yitzi
drmeister: Is there a simple one liner that I can use to test out the non-lisp cando syntax? Is is called leap?
14:50:22
Bike
well, __builtin_setjmp on clang has a different signature from the gcc one, on top of being undocumented
14:51:18
Bike
and i found a dev thread saying "It's not like this is a core language feature; it's completely acceptable to just not provide the builtins on certain targets that don't support it"
15:03:57
yitzi
The kernel crashes right afterward, which probably means the evaluation result is getting properly wrapped for the handoff to common-lisp-jupyter
15:04:58
yitzi
I stuck it a naive 'make-lisp-result' which probably isn't gonna work for your cando stuff.
15:06:28
yitzi
I pushed the docker image to yitzchak/cando-clj:nglview if you want to see it. No slime and a faulty cando kernel, but ... making progress.
16:23:04
Bike
i don't know. those are specifically listed as being for exception handling, and are generated internally rather than used in frontends. it might be unstable.
16:24:14
Bike
also when i tried to use the equivalent __builtin_setjmp it said it wanted a void**, whereas this says it wants an i64* in the text but an i8% in the signature
16:28:41
Bike
you can also see these jmp_bufs are heavier than the intrinsic ones, like they have a sigset_t, which i don't think we need
16:28:53
drmeister
attributes #3 = { nounwind returns_twice "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="none" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false"
16:32:30
Bike
setjmp is a macro and its return value can't be used normally, so calling out to a C++ function would be annoying
16:32:47
Bike
what i did in my test code is define a C++ function that takes two thunks, but that won't work for catch-instruction, really
16:33:05
Bike
i guess it could return a value and then code branches on the value. that kind of sucks though.
16:33:52
drmeister
kpoeck: I incorporated your change and docker is sitting compiling generated-encodings.lsp right now.
16:36:38
kpoeck
--35357(out)--> Writing :OBJECT kernel fasl file to: #P"/home/cracauer/work/clasp/deploy/02-build-cando/build/clasp/build/boehm/fasl/cclasp-boehm-bitcode/src/lisp/kernel/lsp/generated-encodings.fasl" --35357(out)--> Time run(21.027 secs) consed(483162960 bytes)
17:28:17
cracauer
That file is odd. I have one machine taking 26 seconds and another taking 2 seconds.
17:33:23
drmeister
cracauer: I'll call you to fill you in. We had a file that compiled 700x slower on one machine than another.
18:16:49
drmeister
I incorporated kpoeck's fix for faster compilation of that one file and I'm trying another fix for the no output in post_install.
18:19:13
yitzi
Still doing a local user install of cando dependencies since I don't want to wait for a complete clasp rebuild right now.
18:22:10
kpoeck
optimized define-unicode-tables.lisp (https://github.com/clasp-developers/clasp/pull/1013)
18:32:24
drmeister
kpoeck: Could you write an example of the previous code that will illustrate the problem?
19:32:25
karlosz
Bike: you suggested making the cleavir analysis weaker with respect to special bindings and stuff like that, right? because C++ unwind needs to take care of them. couldn't we reuse the same machinery that local unwind had?
19:33:08
karlosz
2. Could we maybe handle the mapc stuff and more by just testing the function for membership in the CL package? i suspect there won't be many ordinary functions in the CL package relying on C++ unwinding
19:33:41
Bike
C++ unwind doesn't take care of them, the compiler generates the code for it. it's just that that code is meant to run in the stack frame doing the binding; for example the old value of the variable is just a datum in the frame
19:33:47
karlosz
the problem with marking any kind of user function would be redefinition: you can't guarantee that the uesr won't put in some C++ function that requires unwinding
19:35:06
karlosz
Bike: i see. so i guess we'd have to "manually" unwind the stack and do every cleanup action in frame if that's the case
19:35:43
Bike
in other news, if i do map-ast-etc without consing, asdf with the ast-interpreter is now only slightly slower than using the compiler
19:36:08
Bike
without consing -> er, i mean without cleavir-ast:children and its consing, it still does the hash table thing
19:37:25
karlosz
cool. so just a bit more before it's hopefully faster. still wondrous to me that the hash tables are doing it
19:37:42
Bike
also on the topic of CL functions and unwinding, that might not actually be true. maphash takes a lock, for example
19:41:20
karlosz
ugh, doing specials binding with setjmp means we'd have to maintain a jmp buffer stack ourselves (sounding more and more like an ordinary machine code implementation)
19:42:06
karlosz
and cleavir doesn't really emit instructions to maintain the stack at that level, maybe that only happens in MIR, idk how sicl handles that
19:42:58
Bike
which is why i want to start with the cases where there are no bindings, cos yeah, could be a mess
19:43:17
Bike
dunno if you saw, i did a really dumb prototype to see how setjmp/longjmp performs and it was like twenty times faster than exceptions.
19:48:11
Bike
the builtin operators take differently typed arguments from gcc, so i worry how that's going to work out
19:51:30
karlosz
oh okay. i guess i was confused by this: "see how setjmp/longjmp performs and it was like twenty times faster than exceptions."
20:13:41
Bike
which was (block nil (mapc (lambda (x) (when x (return t))) list) nil) versus (core:call-with-setjmp (lambda (p) (mapc (lambda (x) (when x (core:longjmp p))) list) nil) (lambda () t))
20:13:55
kpoeck
drmeister the code that run so slow on the buildbot is here: https://gist.github.com/kpoeck/5e9b64834283dccff701b4fd45272a27
20:28:02
Bike
eclector read-from-string uses signal internally. kpoeck is reducing the use of read-from-string, so less signaling occurs, so things are faster. right?
20:28:21
drmeister
In the docker container - where I expected it to take 700x longer it's 7.5 seconds
20:29:28
drmeister
This code should replicate the bad code using read-from-string - or did I misunderstand what kpoeck provided?
20:36:21
drmeister
I'm going to have to futz around with the previous commit to reproduce the problem.
20:36:43
drmeister
This is why I wasn't that excited about changing the code - I had a really good example of the problem there.
20:38:54
scymtym
drmeister: not in the sense of being merged into eclector master anytime soon, but it passes all important tests in eclector if i remember correctly
20:39:46
scymtym
in any case, READ-FROM-STRING should not be very different compared to READ unless WITH-INPUT-FROM-STRING does something crazy in Clasp
20:41:22
kpoeck
drmeister I wonder whether we need "fork" as in our build process, to make the example real slow
20:42:08
scymtym
yes, i'm saying that 70.000 READ calls should be just as (or maybe almost as) bad unless WITH-INPUT-FROM-STRING does something crazy
20:43:11
kpoeck
Stil believe that the pattern is different compiling, we read source code, do transformation, generate code, read again ...
20:45:36
drmeister
This was the commit with terrible performance in the docker container and the buildbot.
20:47:44
scymtym
kpoeck: the equivalent would be READing 70.000 toplevel forms. because READ-FROM-STRING is like a non-recursive READ call. that entails stetting up circularity tracking and other pre- and post-processing
20:48:41
scymtym
kpoeck: i have plans to delay the circularity tracking setup until the first #N= is encountered. that should help with READ-FROM-STRING and non-recursive READ calls
20:55:30
drmeister
I don't know what the buildbot is. I hit the problem in docker and that was the worst case I've ever seen.
20:56:22
drmeister
kpoeck: You pointed this problem out a long time ago. One of the cl-bench examples runs 100x slower than expected.
20:58:50
Bike
and the relation with signaling was what, having to far call to get to __cxa_throw or something?
20:59:54
drmeister
It's just that all compiled code uses the large code model - because library functions and data may be too far away to access with a 32bit relative address.
21:01:29
drmeister
I'm building in the docker container with the commit just before kpoeck's improvement.
21:02:29
drmeister
But you see why I wasn't in a hurry to incorporate your improvement kpoeck. This problem is so ephemeral, and then I had a really good test case.
21:09:38
kpoeck
I'd bet if I write a ltest-file with a loop reading all of clasp source code with eclector we have the problem again
21:10:36
karlosz
okay, i updated the analysis code to include doing a coarse check for bind-instruction and unwind-protect-insttruction
21:11:32
karlosz
Bike: that should be all the analysis needs to check for, right? the presence of bind-instruction and unwind-protect-instruction in every [catch->unwind] "intermediate" function
21:17:41
karlosz
checking the nested cleanups in a function itself is fine, but then you have look deeper by relating the function-dynenvs together in a DAG
21:18:04
karlosz
because if you have different calls to the same function, you have to go up all the dynenvs
21:19:03
karlosz
Bike: is just having an escaping-catch-instruction enough interface for you to hook it up to the translator stuff you're doing?
21:19:39
Bike
i'd do it the other way though. make catch-instruction the general and add an instruction for the simple case
21:25:50
Bike
i guess for a start i could od it the stupid way of defining intrinsics to call that return the value of setjmp
21:33:17
karlosz
if you paste in that code somewhere and call analyze-catches from my-hir-transformations before do-inlining it should just work
21:33:42
karlosz
it should just work after do-inlining as well, but the dag can get more complicated
21:35:38
karlosz
it shouldnt matter, since i work off of the DAG, but big interactions with inlining are a bit harder to graph
21:36:29
karlosz
definitely has to happen before PCV though, because cells block proper def-use chains
21:38:36
Bike
to see if an unwind is a non-escape unwind, you just check if the destination is an escape-catch
21:39:30
Bike
anyway, that reminds me of a tangentially related issue. can copy propagation or inlining coupled with PCV result in extra reads from the closure vector? like if you have (defun mcar (x) (if (primop:typeq x cons) (primop:car x) nil)) and inline that, it looks like it does a separate read for each x
21:45:22
Bike
i only looked at a quick disassembly but it seemed like there were already two reads. maybe i should look closer though.
21:45:36
Bike
anyway, besides being inefficient this would have thread safetey problems, so we should avoid it
21:47:32
karlosz
yes. there is no general copy-prop going on at the moment before P-C-V, so you must be seeing an existing p-c-v thing
21:50:16
Bike
like if you have "read -> typeq -> read -> car", another thread could write to the sell after the typeq but before the next read
21:51:12
drmeister
kpoeck: Building with the old generated-encodings.lsp code in the docker container is taking a looooooooong time.
21:54:15
Bike
maybe you could have a binding assignment for the parameter of car, and not copy prop through it?
21:54:19
karlosz
yes, i know how to fix it. the solution is to keep the redundant assignments around and make p-c-v isolate X
21:54:36
karlosz
the copy prop in full-inlining-pass when converting binding assignments is the problem
21:55:02
frgo__
kpoeck: There's also https://sysdig.com/partners/docker/ - but I don't have experience with sysdig
21:55:22
karlosz
i made a change recently that broke that, since i wasn't thinking of that case, i'll send a PR to revert it
21:56:29
karlosz
as long as this lexical-location/binding-assignment/p-c-v stuff is thought out a bit differently
21:59:28
karlosz
but this is a problem because some lexical locations should not be subject to copy prop
21:59:55
karlosz
you're right we can just find every load value form and copy prop out from there and it should be fine though
22:02:40
karlosz
i just meant that there are obviously some useless assignments lying around and lexical locations that serve no purpose other than to copy
22:03:01
karlosz
but sometimes they are actually important, and i introduced binding assignments to make it more clear when that should be the case
22:03:12
karlosz
but the problem is there are more cases like that where it's dangerous to copy prop
22:03:27
karlosz
like how when you had assignments introduced between dynenvs from eliminate-catches
22:03:49
karlosz
it's just not obvious at all from looking at the class of the datum whether it's actually important in some way
22:04:22
karlosz
but all the different types of lexical locations get lumped together (why is dynenv a lexical location and not some other class of datum?)
22:06:13
karlosz
it doesn't know it has to do some special logic because DE2 looks like a plain lexical location
22:08:19
karlosz
things like the lexical locations introduced by binding assignments are also special
22:09:11
karlosz
that's the lambda-var/lvar distinction in sbcl - one can be closed over, the other can't
22:10:09
karlosz
it's not a real issue right now, but i think once optimizations start being used more aggressively i think treating everything as a lexical location will sort of be annoying
22:15:01
Bike
i've been thinking about things because at some point i want to introduce compiler support for CAS of lexical variables, and that will have to tie in to PCV somehow or another
22:15:52
Bike
possibly also assignments with an atomic order mark, though it'll only matter if it's closed over
22:20:45
karlosz
yeah. the thing with variables introduced by lambda is that they can become memory cells
22:22:17
karlosz
P-C-V won't distinguish between temporaries and lambda variables and just close over anything willy nilly
22:22:52
Bike
we also close over return values at the moment. it's dumb. i think that's removed in cleavir2 tho.
23:15:29
drmeister
::notify yitzi The docker image is working great. I think we can get started prototyping with this.
23:18:14
drmeister
::notify yitzi Ping me when you are online - I have some questions about portability of code from jupyter notebook to jupyter lab
23:25:40
Colleen
yitzi: drmeister said 10 minutes, 11 seconds ago: The docker image is working great. I think we can get started prototyping with this.
23:25:40
Colleen
yitzi: drmeister said 7 minutes, 26 seconds ago: Ping me when you are online - I have some questions about portability of code from jupyter notebook to jupyter lab
0:18:15
karlosz
Bike: i flamed the new ast-interpreter code here: ocf.io/~karlos/asdf-60s-new-ast-int.svg
0:21:41
karlosz
there's some consing for augment environment (maybe a list makes more sense than a hash table), but it seems like maybe it's interpreting more than it should be?
0:22:16
karlosz
i don't think it makes sense for the ast-interpreter to walk the same closure more than once, for example
0:23:36
karlosz
i find firefox is much better than chrome at actually rendering these svgs in particular
0:28:03
karlosz
well, it's not obvious to me from looking at asdf.lisp whether that's actually what is happening
0:47:04
karlosz
i've been seeing a pretty big build time regression (even since before kpoeck's changes) which i'm going to try and track down. The recent ast-interpreter changes and and generated encodings fix is saving 2 minutes, but it's still slower than it was last week
0:47:47
Bike
i made some pretty wide changes to how standard objects are allocated but i didn't see a slowdown with my commits