freenode/#clasp - IRC Chatlog
Search
14:15:42
kpoeck
Bike cleavir-env:cst-eval seems to be called for every top-level form, see https://gist.github.com/kpoeck/c7124d975e99806570262041940b9f9c
14:16:24
Bike
that's not every toplevel form, it's the eval-whens in the toplevel forms in the file you picked
14:19:52
Bike
all of thse hopefully don't even get to the ast interpreter, sincce they're just progn, quote, and function calls
14:20:37
kpoeck
Didn't take implicit eval-when into account, so now I finally understand what you said
14:56:50
drmeister
It looks like we haven't submitted a pull request for bordeaux-threads. I'm going to submit one.
14:59:26
Bike
alright, if i have the ast interpreter check for interpretability ahead of time, the compile time drops from "100s-120s" according to karlosz to like 80
14:59:50
Bike
for asdf. still more than the compiler, so it's still stupid. if i can get the compiler to work on an ast directly that ought to be fixable
15:10:38
kpoeck
drmeister I honestly believe you made a pull request to bordeaux-threads with things there are already in there
15:14:16
kpoeck
drmeister your pull request only touches the file bordeaux-threads.asd adding clauses that are already there
15:14:48
kpoeck
Please look at https://github.com/sionescu/bordeaux-threads/blob/master/bordeaux-threads.asd
15:14:56
drmeister
Yeah - I'm looking at it and I don't get why github thinks those changes need to be made.
15:19:16
Bike
i don't understand. the clasp change is in sionescu's repo, right? and that's what quicklisp uses.
15:19:50
kpoeck
But in quicklisp for every single component one can specify whether a release is taken, or current git
15:20:47
drmeister
Excellent - something good came out of this. As kpoeck said - Stelian said one minute ago that he'd do a release.
15:21:19
yitzi
drmeister: so moving the clone of bordeaux-threads earlier in the Docker should fix it?
15:21:45
kpoeck
Look at https://github.com/quicklisp/quicklisp-projects/blob/master/projects/bordeaux-threads/source.txt
15:22:01
kpoeck
There it says latest-github-release https://github.com/sionescu/bordeaux-threads.git
15:22:20
drmeister
yitzi: The issue is that bordeaux-threads on quicklisp is out of date and so we always have to install a fork of it. I. GET. BIT. BY. THIS. EVERYTIME. I. INSTALL. CANDO. ANYWHERE.
15:24:35
Bike
"release" being like a specific thing sionescu has to do. quicklisp isn't just using the latest commit in the repo.
15:25:27
kpoeck
Shinmera software si configured to use the lates commit on git, I like that better e,g. https://github.com/quicklisp/quicklisp-projects/blob/master/projects/dissect/source.txt
15:26:01
drmeister
Ok. Now I understand. How do we figure out what the latest release looks like other than wiping out my local-projects/bordeaux-threads and pulling one from quicklisp.
15:26:19
drmeister
It's still going to take days or weeks for this to resolve - even if he releases - right?
15:28:18
kpoeck
in my local project of bordeau-threads i did git remote add upstream https://github.com/sionescu/bordeaux-threads.git
15:29:43
drmeister
yitzi: Right now I'm putting everything after building cando because that takes a couple of hours.
15:30:23
kpoeck
Unfortunately no, works fine on macosx and your linux box, but is freaking slow on the buildbot
15:33:11
drmeister
Yes - I'm working on a Dockerfile to build cando with jupyter notebook/jupyterlab
15:34:23
yitzi
drmeister: I've started rebuilding on my end by moving the clone of bordeaux-threads earlier to see if that works.
16:28:28
drmeister
yitzi: Excellent - thank you. I tend to knock these together to avoid long delays and then rearrange them once it does what I want.
16:51:45
kpoeck
drmeister Bike as an immediate solution to the "generated-encodings.lsp" problem I could lazily load the data. That should make the build much faster, but would be a 3 seconds delay on first use
19:37:02
kpoeck
drmeister could you please review https://github.com/slime/slime/pull/562 for slime? Adds profiling for clasp in slime
19:37:40
drmeister
yitzi: I'm back - I'm slowly trying to get cando to build in the docker image. I've had trouble with the quicklisp compilation step - I'm trying to figure out what is going on.
19:38:55
drmeister
There were two problems. One - it can't find /usr/local/bin/cclasp-boehm that is surprising.
19:39:44
drmeister
I've made a change to clasp that forces that step at the end of install that didn't show output to actually show output - I'm testing it in the docker container.
19:40:04
drmeister
It's really, really, really annoying that waf suppresses output - it drives me insane.
19:40:47
drmeister
Huh - ok - I thought I saw it trying to run /usr/local/bin/ccando-boehm maybe I have the wrong path.
19:42:01
drmeister
kpoeck: How do I review the slime pull request? Can I pull it into my local slime?
19:42:29
drmeister
yitzi: I've moved to a faster machine and I gave docker 18 cores. Hopefully I can get through this faster.
19:46:04
kpoeck
drmeister git remote add karsten https://github.com/kpoeck/slime.git git pull karsten master
19:46:15
selwyn
seems to me that bordeaux-threads can be put into quickclasp? you would only have to install quickclasp when installing cando, which is one line of common lisp. i could do this myself if it helps
19:47:04
drmeister
selwyn: A new release of bordeaux-threads is about to be made. I think using quickclasp for this will push the problem into the future.
20:05:45
karlosz
generated-encodings.lisp started freezing up for me as well on mac, so i suspect it might actually be a problem with the wscript.config maybe
20:07:28
karlosz
i'm thinking it would be better right now to not inline the arithmetic operators if there is no associated type information with any of the operands
20:07:54
karlosz
since i don't really see what inlining adds in that case, besides bloating the instruction cache and giving llvm much more to chew on
20:08:15
karlosz
this would be a first step to having something like a monomorphized approach to inlining
20:09:18
karlosz
as is, i don't think the compiler-macro machinery is able to detect the presence of types though, so making the decision would have to be at the HIR level, but by then the possibility to inline is gone, correct?
20:20:05
karlosz
breaking down into binary operations is standard, that's a good normalization to make
20:20:32
karlosz
you only need the full &rest shebang if you're calling indirectly like with mapcar #'+ or something
20:25:32
drmeister
kpoeck: But how does the profiling get timing data? What is the resolution of the clock?
20:47:11
karlosz
i mean, either we do like introduce the specialized arithemtic hir instructions in a separate pass in HIR, or we provide a way to roll back inlining somehow
20:47:56
karlosz
because i do really like the cleavir approach for the other stuff like car and cdr with typeq's just getting inlined in
20:49:28
kpoeck
drmeister this is a profiler in mostly portable lisp with some exceptions. So it is the resolution of the underlying lisp implementation. Profiling works by encapsulation
20:49:34
karlosz
the stuff in transform.lisp looks sort of like what we want, but like the comment says, it should happen at least as late as the HIR stage to leverage type information
20:50:38
kpoeck
karlosz used the profiler with clisp to meter some clearvir stuff, so better allow that in clasp too
21:01:26
kpoeck
can you also try from a repl with (time (compile-file "sys:kernel;lsp;generated-encodings.lsp"))?
21:03:17
drmeister
kpoeck: The buildbot hasn't successfully built clasp for the last day or so - first because generated-encodings.lsp was causing it to timeout and now because of some kind of bordeaux-threads issue? Argh
21:04:06
drmeister
Note - using (compile-file ...) uses compile-file-parallel. The build system uses compile-file-serial
21:05:09
kpoeck
(time (cmp:compile-file-serial "sys:kernel;lsp;generated-encodings.lsp")) takes 6 seconds on my mac
21:05:55
kpoeck
But in order to allow building again. I proposed earlier that I convert generated-encodings to lazy loading at runtime
21:06:38
kpoeck
Would that be fine with you? That should bring back the compile-times to few seconds at the cost of 3 seconds wait for the first use on a non-standard encoding
21:08:47
kpoeck
no: only load the enconding tables when it is first used, so pass the problem to runtime
21:09:56
drmeister
I can't replicate the problem yet on the one machine that really shows the problem because of other noise.
21:14:28
drmeister
This isn't surprising anymore - it's just sad. We need to reduce throwing C++ exceptions.
21:16:45
drmeister
https://github.com/clasp-developers/Eclector/blob/master/code/reader/read-common.lisp#L108
21:17:44
drmeister
It's a return-from - it's nothing fancy. It's a straightforward return-from from an inner function returning from an outer scope.
21:18:31
karlosz
the analysis isn't powerful enough to handle when terminat-token happens in multiple inner functions
21:18:40
drmeister
That's not to put the problem on you. I'm just looking at the code and wondering why contify didn't deal with this.
21:19:44
karlosz
you can declare it explicitly inline. it will cause about 9 copies of the inner function to be duplicated, but it's worth it because return-from is much slower than that
21:21:20
drmeister
What about converting this into a setjmp/longjmp - could we detect that there can be no C++ code between the read-token and terminate-token?
21:22:30
karlosz
because while contify can get rid of a lot of cases, ultimately it can't handle every return-from, because it only applies to single return-site stuff
21:23:18
drmeister
kpoeck: There is something going on with stack unwinding - on some systems it's not too bad on other it's TERRIBLE.
21:24:15
drmeister
It's a massive, unpredictable delay thrown into the code - it makes our lives very difficult.
21:25:27
kpoeck
but perhaps the problem is really (eval-when (:compile-toplevel) (process-encodings-file))
21:26:58
kpoeck
although (time (ext::process-encodings-file)) -> Time real(2.774 secs) on my machine
21:27:45
drmeister
kpoeck: It's not a problem on your machine though. It's a highly variable, machine dependent, operating system dependent, unpredictable cost of C++ stack unwinding that can vary from an insignificant to a MASSIVE delay.
21:28:43
drmeister
Note yesterday we were seeing a 3000 second compile time on the build bot and low double digit (3-20) seconds on my machine and yours.
21:29:44
drmeister
https://github.com/clasp-developers/Eclector/blob/master/code/reader/read-common.lisp#L70
21:31:11
drmeister
Here's another example - I'm building in docker on the iMacPro with 18 cores given to docker. It's sitting for the last half hour compiling one file.
21:37:11
drmeister
With compile-file-serial it is 34.6 seconds. I think I also need to inline read-char-handling eof.
21:38:15
drmeister
It's still terrible performance - now with compile-file-serial it is 33.9 seconds.
21:40:13
drmeister
I'm using C-c k to compile the entire file and then (time (cmp:compile-file-serial "sys:kernel;lsp;generated-encodings.lsp"))
21:42:53
drmeister
Is it missing inlining because of nested functions being created by the compiler?
21:43:03
karlosz
you were right to also inline read-char-handling-eof, but as far as i can tell that should do the trick
21:47:08
karlosz
drmeister: what if you tried putting the inline right after the defun into the body of the labels?
21:48:39
karlosz
there will always be cases where contification and inlining can't remove that, for example if you passed a (lambda () (return-from)) somewhere
21:49:38
karlosz
and if someone declares a function in quicklisp somewhere as explicitly notinline, and there's a return-from, neither contiication nor inlining will ever kick in, so kablooey
21:51:16
karlosz
yes, i think the real problem will be guaranteeing that there is no intervening C++ code
21:52:27
drmeister
Compiling a defun like this you get a big HIR graph with UNWIND-INSTRUCTION in there.
21:52:51
drmeister
We don't have call-with-variable-bound or funwind-protect anymore. There can't be any intervening C++.
21:55:43
drmeister
With call-with-variable-bound and funwind-protect I think it was a problem. But now...
21:58:28
drmeister
Hmm simple counter example - let's say you have (defun foo (x) (some-cxx-function (lambda () (return-from #'foo nil))
22:01:25
karlosz
and it would solve cases like these, where you don't pass random lambda's around, you are only using return-from in a context that setjmp and longjmp work
22:01:50
drmeister
Yeah. Is that the same as analyzing the paths between UNWIND and its destination and checking if there is a closure/FUNCTION between them. I might be talking nonsense here.
22:02:39
karlosz
nope, there's no need to do that. the same machinery that checks whether a function is eligible to be inlined should work
22:03:53
karlosz
we can make this a clasp specific optimization, because return-from is only really a clasp issue
22:03:56
drmeister
What about (defun foo () (cxx-function (lambda () (flet ((bar () (return-from foo nil))) (bar))) ?
22:07:16
karlosz
so, as long as no function that contains the return-from is passed around as a first class value...
22:07:54
karlosz
which, still applies in this case. I think a return-from would have to be enclosed by some kind of local function for C++ to get a handle on it. thinking... for a potential counterexmaple
22:09:22
karlosz
i.e. check all local functions that contain return-from foo are not passed around as first class values
22:13:08
drmeister
Right - think on this. Let's see what Bike thoughts are - he might have thought this through already.
22:14:12
drmeister
With cases like mapc we might be able to flag special functions as being safe in that they don't use CFFI
22:40:59
kpoeck
drmeister could you please check https://github.com/clasp-developers/clasp/pull/1012
22:53:29
karlosz
yeah, there really is no way to contify that terminate-token function, since it's not really possible to have two functions share the same body in HIR or LLVM IR
22:53:57
Bike
there are some lisp special operators that still use intervening C++ functions, like progv and complicated multiple-value-call
22:56:17
karlosz
because, if i understand correctly, you pass first class local functions to progv and m-v-c
22:56:55
Bike
you say "call position", but things like (flet ((foo ...)) (bar (lambda () ... (foo ...)))) have to be ruled out, right?
22:57:22
karlosz
call-position, as in transivitevly including every local function that encloses a RETURN-FROM
22:58:39
Bike
i mean in t his case foo i'd say foo is in "call position" but i don't think we could use set/longjmp
22:59:10
karlosz
yeah, foo is in call-position, but it's "inside" a local function that's not in call-position
23:02:10
karlosz
i guess it's harder for HIR to know what "higher" than a BLOCK means, but should still be possible by analyzing dynenvs
23:02:47
Bike
yeah, i mean, for this case probably all you need to know is that the dynenv of the call is a child of the dynenv of the block
23:03:00
Bike
so like, in (lambda () (foo ...)) it terminates at the lambda's dynamic environment output instead
23:03:55
Bike
and depending how sophisticated it is, either there are no bind or unwind protect environments, or that they're all the same
23:04:32
karlosz
but maybe it blocks too much, since looking at the "terminate-token", case, it would also block that
23:06:19
Bike
hm, well in this case you could go a bit deeper with the analysis. like, terminate-token call dynenvs terminate with read-char-handling-eof's, now how about read-char-handling-eof? and it looks like in that case it does get back to the read-token environment
23:07:27
karlosz
yeah, and you'd need to pair that with the passed-as-first-class-value thing to not conflate it with the example you gave
23:08:06
Bike
i mean yeah you'd also need to know that the only instructions it's fed to are assignments and bla bla funcall as only the callee.
23:09:19
karlosz
i'm thinking it would be pretty straightforward to do this as a pass somewhere in hir or mir, that just change-class's the catch-cont
23:09:56
karlosz
then translate-instruction just does its normal unwind thing in the general case, but setjmp/longjmp if the catch-cont is eligible for it
23:11:34
Bike
it might still be interesting in some cases. for example if you know there are no intervening unwind-protects/binds, in sicl you could skip calling the unwinder function
23:12:06
Bike
anyway, implementing this for the runtime might be annoying. we'll have to figure out what type a jmp_buf has and then probably actually call setjmp and longjmp the C functions
23:13:33
Bike
i mean, even with the C++ unwinding there's no actual unwind instruction, you're supposed to call __cxa_throw or whatever
23:15:37
Bike
quick check on godbolt has jmp_buf has {[8 x i64], i32, %struct.__sigset_t} which is not a five word buffer
23:16:41
Bike
we also might not have the destination address... i think llvm lets you refer to the addresses of code blocks sometimes, but i don't know if that works outside of the jump table thing
23:18:07
karlosz
"For SJLJ based exception handling, this intrinsic forces register saving for the current function and stores the address of the following instruction for use as a destination address by llvm.eh.sjlj.longjmp."
23:21:22
Bike
i'm just a little antsy about using something that's "used internally within LLVM's backend" and is specifically for SJLJ exception handling, which we are not doing
23:28:37
Bike
"returns_twice: This attribute indicates that this function can return twice. The C setjmp is an example of such a function. The compiler disables some optimizations (like tail calls) in the caller of these functions." that's a point i hadn't considered
23:28:57
Bike
i guess you can't reuse the stack space for a tail call since the code after the second return will still want access to the frame
23:32:26
Bike
yeah ok, since returns_twice is the only annotation for this and it's only for functions... yeah, that's difficult, since you'd want the same mark for the equivalent to catch-instruction
23:35:55
Bike
i mean that returns_twice on a function is the only way llvm has to indicate that it can't do tail calls and stuff, as far as i can tell
23:36:15
Bike
so you'd need anything that checks for a returns_twice call to check for this instruction too, i guess?
23:38:16
karlosz
oh, you mean because it's just a bare instruction and not a function that you can annotate return_twice on
1:09:35
yitzi
I removed the line that executed cando in the root user and forced all quicklisp activities to be in the user account
1:12:04
yitzi
I'll have to write a custom installer or tweak cl-jupyters installer since it thinks that cando is clasp.
1:17:16
yitzi
I'm probably done tonight. I'll detangle the dockerfile and fix the kernel installer tomorrow. Just thought you'd like to know that progress is being made.
1:21:09
drmeister
It means I have to go through the whole rigamarole to set up /opt/clasp and set permissions and so on.
1:36:59
karlosz
okay, i coded up a slightly more conservative analysis for when to do setjmp that should be correct
1:38:34
karlosz
Look at every catch-instruction. Look at every unwind that uses that catch-instruction. If every intermediate function of the catch and the unwind are in call-position, there is no problem
1:39:32
karlosz
it's simple and can be improved later, but now the next step is how to actually emit setjmp and longjmp
1:41:59
karlosz
so, anything that is of the class "escaping instruction" should fall back to using c++ unwinding