freenode/#clasp - IRC Chatlog
Search
16:00:30
drmeister
Bike: I think there are issues with build times with the new discriminating function compiler.
16:02:02
drmeister
The compiler relies on unwinding. The parallel compiler runs in multiple threads.
16:03:23
drmeister
I'd like to test this. Can you turn on and off the new discriminating function compiler? Switch between the old code and the new code?
16:03:52
Bike
yeah, like in that file i gave you at some point, you can define a discriminate macro with the new code under a different name
16:04:07
Bike
and then swap it in and out like (rotatef (macro-function 'discriminate) (macro-function 'discriminate2)) or the like
16:23:00
Bike
i jus tdon't understand what could be happening to slow down the build just from a few clos functions having... something... happen to them
17:01:35
drmeister
Do you want an account on a linux machine? Or do you have access already to 'hermes'? It's the linux machine that is in my house.
17:02:35
drmeister
We could also turn off the parallel compiler and build with that and compare timing on linux. That is less sensitive to multithreaded unwinding issues.
17:04:01
drmeister
I conjecture that it's multithreaded unwinding because on macOS there is little difference between the new code and old code and on macOS they have better multithreaded unwinding. On linux multithreaded unwinding has terrible problems.
17:08:33
drmeister
If the discriminating function compiler is only activating for satiation - and the interpreter is being used for GF dispatch - it doesn't make sense that quicklisp compilation would slow down.
17:09:22
Bike
some of the functions in clos have their discriminating functions put together by cleavir at build time, using the discriminating function compiler. if there's a dispatch miss they'll be replaced with the interpreted version.
17:09:42
Bike
so it would have to be something like, some of those functions are extremely slower, but also they never dispatch miss.
17:22:08
Bike
by the way, i'm also seeing that build time increased from around 18min to 22min independent of my changes
18:35:00
Bike
the change i just pushed will cause some bizarre unbound function errors until you purge fasls, jsyk
18:45:17
drmeister
yitzi: Do you have a Dockerfile that builds clasp with common-lisp-jupyter and the jupyter-widgets?
18:45:46
drmeister
This is a repeat of the question I just asked in the PR you submitted for cl-nglview.
18:48:01
drmeister
FYI - and stop me if you know this already - we have clasp and cando. Clasp is a Common Lisp implementation. Cando is an extension that is cloned into the clasp directory hierarchy and it adds a lot of computational chemistry tools.
18:49:05
yitzi
I got it. There is a line in the Docker file with the cando extension clone commented out. Trying to get other stuff to work first.
18:53:54
yitzi
The clasp kernel installation line is commented out because that is what I am working on right now
18:55:22
drmeister
yitzi: You don't specify the port when you invoke jupyter-notebook - is it not necessary?
18:56:27
yitzi
Just for local stuff I use "docker build --network=host --tag=cando-clj ." and "docker run --network=host -t cando-clj"
18:57:30
drmeister
Yeah - neither am I - I tend to fall into patterns and not deviate too much. In that vein - why jupyter-notebook and not jupyterlab?
18:58:20
yitzi
This is the testing harness for nglview since that is what I was working on. If you remove the nglview stuff you should be able to change to jupyter-lab
18:59:01
drmeister
https://github.com/clasp-developers/clasp/blob/master/tools/dockerfiles/cando-deploy/Dockerfile#L116
19:01:46
yitzi
Mine is just about done compiling clasp...so should know in about 15 minutes if the installation problem still exists.
19:21:50
drmeister
karlosz: Ok, basically - on linux stack unwinding via c++ exceptions has a serious problem on linux when exceptions are thrown in multiple threads.
19:23:27
drmeister
It's a known problem - and it keeps biting us because if we add even a little bit of extra compiler complexity - on linux we can see large slowdowns.
19:24:11
drmeister
The compile-file-parallel was slower than the serial compiler until we put some work into reducing the amount of stack unwinding in the cclasp compiler.
19:25:35
drmeister
It's a known problem and it appears to be a calculated choice to speed up single threaded stack unwinding at the expense of multithreaded stack unwinding.
19:26:47
drmeister
Let's see if this works: https://discordapp.com/channels/636084430946959380/636732894974312448/681528670606852107
19:29:15
Bike
karlosz and i talked a bit about the possibility of bypassing C++ exception handling when it's statically known that there are no intervening C++ cleanups to worry about, which covers some cases
19:29:29
Bike
lots of compiler support needed, though, and i don't know how good llvm's setjmp and longjmp primitives are
19:40:48
karlosz
like Bike said, i think a good approach is to have the compiler optimize away as many unwind cases as possible, either by deleting return-from (with contification or inlining), or by more sophisticated analysis in the general case
19:47:10
drmeister
The ABI states that two registers can contain return values. We use one for the first return value and the second for the number of return values.
19:49:04
drmeister
foo calls bar and if bar unwinds into foo we could have bar return to the caller with a special value in the #return-values register and test for it in the caller.
19:51:24
Bike
because if clasp is supposed to work with anybody's C++ code that's probably not possible
19:52:33
Bike
i mean i don't understand how this is supposed to work. bar is supposed to unwind to foo. you have foo return to its caller, and then i guess teh caller immediately returns, and the caller's caller immediately returns, until it ends up back in foo?
20:00:55
yitzi
drmeister: https://github.com/yitzchak/common-lisp-jupyter/pull/44#issuecomment-634242652
20:02:09
karlosz
Bike: do you have an idea why cst->ast is slow? it seems like it's being dominated by gf dispatch related stuff, but did anything else strange about the profile there jump out at you>
20:02:12
yitzi
drmeister: docker image is still pushing. I've haven't spent too much time trying to find the reason for problem. Just where it is coming from.
20:02:59
Bike
i don't remember seeing dispatch be a problem. i would guess the main problem is that it recursively invokes the compiler as a whole for eval-when, local macros, etc
20:05:14
drmeister
karlosz: I wrote a special profiling tool that turns the flame graph on its head for a particular function. With it I can ask - what are the most common paths that enter cc_unwind - for instance.
20:06:54
karlosz
yeah, i mean cst-to-ast does have to do a recursive walk, that's unavoidable, but then i'd expect it to perform more like ast->hir if that's all it was
20:08:33
Bike
becaues of all the recursivity flame graphs make it kind of hard to see what's happening, i think
20:12:10
karlosz
let's see, i may be able to get a tabular output of what cst->ast is doing in another implementation
20:12:38
karlosz
if it's not some low level issue in clasp like unwind, then the same problem should occur
20:17:23
karlosz
okay, got a simple tabular profile here by precisely profiling cleavir-cst-to-ast https://paste.gnome.org/prd2ahepd
20:17:44
yitzi
drmeister: I putting the nglview stuff on a separate branch and will eventually update that tag to run the latest jupyter-lab vs the notebook
20:20:03
Bike
could you throw in a couple cst functions? probably... cst:parse-ordinary-lambda-list, cst:separate-function-body, cst:canonicalize-declaration-specifiers, cst:reconstruct
20:25:50
Bike
those -action things are part of the parser, ass is item-equal. add-atoms and cons-table are part of reconstruct, i think.
20:27:32
Bike
what kind of file did you compile for this? since the expander is taking all that time
20:31:16
Bike
the time in reconstruct is worrying me. that's not really an expense other compilers are going to have
20:33:08
Bike
then it makes a new CST with the expanded form as the raw, and attempts to assign source info in it based on the original form's
20:33:30
Bike
so that e.g. if a subform of the macro form appears again, it has the same source info
20:33:45
Bike
https://github.com/s-expressionists/Concrete-Syntax-Tree/blob/master/reconstruct.lisp it's all in this file, w hich has a long explanation
20:38:37
karlosz
just that, hash table allocation and access have never been a noticeable hotspot, since i think they're rather optimized
20:39:25
karlosz
although it's a bytecode compiler to a stack machine, i think all the runtime functions are actually in C
20:40:06
Bike
maybe we could save repeated work by building a cons to cst mapping at the top, and then just using it repeatedly
20:40:32
Bike
since i think each macroexpansion will build its own cons table, which will be redundant for macro forms with macro forms in them, which is of course common
20:41:54
Bike
i guess this is a reason that some kind of hash table with source information might be preferable to the cst objects sometimes
20:42:01
karlosz
yeah, that might help a lot with the consing, and also i don't think it messes up the code structure too much
20:42:46
Bike
seems like we could just have an optional cons-table argument to reconstruct, and then bind it as a special variable or whatever
20:44:15
karlosz
yeah, i mean most of the functions in reconstruct.lisp seem to already work in that style
20:45:20
karlosz
and also, lambda-list-parsing seems to be bad. i'm going to see if it's doing something particularly wacky there...
20:49:07
karlosz
there's no point in walking the parser rules on every invocation, if i'm reading this right
20:53:15
karlosz
yeah, and in the profile, you clearly see that the rule munging is taking up a ton of time
22:05:59
kpoeck
The clisp manual is very detailed, but can't find anything regarding profiling, strange
22:10:09
kpoeck
looking at the swank clisp code it smells like a clone of metering in the file suspiciously called metering.lisp
22:12:32
kpoeck
So in my humble opinion I believe your answer is wrong for clisp (sbcl seems to have specific profiling code though)
22:13:17
kpoeck
But my main point, since metering work for clasp we could hackup the metering file in slime to support clasp
22:14:25
kpoeck
in swank/clisp the profiling implementation simply uses swank-monitor:monitor and friends
22:57:52
drmeister
yitzi: I can reproduce the crash in the docker image on macOS - that's easier to debug.
23:10:03
drmeister
I'm trying to get some info - but it clobbered the stack and the backtrace is still printing.
23:48:18
drmeister
I mean - I'm not disagreeing, and I'm not looking at your code. I'm trying to develop a tool so that I can figure this out going forward.
23:48:42
drmeister
What I really need to do is protect a page on the stack to catch stack overflows.
23:53:34
yitzi
Yeah, so I just put a format statement at the top of each method. And it looks like it is call the :prefix specialization even though :root is specified.
0:25:21
karlosz
okay, not sure if i can really turn the parser into a parser generator as is, but at least the grammars can be precompute., saving a ton of consing during lambda list parsing
0:26:51
drmeister
For compile-file-parallel reducing consing is good. Reducing consing in the multiple threads that convert AST->HIR->llvm-ir is really good.
0:28:48
karlosz
since an average lambda list is extremely small, i'm not so much concerned about the algorithm, but the huge amount of object creation everytime one is parsed
0:29:19
karlosz
though maybe people use macros in a way that give you large lambda lists, im not sure