freenode/#clasp - IRC Chatlog
Search
19:34:36
Bike
the change i made to condition variables was giving them a __repr__ that displayed their name
19:38:35
Bike
well, it's a basic condition variable. It hangs until the condition variable is notified (with `pthread_cond_notify` or `pthread_cond_broadcast`, I think)
19:42:41
kpoeck
There is a bug report that the test for condition variable is bad https://github.com/sionescu/bordeaux-threads/issues/65
19:45:06
kpoeck
https://github.com/sionescu/bordeaux-threads/blob/master/test/bordeaux-threads-test.lisp#L129
19:45:41
kpoeck
This (loop until (= i *shared*) do (condition-wait *condition-variable* *lock*)) looks wrong
19:46:21
Bike
that's the usual pattern for using condition variables. the problem is that the test expects the threads operate in a particular order
20:03:53
Bike
i have the personality function working to the point i can start the system, but it's still kinda borked
20:07:26
kpoeck
The MP:CONDITION-VARIABLE-WAIT only returns, if mp:condition-variable-signal is called?
20:09:30
Bike
::notify kpoeck our condition-variable-wait, -signal, and -broadcast functions implement this https://en.wikipedia.org/wiki/Monitor_(synchronization)#Condition_variables_2
20:12:00
Colleen
kpoeck: Bike said 2 minutes, 30 seconds ago: our condition-variable-wait, -signal, and -broadcast functions implement this https://en.wikipedia.org/wiki/Monitor_(synchronization)#Condition_variables_2
20:40:16
drmeister
We are probably already there in terms of what compile-file-parallel can get us. On Mac it’s 2.5-3.5x speed up
21:43:51
drmeister
dtrace is also a great tool here. If you have a situation where there are thousands of calls to something and then it fails. You can have it store backtraces for each call and then look at just the last one.
21:45:40
Bike
if std::terminate gets called thousands of times we're in a pretty weird situation, i think
21:45:47
drmeister
clasp/src/profiler/scripts/profile_throw whos you how to do it. Change the *cxx_throw*:entry to *whatever*:entry and it will save a backtrace everytime any function that contains "whatever" in the name is entered.
21:46:29
drmeister
I was thinking of the situation where the stack unwinds and you lose the information about whatever was doing a throw.
21:47:16
drmeister
I've setup wscript now so that we can build clasp in "object" mode but compile-file-parallel takes over once Cleavir is loaded.
21:57:12
drmeister
https://github.com/clasp-developers/clasp/blob/dev/src/lisp/kernel/lsp/setf.lsp#L581
22:01:22
Bike
multiple-value-bind shouldn't make a function boundary. with the macroexpansion it will, but in cleavir we use a compiler macroexpansion instead.
22:01:31
drmeister
It did unwind during macroexpansion. I was thinking about babel's monster macroexpansion.
22:03:18
Bike
So you're saying the push macroexpander is unwinding. Not multiple-value-bind's or anything.
22:03:47
Bike
But it shouldn't be, because like I said, multiple-value-bind compiler macroexpands into something that doesn't involve any function boundaries.
22:06:29
drmeister
No - I'm building bclasp and for yuks I pointed do-flame-throw at it. I saw what I pasted above and then I started thinking about macroexpanders throwing exceptions and now I see that this wouldn't be a problem for babel for the reason you just described.
22:07:50
drmeister
I am poking around with this new tool to see if there are hidden speed bumps due to too much unwinding. Unwinding is bad in parallel - but it's also slow in serial code. You are working to improve that situation.
22:10:05
Bike
which i guess suggests that funwind protect is trying to rethrow an exception that doesn't exist
22:11:56
Bike
"Nested foreign exceptions, or rethrowing a foreign exception, result in undefined behaviour." oh.
22:12:35
Bike
Guess I'll have to do unwind protect with destructors instead. wasn't that how it went before? i'll check the commit log first
22:16:34
drmeister
If I remember correctly getting the semantics right was tricky. Destructors didn't... quite... work? I don't want to freak you out though - I may be wrong.
22:17:05
Bike
well i mean the existing unwind protect implementation works fine with the new throw, for whatever reason
22:17:32
Bike
oh, right. destructors wouldn't work since the unwind protect cleanup might itself throw, but a destructor throwing is N.G.
22:23:08
drmeister
Itanium exception handling is a bit more flexible than C++ exception handling. You need a 'finally' clause - right?
22:24:04
Bike
I don't know what I need. I don't understand why things are working or not working. The Itanium ABI doc seems to say rethrow of a foreign exception (e.g. by "catch (...) { ... throw; ... }") is illegal, but (a) that's not what I observe, and (b) that would be really stupid
22:26:01
Bike
Maybe. I already dumped in a big block of text describing our problems and nobody read it.
22:27:59
Bike
But since I don't actually understand what the problem is here I'm not sure what to ask
22:30:10
Bike
It's not just because the exception is foreign now - because then unwind-protect would never work at all. But it is.
22:31:17
drmeister
You mean core__funwind_protect is catching the foreign exception - but it looks like it just can't rethrow it.
22:32:42
drmeister
Could you drop by my office and explain this? I've lost track of what case is failing.
22:43:04
drmeister
Building cl-ppcre on linux with compile-file-parallel and I'm getting between 150% and 450% CPU.
22:47:12
drmeister
So I still didn't get it - can we use the SJLJ stuff? Can we make it work with C++?
22:49:31
Bike
there's the SJLJ that means the setjmp and longjmp operators, and there's the SJLJ that means implementing C++ try/catch using setjmp/longjmp
22:49:49
Bike
the latter is kind of irrelevant to us, but it's what that guy was referring to yesterday.
22:51:06
drmeister
How does one implement C++ try/catch with setjmp/longjmp? That's a compiler switch?
22:51:32
Bike
yeah, -fsjlj-exceptions, and you also have to rebuild a bunch of system libraries like libunwind
22:52:17
drmeister
I see - you can't load a dynamic library that was not built with -fsjlj-exceptions.
22:53:11
Bike
i mean, probably. but this is basically irrelevant to us because we're not implementing C++
22:54:33
drmeister
I thought it could be relevant if we restricted ourselves to only use code that was compiled from C++ source using -fsjlj-exceptions and we changed our exception handling to use llvm sjlj intrinsics.
22:55:39
Bike
It's not relevant to us because we'd still need to call exception handlers and everything.
23:00:56
cracauer
Bike: did you ever remove or rename those files with '@' in the filename in your copy of the SICL tree? That keeps us from running with some customers.
23:01:48
drmeister
cracauer: We are seeing the 2-3x speedup with compile-file-parallel now. Once Bike pushes a change to sicl we should be able to start the buildbot with the new changes.
23:04:39
cracauer
The @ is actually not the dealbreaker, but either way it would be good to have the filenames normalized with no special characters, and maybe even no sppaces.
23:05:09
Bike
alright. right now i'm building with the sicl changes to double check everything's fine. I'll just delete that entire directory in my branch
23:14:16
drmeister
cracauer: Do you have a wscript.config file on the buildbot or are you using the default settings?
23:20:56
drmeister
Ok. I've made the default on linux and freebsd CLASP_BUILD_MODE = "object" . USE_COMPILE_FILE_PARALLEL=True
23:21:59
drmeister
Lang Hames is taking the linker in a very nice direction. The runtime linker is going to behave like the system linker.
23:26:16
drmeister
What it's building now will be the first time we had the buildbot building the quicklisp code with compile-file-parallel in a way that could speed up the build.
1:52:38
Bike
alright so the libcxxabi code is definitely written to allow rethrowing foreign exceptions, so that's something.
1:52:51
Bike
i'm probably going to have to do something dumb like step through __cxa_rethrow to figure out the problem though.
2:05:08
Bike
wait, i know. we throw the exception. it's actually handled by the unwind protect's catch. it rethrows the exception. nothing handles it higher up, so __cxa_rethrow terminates.
2:27:02
Bike
we could fix unwind protect to do something that's like rethrow but doesn't terminate stupidly, but this would still apply for any C++ frame in the way that does catch (...)
2:28:40
Bike
so yeah, custom personality is a nonstarter. sweet. guess i'll ask about it on the discord tomorrow to see if i'm missing something, though.
3:18:21
Bike
we don't want to catch it. we want the throw code to signal an error instead of calling std::terminate
3:20:41
Bike
C++ exciting because not only can I shoot myself in the foot, complete strangers can shoot me in the foot as well
3:23:35
drmeister
That's why I prefer talking to the llvm people. They have the right attitude. These things are language features and they should be efficient.
3:24:28
Bike
honestly i think it is efficient. the dwarf eh stuff is a good solution. it's just that the problem is stupid
3:25:45
Bike
that's not even true. i like the unwind library design. but it's in service to C++ and C++ has stupid semantics
3:27:24
Bike
like what the hell, you know? there's even a terminate handler thing to customize death behavior, but you can't go "oh i'd actually rather not shut down my entire process"
3:38:04
drmeister
This is on a per-thread basis - so it's reporting the main thread the AST compilation part
3:44:03
drmeister
https://github.com/robert-strandh/Eclector/blob/master/code/reader/read-common.lisp#L65
3:53:02
drmeister
When I time (block FOO (let ((*special* 1234)) (return-from FOO 'nil)) I can only do about 850/second.
3:57:03
drmeister
(time (foo 2200)) --> Time real(0.976 secs) run(0.976 secs) consed(123200 bytes) unwinds(2200)
3:58:42
drmeister
Stack unwinding using C++ exception handling is very slow and it involves a mutex - so it slows down a lot when doing it in multiple threads.
4:00:13
drmeister
Bike is working on incorporating cleavir2 unwinding constructs (I probably mischaracterized that) into cleavir1 so we can get rid of a lot of unwinding and get rid of call-with-variable-bound especially.
4:01:29
Bike
it's the fact call-with-variable-bound induces a function boundary. the function itself doesn't unwind, no.
4:01:40
drmeister
Right - it's that there are a lot of return-from's and go's that cross the c-w-v-b boundary.
4:02:41
Bike
for example, every restart-case involved an unwind even in the normal case, though we could fix that one with a bit of rearrangement
4:02:47
drmeister
We had two functions in cleavir that were doing this in a gratuitous way. We rearranged the code so that they used the normal control flow to return values and now our parallel compiler gives us a 2-3x speedup.
4:03:36
Bike
the one i changed in cleavir was because i wrote it to rely on destructuring-bind signaling type errors. it was just bad.
4:04:24
drmeister
And we are at that limit. Could the eclector reader be determining the speed of the compiler?
4:05:17
Bike
and yeah, i think handler-case will unwind, but we could make it not by doing the same sort of rearrangement
4:06:04
drmeister
I don't know. The regular profiling has not revealed this problem. We only got this far because of that new do-flame-throw tool. But it doesn't tell us where time is being spent.
4:06:54
drmeister
I'm just rationalizing based on that I can only do about 2200 (block foo (let ((*special* ...)) (return-from foo ...))) a second and we are seeing about 2200 unwinds a second with eclector.
4:07:24
Bike
given that twenty minutes ago you said you could only get 850 i'm not sure i trust that logic...
4:09:53
Bike
I've been seeing if I could work around the C++ runtime to improve our performance and my conclusion is that C++ sucks and I hate it.
4:10:22
Bike
Well, I don't think they mind it for errors so much, since errors are usually pretty cold code
4:11:03
Bike
but yeah, it is an influence... plenty of LLVM functions return something like a haskell Maybe instead of signaling
4:11:48
Bike
I'm not totally clear on the details unfortunately. It has to do with how the unwinding information is stored with the code.
4:12:11
Bike
On Linux it seems to iterate over all loaded shared objects to find the code for a given return address.
4:12:26
Bike
Which is like, not very fast, and also requires synchronization because you could load a new object while doing that.
4:14:14
Bike
I might look deeper into it. If this kind of thing is actually necessary it's a pretty important caveat as concerns the "zero cost" stuff. But it's also possible they just haven't bothered trying to improve it.
4:15:59
Bike
Since we might have C++ frames hanging around between Lisp functions, we have to clean those frames up C++ style, which will involve these kinds of lookups, I think
4:16:15
drmeister
https://github.com/s-expressionists/Eclector/blob/master/code/reader/read-common.lisp#L38
4:17:02
drmeister
I think we should be able to rewrite any code that does this so that the common case uses the normal return path.
4:20:24
Bike
i spent, what, like a week on this personality function thing? digging around in the shadowy complex runtime for the "close to the metal" language? and in the end i can't do it because of how c++ is. i can't help but be kind of pissed
4:23:09
Bike
in C++ you have the try/catch language constructs, but in actual code they call out to a runtime library, an unwinder function like you have in sicl.
4:23:51
Bike
This library has an ABI so it can be used by other language runtimes with different exception semantics, at least in theory. So the unwinder is supposed to be able to unwind through frames that conceptually belong to some other non-C++ language.
4:24:35
Bike
To accomplish this, what the unwinder does for each frame is call an associated "personality function", which handles the exception semantics for that language in relation to the current unwinding.
4:25:24
beach
So unwinding also has to process each stack frame, unlike what you would do in Common Lisp?
4:26:33
Bike
well if a frame has no relation to unwinding - like in lisp, no blocks or unwind-protects or anything - the personality for that frame can just be blank, so the unwinder ignores it
4:28:02
Bike
it'll do more iteration than going through a stack of just relevant entries like in sicl, i imagine.
4:28:42
beach
Whereas in Common Lisp, an exit point would just not rely on callee-saves registers, so that the rest of the stack can just be abandoned.
4:28:56
Bike
I should also mention that there's an alternate implementation of C++ exceptions that uses chains of setjmp/longjmp buffers, which conceptually seems much more like what SICL does
4:29:22
Bike
since there's a small runtime cost whenever you enter a frame that's exception-relevant
4:30:48
drmeister
We could use that Bike. We could compile all of our C++ code using SJLJ and generate SJLJ code in clasp - couldn't we?
4:31:33
Bike
We would also need anyone who links any extension C++ code to use SJLJ for it too, right? Including libraries they may not be able to compile because they're part of the system or proprietary.
4:31:48
drmeister
Hmm, we would have to compile everything with that - llvm, clang, GMP, libz... anything in C++
4:32:55
drmeister
Yeah - we don't want to go that direction. We just need to reduce the number of unwinds we do.
4:34:23
Bike
i think avoiding c-w-v-b is probably going to be more of a win than rewriting code, generally, so i'm going to focus on that
4:34:45
drmeister
I'm not convinced that it has to. It seems to me that the regular return path will always be faster than unwinding - so we should use it for the common case. It would surprise me if that makes the code less readable.
4:35:22
Bike
if you could find examples that don't involve c-w-v-b that would be interesting, drmeister. like the one in inlining
4:36:41
Bike
presumably the full thing is (block ... more code here (let ((*mumble* ...)) (return ...)) yet more code...)
4:37:40
beach
The RETURN could involve the value of *MUMBLE*, and the code after the LET must be abandoned.
4:37:51
drmeister
Doesn't it? I'm not convinced that I'm right - and you are a more subtle common lisp programmer than I.
4:38:07
Bike
i don't really see any reason to rewrite ones with special variable bindings since we should be able to just eliminate c-w-v-b
4:40:15
drmeister
Backtraces are really deep, excessive unwinding, forcing lexical variables into closures are three problems with it.
4:44:18
drmeister
So we are going to move in the direction of eliminating c-w-v-b. We'd like to do it in a way that segues into how cleavir2 does things. I don't know what that is - but Bike does - so I'll leave it to him.
4:54:32
drmeister
The backtraces are really, really deep and I have to truncate them to get anything to display.