freenode/#clasp - IRC Chatlog
Search
21:47:53
drmeister
kpoeck: We had a discussion on this end - it is difficult to unwind that one merge commit - we tried and got something that didn't build - although it should have because later commits didn't refer to the merge commit.
21:59:40
drmeister
(sigh) I had turned on cmp::*compile-file-parallel* in the wscript file - so even when I turn it off in compile-file-parallel the build system turned it back on.
22:21:42
drmeister
I put (compile-file-parallel <asdf>) in a loop - to try and reproduce a crash we are seeing.
23:51:25
drmeister
Bike. The changes to the compiler that were merged into dev. - did they cnange the llvm ir that was generated?
23:54:13
drmeister
Ok - I thought maybe dynamic environment changes might not effect the code generation
2:01:47
drmeister
I'll put it back and build it over and over again tonight - in another shell I'll do it with the (hander-case ...) wrapping it.
2:02:49
drmeister
I did notice that the _Binding slot of Symbol_O was not atomic - that could create a race condition if multiple threads try to write to it at the same time.
2:05:21
drmeister
It's ..........possible.......... that this could be happening here because I start up a bunch of threads to service the compile-file-parallel.
2:31:01
drmeister
gensyms are not in packages - so they get deleted once nothing refers to them. But I guess they remain as roots because the code will reference them.
2:32:06
drmeister
I'm wondering if i have some fault in my thinking about how symbols turn over - in Clasp the multithreading code for symbols gets sketchy if symbols are bound and then collected.
2:34:13
Bike
practically speaking, nearly all dynamic variable names will be referenced from packages, so those are fine
2:37:43
Bike
anyway that doesn't seem to be what's happening here given that the symbols are things like *BASIC-BLOCKS* that aren't being collected
2:39:56
drmeister
I'm thinking more that the machinery for making symbol bindings thread safe isn't itself thread-safe.
2:40:53
drmeister
Since _Bindings wasn't atomic it was possible that it could be set by two threads in a race condition - that would lead to weird errors not unlike we are seeing.
2:41:49
drmeister
it's by no means clear that this is the problem but compile-file-parallel could expose a problem because it's a bunch of threads that startup and end at about the same time.
2:45:04
drmeister
The _Binding slot stores a size_t variable that stores an index into a thread local vector that stores bindings for symbols
2:46:01
drmeister
IIRC the first time a thread binds a dynamic variable the _Binding slot is set and that is the symbols index for life.
2:47:34
drmeister
The errors that we have been seeing when compile-file-parallel fails are this variable is unbound or that variable is unbound IIRC
2:49:24
drmeister
I've changed the _Binding to an atomic variable and I'm going to put in a print statement that will print if a race condition is ever detected.
2:50:02
drmeister
I'll have it do the right thing - but print a message - then we can watch for that message.
3:24:27
drmeister
I printed the _Binding of symbols as they are assigned - and it's very possible that there is a race condition...
3:29:19
drmeister
Once the compiler starts somewhere around line 40 there are many more symbols bound.
3:31:50
drmeister
Here I printed the thread id - there are threads racing to bind symbols at the same time.
3:43:47
drmeister
! WARN WARN WARN !!!! ANOTHER THREAD SET A SYMBOL _Binding slot to 79 before we could set ours to 80!!!!
3:45:03
drmeister
! WARN WARN WARN !!!! ANOTHER THREAD SET A SYMBOL _Binding slot to 122 before we could set ours to 123!!!!
6:42:41
drmeister
Everyone - I pushed to the 'work' branch a version of clasp that I think fixes this very random problem with compile-file-parallel.