freenode/#clasp - IRC Chatlog
Search
13:50:57
Colleen
drmeister: frgo said 1 hour, 25 minutes ago: Re MPS and signals: I think we need to change behavior in file src/gctools/interrupt.cc: ADD_SIGNAL( SIGSEGV, "+SIGSEGV+", ext::_sym_segmentation_violation); - Am I right that this estalishes a handler represented by the symbol ext::_sym_segmentation_violation? If so: when using MPS on Linux, we're not allowed to do that. If not: I'd like to know what this line actually does.
13:54:36
drmeister
My friend at Ravenbrook got back to me and wants to see a backtrace. The machine I'm using is in the Amazon Cloud - so I can give him access as well.
13:58:48
drmeister
It doesn't reproduce the problem with the cases I tried last night - it works on simple cases.
13:59:57
drmeister
Nope - it does - I was using cclasp - it behaves differently. In aclasp it fails like it did last night.
14:06:28
drmeister
I can reproduce the problem and I passed it on to David along with stassats' observation.
14:12:41
frgo
Error >>>>>>>> In file included from /opt/common-lisp/lang/clasp/src/clasp/src/gctools/interrupt.cc:2:
14:12:42
frgo
In file included from /opt/common-lisp/lang/clasp/src/externals-clasp/llvm50/include/llvm/Support/ErrorHandling.h:18:
14:13:31
drmeister
I just realized something - I can create Amazon Cloud machines with Clasp running and give people access to them. Great for debugging.
14:14:01
drmeister
frgo: That was the same problem that you had last night - this is with the new externals-clasp build?
14:16:03
frgo
LLVM_CONFIG_BINARY = "/opt/common-lisp/lang/clasp/src/externals-clasp/llvm50/build-release/bin/llvm-config"
14:17:39
drmeister
Yes - that is all fine - you can remove the EXTERNALS_CLASP_DIR line - that's not used anymore.
14:19:20
drmeister
This is the contents of the llvm/Config directory that I think your system wants:
14:19:49
drmeister
The peculiar thing is that I don't have an llvm-config.h file and I don't see the problem that you do.
14:21:14
frgo
AsmParsers.def.in AsmPrinters.def.in Disassemblers.def.in Targets.def.in abi-breaking.h.cmake config.h.cmake llvm-config.h.cmake
14:22:59
frgo
As soon as you actually build LLVM there is a llvm-config.h there... - in build-release/include/llvm/Config/llvm-config.h
14:24:43
drmeister
That's my /externals-clasp/llvm50/build-release/include/llvm/Config/ directory - and yes there is an llvm-config.h
14:25:24
frgo
No - it's there: AsmParsers.def AsmPrinters.def Disassemblers.def Targets.def abi-breaking.h config.h llvm-config.h
14:26:07
frgo
It's just that the directory ".../externals-clasp/llvm50/include" is not set as an include dir by wscript.
14:29:21
Bike
it seems that the include has to be built. the source only has whatever kind of pre file.
15:32:15
drmeister
I see problems when I try to create >50 threads on OS X and then there are the problems that we ran into on Linux.
15:39:56
drmeister
frgo: I can give you access to the Linux machine that has Clasp built and exhibits the problem - would that help?
15:47:04
frgo
Later on - I am setting up a small app that helps demonstrate the issue. For not havimg to build clasp over and over ;-)
17:25:45
drmeister
It writes out JITted symbols and their addresses and sizes to symbolicate backtraces.
18:58:12
clasper
drmeister: according to this video: https://vimeo.com/216547984 Azul has done a lot of work on the llvm for jited languages
18:58:52
clasper
it is nice to know that they are making the llvm mare amenable to managed languages
19:18:52
drmeister
printf("%s:%d caught signal sig=%d SEGV_ACCERR=%d siginfo_t=%p context=%p \n", __FILE__, __LINE__, sig, SEGV_ACCERR, info, context);
19:22:50
drmeister
So, what am I looking at here? Is sigHandler called because the program touched memory that had a barrier over it?
19:23:22
frgo
and it is ok to get a SIGSEGV. What's the backtrace at that point? As we have a core we should be able to see what happened.
19:31:13
stassats
but they always change, so it's something not doing enough book-keeping, and not search_cache being broken
19:34:17
drmeister
I'm checking to see if that cache is still thread local and being set up properly.
19:34:21
stassats
now that i'm just aborting on bad addresses, i'm always getting faults in search_cache
19:36:34
drmeister
Could it be that search_cache is doing something that is not MPS, moving garbage collection, safe?
19:37:24
drmeister
It's an open hashed hash table of selector keys to effective method functions - for single dispatch C++ methods.
19:37:41
drmeister
It is not location aware (yet) - but I was hoping to get rid of it and use fastgf.
19:37:42
stassats
now i'm faulting at core::DynamicBindingStack::pop_binding , tried a different test case
19:38:43
drmeister
Could you explain a bit more about these fault addresses? These are addresses that the system is trying to read and causing a SIGSEGV?
19:39:32
drmeister
Where do you find the address? In the siginfo_t structure or the context passed to sigHandle in protli.c?
19:40:57
stassats
most of the addresses are pretty low, i'm thinking you are accessing zeroed memory with some offsets
19:43:48
drmeister
The pointer to it (a root) is stored at the top of the stack. This is certainly true for the main thread. I'll recheck the code for threads.
19:43:58
stassats
this->_ThreadLocalBindings.resize(index+1,_NoThreadLocalBinding<T_O>()); does that still happen in mps memory?
19:45:54
drmeister
https://github.com/drmeister/clasp/blob/dev/include/clasp/gctools/threadLocalStacks.h#L33
19:46:19
drmeister
It's a gctools::Vec0<T_sp>. That's a stretchy vector that is maintained with the GC managed memory.
19:47:06
drmeister
https://github.com/drmeister/clasp/blob/dev/include/clasp/gctools/threadlocal.h#L14
19:47:33
drmeister
The entire ThreadLocalState is stored in the stack of the thread, right when the thread starts up.
19:50:01
stassats
drmeister: if ThreadLocalState is allocated before mps_root_create_thread_tagged, what happens?
19:53:11
drmeister
Oh sh*t - I don't even have the allocation points initialized when I declare the ThreadLocalState on the stack.
20:12:09
stassats
i'm clearly seeing three threads calling core::MDArray_O::vectorPushExtend with this being 0x7fc24a957020
20:14:14
drmeister
Could you give me the test case that you are using to generate this problem? But it's not completely reproducible - is it?
20:14:54
stassats
(loop repeat 1000 do (mp:process-run-function nil #'(lambda () (core:bformat nil "module%s" 10))))
20:25:23
drmeister
I'm not however - I've been doing fine with this. Could it be the guards that I have in place?
20:31:48
drmeister
I don't understand how three threads can have the same _BFormatStringOutputStream - each thread gets its own.
20:33:58
drmeister
I put in a printf statement right after the _BFormatStringOutputStream is initialized - this is what I get when I run your testcase:
20:38:22
drmeister
next-run-time-module-name is accessing a dynamic variable in a thread unsafe way.
20:41:04
drmeister
I should get rid of that special variable then and every module will have the same name.
20:41:32
drmeister
Although this can't be the true problem because it was happening in aclasp as well - which doesn't do dispatch.
20:42:18
drmeister
What would you recommend for the naming scheme? A thread local name with a counter that is thread local? Or put a lock around the name calculation?
20:46:19
drmeister
So, it's a function that takes a symbol value and does an atomic incf on its symbol-value
20:47:49
drmeister
Now that you mention it, I will check the exact sequence against the one in mpPackage.cc
20:59:25
stassats
my_thread->_BFormatStringOutputStream seems to be indeed different, but the vector to which it push is the same
21:01:30
drmeister
And the main thread doesn't appear to call my_thread->initialize_thread() searching...
21:11:18
drmeister
Hmm, no - that's not it. The ThreadLocalState does not invoke the MPS allocators - so the order is fine and the order I had in start_thread previously was fine.
21:21:32
drmeister
With this: (loop repeat 1000 do (mp:process-run-function nil #'(lambda () (core:bformat nil "module%s" 10))))
21:23:55
stassats
i'm currently using (loop repeat 100 do (mp:process-run-function nil #'(lambda () (catch 'x (eval '(throw 'x 10))))))
21:26:23
drmeister
There are differences - but allocation points are created before allocations take place.
21:29:28
drmeister
That takes more work - and I have to share the start_thread code - that is possibly giving us trouble.
21:30:51
drmeister
The main thread code isn't a simple function like start_thread - it's more scattered and comes in from main(...)
21:44:40
stassats
just need to restrict the number of backtrace frames printed, otherwise i'm losing the numbers out of sight
21:45:19
stassats
crashing at this->_Data->rowMajorAset(idx+this->_DisplacedIndexOffset,newElement);
21:59:27
drmeister
I'm single stepping through the initialize_thread method where the _BFormatStringOutputStream is created - I noticed there is a thread unsafe increment of a counter that I put in that counts whenever a class is allocated - I will make that thread safe using an atomic variable
22:18:00
stassats
(mp:process-run-function nil #'(lambda () (let ((x (make-string-output-stream))) (write-char #\a x) (loop (assert (find #\a (get-output-stream-string x)))))))
22:30:00
stassats
(mp:process-run-function nil #'(lambda () (loop (core:bformat "a%p" 10)))) crashes on its own
22:51:39
drmeister
If you want a simpler test case - I can get it to crash in the interpreter - with no loaded Common Lisp.
22:52:07
drmeister
(let ((c 1000)) (tagbody top (mp:process-run-function nil #'(lambda () (eval '(list 1 2 3 4)))) (setq c (- c 1)) (if (> c 0) (go top))))
22:54:51
drmeister
This means the fault happened in frame #4? Or can the signal handler be caused by a fault in one of the other 6 threads?
22:56:26
drmeister
Interesting - now it created about 100 threads and then failed with a different error
22:59:07
stassats
the thread struct, with local bindings, bformatStringOutputStream, should be a root itself
23:02:29
stassats
even then, &stack_base isn't guaranteed to be the bottom of the stack where you put your other stuff
23:03:42
drmeister
Right - it's in the frame - but where in the frame relative to the ThreadLocalState is anybodies guess.
23:07:21
stassats
https://stackoverflow.com/questions/1102049/order-of-local-variable-allocation-on-the-stack?answertab=votes#tab-top
23:09:52
drmeister
Sorry - the Process_O::make_process call is here: https://github.com/drmeister/clasp/blob/dev/src/core/mpPackage.cc#L283
23:10:13
drmeister
Process_O::enable() is what calls pthread_create: https://github.com/drmeister/clasp/blob/dev/include/clasp/core/mpPackage.h#L140
23:11:38
drmeister
In start thread I declare stack_base on the stack - but who knows where that ends up on the stack relative to the ThreadLocalState - which must, MUST be above it (on X86_64 below it because the stack grows down).
23:21:03
drmeister
They say you find it the way I do it for the main thread - in a function that calls the function that calls mps_root_create_thread_tagged and that you should ensure that the inner function that calls mps_root_create_thread_tagged is not inlined.
23:25:13
drmeister
I'll add __attribute__((noinline)) to initializeMemoryPoolSystem and initializeBoehm
23:27:29
drmeister
I have all of the Boehm code that would call the API to set the stack base commented out and I recall that Boehm was pretty smart about this stuff.
23:29:18
stassats
so, what i think should happen to be totally robust, no allocation before mps_root_create_thread is called
23:29:20
drmeister
So what did you do to find the cold end of the stack? I was girding my loins to copy that code you pasted from sbcl.
23:29:45
stassats
all allocations is then coming from a noninlined function called after mps_root_create_thread
23:29:48
drmeister
Ok, I am certain that I don't do any allocation before mps_root_create_thread is called.
23:33:37
stassats
since you can probably avoid allocating claspProcess, but you can't really avoid passing the function
23:35:49
drmeister
Ok, so (1) copy the code from sbcl to find the cold end of the stack. (2) move everything that allocates into a separate noinline function and call that from start_thread
23:37:11
drmeister
Just having a pointer on the stack to a thing pins it down. So I just need to make sure the thread has initialized before I leave the Process_O::enable()
23:37:44
drmeister
Right - option (2) is like what the MPS docs suggest and then I don't need to do (1).
23:39:01
drmeister
A mutex? Or a condition variable? I need some way of telling the parent thread that the child thread is ready to go.
23:40:00
drmeister
Besides getting this working - is it a reasonable idea to put the ThreadLocalState struct at the top of the stack like this?
23:40:26
drmeister
Or should I declare it as a collection of roots. I know I need to get the cold end of the stack correctly regardless.
23:41:25
drmeister
Is it a reasonable idea to use conservative GC to keep the ThreadLocalState alive.
0:26:27
drmeister
I went with option (2) and I declared: std::atomic<ProcessPhase> _Phase; and I use a spin lock in Process_O::enable() to check for when the process becomes Active.
0:32:47
drmeister
I will - I'm just going through them - I did some on the linux machine and some on my machine - I have to get everything resolved.
1:10:33
drmeister
Yeah - you need to register the "cold end of the stack" and all roots need to be above it.
1:11:25
drmeister
I'd forgotten that to do this properly you need to do it carefully, (1) get the address of a local variable on the stack, (2) pass it to a noinline function that does all allocation.
1:12:15
drmeister
I was declaring the local variable and allocating objects in the same function and the compiler rearranged them so that roots were below what it thought was the cold end of the stack.
1:12:54
drmeister
And so very important thread local data structures were being collected and everything broke.
1:16:35
stassats
a function call should be separating the point of registering stack roots and using them
1:16:59
drmeister
The function that set the stack_base/cold_end_of_the_stack also allocated the ThreadLocalState on the stack - the order is determined by the compiler and not the order I expected or wanted.
1:17:29
Bike
oh, so it reordered the stack frame so the stack wasn't where you thought it was, i really do see.
1:17:45
drmeister
Yes - and I did that for the main thread for years - but the multithreading is new and I forgot that rule.
1:18:52
drmeister
stassats: An atomic-incf function - it should take a symbol and operate on the symbol-value of that symbol - correct?
1:20:09
drmeister
A place - how do I do it on a place? That needs a macro and I do atomic things in C++.
1:21:15
Bike
you could have a macro that expands into whatever based on the place. like setf. i think sbcl does this.
1:22:10
drmeister
So we need a new macro facility - like setf. I can see the value of it. We should steal sbcl's