freenode/#clasp - IRC Chatlog
Search
3:51:56
drmeister
When I do this in aclasp: (mp:process-run-function nil #'(lambda () (catch 'x (eval '(throw 'x 10)))))
4:01:40
drmeister
"You may need to make sure that the debugger isn’t entered on barrier(1) hits (because the MPS uses barriers to protect parts of memory, and barrier hits are common and expected)."
4:14:18
drmeister
start_thread is called when the thread is started and I create thread local allocation points, register the thread and register the thread stack.
4:16:41
stassats
if it's a valid address, then it knows to protect the region, but not that it belongs to it
4:23:19
drmeister
x/8xg 0x3e800005860 --> 0x3e800005860: Cannot access memory at address 0x3e800005860
4:23:51
drmeister
MPS puts hardware barriers on memory - when the program touches it it signals to the MPS that that memory needs to be fixed.
4:27:52
drmeister
I don't see that in any of the registers in the frame above the <signal handler called> frame #2 -- although maybe I shouldn't expect to.
4:28:38
drmeister
Does info registers give you the values of registers as they were in the frame that I'm currently looking at - or only the current values of the registers in the top frame.
4:34:27
drmeister
In my morning (7 hours from now) one of my friends at Ravenbrook will be up - I'll ask them about this. They might have some advice. The problem is easy to reproduce.
4:38:47
drmeister
There is some stuff in here that I'm not sure if we are doing (or not doing). frgo set up a lot of signal handling code for clasp.
4:46:54
drmeister
It is a tagged pointer - how does it end up in $_siginfo._sifields._sigfault.si_addr and where is $_siginfo._sifields._sigfault.si_addr located in memory? Is it in kernel space (I'm guessing)?
4:50:01
drmeister
MPS has this function: mps_bool_t mps_arena_has_addr(mps_arena_t arena, mps_addr_t addr)
4:55:05
drmeister
Unless you have some further insight I was going to ask my friend at Ravenbrook about it in the morning.
4:55:44
drmeister
There's clearly something going on in Linux when we do non-trivial things in a child thread.
4:56:43
drmeister
I have one thread local pointer that points to a data structure at the top of each threads stack.
4:57:18
drmeister
With MPS I have a second thread local structure that contains half a dozen MPS allocation points.
4:57:40
drmeister
The first thread-local data structure stored at the top of each stack is described here:
4:57:41
stassats
looking at the actual faulting instruction and the memory it uses, it doesn't look like 0x3e800005860
4:57:56
drmeister
https://github.com/drmeister/clasp/blob/dev/include/clasp/gctools/threadlocal.h#L7
5:02:48
drmeister
You could use nm to check if the symbol for the function you are editing is in there.
5:03:12
drmeister
the mygc.c.4.o is a bitcode file. You could also llvm-dis it and look at the human readable .ll file.
5:16:54
drmeister
I wasn't clear. ./waf build_imps will rebuild iclasp-mps - you could run that. Alternatively you can use ./waf build_cmps and that will relink cclasp-mps - but that takes longer because it recompiles some Common Lisp and relinks everything with everything to make cclasp-mps.
5:18:18
drmeister
It shouldn't be different - but it may be - I don't understand this error and I don't know if iclasp-boehm will reproduce the same error as cclasp-boehm.
5:19:20
stassats
Add support for unknown (immediate?) object to lisp_instance_class obj = 0xffffffffffffffff
5:20:31
drmeister
My knee jerk reaction is to turn on the guards and rebuild and see if we can track it down then. It's a lot more fun tracking down GC problems with the guards on.
5:22:54
drmeister
CONFIG_VAR_COOL turns on assertions in MPS and the other three cause clasp to check objects for validity. They don't slow things down too much.
5:27:20
stassats
sigHandle is being hit multiple times with 0x3e800004132, refuses to handle it and ultimately is hit with 0
12:25:46
frgo
::notify drmeister Re MPS and signals: I think we need to change behavior in file src/gctools/interrupt.cc: ADD_SIGNAL( SIGSEGV, "+SIGSEGV+", ext::_sym_segmentation_violation); - Am I right that this estalishes a handler represented by the symbol ext::_sym_segmentation_violation? If so: when using MPS on Linux, we're not allowed to do that. If not: I'd like to know what this line actually does.
13:05:04
frgo
Because MPS on Linux relies on SIGSEGV being not handled by someone else. It uses SIGSEGV to manage memory.
13:06:44
frgo
Yes, it does. But if you install another handler, then this leads to MPS being prevented from doing its job.
13:50:57
Colleen
drmeister: frgo said 1 hour, 25 minutes ago: Re MPS and signals: I think we need to change behavior in file src/gctools/interrupt.cc: ADD_SIGNAL( SIGSEGV, "+SIGSEGV+", ext::_sym_segmentation_violation); - Am I right that this estalishes a handler represented by the symbol ext::_sym_segmentation_violation? If so: when using MPS on Linux, we're not allowed to do that. If not: I'd like to know what this line actually does.
13:54:36
drmeister
My friend at Ravenbrook got back to me and wants to see a backtrace. The machine I'm using is in the Amazon Cloud - so I can give him access as well.
13:58:48
drmeister
It doesn't reproduce the problem with the cases I tried last night - it works on simple cases.
13:59:57
drmeister
Nope - it does - I was using cclasp - it behaves differently. In aclasp it fails like it did last night.
14:06:28
drmeister
I can reproduce the problem and I passed it on to David along with stassats' observation.
14:12:41
frgo
Error >>>>>>>> In file included from /opt/common-lisp/lang/clasp/src/clasp/src/gctools/interrupt.cc:2:
14:12:42
frgo
In file included from /opt/common-lisp/lang/clasp/src/externals-clasp/llvm50/include/llvm/Support/ErrorHandling.h:18:
14:13:31
drmeister
I just realized something - I can create Amazon Cloud machines with Clasp running and give people access to them. Great for debugging.
14:14:01
drmeister
frgo: That was the same problem that you had last night - this is with the new externals-clasp build?
14:16:03
frgo
LLVM_CONFIG_BINARY = "/opt/common-lisp/lang/clasp/src/externals-clasp/llvm50/build-release/bin/llvm-config"
14:17:39
drmeister
Yes - that is all fine - you can remove the EXTERNALS_CLASP_DIR line - that's not used anymore.
14:19:20
drmeister
This is the contents of the llvm/Config directory that I think your system wants:
14:19:49
drmeister
The peculiar thing is that I don't have an llvm-config.h file and I don't see the problem that you do.
14:21:14
frgo
AsmParsers.def.in AsmPrinters.def.in Disassemblers.def.in Targets.def.in abi-breaking.h.cmake config.h.cmake llvm-config.h.cmake
14:22:59
frgo
As soon as you actually build LLVM there is a llvm-config.h there... - in build-release/include/llvm/Config/llvm-config.h
14:24:43
drmeister
That's my /externals-clasp/llvm50/build-release/include/llvm/Config/ directory - and yes there is an llvm-config.h
14:25:24
frgo
No - it's there: AsmParsers.def AsmPrinters.def Disassemblers.def Targets.def abi-breaking.h config.h llvm-config.h
14:26:07
frgo
It's just that the directory ".../externals-clasp/llvm50/include" is not set as an include dir by wscript.
14:29:21
Bike
it seems that the include has to be built. the source only has whatever kind of pre file.
15:32:15
drmeister
I see problems when I try to create >50 threads on OS X and then there are the problems that we ran into on Linux.
15:39:56
drmeister
frgo: I can give you access to the Linux machine that has Clasp built and exhibits the problem - would that help?