freenode/#clasp - IRC Chatlog

14:37:52 drmeister Hello everyone.

14:38:19 frgo Hello drmeister

14:38:21 drmeister I forgot to upgrade to BigSur on my macmini - so I'll bring in an extra power supply and do it at work.

14:38:31 frgo Good morning

14:38:51 drmeister Lang Hames is working on the exception handling support for JITLink on llvm12 for linux.

14:38:54 drmeister Howdy frgo

14:39:12 drmeister He will probably get me some new stuff to test in the coming days.

14:39:29 drmeister I also found out that there may be a way to get stackmaps from C++ code!

14:41:46 drmeister frgo: Did you get clasp building on BigSur?

14:44:15 drmeister Regarding stackmaps in C++.

14:44:39 drmeister You can do this in C++: __attribute__ ((address_space (1))) int *p;

14:45:01 frgo I did get it to compile using the commit that cracauer suggested yesterday.

14:45:11 drmeister So a pointer can be annotated to be in "address_space(1)"

14:45:38 frgo cclasp_boehm hangs though when compiling quicklisp ...

14:45:45 drmeister Then there is a pass in llvm that converts address space(1) annotations into stackmap entries. Badda bing.

14:46:34 drmeister frgo: I'll upgrade to BigSur and reproduce the problem.

14:46:40 drmeister What version of Xcode do you have?

14:47:14 drmeister Regarding address_space(1) - I just defined this - and it builds...

14:47:19 drmeister https://www.irccloud.com/pastebin/u4wW1aS9/

14:47:55 drmeister This means that our tagged_ptr and smart_ptr - which we use everywhere to indicate GC managed pointers - can be annotated to contain an address_space(1) pointer.

14:48:43 drmeister There is a hitch. I haven't been completely dogmatic about using smart_ptr/tagged_ptr to indicate GC managed pointers.

14:48:53 frgo Xcode 12.4

14:48:53 frgo Build version 12D4e

14:49:28 drmeister There are also internal pointers for things like strings. There has been some leakage into regular pointers.

14:50:21 Bike i think i've mentioned this before, but i really don't like the array operators we have that let C++ get a regular pointer to the data

14:50:29 Bike though i don't think i'd considered the garbage collection angle

14:50:54 drmeister I'm looking forward to incorporating MMTk where we can take more control of the stack scanning.

14:51:20 drmeister In Clasp we have more control and we can annotate pointers as address_space(1).

14:51:57 drmeister We might be able to do something like precise GC from the stack - but pin objects that are on the stack.

14:52:51 frgo wow - never heard about the address_space attribute. Some more reading for me ...

14:52:58 drmeister Rely on precise pointers to keep things alive and pinned from the stack. It's going to be a while to get there.

14:53:15 Bike yeah, i'm not totally clear on what the statepoints stuff is doing with the addrspace markings

14:54:16 drmeister The statepoints is independent. We define a @gc.safepoint_poll() function and provide the code for inlining in llvm.

16:09:53 drmeister I posted this to the MMTk Zulip chat group...

16:09:56 drmeister We can provide the @gc.safepoint_poll() code (it needs to be tight) and then run the pass to add safe-points to the llvm-IR. Then we can take a look at how they are distributed. Once MMTk 0.3 comes out in February (or whenever, no hurry) we can work through the tutorial and implement bindings for clasp that work with MMTk. I'm hoping that in the coming months to a year MMTk will grow in power as it gets a port of immix

16:09:56 drmeister and we can transition to it fully. We have maintained a discipline of manipulating GC managed objects using a C++ template class that wraps a tagged pointer. We can annotate that tagged pointer with __attribute__((address_space (1) )). Then we can run the pass that converts those pointers to GC stackmap entries. With this we may be able to do better than fully conservative stack scanning. I'm thinking MAYBE we can rely

16:09:56 drmeister on precise pointers (in stackmaps) from the stack to keep objects alive and pin them and use exclusively precise pointers on the heap.

16:11:30 Bike precise stack collection sounds pretty neat.

16:12:00 drmeister The thought about doing better than fully conservative stack scanning may be naive. We may not have been disciplined enough with smart_ptr::raw_() and internal pointers to clasp objects.

16:12:50 drmeister From my reading and what people have told me, internal pointers are the big bugaboo - it's expensive to maintain and search the data structures that convert internal pointers to base pointers.

16:13:33 drmeister If you can avoid allowing internal pointers to exclusively keep an object alive and pinned - that's a win.

16:14:01 Bike for that purpose, do "internal pointers" include tagged pointers, or just actual internal pointers (e.g. to the middle of a vector)? because it seems like tag bits can just be masked off

16:14:10 drmeister I'm thinking if we have a precise pointer on the stack in a stackmap that keeps the object alive and pinned - then we can have any number of additional internal pointers to address things inside of it.

16:15:02 drmeister I'm thinking the hard kind of internal pointers - the ones that point to the 1029th entry of a 2000 element vector.

16:15:27 drmeister Tagged pointers are an easy kind of internal pointer - you just mask off the tag and you have the base pointer.

16:15:29 Bike right, ok. i'm wondering how often we actually use those.

16:15:49 drmeister We definitely use the hard internal pointers with arrays and strings.

16:15:58 Bike is that in C++ code or in lisp code?

16:16:16 Bike i mean like i mentioned we do just get unmanaged pointers into the middle of managed arrays sometimes, which seems bad, but that's in C++ code

16:17:18 drmeister But I don't think I let them propagate far in calls and I think I worked hard not to let them escape. They definitely aren't stored in object slots. I hope all of this is true. We are going to have to do some analysis of the code.

16:17:20 Bike when you told me about this before you mentioned stuff like CDR being a problem, but that's a pretty transient pointer

16:18:06 drmeister In C++ code. In lisp code we can add new discipline by changing the code. You wrote the inlined array/vector code - we could fix it so that we keep a base pointer around - right?

16:18:28 drmeister "we can add new discipline by changing the code GENERATED BY THE COMPILER"

16:18:51 Bike i don't see why not. i guess as far as that goes i'd be more worried about llvm removing a pointer because it's basically dead, but i guess all this statepoint stuff should have ways of dealing with that too.

16:19:15 drmeister Right - and there are ways of indicating GC roots.

16:20:13 drmeister C/C++ guarantee that a reference won't be optimized away if you take the address of it. The address_space(1) attribute might give us a way to automatically force llvm to NOT optimize things away.

16:21:07 Bike let's see, refreshing my memory here, for cdr we do basically (load (gep cons-pointer 5)), so between the gep completing and the load completing we have an extant pointer into the cons, where cons-pointer could be optiized away if it's not used thereafter

16:21:40 Bike (where 5 is (- +cons-cdr-offset+ +cons-tag+)

16:21:42 Bike )

16:21:59 drmeister With llvm the way to deal with this stuff is to add unit tests to llvm that test for certain behavior - like not optimizing away references to things. With that we can take these mechanisms that may or may not have been developed for this purpose and future proof them.

16:22:03 Bike but that seems like the kind of access that's very common in pretty much any system, garbage collected or not

16:22:29 Bike and i guess at the machine level the gep probably doesn't actually exist

16:23:27 Bike i'm not sure the generated code really keeps interior pointers around for any substantial length of time ever. we don't do the kind of optimizations that would make that happen, i don't think

16:23:40 Bike and if llvm does we could just rein it in

16:23:58 drmeister Right - but the CDR function is passed a CONS cell. If we annotate the CONS cell pointer that it cannot be optimized away while the CDR function is being evaluated then the CDR pointer won't be left with the responsibility of keeping the CONS cell alive. Does that make sense?

16:24:44 Bike yeah, i mean, i get it. i think that hopefully in this particular case the load is so trivial no actual keeping alive is necessary, though

16:25:53 Bike i'm thinking all the generated code might only do trivial accesses like that. we don't, like, keep pointers to subarrays within loops, or something, like optimized C code might

16:26:30 drmeister For context - my question was not "you aren't getting what I'm saying - so does this new explanation make sense to you" it was more like "this thing that I just said you and I get it - but if I explained it to someone else - does it capture the idea?".

16:27:08 Bike no i know. i think what you're saying makes sense.

16:27:23 Bike i'm kind of thinking out loud about what might or might not need changing in our compiler is all

16:27:34 drmeister Good - I'm going to have to explain my thinking to the MMTk people is why I ask.

16:28:00 Bike flipping through translate.lisp to see if we do anything dicey... would stuff get weird with llvm.frameaddress? i guess it's just a pointer into the stack rather than the heap

16:28:21 drmeister Yeah - that's a good thing to think about - because I'd like to take control of memory management in Clasp - and MMTk looks like the way to do that.

16:28:50 drmeister Hmm, how could things get weird with llvm.frameaddress?

16:29:09 Bike well, just in that it's a pointer that we store into an object

16:29:29 Bike though i guess the object also doesn't go on the heap. except maybe the C++ runtime would put it on the heap? but not gc-managed heap

16:29:49 drmeister Stackmaps would treat each stack frame like an instance of an object in memory. We would know the offsets of GC managed pointers in each stackframe. We can generate them from Clasp - and now it looks like we can generate them from C++ code as well.

16:30:10 drmeister We don't need to keep stackframes alive - or pin them.

16:30:20 drmeister They just are there.

16:30:29 Bike yeah, makes sense.

16:30:48 Bike and i guess the gc should never really need to look at our unwind exceptions.

16:30:52 Bike right now we're actually leaving svref and aref and stuff as function calls, so no generated code problems there i guess

16:31:15 drmeister So we will have the heap full of objects that contain GC pointers at certain offsets and a stack full of objects that contain GC pointers at certain offsets.

16:31:46 drmeister And registers and some special memory as roots.

16:32:47 Bike oh, this is a stupid question, but the whole idea with yieldpoints is that the GC will only run at yielddpoints we define, right? not anywhere

16:32:47 drmeister If we can rely on just the GC managed pointers that we know about to keep everything alive and we pin the objects that are pointed to by the GC managed pointers on the stack - then I think we are doing better than boehm or MPS right now.

16:33:04 Bike seems like it would be bad if the gc started and we're in the middle of __cxa_throw or something and god knows what's in any register

16:33:10 drmeister Right - yieldpoints do exactly that AFAIK.

16:33:15 Bike ok, right.

16:33:39 drmeister Right - so with yieldpoints it can't happen in the middle of __cxa_throw.

16:33:51 Bike unless we lose our minds and put a yieldpoint in there

16:34:08 drmeister We need a mechanism to suppress yieldpoints in certain regions of code.

16:34:20 drmeister C++/C library code won't have yieldpoints.

16:34:55 drmeister We could NOT put them in C++ code at all in general and add them only at specific points - like MAP

16:35:54 drmeister If we use patchpoints we can know where they all are and add and replace them with NOP dynamically.

16:38:33 Bike wait, like, we're going to add and remove yieldpoints from code while it's running?

16:39:05 Bike (brb)

16:46:09 cracauer No yield points at all in C++ code can backfire. It could delay GC while stalling all other threads.

16:47:41 cracauer I think internal pointers into cons cells are the only easy ones. If cons cells live in a separate space. You can always find the beginning of the object via alignment.

16:55:47 drmeister This is interesting... https://www.hboehm.info/gc/tree.html

16:56:11 drmeister I think that's the data structure that the boehm GC uses to identify base pointers.

16:58:36 drmeister I would be tempted to say "what - that old thing?" but boehmgc is surprisingly efficient.

16:59:22 drmeister What I'd like to move to is where we implement the stack scanner so we can run it in a "safe" mode and a "precise" mode.

17:01:04 drmeister Safe mode would scan conservatively and use a data structure like what is above and look for conservative pointers that point to objects that aren't backed by a precise pointer in the same stack frame.

17:02:27 drmeister We could then identify unsafe functions and fix them.

17:03:52 drmeister It would just be C++ code - we can make all Common Lisp code safe by changing the generated code.

17:04:13 drmeister A simple way to make a C++ function safe is take out the yieldpoints.

17:04:44 Bike i've been reading a bunch of papers about memory models in the last few days and apparently hans boehm is pretty involved in the C++ standardization process about it. surprised me a little.

17:05:32 Bike if we didn't have any yield points in C++, then besides what martin said, couldn't we actually run out of memory if they allocate lisp objects? because there'd be no gc running

17:05:34 drmeister I think he wants to allow C++ to have garbage collection optionally.

17:06:09 drmeister Yes. Functions that run for a long time and do lots of allocation with no yieldpoints would blow up.

17:06:16 drmeister But what does that?

17:06:42 Bike i don't know, but we do kind of a lot of stuff in C++.

17:07:21 drmeister Oh wait - if you call a function with yield points then the callee needs to have its stack frame be consistent.

17:08:26 drmeister Every call to a function with yieldpoints itself becomes a yieldpoint.

17:09:43 drmeister Leaf functions are easy. Everything else needs to be dealt with more carefully.

17:10:25 drmeister But say MAKE-LIST - that would need a yieldpoint ever N allocations.

17:10:57 drmeister every

17:17:08 Bike i see, yieldpoints can be patched in and out based on dynamic profiling

17:35:18 drmeister Yeah

18:03:12 cracauer kpoeck: around?

19:02:02 aeth_ ** NICK aeth

19:08:52 cracauer test

19:08:57 yonkunas test

21:27:20 kpoeck ::notify cracauer am here now

21:27:20 Colleen kpoeck: Got it. I'll let cracauer know as soon as possible.

22:27:01 drmeister I'm building clasp on macOS BigSur with Xcode 12.4

22:27:15 drmeister Interestingly - I have no wscript.config - it just works.

22:27:30 drmeister I do have llvm@9 and boost and gmp installed via homebrew.

22:29:58 cracauer I think we hardcoded the correct defaults just recently.

22:29:58 Colleen cracauer: kpoeck said 1 hour, 2 minutes ago: am here now

22:30:23 cracauer kpoeck: still looking into your patch and the FreeBSD hang.

22:41:19 cracauer kpoeck: when I apply your change, but not the *.h file it doesn't hang.

23:33:14 drmeister ::notify frgo Could you describe the crash that you experience building quicklisp with clasp on BigSur? I cleared my quicklisp cache and loaded iclasp-boehm and used (load "~/quicklisp/setup.lisp") and it worked.

23:33:14 Colleen drmeister: Got it. I'll let frgo know as soon as possible.

23:33:24 drmeister https://www.irccloud.com/pastebin/P5S9cmQq/

23:33:51 cracauer he probably meant modules inside quicklisp

23:37:28 drmeister ::notify frgo Could you give me the particular quicklisp systems that you were trying to build?

23:37:28 Colleen drmeister: Got it. I'll let frgo know as soon as possible.

23:37:54 drmeister I'll build a bunch of them - start with ironclad.

23:46:24 drmeister Our compiler is really much faster now.

23:47:15 drmeister 13 seconds to compile quicklisp.

23:49:01 drmeister Bike and karlosz - you guys did a fantastic job.

23:51:31 drmeister Ironclad with alexandria and bordeaux-threads takes 212 seconds - this is on a macbook air 1.1 GHz Quad-Core Intel Core i5

23:51:51 drmeister compile-file-parallel sustained between 200-400% CPU

23:52:08 drmeister This is really impressive compared to a few months ago.

23:54:29 drmeister ::notify frgo Don't use cclasp-boehm - use iclasp-boehm. cclasp-boehm won't work for a few more months until we upgrade to llvm12. Otherwise I don't see what you are doing that would crash. Ping me when you get this and I'll dig deeper.

23:54:29 Colleen drmeister: Got it. I'll let frgo know as soon as possible.

23:56:08 aeth_ ** NICK aeth

0:09:03 karlosz what's the difference between the two?

0:52:06 kpoeck drmeister frgo was calling cclasp-boehm and still compiling quicklisp, see https://gist.github.com/dg1sbg/afb4c951bcdd1a8e1e3eaeb9ec549450

1:19:11 drmeister Ah - my memory is all jumbled up. cclasp-boehm is a symbolic link for iclasp-boehm

1:19:41 drmeister So - no difference

1:20:54 drmeister I think there must be something different about frgo's environment relative to mine that is breaking his clasp. Mine just built and builds quicklisp and ironclad with no problems.

1:21:38 drmeister It used to be (and will be again) that cclasp-boehm links together all of the Common Lisp and C++ code into one executable.

1:22:14 drmeister iclasp-boehm is just the C++ code and it loads a an image file that is a big faso file.