freenode/#clasp - IRC Chatlog
Search
15:43:54
Bike
i have ext:source-location returning what it ought to, but if i try M-. i get "Containing expression ends prematurely"...
16:17:36
Bike
drmeister: doing peek-char seems to update the result of core:input-stream-source-pos-info. I am confused.
17:16:52
drmeister
Continuing the conversation from yesterday (because I need to tell the Ravenbrook people something).
17:17:25
drmeister
I can't figure out how we use core dumping to build Clasp. We don't GC code - so we can't dump the core.
17:22:50
drmeister
So we JIT code into particular memory locations and then dump that along with the memory? For cclasp - we dump the core at the end of loading inline.lisp rather than compile-filing everything.
17:24:08
drmeister
We have C++ memory that we would need to fix up when we reload that core - but this is a clear plan forward.
17:25:53
drmeister
The Ravenbrook folks do have a way of GC'ing JITted code - they developed it for their other commercial client.
17:34:56
drmeister
So - currently the way we build clasp - compile-filing everything and linking it together into a single fasl/executable has problems that hamper our ability to implement inlining.
17:35:31
drmeister
At startup, these fasl/executables evaluate each toplevel form one at a time and build the environment up that way. It's very modular - but the modularity brings problems.
17:36:25
drmeister
An alternative is to essentially load a full environment into memory and then "dump core".
17:36:51
drmeister
We need GC support for that. We can get it from Ravenbrook. We probably can't get it from Boehm.
17:37:32
Bike
i don't think it's really "modular", the whole problem with inlining is interdependencies
17:38:46
cracauermob
The only quick way here without original research is to switch the GC to from now on collect into one contiguous region
17:39:56
drmeister
Well, the Ravenbrook folks could give us something more sophisticated - we could specify the pools and then they create a serialized version of those pools. At load time the pools could be reconstituted from the serialized version.
17:40:53
drmeister
cracauermob: What you are proposing is straightforward from the point of view of writing the data out to disk.
17:41:36
cracauermob
Unless it is with the sale purpose of adjusting pointer so that you can mmap anywhere
17:42:54
cracauermob
After restart you can then GC again and spilt objects by pools as they move, if that is considered important.
17:43:35
cracauermob
Or that one mapped corefile has annotations to tell it which part we're which region
17:44:25
drmeister
The memory layout in Clasp is pretty rich. Cons cells are in their own pool, objects without internal pointers are in their own pool.
17:45:06
cracauermob
So on that gc-tosave-core you would write regions one by one, also dumping their metadata
17:47:08
drmeister
Yes - let's say we ask them to handle the serialization of their MPS memory structures to a single contiguous block of memory. The gc-tosave-core function would take all of the MPS memory in an Arena and turn it into a contiguous block of memory.
17:47:38
drmeister
Then at startup we mmap that block into memory somewhere and we call another MPS function that recreates the Arena.
17:49:42
cracauermob
I mean the source code is still there, but what about closures that had specific scope?
17:49:44
drmeister
They solved this problem for their other commercial client. They keep track of relocation information in the code and they can move the code around.
17:51:21
Bike
well, a dynamic linker is what a core load is, yeah? sbcl has to do fixups and stuff too
17:52:28
cracauermob
Sbcl's image loading does no linking or fix-up because it maps at the same address
17:52:58
drmeister
jitting happens at an llvm Module level there is a table of roots inside of the module that the code in the module refers to with IP relative addressing. If the whole module is moved around - what in the code needs to be relocated?
17:56:36
drmeister
If Ravenbrook has a way to relocate llvm Modules - then Modules become just another llvm object that is part of the Arena.
18:02:42
drmeister
I don't think of it so much as doing relocations when we load a core - more like convert some blob of bytes into a working Arena. Relocations will need to be taken care of by the serialization/deserialization.
18:05:11
drmeister
cracauer: If we make the implementation as dumb as possible - we need to throw out the garbage collector.
18:05:52
drmeister
The point here is we have several experts on the garbage collector available and waiting for instructions on what we need.
18:06:16
cracauer
The way I would do it with that I know is that I start a full-heap GC into the area I want to save.
18:07:28
drmeister
cracauer: That sounds like the serialization process that I'm envisioning asking the Ravenbrook folks to figure out how much work it would require to implement.
18:08:06
cracauer
But the heap data is not changed in any way by the image saving/loading processes.
18:10:18
cracauer
Sorry I missed that SBCL implemented varying the base address. But that is a minor detail.
18:11:49
cracauer
I am very afraid of new code doing pointer adjustments. Debugging would be a nightmare.
18:12:47
drmeister
We need to incorporate whatever solution they developed for GC'ing llvm JITted code and then ask them to give us an estimate for what it would take to serialize an Arena to a single contiguous block of memory and then reconstruct that Arena from that contiguous block of memory.
18:15:30
cracauer
I suppose you could, instead of doing a final GC into one contiguous area at one address, just dump things where they are and re-map them at the same addresses. But that means lots of opportunities of clashes.
18:15:58
cracauer
You can do that final GC into one VM region with existing GC code. So it would be safe and likely to work.
18:17:12
cracauer
stassats: I suppose the best way is to write the mechanism in a way that support relocation, but happen to map at the same address at first for safety.
18:17:27
drmeister
The Ravenbrook folks want me to write this up in an email - I chaff at that because I don't really know what I'm asking for - or what is possible. I prefer the back and forth and immediacy of chats like this one.
18:17:32
cracauer
If you save regions individually you have more opportunity for VM mapping clashes.
18:18:34
drmeister
cracauer: It's hard to be specific because I don't know much about the underlying machinery in mps - I don't know what is hard and what is easy for them.
18:20:27
cracauer
Yeah, I just don't know about that. They have a GC spec, but that doesn't mention moving code, jit or otherwise.
18:22:44
cracauer
In fact, nothing forces us to pick final-gc versus original-address for any region.
18:23:21
drmeister
The Ravenbrook folks came up with another way. From the brief description they gave me they use a large code model and they fix up pointers.
18:24:00
cracauer
For any region, we can decide freely whether to re-map it at the original location or have it compacted into one VM region.
18:24:54
cracauer
Their spec says clearly that code might keep copies of pointers around that you don't know about.
18:27:03
cracauer
Whatever ravenbrook does, or in fact we do right now, might only work because of not-jet-implemented optimizations in LLVM.