libera/#sicl - IRC Chatlog
Search
3:00:23
beach
nij-: The plan is for the initial executable to contain everything. There is no particular reason to load a small executable.
3:02:41
nij-
Does SICL (the builder) load Constrictor into the image that will produce the executable?
3:04:26
beach
It is more complicated than that. The main difference between SICL and other Common Lisp implementation is that we can execute target code during bootstrapping.
3:05:15
nij-
Does that mean one can interact with that baby (still in the image) before it's serialized to x86 code?
3:06:54
beach
So in phase 1 of bootstrapping, we use host generic functions and host classes to build CLOS. Then, we use those to build CLOS again in phase 2, but this time with something I all bridge generic function and bridge classes. Then we use those to build CLOS again in phase 3, this time with something I call ersatz generic functions and ersatz classes.
3:07:09
nij-
Basically, you have a running CL in another CL. Later, you serialized that running CL..
3:07:34
beach
Those ersatz objects have a structure that is isomorphic to the target objects, so we can generate an executable from them.
3:08:51
beach
There will be an initial Clostrum environment that contains everything to put in the executable.
3:10:28
nij-
small CL script => tiny executable that only does what that script does, but doesn't contain things that it doesn't use.
3:12:24
beach
I don't know. Like I said, I have enough of other things to thing about, so I don't worry about that use case.
3:14:24
beach
The SICL bootstrapping procedure needs a host that is a conforming Common Lisp implementation that also has the CLOSER-MOP library.
3:17:35
beach
The ersatz functions are executable in the host, but they also contain executable code created by the SICL compiler, and that the host is not concerned about.
3:19:42
beach
The difference with SICL is the first part, i.e., that those ersatz functions are executable in the host.
3:22:30
beach
That makes no difference. There will be a function executing in the host that creates a SICL executable. That function won't kill the host automatically.
3:23:17
beach
An ersatz function contains (or will contain) a vector of bytes that is x86 code, or some other binary code.
3:24:43
beach
Nope, the ersatz functions are host executable with the host mechanism. It does not touch the x86 code.
3:26:34
nij-
In the beginning, it is a minimal core. And when it loads, say, Constrictor, some mechanism has to compile constrictor into a format that can be "added on" that minimal core?
3:27:10
beach
In the beginning, there is the full host Common Lisp implementation and the CLOSER-MOP library.
3:27:58
beach
In the beginning, the SICL compiler runs as a host function, operating on a first-class global environment.
3:29:12
beach
The SICL compiler takes a first-class global environment as one of its arguments, and that's where it looks up definitions of functions, classes etc. and that's where it stores the objects it creates.
3:30:19
beach
So in the beginning (phase 1), ENSURE-CLASS and ENSURE-GENERIC-FUNCTION are defined to create host objects and store them in the first Clostrum environment.
3:31:09
beach
And the first Clostrum environment contains loads of host functions that we use initially, such as the CONS functions, the SEQUENCE functions, and more.
3:32:13
beach
Constrictor is not needed until at the end, because we can use host functions that do the same thing.
3:33:26
beach
The host-executable code created at bootstrapping time is not necessarily very fast, so it is faster to use host functions when possible during bootstrapping.
3:34:43
beach
The SICL specific aspect here, is that we always have a (nearly) complete Common Lisp system into which we can load arbitrary code.
3:35:09
beach
Initially, that system contains lots of host code, but in the end, it contains only SICL code.
3:36:33
beach
So the reason I was so happy about ctype yesterday was that ctype requires a very complete Common Lisp system, and I had to make that happen in order to load ctype.
3:37:09
beach
But the same thing can be said about other system we need to load, so the work had to be done anyway at some point.
3:40:43
nij-
I know it's not of your interest yet. But imagine if the last executable has to be in other form, would it be very difficult (once the x86 version sicl is done)?
3:42:16
beach
That's what i want the backend library for (that younder might be working on) so that this code doesn't have to be in SICL.
3:43:32
beach
It would have to be executable I think, since what is generated is an ELF executable file.
3:45:05
beach
The instructions in the ELF files would have to be native instructions I think, unless there is a mechanism that I am unaware of.
3:47:48
nij-
Lemme read the bootstrap paper again. I think I can get much more from that now.. and I can ask slightly better questions.)
3:48:04
beach
Before the very end of bootstrapping there are still no files. There is just a graph of objects that is isomorphic to the one that will be put in the executable file. But the vector of bytes in the ersatz functions I think must be native code.
3:51:03
beach
An ersatz function is a host instance of host FUNCALLABLE-STANDARD-CLASS so it is executable in the host. But that instance is otherwise just a HEADER object as required by SICL. The HEADER object contains a RACK object that contains all the SICL-required stuff.
3:52:08
beach
The end of the SICL bootstrapping procedure does not look at the executable code in the header, and is only concerned with its structure and the contents of the RACK.
3:57:20
beach
nij-: You might be the first person to take a serious interest in the SICL bootstrapping procedure. :)
3:57:56
beach
I mean, some people, like bike, probably understand the underlying principles, but I doubt that even bike has kept up with the details.
4:00:51
nij-
But yeah.. I'm still not convinced (?) that this is going to work. That's why I want to read the paper.
4:02:06
beach
As I often put it, the graph of objects (classes, generic functions, methods) defined by the AMOP could be created "manually", but it would be tedious, error prone, and unmaintainable. The best way to create this graph, is by executing CLOS code, like DEFGENERIC, DEFCLASS, and DEFMETHOD forms, so that's what I had to make possible in order to create CLOS first during bootstrapping.
4:03:09
beach
And I had to create CLOS first so that I could load arbitrary Common Lisp code after that, because such code would be much nicer if it can use CLOS.
4:04:49
beach
In particular, we can now extract lost of code that was previously SICL specific into libraries that can be maintained separately.
4:05:46
beach
And even that... We have already extracted the Common Boot library that makes it possible to execute host code during bootstrapping.
4:06:52
beach
The desired outcome is to make it much easier to create a new Common Lisp implementation, and SICL will be the first one to benefit.
4:17:00
nij-
And if that's the case, that ruins the point of having the thing being executable in the host, right?
4:17:30
beach
Indeed. But in both cases, the source code is translated to an AST, and the host code is a very simple translation of that AST to simple host code.
4:19:03
beach
If the translation to host code is buggy, the more likely outcome is that the problem will show up during bootstrapping. This has already happened a few times.
4:19:35
nij-
Oh, and also, if (2) doesn't have to be dumped, and all (2)'s point is to make a proxy of (1) before dump, why don't we just use ordinary CL code? Why bother compiling it into CL-subset?
4:20:18
beach
It has no name. It is located in the Common Boot library. First and AST is created, then there is a generic function named CPS that translates the AST into CPS code using a trampoline.
4:22:27
nij-
Why do we risk compiling it into CL-subset? It can be done, but if there's no point, then that's a pure risk.
4:23:11
beach
I am not expressing myself very well. It uses all the Common Lisp features that it needs, so there is no particular effort in making it a subset.
4:26:36
beach
From the AST, 2 is obtained by the CPS translator. 1 is obtained by turning the AST into intermediate representation and then to native code.
4:26:49
nij-
Code (e.g. Constrictor) -> AST -> (1: almost x86 code, but vector) + (2: CL-subset code)
4:28:23
beach
No it can't. We don't assume any particular Common Lisp implementation in the host. It may be running in ARM, or it may be an interpreter.
4:28:54
beach
And even if it is an x86 Common Lisp implementation, the native code won't be the same in the host and the target.
4:30:01
nij-
Why can't the host execute that CL code by itself? And instead why do we have to traslate that CL code to AST and to CL-subset?
4:30:37
beach
But if it is executed directly by the host, then the host compiler will consult the host global environment, and it will define things in the host global environment, thereby clobbering host functions.
4:31:55
beach
We could do that by using a source translation of the code, but that requires the same analysis as the first pass of a compiler.
4:32:22
beach
So we might as well use the first pass of the SICL compiler, which analyses that code and turns it into an AST.
4:32:55
nij-
I see. The host is an *ordinary* CL implementation. It doesn't know how to talk to the 1st class global env.
4:34:44
nij-
Why would it be useful to be able to execute the end result in host before dump be useful?
4:40:24
beach
The compiler is not a particularly complicated Common Lisp program, in terms of the resources it needs.
4:40:34
nij-
In phase 1, the compiler is an ordinary function in the host, and of course it can use the host CLOS.
4:43:21
beach
SBCL and other Common Lisp implementations have a problem in that they load CLOS last.
4:43:50
beach
This is for historical reasons. These implementations were created before the standard was written so before CLOS was part of the language.
4:47:40
beach
The SBCL compiler is written in portable Common Lisp that does not use CLOS, whether it executes in the host or in the target.
4:49:51
beach
But that's not enough to make it execute code in the host. You would also need first-class global environments, and the SBCL compiler is not written that way.
4:51:01
nij-
In SICL, the (crossed?) compiler (before the end), as a host function, after phase one, always use the 1st class global env?
4:53:30
beach
Well, I don't buy the logic: 1. Don't make CLOS faster by using my generic dispatch technique. 2. As a result, don't improve the compiler by using CLOS. 3. Suffer.
10:30:31
nij-
2. How do we effectively ensure that there's no different between ersatz(1) and ersatz(2)?
10:30:31
nij-
3. Why can't ersatz(1) contain LLVM IR (or other IR) in general? After all in order to test them in the host, we only have to execute stuff in ersatz(2).
10:30:33
nij-
4. How will the end product (SICL-produced x86 executable) do garbage management? Do we also have a module that does that?
10:30:36
nij-
5. If we want to add supports (e.g. OS threads, weak pointer, local package.. etc) to the final executable, would it be possible without writing C bindings?
10:30:39
nij-
6. How many phases are there (E0~E5 in the paper?) in total, and what are the goals and (roughly all) steps in each phase?
10:32:39
beach
1. The host may cease to work if polluted with target code. Like the host function ENSURE-GENERIC-FUNCTION creates a generic function in the host global environment. But that's not what the target function does.
10:35:04
beach
5. Anything is possible without C bindings. You can do it either directly i machine code, or in some kind of assembly, like Cluster input.
10:35:41
beach
6. I don't know how many phases there are, and it is kind of arbitrary how many you add beyond 4.
10:36:27
beach
7. Right now for the new bootstrapping procedure? There is no longer an E0, so it starts with E1, and I am currently working in E2.
10:43:51
nij-
5.1. Can we do it in common lisp, and can we observe their effects in the host (before dump)?
10:46:03
beach
Sure, it can be done in Common Lisp since we can always create Cluster instructions from Common Lisp. But we won't be able to execute target native code in the host. Again, that would require an x86 emulator, and the host probably would not allow it.
10:47:48
beach
nij-: I am off for a lunch break. Further questions will be answered when I get back, or perhaps by some other participants.
10:52:49
nij-
Say after SICL is done and got extension of OS threads - before dump, we can interact with SICL executable (predumped) in the host, and that's going to have OS threads effectively?!?!
10:53:23
nij-
Will that predumped SICL also have a working garbage collector and a working debugger?!
10:58:57
nij-
6.1. Any plans on laying them out first? It may be easier when implementing, and it would allow more people to join the discussion. Higher levels can be cast out, and perhaps we can discover issues or the possible need of large-scale rewrite ahead of time.
12:39:03
beach
No, I don't see a way that you can interact with OS aspects of SICL before it is dumped.
12:41:16
beach
The debugger might work. In fact, we already have a primitive "debugger". I am not sure how to organize the debugger yet. I would like to see the preferred way to use SICL to be through a CLIM-based IDE.
12:56:21
yitzi
beach: I know you haven't gotten to khazern, etc. ... but I don't remember if I told you that did some significant work on Cyclosis. It can be loaded extrinsically now which means there will definately be changes when you get to that stage in bootstrapping.
12:57:24
beach
Yes, I pulled it and noticed. I'll deal with it when I get to it. Thanks for letting me know. I do use Khazern already.
12:58:25
yitzi
Ok. Please let me know if I can help in any way. One of the things I added to Cyclosis was a "transcoding" api that takes care of stream-external-format.