libera/#sicl - IRC Chatlog
Search
11:41:25
beach
So if BOCL is implemented using the Boehm garbage collector and the GNU multiple precision library, then a lot of the complexity of a Common Lisp implementation vanishes.
11:42:42
beach
Those are both "legitimate" choices, given the objective that it should be possible to build BOCL with only the C tool chain.
11:46:25
mfiano
I don't know if there are multiple Boehm implementations in C that are worthwhile, but I can say that the one used by Crystal is very bad.
11:47:11
hayley
I don't think BOCL is intended to be used in production, and Boehm can be made more "precise" to an extent, which helps.
11:47:18
beach
It doesn't really matter if it is somewhat bad, as long as it doesn't run out of memory entirely.
11:48:02
mfiano
It exhausts all 32GB of my memory quite easily with small workloads, and this is expected behavior to the authors.
11:48:22
hayley
See https://github.com/vlang/v/pull/9716 (yes, V somehow is relevant) for some numbers.
11:49:13
jackdaniel
but, given that bocl is meant primarily for bootstrapping, limiting bigint to the biggest integer type (in c language) and then signaling a serious condition 'storage-exhausted'; and not expecting it to consume more than 32GB seem to be sane limitations
11:49:14
mfiano
"Since the collector does not require pointers to be tagged, it does not attempt to ensure that all inaccessible storage is reclaimed. However, in our experience, it is typically more successful at reclaiming unused memory than most C programs using explicit deallocation."
11:50:14
hayley
If it makes you feel any better, there's enough address space that other data rarely looks like a pointer, and Boehm tries to allocate in a way that avoids such false positives.
11:50:15
jackdaniel
mfiano: in "precise" mode it allows specifying necessary information to ensure a complete tree
11:51:50
mfiano
Crystal (language) frequently uses many GB by just running a language server for a little while, and OOM kills things after a few days with 32GB capacity. (it uses the above library for garbage collection).
11:51:54
beach
So maybe pjb's idea is better, i.e., write a garbage collector, but make it very simple since performance is not an issue.
11:53:22
hayley
Well, I don't know anything about Crystal, but Boehm seems to work fairly well for ECL, Clasp, and other languages.
11:54:06
mfiano
Like I said, I don't claim to know much about Boehm or even if there are multiple worthwhile implementations. I am only speaking about the above library in the context of Crystal's use of it.
11:54:07
beach
The idea I had for BOCL was to use essentially the same object representation as SICL has with a two-word header and a rack. But for BOCL it would also be used for other objects like CONS cells and integers.
11:57:12
jackdaniel
certainly a simple precise gc would be more suitable for bocl; avoiding libgmp would (imo) also fall in that category (either by limiting the size as I've mentioned before or by implementing some kind of naive algorithms)
11:58:00
jackdaniel
especially that libgmp has some licensing implications -> up to the version 4.x it is lgpl-2.1+, and from then onward it is lgpl-3.0+
11:58:23
mfiano
I guess my main point is, a GC is extremely important. Don't use/implement a GC that will give GC's an even worse reputation than they already have. It's bad enough we have so many productivity-breaking languages against GC because of the performance myth.
12:00:43
hayley
I agree in principle, but BOCL is not supposed to be used in production, and Boehm doesn't sound that bad in the other uses I'm aware of.
12:01:08
jackdaniel
(while both licenses are fine, there are always people who are ready to defend rights of big corpo to take things for free without strings attached ;-)
12:01:51
jackdaniel
boehm gc is a very sane choice for ecl and clasp that heavily interoperate with the c world
12:02:17
jackdaniel
but if you are not concerned with heavy-lifted ffi then having a ground-up precise gc makes more sense
12:03:02
Bike
in clasp we're looking at non boehm GCs, and even now we're using boehm with precise collection
12:03:47
jackdaniel
Bike: what happens if you pass something from Lisp world to C++ world and the latter "saves" the pointer for later use?
12:03:52
heisig
My experience is that boehm is essentially as good as it gets, under the constraints imposed by the C language.
12:04:37
jackdaniel
I think that bocl is an old idea of his, for bootstrapping on debian without relying on clisp or ecl (:
12:04:40
beach
heisig: This is a project I started some time ago. I just give it some thought from time to time.
12:05:27
jackdaniel
obvious way of crashing is better, because it may be easily tackled when identified. dangling pointers are another story
12:07:24
beach
heisig: There are some real concerns as well. Some Linux distributions apparently require that their packages can be built using only the C tool chain.
12:09:27
beach
heisig: BOCL is also an element in the debate about the best way of writing a Common Lisp implementation.
12:10:29
hayley
I remember ECL being easier to bootstrap from, mostly because I couldn't find a copy of libsigsegv which the CLISP build scripts liked, and that ECL is a fair bit faster.
12:11:14
beach
heisig: With a very simple, but conforming implementation like BOCL, there are fewer reasons to write Common Lisp implementations in any language other than Common Lisp.
12:11:24
jackdaniel
right. otoh clisp is more portable because libgc requires a few lines of code that are platform-specific
12:12:32
jackdaniel
beach: there are other benefits - libgmp is by far the fastest bignum implementation afaik
12:12:52
heisig
My hunch is that BOCL wouldn't be much different from Clisp in the end. Maybe one could simply fork Clisp as a starting point and simplify it further.
12:14:30
beach
heisig: You apparently haven't done the exercise of thinking about simplifications that can be made when performance is not an issue.
12:15:35
jackdaniel
that said, if bocl is say slower than clisp, then it would be quite troublesome to bootstrap from it
12:16:21
beach
It would be used only to build the packages of those particular Linux distributions that impose this restriction.
12:20:03
beach
jackdaniel: Also, I just read up a bit on the GNU multiple precision library. I think it would be fairly simple to incorporate it into SICL for someone who would want that. Recall that the global GC in SICL is essentially malloc()/free(), and it is possible to configure the GNU multiple precision library to use any memory allocator.
12:21:02
beach
The rack of a bignum or a ratio would just contain an instance of the appropriate C object.
12:21:12
jackdaniel
I'm not denying a technical possibility, I was referring to writing the implementation in not "any language other than Common Lisp"
12:22:11
jackdaniel
in fact libgmp is being incorporated by ecl (and I think - optionally - by sbcl)
12:22:32
beach
jackdaniel: You are confusing two things here I think. One is my desire to have no C code whatsoever. The other is the implementation of the Common Lisp evaluator and "library".
12:23:35
Bike
as enticing as the prospect to try to compete with a schonhage-strassen implementation written by someone who knows what they're doing was
12:23:38
jackdaniel
beach: I was convinced you aim at the system that will minimize the dependency on c compiler to minimum
12:24:59
beach
Apparently, I am not doing a good job of expressing myself clearly today, so I'll go do something else instead.
12:28:52
pjb
jackdaniel: The current implementation choices made for my bocl will probably make it slower than clisp: it's an interpreter, not a compiler.
12:28:56
jackdaniel
(or, to be more precise, dependency on C libraries at runtime); sorry for upsetting you beach
12:29:37
jackdaniel
pjb: the argument that it won't be normally used is sound to me, so I'm already convinced that this is a non-issue from the bocl standpoint
12:30:59
jackdaniel
pjb: fwiw ecl has bytecodes compler/interpreter and performs a minimal compilation of the code (i.e doesn't interpret it, but doesn't optimize it either)
12:31:50
pjb
The problem has more to do with the C code, and how standard and compatible with a large range of platform it is (how few implementation specific or undefined behavior it contains).
12:32:31
pjb
Any CL implementation could be adopted if we can remove dubious C code from it and adapt to a large range of platforms.
12:33:44
pjb
That said, once the minimal compilation is performed, interpreting the function calls and the special operators walking a sexp cannot be much slower than interpreting the byte code of a VM.
12:35:10
pjb
So, using clisp as a base could be a possibility. The question is how intricate clisp C code is and how much it relies on non-purely standard C behaviors.
13:10:10
lonjil
I predict that most or all Linux distros will build SICL with Clisp or SBCL, regardless of BOCL being available or not.
14:21:45
beach
lonjil: Why do you think that? I mean, if (say) SICL were to be distributed as a package for some Linux distribution with instructions to just type `make', why would they do something different?
14:27:26
lonjil
Whether they decide *not* to do what you said would probably depend on how slow bootstrapping with BOCL ends up being. If it is very very slow they may use SBCL instead to save CPU time on their build servers.