libera/#sicl - IRC Chatlog
Search
4:41:40
mfiano
beach: It might be me, but I noticed something odd with CLHS, that may or may not be good for WSCL to clarify.
4:46:46
mfiano
Does this have to do with how SICL doesn't store things in the symbol object directly?
4:48:41
beach
Somewhat. It has to do with the fact that the term suggests a particular implementation technique that used to be the only one in the past. But nowadays, it is not even a viable implementation, given threads and such.
4:50:43
mfiano
Well it may be hard to delete such a thing from the standard for WSCL. Clarification on the original intent is another story.
4:52:16
beach
The goal is mainly to specify things that, for no good reason, are unspecified in the standard.
4:53:07
beach
Especially "things" where all major implementations do specify them, roughly in the same way.
4:56:28
mfiano
The page on CASE says it matches keys based on their "identity". This can be misleading, and only "identical" is a glossary term.
4:59:09
mfiano
It's quite amazing how well the standard held up over time considering all these goofs
5:00:52
mfiano
That's the confusing part. All implementations (that I know of) do, but afaik, it is unspecified.
5:02:31
mfiano
That's the part I was forgetting. I knew that when I first discovered this years ago.
5:04:31
Mondenkind
even if it didn't mention that, ccase and ecase mention an expected type of (member key1 key2 key3), which is strong evidence for EQL
5:08:53
mfiano
Yeah, one cannot just read the summary to get an idea of how CASE behaves; if you do you may mistake that word for a similar one in the language lexicon, rather than its intended English definition. So, you are expected to catch this, and read past the summary.
5:12:30
mfiano
Why it bothers me so much is because this was actually the first thing I read when I came to Common Lisp many years ago, and right then, knew how careful you had to read it as an average user of the language (because we don't have good user documentation yet)
5:12:35
hayley
Would it be difficult to write "identical under EQL" or "identical under EQ", rather than specifying that "same" without qualification refers to EQL, and "identical" to EQ? The glossary definition for "same" mentions qualifying with "under <an equality predicate>", as well as just "same" with EQL.
5:14:49
beach
Well, "identical under EQL" would be a contradiction, but "same under EQL" would be good. And yes, repeating that information rather than assuming the reader will go to the glossary would be a good thing.
5:15:33
mfiano
My thoughts on the matter are deeper than that. I think the glossary is over-used, and things should be "spelled out" more (at the expense of duplication).
5:16:10
hayley
Such a reform would remove the definition of "identical" to mean "same under EQ", but that change could be pushing it too far.
5:20:26
Mondenkind
I think one should be careful about assigning specific technical meanings to words which also have common english meanings
5:22:58
mfiano
We can just choose names like ersatz, that apparently many people have not heard of before :)
5:23:45
hayley
The two hardest problems in Netfarm: 1. what to call a "system" (most of a server, notably excluding networking code) and 2. how to shorten the name UPDATE-SYSTEM-FOR-NEW-INTERESTING-OBJECT-PREDICATE
5:25:56
hayley
Somehow choosing uncommon names seems harder than choosing names, but that is why we have thesauruses.
5:27:16
beach
Mondenkind: I think that, as long as the use of the word is unique to the domain in question, it is fine to use ordinary words.
5:30:51
hayley
Well, that is what I learnt in primary school, where they really didn't want you to write "said" too often, e.g. "beach said 'How is SB-SIMD going?' heisig said 'I found the instruction that made my processor catch fire.' hayley said 'Does it work on 16-bit integers?'"
5:31:31
hayley
Mondenkind: I guess coming up with less common synonyms is something you can do with a thesaurus, but not with a dictionary. You could look through a dictionary to find words (assuming you know a prefix of the word).
5:32:44
Mondenkind
hayley: 'coming up with less common synonyms' is bullshit. Words have meanings. If my meaning is closest to the meaning of one word, but I choose another word instead, then I have not communicated effectively
5:33:52
hayley
Mondenkind: I agree, but we're playing the stupid game of "picking names that many people have not heard of before", and that's the stupid prize.
7:22:01
heisig
hayley: I wouldn't be surprised at all if x86 had an instruction for setting the processor on fire :)
7:22:56
heisig
And given how sensible the instruction set is laid out, it would differ only in a single prefix bit from some commonly used MOV instruction :)
7:24:27
hayley
Though I recall people do fuzz x86-64 processors, and I don't think there have been HCF instructions yet.
7:27:08
heisig
ACTION thinks of how he could implement HCF in sb-simd - and about a proper German name for it.
7:29:28
jackdaniel
its funny how they lump "LISP variants" with scheme and say that list is the only data type (https://users.monash.edu/~damian/papers/PDF/SevenDeadlySins.pdf)
7:30:52
hayley
No, they're not cache friendly, pointers are bad (but let me tell you about b-trees). You only have vectors.
7:34:34
hayley
If you want a language where lists are the only data type, then you probably want NILP <https://gist.github.com/Goheeca/b4ce549203e52d8d33dc3eaecc173ee5>
7:36:41
heisig
Except the part where they complain about the 'paradigmatic purity' of Lisp. That made me giggle.
7:37:04
hayley
I didn't mind the Eno song Seven Deadly Finns, though it doesn't sound like the art rock stuff or the ambient stuff.
7:39:26
jackdaniel
words shape the reality, in accordance with that we could say that there is a lot of offtopic -- still bad but smells better ,p /me gets back to typing code
7:39:42
hayley
heisig: It's fun seeing people try to categorise Lisp in general. Never seems to go well.
7:45:52
Mondenkind
hayley: https://www.youtube.com/watch?v=jmTwlEh8L7g mentions a halt and catch fire instruction :)
11:41:25
beach
So if BOCL is implemented using the Boehm garbage collector and the GNU multiple precision library, then a lot of the complexity of a Common Lisp implementation vanishes.
11:42:42
beach
Those are both "legitimate" choices, given the objective that it should be possible to build BOCL with only the C tool chain.
11:46:25
mfiano
I don't know if there are multiple Boehm implementations in C that are worthwhile, but I can say that the one used by Crystal is very bad.
11:47:11
hayley
I don't think BOCL is intended to be used in production, and Boehm can be made more "precise" to an extent, which helps.
11:47:18
beach
It doesn't really matter if it is somewhat bad, as long as it doesn't run out of memory entirely.
11:48:02
mfiano
It exhausts all 32GB of my memory quite easily with small workloads, and this is expected behavior to the authors.
11:48:22
hayley
See https://github.com/vlang/v/pull/9716 (yes, V somehow is relevant) for some numbers.
11:49:13
jackdaniel
but, given that bocl is meant primarily for bootstrapping, limiting bigint to the biggest integer type (in c language) and then signaling a serious condition 'storage-exhausted'; and not expecting it to consume more than 32GB seem to be sane limitations
11:49:14
mfiano
"Since the collector does not require pointers to be tagged, it does not attempt to ensure that all inaccessible storage is reclaimed. However, in our experience, it is typically more successful at reclaiming unused memory than most C programs using explicit deallocation."
11:50:14
hayley
If it makes you feel any better, there's enough address space that other data rarely looks like a pointer, and Boehm tries to allocate in a way that avoids such false positives.
11:50:15
jackdaniel
mfiano: in "precise" mode it allows specifying necessary information to ensure a complete tree
11:51:50
mfiano
Crystal (language) frequently uses many GB by just running a language server for a little while, and OOM kills things after a few days with 32GB capacity. (it uses the above library for garbage collection).
11:51:54
beach
So maybe pjb's idea is better, i.e., write a garbage collector, but make it very simple since performance is not an issue.
11:53:22
hayley
Well, I don't know anything about Crystal, but Boehm seems to work fairly well for ECL, Clasp, and other languages.
11:54:06
mfiano
Like I said, I don't claim to know much about Boehm or even if there are multiple worthwhile implementations. I am only speaking about the above library in the context of Crystal's use of it.
11:54:07
beach
The idea I had for BOCL was to use essentially the same object representation as SICL has with a two-word header and a rack. But for BOCL it would also be used for other objects like CONS cells and integers.
11:57:12
jackdaniel
certainly a simple precise gc would be more suitable for bocl; avoiding libgmp would (imo) also fall in that category (either by limiting the size as I've mentioned before or by implementing some kind of naive algorithms)
11:58:00
jackdaniel
especially that libgmp has some licensing implications -> up to the version 4.x it is lgpl-2.1+, and from then onward it is lgpl-3.0+
11:58:23
mfiano
I guess my main point is, a GC is extremely important. Don't use/implement a GC that will give GC's an even worse reputation than they already have. It's bad enough we have so many productivity-breaking languages against GC because of the performance myth.
12:00:43
hayley
I agree in principle, but BOCL is not supposed to be used in production, and Boehm doesn't sound that bad in the other uses I'm aware of.
12:01:08
jackdaniel
(while both licenses are fine, there are always people who are ready to defend rights of big corpo to take things for free without strings attached ;-)
12:01:51
jackdaniel
boehm gc is a very sane choice for ecl and clasp that heavily interoperate with the c world
12:02:17
jackdaniel
but if you are not concerned with heavy-lifted ffi then having a ground-up precise gc makes more sense
12:03:02
Bike
in clasp we're looking at non boehm GCs, and even now we're using boehm with precise collection
12:03:47
jackdaniel
Bike: what happens if you pass something from Lisp world to C++ world and the latter "saves" the pointer for later use?
12:03:52
heisig
My experience is that boehm is essentially as good as it gets, under the constraints imposed by the C language.
12:04:37
jackdaniel
I think that bocl is an old idea of his, for bootstrapping on debian without relying on clisp or ecl (:
12:04:40
beach
heisig: This is a project I started some time ago. I just give it some thought from time to time.
12:05:27
jackdaniel
obvious way of crashing is better, because it may be easily tackled when identified. dangling pointers are another story
12:07:24
beach
heisig: There are some real concerns as well. Some Linux distributions apparently require that their packages can be built using only the C tool chain.
12:09:27
beach
heisig: BOCL is also an element in the debate about the best way of writing a Common Lisp implementation.
12:10:29
hayley
I remember ECL being easier to bootstrap from, mostly because I couldn't find a copy of libsigsegv which the CLISP build scripts liked, and that ECL is a fair bit faster.
12:11:14
beach
heisig: With a very simple, but conforming implementation like BOCL, there are fewer reasons to write Common Lisp implementations in any language other than Common Lisp.
12:11:24
jackdaniel
right. otoh clisp is more portable because libgc requires a few lines of code that are platform-specific
12:12:32
jackdaniel
beach: there are other benefits - libgmp is by far the fastest bignum implementation afaik
12:12:52
heisig
My hunch is that BOCL wouldn't be much different from Clisp in the end. Maybe one could simply fork Clisp as a starting point and simplify it further.
12:14:30
beach
heisig: You apparently haven't done the exercise of thinking about simplifications that can be made when performance is not an issue.
12:15:35
jackdaniel
that said, if bocl is say slower than clisp, then it would be quite troublesome to bootstrap from it
12:16:21
beach
It would be used only to build the packages of those particular Linux distributions that impose this restriction.
12:20:03
beach
jackdaniel: Also, I just read up a bit on the GNU multiple precision library. I think it would be fairly simple to incorporate it into SICL for someone who would want that. Recall that the global GC in SICL is essentially malloc()/free(), and it is possible to configure the GNU multiple precision library to use any memory allocator.
12:21:02
beach
The rack of a bignum or a ratio would just contain an instance of the appropriate C object.
12:21:12
jackdaniel
I'm not denying a technical possibility, I was referring to writing the implementation in not "any language other than Common Lisp"
12:22:11
jackdaniel
in fact libgmp is being incorporated by ecl (and I think - optionally - by sbcl)
12:22:32
beach
jackdaniel: You are confusing two things here I think. One is my desire to have no C code whatsoever. The other is the implementation of the Common Lisp evaluator and "library".
12:23:35
Bike
as enticing as the prospect to try to compete with a schonhage-strassen implementation written by someone who knows what they're doing was
12:23:38
jackdaniel
beach: I was convinced you aim at the system that will minimize the dependency on c compiler to minimum
12:24:59
beach
Apparently, I am not doing a good job of expressing myself clearly today, so I'll go do something else instead.
12:28:52
pjb
jackdaniel: The current implementation choices made for my bocl will probably make it slower than clisp: it's an interpreter, not a compiler.
12:28:56
jackdaniel
(or, to be more precise, dependency on C libraries at runtime); sorry for upsetting you beach
12:29:37
jackdaniel
pjb: the argument that it won't be normally used is sound to me, so I'm already convinced that this is a non-issue from the bocl standpoint
12:30:59
jackdaniel
pjb: fwiw ecl has bytecodes compler/interpreter and performs a minimal compilation of the code (i.e doesn't interpret it, but doesn't optimize it either)
12:31:50
pjb
The problem has more to do with the C code, and how standard and compatible with a large range of platform it is (how few implementation specific or undefined behavior it contains).
12:32:31
pjb
Any CL implementation could be adopted if we can remove dubious C code from it and adapt to a large range of platforms.
12:33:44
pjb
That said, once the minimal compilation is performed, interpreting the function calls and the special operators walking a sexp cannot be much slower than interpreting the byte code of a VM.
12:35:10
pjb
So, using clisp as a base could be a possibility. The question is how intricate clisp C code is and how much it relies on non-purely standard C behaviors.
13:10:10
lonjil
I predict that most or all Linux distros will build SICL with Clisp or SBCL, regardless of BOCL being available or not.
14:21:45
beach
lonjil: Why do you think that? I mean, if (say) SICL were to be distributed as a package for some Linux distribution with instructions to just type `make', why would they do something different?
14:27:26
lonjil
Whether they decide *not* to do what you said would probably depend on how slow bootstrapping with BOCL ends up being. If it is very very slow they may use SBCL instead to save CPU time on their build servers.