libera/#sicl - IRC Chatlog
Search
8:04:05
beach
bike: I still don't understand exactly how it works, and at some point, I would like you to walk me through the steps the machine takes and what global cells are gotten from which environment, and how that is done.
8:04:06
beach
So in one environment E1, I have code in some function (say F) that (simplified) does (LET ((*SOME-VARIABLE* ...)) *SOME-VARIABLE*). Then I call F from a function G in a different environment E2. Evaluation takes place in environment E2. What global cell for *SOME-VARIABLE* is used when that variable is bound, and what global cell for it is used when its value is asked for?
8:05:22
beach
bike: I realize you are not awake, and I have to be somewhere else in a little while. So no rush. When you are awake and you have time.
8:08:19
beach
Since the symbol *SOME-VARIABLE* is not at all involved in environment E2, I can't figure out why a global cell from E2 would be involved at all either.
11:46:39
bike
beach: in that case the cell for both the binding and the read should be from E1, if I'm not mistaken
11:54:50
bike
here's how maclina works. when you compile some Lisp, it builds up the bytecode as well as a vector of literals referenced by the code. This literals vector includes literal constants, but it also includes cell indicators.
11:55:23
bike
For example if you compile (lambda (x) (foo x)), the literals vector during compilation will include a kind of note saying "FOO's cell should go here"
11:56:48
bike
Then, at link time, which if you're using maclina.compile:compile takes place immediately after this compilation process of building up bytecode and literals, maclina resolves these notes into actual cells.
11:58:20
bike
For variable and function cells it does this by calling the link-variable and link-function generics respectively. These generics get the client, the run time environment COMPILE got, and the name of the variable/function, and they're supposed to return a cell object, the nature of which is client defined.
11:59:12
bike
This cell goes into the bytecode function's literals vector. Then when you run the function, the various instructions (special-bind, fdefinition, etc) get this cell from the literals vector and deal with it in a client-dependent way.
12:25:27
beach
I found the problem. I called INITIALIZE-VM between the compilation of the code in E1 and the call in E2.
12:53:31
bike
well, that's why i'm confused. the constants are stored in the module objects, which are referenced from the function objects. the vm itself shouldn't really affect them.
12:56:02
bike
but, well, I can definitely imagine something going funky if you initialize partway through.
12:59:51
beach
The next issue resembles the previous one I had and that I never figured out. It was calling a different version of MAKE-INSTANCE from the one called by the AST evaluator, and now I think a different version of FIND-METHOD-COMBINATION is called. So I think I really need to figure out the source of that problem, rather than trying to patch every occurrence.
13:21:52
beach
bike: Do you "nullify" the stack element when you pop the stack? It looks like you don't, so then lots of object might be kept alive.
13:45:57
bike
I don't think so, no. It would be nice if we could tell the GC that anything past a given point is garbage, but I don't think there are any interfaces for that, so it probably has to be nulling
14:07:34
karlosz
when we wrote the c++ vm we could just tell boehm not to scan past the stack pointer
14:08:07
karlosz
just having pop make things null doesn't do the trick, because i think there are other instructions that logically decrement the stack pointer
14:09:40
karlosz
you can also periodically clear the stack space after the stack pointer, i think it's enough to do this at function call boundaries and backward jumps
14:26:56
beach
Why don't you just have a function decrement the stack pointer and call that function each time you decrement?
14:28:03
beach
Also, if you look at the stack in an inspector, it is nicer if there are (say) 0s where elements are not valid.
14:30:16
karlosz
i think it was done that way for speed; a memcpy and a block decrement is faster than calling a function each time. say, you close over 100 values and 100 values need to be popped off the stack at once and shoved into a vector
14:31:05
karlosz
anyway, i don't remember what it was, but i think bike made it so that most things that do that in bulk call gather, so you just need to do a zero-fill there as well as in the pop instruction
14:31:15
beach
That sounds like SBCL philosophy. Do anything possible for performance, even if the result is worse in terms of debugging.
14:32:30
karlosz
beach: the bytecode vm was written because the evaluator it was replacing was too slow... also, the portable vm was not meant to be used for real, it was just to check the compiler output so that a real vm could be written against it
14:33:01
karlosz
i am surprised it is being used as a real thing now; as you can see, that's why there are issues like not zero-filling for gc
14:34:18
beach
I see. Even the performance argument doesn't ring true. There can't be more than a tiny slowdown if the stack element is cleared when the stack is popped. Nothing like the difference between Maclina and the evaluator it replaced.
14:35:27
karlosz
there's nothing wrong with clearing stack elements (that won't be much of a hit), but calling a function n times instead of zero-filling it at once will be slower
14:35:47
bike
well, i can tell you when i didn't nil out entries, it wasn't for speed, it was because it was a simple reference implementation, like karlosz said. something to fix up now.
14:36:24
bike
i have done a bit of actual profiling of the lisp VM, and I don't think niling out the stack will particularly make a dent.
14:36:47
beach
karlosz: I didn't say it wouldn't be slower. I said the slowdown will likely be very modest. And should not be compared to the difference in performance between Maclina and the evaluator it replaced.
14:37:08
bike
the main problem performancewise is actually argument and multiple value handling, so maybe karlosz's ideas can help there along with just improving the design.
14:42:37
karlosz
i'm working on a design (for a simpler scheme-targeted vm) that separates out the control stack from the eval stack, which should eliminate copying for arg passing and values return
14:43:16
karlosz
so no copying, just pointer frobbing with an occasional listify-rest-args thrown in
14:53:01
karlosz
i mean, i think the success of the bytecode stuff in clasp is the amount of control you have over it and it's tailored exactly for the implementation at hand, and there's no need to chase external apis
14:55:01
karlosz
the other thing with the bytecode vm is that calling from bytecode to bytecode entails some overhead
14:56:14
karlosz
i'd like to do a redesign where the equivalent of lambda lifting for local calls like we do in sbcl or bir is also done in the bytecode vm in the same compiler pass
15:06:10
beach
When a DEFINE-METHOD-COMBINATION form is evaluated in environment E3, when the AST evaluator is used, (SETF FIND-METHOD-COMBINATION-TEMPLATE) in E3 is called, but with Maclina, (SETF FIND-METHOD-COMBINATION-TEMPLATE) in E2 is called.
15:06:14
beach
But with both evaluators, FIND-METHOD-COMBINATION-TEMPLATE is called in E3, so with the AST evaluator, the hash table contains all useful templates, but with Maclina, it is empty.
16:06:12
kingcons
bike: Apologies if this seems out of left field but I saw the GC discussion in logs and wondered if you were had seen / are aware of whippet as an option? https://github.com/wingo/whippet
16:06:12
Colleen
kingcons: scymtym said at 2023.12.15 15:15:39: it is true that 6EQL specializers work in both cases, however instances of standard classes also allow classes as specializers and thus specialization and generalization of "kinds"
16:06:12
Colleen
kingcons: scymtym said at 2023.12.15 15:16:24: for example, a method could be specialized to a "superkind" of multiple "kinds" instead of having to write multiple 6EQL-specialized methods
16:06:49
kingcons
Heh, I haven't signed in for a while. Also, that makes sense, should've considered that.
16:07:31
kingcons
Anyway, whippet may not be a better fit than Mmtk for clasp / maclina but given that it was designed to be easy to integrate with an existing scheme system I thought it might be interesting.
17:23:08
yitzi
scymtym: I'm not sure if I made it clear before, but Quaviver should signal floating-point-underflow for exponents that are too small....for your experiments using Quaviver from Eclector.
17:33:05
scymtym
yitzi: i don't think i have a test for that yet but i assume it would work like the overflow case for too large exponents?
17:43:39
yitzi
I would think so. You could just try to read 1e-1000000, etc. If you want the tests to be very fine grained you could extract the exponent from LEAST-POSITIVE-SINGLE-FLOAT, or use (quaviver:min-exponent 'single-float)
17:46:44
yitzi
paulapatience: I have the basic outline of a stream interface read-digits, write-digits, read-number, write-number worked out. Its still very rough around the edges, but I think it has potential.
17:52:02
scymtym
i did some benchmarks with "my" jaffer implementation vs. liebler vs. the current Eclector code as a baseline. i had to tweak liebler a bit to be on par with the jaffer implementation. if there was a way to get rid of the dispatch overhead, liebler could the fastest implementation
18:02:54
scymtym
i didn't change anything about the dispatch. i added type declarations and further split some cases so that what was previously a full call to 06ASH could be optimized due to the tighter inferred types
18:07:36
yitzi
Hmmm... how did you benchmark them? For me liebler is already 4 times faster then jaffer.
18:13:33
scymtym
i called the token conversion function many times for each implementation. i should probably have included quaviver/jaffer in the comparison. when i say "my" jaffer, i mean the one from the eclector github issue
18:18:08
yitzi
I started off with one of the jaffer implementations from that issue, but then ended up writing a version from scratch via the paper. One thing I am little concerned about in some of the versions on the Eclector issue is that there are places where FLOAT is being called on a what could be a bignum, which to me seems like it is just resorting to whatever algorithm the underlying implementation is using.