libera/#sicl - IRC Chatlog

8:04:05 beach bike: I still don't understand exactly how it works, and at some point, I would like you to walk me through the steps the machine takes and what global cells are gotten from which environment, and how that is done.

8:04:06 beach So in one environment E1, I have code in some function (say F) that (simplified) does (LET ((*SOME-VARIABLE* ...)) *SOME-VARIABLE*). Then I call F from a function G in a different environment E2. Evaluation takes place in environment E2. What global cell for *SOME-VARIABLE* is used when that variable is bound, and what global cell for it is used when its value is asked for?

8:04:45 beach This code fails because *SOME-VARIABLE* is not bound in the body of the LET.

8:05:22 beach bike: I realize you are not awake, and I have to be somewhere else in a little while. So no rush. When you are awake and you have time.

8:08:19 beach Since the symbol *SOME-VARIABLE* is not at all involved in environment E2, I can't figure out why a global cell from E2 would be involved at all either.

11:46:39 bike beach: in that case the cell for both the binding and the read should be from E1, if I'm not mistaken

11:54:50 bike here's how maclina works. when you compile some Lisp, it builds up the bytecode as well as a vector of literals referenced by the code. This literals vector includes literal constants, but it also includes cell indicators.

11:55:23 bike For example if you compile (lambda (x) (foo x)), the literals vector during compilation will include a kind of note saying "FOO's cell should go here"

11:56:48 bike Then, at link time, which if you're using maclina.compile:compile takes place immediately after this compilation process of building up bytecode and literals, maclina resolves these notes into actual cells.

11:58:20 bike For variable and function cells it does this by calling the link-variable and link-function generics respectively. These generics get the client, the run time environment COMPILE got, and the name of the variable/function, and they're supposed to return a cell object, the nature of which is client defined.

11:59:12 bike This cell goes into the bytecode function's literals vector. Then when you run the function, the various instructions (special-bind, fdefinition, etc) get this cell from the literals vector and deal with it in a client-dependent way.

12:17:07 beach Yes.

12:17:39 beach Thanks. I think I have an idea what the problem is.

12:18:32 beach Let me run some tests, and I'll let you know.

12:25:27 beach I found the problem. I called INITIALIZE-VM between the compilation of the code in E1 and the call in E2.

12:25:40 beach No need to modify any Maclina code. At least not for this problem.

12:26:48 beach Now on to the next issue. bike: And thank you so much for all the help.

12:42:46 bike no problem. i'm a little surprised initialize-vm would screw it up, though.

12:45:01 beach Maybe it erases things that were created with code was loaded into E1.

12:47:24 bike I suppose it must

12:50:20 beach ... like the constants?

12:53:31 bike well, that's why i'm confused. the constants are stored in the module objects, which are referenced from the function objects. the vm itself shouldn't really affect them.

12:54:24 beach Then I'm confused too.

12:56:02 bike but, well, I can definitely imagine something going funky if you initialize partway through.

12:59:51 beach The next issue resembles the previous one I had and that I never figured out. It was calling a different version of MAKE-INSTANCE from the one called by the AST evaluator, and now I think a different version of FIND-METHOD-COMBINATION is called. So I think I really need to figure out the source of that problem, rather than trying to patch every occurrence.

13:21:52 beach bike: Do you "nullify" the stack element when you pop the stack? It looks like you don't, so then lots of object might be kept alive.

13:22:43 beach SP is 18, but more than 700 elements of the stack are something other than 0.

13:45:57 bike I don't think so, no. It would be nice if we could tell the GC that anything past a given point is garbage, but I don't think there are any interfaces for that, so it probably has to be nulling

13:47:09 beach Yeah. Just define POP to do that, I guess.

13:49:09 bike right.

14:07:17 karlosz objects are indeed kept alive in the portable vm

14:07:34 karlosz when we wrote the c++ vm we could just tell boehm not to scan past the stack pointer

14:08:07 karlosz just having pop make things null doesn't do the trick, because i think there are other instructions that logically decrement the stack pointer

14:08:28 beach Ouch.

14:09:05 karlosz but it's not terrible to write a null in the places you need to be null

14:09:40 karlosz you can also periodically clear the stack space after the stack pointer, i think it's enough to do this at function call boundaries and backward jumps

14:10:03 karlosz basically simulating a pseudo-safepoint gc mechanism

14:26:56 beach Why don't you just have a function decrement the stack pointer and call that function each time you decrement?

14:27:22 beach It seems a lot more complicated to periodically clear the stack space.

14:28:03 beach Also, if you look at the stack in an inspector, it is nicer if there are (say) 0s where elements are not valid.

14:30:16 karlosz i think it was done that way for speed; a memcpy and a block decrement is faster than calling a function each time. say, you close over 100 values and 100 values need to be popped off the stack at once and shoved into a vector

14:30:26 karlosz that's what a real vm would do

14:31:05 karlosz anyway, i don't remember what it was, but i think bike made it so that most things that do that in bulk call gather, so you just need to do a zero-fill there as well as in the pop instruction

14:31:14 karlosz there may be other places, but i think that's the main thing

14:31:15 beach That sounds like SBCL philosophy. Do anything possible for performance, even if the result is worse in terms of debugging.

14:32:30 karlosz beach: the bytecode vm was written because the evaluator it was replacing was too slow... also, the portable vm was not meant to be used for real, it was just to check the compiler output so that a real vm could be written against it

14:33:01 karlosz i am surprised it is being used as a real thing now; as you can see, that's why there are issues like not zero-filling for gc

14:33:07 bike there are also nonlocal returns

14:33:11 bike and yeah, there's fixing up to do

14:34:18 beach I see. Even the performance argument doesn't ring true. There can't be more than a tiny slowdown if the stack element is cleared when the stack is popped. Nothing like the difference between Maclina and the evaluator it replaced.

14:35:27 karlosz there's nothing wrong with clearing stack elements (that won't be much of a hit), but calling a function n times instead of zero-filling it at once will be slower

14:35:47 bike well, i can tell you when i didn't nil out entries, it wasn't for speed, it was because it was a simple reference implementation, like karlosz said. something to fix up now.

14:36:24 bike i have done a bit of actual profiling of the lisp VM, and I don't think niling out the stack will particularly make a dent.

14:36:47 beach karlosz: I didn't say it wouldn't be slower. I said the slowdown will likely be very modest. And should not be compared to the difference in performance between Maclina and the evaluator it replaced.

14:37:08 bike the main problem performancewise is actually argument and multiple value handling, so maybe karlosz's ideas can help there along with just improving the design.

14:37:13 beach bike: That's my hunch as well.

14:38:18 karlosz anyway, the reference implementation was not made to be debuggable either

14:38:32 karlosz i think there's debug info attached to the real one now right?

14:38:38 karlosz for actual bytecode level stepping

14:39:10 bike The one in Clasp, yes. I'm working on doing that in Maclina right now.

14:42:37 karlosz i'm working on a design (for a simpler scheme-targeted vm) that separates out the control stack from the eval stack, which should eliminate copying for arg passing and values return

14:43:16 karlosz so no copying, just pointer frobbing with an occasional listify-rest-args thrown in

14:43:32 karlosz that would also remove the need for the values register

14:44:57 bike i will look forward to that.

14:45:34 karlosz it would also make precise gc scanning easier, not that it matter for clasp

14:46:34 bike we've been working on integrating a more precise GC. it's uh, nontrivial, though

14:47:03 karlosz is it mmtk?

14:48:46 bike Yep

14:48:58 bike It is, unsurprisingly, not in a state where using it is simple and well documented

14:49:36 karlosz i think it honestly takes less effort to handroll one that works well for lisp

14:49:42 karlosz is my honest feeling on the subject

14:49:45 karlosz ccl has a rather good one

14:52:26 karlosz using external libraries is just pain

14:53:01 karlosz i mean, i think the success of the bytecode stuff in clasp is the amount of control you have over it and it's tailored exactly for the implementation at hand, and there's no need to chase external apis

14:54:12 karlosz just do that for the gc, and you'll be happy

14:55:01 karlosz the other thing with the bytecode vm is that calling from bytecode to bytecode entails some overhead

14:55:20 karlosz and of course without local calls you do a lot of extra work

14:56:14 karlosz i'd like to do a redesign where the equivalent of lambda lifting for local calls like we do in sbcl or bir is also done in the bytecode vm in the same compiler pass

14:56:35 karlosz or not redesign, but at least add that in somehow

14:56:51 karlosz i don't think there's any one-pass compiler that does that

14:56:56 karlosz but i think it's possible

15:06:05 beach Wow, this is weird.

15:06:10 beach When a DEFINE-METHOD-COMBINATION form is evaluated in environment E3, when the AST evaluator is used, (SETF FIND-METHOD-COMBINATION-TEMPLATE) in E3 is called, but with Maclina, (SETF FIND-METHOD-COMBINATION-TEMPLATE) in E2 is called.

15:06:14 beach But with both evaluators, FIND-METHOD-COMBINATION-TEMPLATE is called in E3, so with the AST evaluator, the hash table contains all useful templates, but with Maclina, it is empty.

15:06:36 beach Strange stuff.

15:10:47 beach I am sure it has to do with how I configured stuff.

16:06:12 kingcons bike: Apologies if this seems out of left field but I saw the GC discussion in logs and wondered if you were had seen / are aware of whippet as an option? https://github.com/wingo/whippet

16:06:12 Colleen kingcons: scymtym said at 2023.12.15 15:15:39: it is true that 6EQL specializers work in both cases, however instances of standard classes also allow classes as specializers and thus specialization and generalization of "kinds"

16:06:12 Colleen kingcons: scymtym said at 2023.12.15 15:16:24: for example, a method could be specialized to a "superkind" of multiple "kinds" instead of having to write multiple 6EQL-specialized methods

16:06:49 kingcons Heh, I haven't signed in for a while. Also, that makes sense, should've considered that.

16:07:31 kingcons Anyway, whippet may not be a better fit than Mmtk for clasp / maclina but given that it was designed to be easy to integrate with an existing scheme system I thought it might be interesting.

16:08:12 kingcons ACTION gets back to work

16:11:03 bike kingcons: did not know about that, no. i'll check it out. wingo does good work.

16:11:09 bike thanks

16:19:49 kingcons np

16:41:06 edgar-rft_ ** NICK edgar-rft

17:23:08 yitzi scymtym: I'm not sure if I made it clear before, but Quaviver should signal floating-point-underflow for exponents that are too small....for your experiments using Quaviver from Eclector.

17:33:05 scymtym yitzi: i don't think i have a test for that yet but i assume it would work like the overflow case for too large exponents?

17:43:39 yitzi I would think so. You could just try to read 1e-1000000, etc. If you want the tests to be very fine grained you could extract the exponent from LEAST-POSITIVE-SINGLE-FLOAT, or use (quaviver:min-exponent 'single-float)

17:46:44 yitzi paulapatience: I have the basic outline of a stream interface read-digits, write-digits, read-number, write-number worked out. Its still very rough around the edges, but I think it has potential.

17:52:02 scymtym i did some benchmarks with "my" jaffer implementation vs. liebler vs. the current Eclector code as a baseline. i had to tweak liebler a bit to be on par with the jaffer implementation. if there was a way to get rid of the dispatch overhead, liebler could the fastest implementation

17:52:51 yitzi The current Eclector is COERCE on a ratio?

17:59:10 yitzi And did you tweak liebler aside from dispatch overhead?

18:00:20 scymtym yes, current implementation is generic arithmetic then 06COERCE

18:02:54 scymtym i didn't change anything about the dispatch. i added type declarations and further split some cases so that what was previously a full call to 06ASH could be optimized due to the tighter inferred types

18:07:36 yitzi Hmmm... how did you benchmark them? For me liebler is already 4 times faster then jaffer.

18:13:33 scymtym i called the token conversion function many times for each implementation. i should probably have included quaviver/jaffer in the comparison. when i say "my" jaffer, i mean the one from the eclector github issue

18:18:08 yitzi I started off with one of the jaffer implementations from that issue, but then ended up writing a version from scratch via the paper. One thing I am little concerned about in some of the versions on the Eclector issue is that there are places where FLOAT is being called on a what could be a bignum, which to me seems like it is just resorting to whatever algorithm the underlying implementation is using.