libera/#clasp - IRC Chatlog

16:31:21 drmeister We evaluate the prog1 form, copy multiple-values into a VLA array, evaluate the remaining forms, copy from the VLA to the multiple-values, return.

16:31:41 karlosz i see. does adopting that for the bytecode interpreter sufficiently solve your GC worries?

16:33:14 drmeister Maybe. The devil is in the details. I'd like to understand how the bytecode interpreter works and try to implement the mv-prog1 to see how it plays out.

16:34:46 drmeister I understand how it works currently (https://github.com/clasp-developers/clasp/blob/main/src/core/evaluator.cc#L1722) but it depends on being able to VLA allocate a buffer

16:39:06 karlosz right... and that VLA buffer is on the C++ stack

16:39:22 karlosz that would indeed be hard to emulate with the bytecode vm model. i think i understand what you need now

17:21:03 frgo Hi - I see you are working on a VM. I read the logs and wondered why it is desirable to have a VM for clasp. Also, has there been any thoughts about using an existing VM like the BEAM? I know, quite a different domain, but I was thinking about BEAM and Lisp and VM last year but never got to an answer for myself.

17:22:12 frgo Q: LVA stands for Local Variable Area?

17:23:37 karlosz frgo: 2 reasons: bootstrapping clasp is slow partly because an expression interpreter is used to bootstrap the first stages. a bytecode vm would significantly speed up code execution for bootstrap reasons while making it easy to transform the C++ interpreter into a bytecode compiler in C++. the bytecode vm needs to be simple as a result and hand

17:23:37 karlosz tailored essentially

17:24:20 frgo Ok - I see. Thx for the explanation.

17:24:21 karlosz the other reason is that compiling to LLVM is super heavy and not always desirable (think macroexpanders, compile time computations that don't get run very frequently but cost a lot of compile time only to be thrown away)

17:25:14 frgo Yep. Clear.

17:26:31 frgo Making both (LLVM JIT compile and VM approach) be co-existing might be a challenge, no?

17:27:09 karlosz Bike, drmeister: I did a very rough sketch what the VM structures themselves would look like on the wiki page (stacks, pointers, mv register, function representation, so on). maybe you guys can look at it and probably revised to use the actual C++ Clasp types/runtime object representations

17:28:40 karlosz frgo: well, bytecode functions and native functions need to coexist at runtime. we were thinking that trampolines to translate between the call boundary is probably all that's needed to get calls working. otherwise they are mostly independent entities

17:29:00 karlosz what will take more though is adapting the VM to C++ exception handling

18:20:30 Bike frgo: we probably wouldn't use BEAM (or e.g. the JVM) because we'd like the compilation from lisp to the VM target to be simple and fast. it looks like the BEAM book actually explains one reason why translation to BEAM might not be this:

18:20:37 Bike ugh, can't paste into this terminal very well

18:21:21 Bike "Stack machines are quite popular among virtual machine and programming language implementers since they are quite easy to generate code for, and the code becomes very compact. The compiler does not need to do any register allocation, and most operations do not need any arguments (in the instruction stream)."

18:22:23 Bike and beyond that, BEAM probably doesn't have quick translations for basic lisp constructs like nonlocal exits or multiple value calls, any more than the JVM does

18:22:43 karlosz oh right, we need nlx instructions

18:23:41 Bike i only glanced at clisp's quickly but their instructions seem ok for that

18:23:51 Bike although we probably don't need the block-cons specifically

18:25:57 Bike well ok, more concretely, we'd probably have a BLOCK-OPEN instruction with a label, and it creates a block environment with an exit to that label, and pushes it to our usual dynamic environment stack

18:26:12 karlosz yeah the clisp ones are fin

18:26:31 karlosz i was thinking that the distinction between tagbody and go tags might not be so relevant

18:26:48 karlosz since i know with the runtime in clasp the distinction is gone

18:27:13 Bike mostly gone. there is a slight difference in that the unwinder tells them apart so that it can decide to pop the dynenv or not

18:27:25 Bike since we want to pop it for a block (since the block can only be exited once) but not a tagbody

18:29:08 karlosz i remember the clisp ones being kind of weird with respect to the closures and literals vector

18:30:00 karlosz we can just have block open and tagbody open push the continuation object on the stack, then have make-closure package it up in the flat closure vector. return takes that

18:30:12 karlosz mostly like how its done in Cleavir

18:30:17 Bike right yeah.

18:30:27 karlosz though i forget if the continuation object is a real object that GC knows about

18:30:35 Bike it is, now

18:30:41 karlosz its probably fine to have that live in the "stack" (which will be rooted Lisp objects)

18:30:45 Bike and it has indefinite extent too, so that putting it in a closure is fine

18:31:12 karlosz was that not how it always was in Cleavir?

18:31:38 Bike well previously the continuation was just the stack frame pointer, treated as an integer

18:31:48 karlosz ah right

18:31:58 karlosz that works too if it gets fixnum tagged

18:32:15 Bike yeah, so treating it as a lisp object was fine

18:32:19 karlosz i guess not really if on stack code moves

18:32:24 karlosz well whatever

18:32:35 karlosz the point is we can use the same representation

18:32:38 karlosz nothing scary there

18:33:41 karlosz maybe someone needs to know its a bytecode pc instead

18:35:06 Bike there's a slot in the continuation that should be usable for that, i think

18:36:40 Bike er, or rather then runtime has it... well, i don't think it will present a big issue, anyway.

18:36:44 Bike the*

18:37:27 karlosz it doesn't matter for tagbody and block, because there's no way to cross machine code and bytecode compiled leical nlxs

18:37:30 karlosz *lexical nlxs

18:37:40 karlosz so an opaque integer without a type works

18:38:00 karlosz but catch tags do need to tag whether its a bytecode or machine pc i think

18:39:24 Bike we basically use jmp_bufs for everything. the unwinding runtime just longjmps to wherever. longjmping into the vm code should be ok.

18:41:57 karlosz longjumping into the vm code is good, but how will the vm know which pc to pick up at?

18:42:47 karlosz i guess that will come when the vm structure is more ironed out. im oscillating a bit between how we want to do modules/functions/closures/literals at that level

18:43:02 karlosz clisp doesn't have a modules concept which i think is a bit suboptimal

18:43:25 karlosz its also pretty easy to make modules a thing with one pass compilation

19:25:30 drmeister Maybe this is only news to me... multiple VLA's are allowed.

19:25:31 drmeister https://www.irccloud.com/pastebin/sgzXxF87/

19:32:40 Bike sure they are, but i think we have a more fundamental problem with how you'd scope it in the vm

19:33:56 Bike lifetime of a VLA ends when the variable goes out of scope, so we couldn't declare it in the scope of "case BYTECODE_SAVE_VALUES:" or whatever, but that's the first time we get the array size

20:04:28 drmeister That's a good way to put it. So I was thinking we allocate them on the heap and push a pointer to them on the stack.

20:05:19 Bike Right

20:05:39 drmeister Will that cost us so much that we shouldn't do that?

20:07:10 Bike I dunno

20:08:05 Bike It might be okay, especially if we compile the trivial case of mv-prog1 to not do it, like we do now

20:09:23 drmeister You and karlosz appear to be making a lot of progress laying out the VM. I'm behind in understanding what you are up to because I've been tied up with other things. On the weekend I'll have time to catch up I hope.

20:10:11 Bike we're still just designing. no code yet, tho we could start

20:10:48 drmeister Can we use a complex-vector with a fill-pointer to store the bytecode?

20:12:46 drmeister Or something like that. I was thinking a ByteCode_O class with dedicated slots for some of the things that are in the clisp VM function objects and a stretchy-vector for the bytecode.

20:12:55 Bike why stretchy?

20:13:37 drmeister I'm not using stretchy the way we normally do. We need an adjustable vector of bytes.

20:13:47 Bike We do?

20:13:53 Bike why's it need to be adjustable?

20:13:57 drmeister We don't know how much bytecode a particular function will have until we have finished compiling it.

20:14:41 Bike sure, so while compiling we have an adjustable vector, but in the finished function we can just have a simple array

20:14:52 drmeister Right.

20:17:02 drmeister We allocate 1024 bytes and a fill-pointer. Say you compile IF you call a function to compile the test expression and it fills in a certain amount of bytecode. When it returns the fill-pointer points to where the THEN expression will start, we compile that and then we compile the ELSE expression. When that returns we fill in the addresses for the THEN, ELSE and first bytecode after the ELSE expression into the IF bytecode.

20:17:46 Bike yeah, for the dtree compiler i put in a simple "linking" step for that

20:20:34 drmeister Got it.

20:20:46 drmeister Did you guys figure out stack unwinding?

20:21:39 Bike I don't think it should pose a major issue (knock on wood). we were just talking about it here earlier, and i put some notes on the wiki page.

20:31:15 yitzi drmeister: Both Rocky Linux 8 & 9 only have llvm 13.

20:32:05 karlosz we'd just use whatever is most convenient on the C++ side during compilation to contain the bytes, but i was thinking of just using a Lisp (UNSIGNED-BYTE 8 ) array object inside the actual bytecode_O

20:33:18 karlosz the C code sketch on the wiki for bytecode functions and modules should be made to look as much like the Clasp native machine code object definitions as possible ideally

20:37:11 Bike i was also just imagining a simple ub8 array

20:52:28 karlosz putting the byte in bytecode

20:53:53 karlosz i don't really see why we'd need maxsp_depth like clisp

20:53:59 karlosz i think most of that stuff can be cut

20:55:56 karlosz i'm also not a big fan of treating single values specially like in clisp only to push them later. there's pretty much only one context we care about that which is CALL-RECEIVE-ONE.

23:53:04 Bike (subtypep '(cons integer) '(cons t)) => NIL NIL. we have got to change this implementation at some point... but it's so early... aaaaagh

1:20:49 drmeister Anyone online?

1:22:52 drmeister Regarding VM. CALL-XXX should be easy? The general entry point takes the number of arguments and a pointer to a vector of arguments.

1:44:08 Bike hello.

1:45:34 Bike calling functions with the usual convention should be pretty simple, yeah.

1:52:08 Bike my compiler improvements branch is now at a point where some benchmarks are a lot better (than main) but some are kind of worse. it's a start i guess. i'll try to improve everything before i merge, but it's kind of difficult since the compiler changes mean more type checking generally

1:54:24 Bike e.g. CRC40 goes from 2.58 to 1.34 and MANDELBROT/DFLOAT from 1.20 to 0.03 (cool!), but also FPRINT/PRETTY went 5.50 to 7.54

1:54:41 Bike the compiler is actually slightly faster, so that's nice.

2:09:38 drmeister Bike: I've got an x86 Macbook Pro on the TLT VPN

2:11:03 Bike oh here's a goofy bug i just noticed. (let () (max (values 1 nil))) => 1 NIL

2:11:09 Bike drmeister: do you want me to try it out?

2:11:53 drmeister Try out what? profiling?

2:12:47 drmeister I'll have to add an account when I get back. I tried to add an account for karlosz using the command line - but on MacOS that just leads to tears.

2:16:09 Bike I'm just wondering if you have something in mind

2:36:05 Bike oh yeah, optimization related question: do we have any desire to allow sequences with a length more than most-positive-fixnum

2:36:35 Bike probably not, since it's like 2^61 or something and it wouldn't be practical for any sequence whose elements actually exist

2:36:44 Bike i guess you could have some exotic extended sequence

2:52:51 drmeister No - I don't think sequences with a length more than most-positive-fixnum make any sense. most-positive-fixnum is 61bits wide of 1's

2:55:35 Bike alrighty.

2:56:04 Bike also we're not doing eliminate-if-if on variable reads. ack ack ack. have to fix that too

2:57:32 Bike telling the compiler that LENGTH returns a fixnum should help it use fixnum arithmetic more, i swhy i asked

3:23:19 drmeister Yeah LENGTH returning a fixnum is good.

3:36:23 Bike i fixed something tricky with max/min and now most of the benchmarks are improved or worse at a level that could be noise

3:39:38 Bike the basic problem seems to be THE type checks killing everything... hm...