libera/#clasp - IRC Chatlog
Search
14:17:06
yitzi
drmeister: I could use your help when you have time. I am trying to setup a "non-clbind" version of lila, but I am getting some errors. I don't think I have the "project_headers" stuff defined correctly.
14:43:51
Bike
::notify karlosz it might be slightly more convenient if jumps were relative to the most significant (last) byte of the label rather than the least
14:44:53
Bike
::notify karlosz having the mv-call instructions handle value popping themselves might be better- as is I think I'll need to copy the mvalues into a VLA so the callee doesn't need to worry about them getting stomped
14:49:05
Bike
without copying i guess there'd be a little memory leak in that the callee would still be on the stack during the call, but it's going to be alive anyway given that it's running, so it's just a wasted stack slot
16:35:39
Bike
drmeister: i think the entry-point-as-function thing is a little broken - trying to get the function name of an entry point segfaults
16:36:00
Bike
drmeister: i don't quite follow why, if you did the self pointer thing, since it should go entry-point -> entry-point -> function description -> name
16:56:49
yitzi
I think that clbind's `def` might actually already be equivalent to CL_DEFMETHOD...just subtly broken.
17:20:15
Colleen
karlosz: Bike said 2 hours, 36 minutes ago: it might be slightly more convenient if jumps were relative to the most significant (last) byte of the label rather than the least
17:20:15
Colleen
karlosz: Bike said 2 hours, 35 minutes ago: having the mv-call instructions handle value popping themselves might be better- as is I think I'll need to copy the mvalues into a VLA so the callee doesn't need to worry about them getting stomped
17:23:13
karlosz
also: re: labels: i can change it to msb if its more convenient. i'm working on short jump encoding "compression" (really "expansion" since i'll try to do it optimistically since that gives better code) because i have a good handle on how to do it i think
17:25:20
karlosz_
Bike: re: mv-call. i remember thinking about this issue and there was some reaosn why its the way it is... maybe something about optimizing for mv-call with 1 arg value being faster that way?
17:25:44
karlosz_
im thinking (FDEFINITION n) (CALL) (MV-CALL) is more optimized that way if you had mv-call not do popping
17:27:19
Bike
because the callee might call some other function and then try to grab some of its own arguments, but that other function will have stomped the mv vector
17:28:30
Bike
that's true about fdefinition call mv-call though... i guess in that case there kind of needs to be a copy
17:32:07
Bike
okay, now that i have straightened out the disassembler it looks like the label for jump-if-supplied is not getting linked. probably it should not be zero
17:34:30
Bike
oh, wait, emit-jump-if-supplied just puts in zeroes instead of giving ASSEMBLE the label
17:43:25
karlosz
Bike: i don't understand... doesn't mv-call just copy the mv-vector into args before transferring control to the callee?
17:45:41
karlosz
you agree that everything is OK up until the MV-CALL? all the mv vectors so far have been saved and restored properly
17:47:07
Bike
yeah, i mean, in the implementation the "args register" is just a pointer, and for normal calls it can just be a pointer into the stack (below the calee frame pointer)
17:47:46
Bike
i already implemented it as it is and it works fine, i was just wondering about the copy
17:49:29
karlosz
i think if we did it the way you're saying it would be a strict win in terms of copying
17:50:07
karlosz
you'd just get an extra bytecode for mv-call with 1 arg form which is probably fine
17:52:31
karlosz
lets see. you'd have to put in a PUSH-VALUES for 1 arg forms but that's compensated by ditching a POP-VLAUES on >1 arg forms
17:59:40
Bike
i am implementing more of the instructions. is there still a rest slot as a property of the bytecode function, like the wiki says?
18:00:38
karlosz
i realized that we could do better than clisp and just compile the lambda list processing stuff directly into bytecode
18:01:13
karlosz
clisp probably does that to save space but you're saving at most 1 or 2 bytes of redundant information for really complicated lambda lists
18:02:03
karlosz
Bike: also note just when reviewing your vm changes: i think we can ditch the read_uint16 stuff. definitely for arg processing i'm taking the stance we should just restrict the arg count limit to 255
18:03:30
karlosz
also for the read_labels stuff. i was imagining for the 1 byte and 2 byte variants we just directly use C's unsigned->signed coercion. 3 bytes you'll need to have the specialized function that can be hand rolled but for 1 and 2 bytes you can just do pointer casting to get the right signed value (that should translate to just one hardware instruction usually)
18:04:01
karlosz
and 3 bytes will be sooooo rare that it will be fast, unaligned accesses for load-byte and load-nibble be damned
18:05:57
Bike
i spent like twenty minutes staring at the C standard trying to remember how unsigned->signed conversions work before giving up and writing that, but yeah, obviously it's stupid for there to be any actual code for that
18:06:23
Bike
and i'm just doing the read_uint16 out of inertia until we decide how the long prefix should work (although i guess it's pretty simple)
18:11:08
karlosz
for LONG i think the best thing to do is like case LONG: long_dispatch(local variables...)
18:11:24
karlosz
and then long_dispatch does basically a switch case on REF CONST SET and friends that need it
18:14:04
drmeister
I imagined a LONG-PREFIX would set a flag that the next instruction's arguments would be long and call functions to read long arguments.
18:14:15
yitzi
The first is with clbind's class def function ... it turns out that it is already equivalent to CL_DEFMETHOD in that it defines a single dispatch method. But I don't understand this: https://github.com/clasp-developers/clasp/blob/fbd9270ee0cca5c73eeb587a0cdb26351fff38c1/include/clasp/clbind/class.h#L974-L982
18:15:40
yitzi
add_method creates a method and automatically makes the entry function. What is `maybe_register_symbol...` doing there. Seems like that is defining a regular DEFUN only to be replaced by add_method
18:16:18
karlosz
drmeister: yeah, that's another approach, but i really think long-prefix would be so rare that we don't even want to pay the cost of reading the flag and dispatching in basic codes like REF and CONST
18:16:50
drmeister
yitzi: The clbind code was derived from luabind - a binding library for lua that I think predates pybind11. They do the binding in two stages. The .def methods create records to bind methods and those records are then interpreted to create the bindings after all the .def calls have been made.
18:17:37
drmeister
Perhaps we could write the VM in a template function or a template class and that is templated with another class that provides the argument reader.
18:20:49
karlosz
pretty much just REF SET and the operations which have indices into the literal vector
18:21:06
drmeister
Ok, so we would have duplicate code for those LONG-PREFIX versions of those instructions.
18:21:30
drmeister
In that case - why not just have LONG-REF and LONG-CONST and LONG-SYMBOL-VALUE-SET ...
18:22:39
karlosz
we'll see what's worth compressing when its all working - we might want to use that bytecode space to make shortcut bytecodes for more commonly used operations like REF 0 => REF0 bytecode
18:23:09
karlosz
i think we should see what available bytecode space we have and then some static instruction instrumentation to see what's most worthwhile to compress
18:24:25
drmeister
LONG-CONST means we are pulling a constant from the literals vector at an index beyond 255 - right?
18:26:07
Bike
also our "register allocation" is like linear scan but a little worse. but i don't think it'll come up super much anyway
18:26:54
karlosz
yeah. we can fold long prefix into the bytecodes themselves if it turns out generated code actually does hit it a lot
18:27:18
karlosz
that wouldn't be difficult, just a cut and paste in the vm from one section to another
18:28:21
Bike
well, whether we do prefixes or longs, we'll have a limit of 16 bits, which should be enough for any code that isn't pretty freaky
18:29:44
yitzi
drmeister: Basically, I am saying that your example here https://github.com/clasp-developers/seqan-clasp/blob/5caa2e1e6028525276a6b6ba770fa6e334563d58/src/seqan.lisp#L50-L63
18:30:19
karlosz
i think sbcl will break with 2^16 literals or 2^16 variables in the stack frame....
18:30:54
yitzi
drmeister: Can already be simplified using single dispatch b.c. `.def` is actually the same as CL_DEFMETHOD when called on a clbind class instance.
18:32:05
yitzi
And I think that the `maybe_register...` call isn't needed b.c. add_method takes care of that.
18:42:09
drmeister
yitzi: That may be so - when I wrote the seqan code I was trying out new ideas for integrating C++ code with CL code.
18:43:41
drmeister
https://github.com/clasp-developers/seqan-clasp/blob/5caa2e1e6028525276a6b6ba770fa6e334563d58/src/seqan.lisp#L152
18:44:40
drmeister
What I'd like to do next, is get rid of all the SingleDispatchXXX_O classes and replace them with generic function dispatch.
18:45:39
drmeister
And expose the generic function methods that use multiple dispatch directly from C++ code so we don't need to write things like:
18:45:45
drmeister
https://github.com/clasp-developers/seqan-clasp/blob/5caa2e1e6028525276a6b6ba770fa6e334563d58/src/seqan.lisp#L152
18:46:58
yitzi
At least with the single dispatch I was able to get rid of the `dispatch@r1v` functions and just do this https://plaster.tymoon.eu/view/3363#3363
18:47:46
yitzi
Then just call `(setf (documentation dimension) ...)` to set the documentation strings later.
18:50:48
drmeister
There - for the first time I constructed a concrete example on how to expose compile-time multiple dispatch as generic function methods from C++.
18:54:22
drmeister
We would use those defmethods to create a "call history" to dispatch on the provided types and the lambdas would be the effective methods.
18:56:27
drmeister
I didn't test the self-reference for entry-points. Let me see if I can get that to work.
19:03:26
drmeister
Here's another timing comparison that is useful. Compare the bytecode speed to compiled cclasp code speed.
19:08:11
karlosz
so not only will we squeeze out jumps but also i envision using it to get rid of cells
19:08:41
karlosz
i think we'll get a speed up once we stop using read_uint16 and short jumps + reading from literals smartly
19:45:45
drmeister
So get the bytecode compiler working and then get back to work on the llvm compiler. :-)
20:25:22
Bike
it also looks like (lambda (&key)) compiles the same as (lambda ()), which is wrong, since :allow-other-keys has to be acceptable
20:28:02
karlosz
Bike: oh yeah, &key is wrong, but i was careful with allowing :allow-other-keys in the correct circumstances
20:54:05
karlosz
long isn't even compiled yet, but at least we get an error at compile time if we hit a program that needs it
20:55:28
Bike
i don't mind leaving LONG for last, given that we very possibly don't even need it for booting clasp
22:20:06
drmeister
Now that EntryPointBase_O points back to itself and inherits from the class formerly known as Function_O
22:48:20
karlosz
one note though: the VM will have to be updated to make the pc offset work from the start of the instruction. which might be more natural anyway
22:50:16
Bike
nice. yeah that should be easy, in the clasp vm it will just mean changing some -2s to -3s
22:53:08
karlosz
still should be portable to C++ - i did use a single dispatch closure mechanism thingy
22:54:10
drmeister
Name changes done, cclasp-boehm builds. Running static analyzer and then I'll push everything.
23:21:57
karlosz
drmeister: the compiler needs to learn how to do unwind protect, but other than that, very close
23:38:57
drmeister
Great - once the compiler is done and we start translating it we can compare the output of the CL compiler with the C++ compiler and identify problems quickly.
23:43:39
karlosz
yes. the CL compiler doesn't do anything crazy - it should be translatable to C++ without much fanfare
1:15:05
Bike
loading bytecoded functions shouldn't be hard, right? all their components are dumpable (except maybe constants, but then you couldn't compile-file that anyway)
1:25:05
drmeister
Well, that's a bit of a puzzler isn't it? With compile-file we use the literal compiler to generate code that evaluates to initialize the literals vector.
1:26:59
drmeister
It loads and compiles all of the bclasp source code and clos and then the cleavir code.
1:28:40
drmeister
Later we can interface the literal compiler with the bytecode compiler - but will we need to translate the literal compiler into C++?
1:33:40
Bike
i kind of thought we'd be compile-filing with the bytecode compiler, but i haven't deeply thought it through