libera/#commonlisp - IRC Chatlog
Search
21:49:12
paulapatience
aeth: In your vector DSL, do you walk the forms to find *, +, etc., or are the arithmetic operators bound via macrolet or something?
21:50:51
aeth
paulapatience: any *, +, etc., found at compile time is undone in a step, so yes, it walks
21:52:23
aeth
rather than optimizing some reduce/fold on &rest, it just explicitly hardcodes a removal of 0, 1, and 2 length into a %low-level, with 2+ reduced to 2 at compilation time. (This is because e.g. - at 0 length is NIL, i.e. an error, while - at 1 length is %negate, i.e. completely different than the binary/dyadic form)
21:52:50
aeth
(similarly, + at 0 length is the constant-returning-function %zero, different than the binary/dyadic form)
21:53:10
aeth
sort of like how CL implementations often have %false and %true functions as optimizations of (constantly t) and (constantly nil)
21:53:51
aeth
paulapatience: these are technically looking at zrvl:+, not cl:+, but since the API is identical, I can make the walker also work on CL:+ for lightly embedded vector stuff that seem more integrated into CL, in a separate sort of macro
21:55:35
aeth
Note that this can't always be done, considering e.g. (f #'+) which then calls #'+ inside of F. So I will have to implement an alternative, "proper" implementation in such cases, which I will do once I add higher order functions back in.
21:56:33
aeth
Well, I aggressively, aggressively, _aggressively_ inline and I will do so on higher order functions, considering the goal is efficient vector math. So (f #'+) might be handled, but you can just obfuscate that a few levels to get a place where a properly generic + is needed.
21:59:40
aeth
The three goals, and why it's a proper DSL, are (1) I want to just write things like (zrvl:+ a b) instead of having to think about the 5 different, more efficient ways of doing that (not heap-allocating a and b if possible) with a bunch of fallbacks. And (2) I want to be able to use higher order functions complete with currying, etc., where convenient without the typical performance losses. And (3) it
21:59:46
aeth
also needs to compile to SPIR-V. So the whole thing _has_ to tree walk and only permit a subset of CL inside of the tree-walker for simplicity of implementation. Especially for #3.
22:08:41
paulapatience
aeth: I see. I'm defining a mini-DSL for something, and I wanted to shadow cl:* and cl:+ because I found the package prefix to be too long, but then I might need to use cl:* and cl:+ in the subforms in the macro.
22:09:09
paulapatience
I was wondering whether to use :* and :+ instead, or to give an escape hatch to restore * and + to the usual meanings.
22:09:30
aeth
well, that's why I picked the letter combination "ZR" for my game engine > 10 years ago. So I can say, e.g. "ZRVL" (Zombie Raptor Vector Language) and know that nobody in the world is going to name clash it
22:09:55
aeth
But I can also, separately, turn CL:+ into ZRVL:+ in a separate macro if I choose to. One downside being not everything is in CL
22:10:15
aeth
For instance, I'm taking a few things from common libraries, with near-identical or identical APIs.
22:11:29
aeth
One advantage with this is that for the scalar version, I may be able to implement it with those functions. Because I am going to have three backends, scalar, non-portable (in implementation _or_ architecture) SB-SIMD, and SPIR-V
22:31:01
aeth
You can't do that... with floats, you have to assume that people know what they're doing and changing things around with identities assuming floats are reals (they're not) can greatly change results, especially on single floats.
22:31:24
paulapatience
A higher level than that. I guess the question is, are you mapping the DSL operations directly to the underlying architecture, or are you doing some higher level optimization of the forms provided to the macro?
22:34:32
aeth
It might be able to prove that f and g are equivalent mathematical functions, and thus the straightforward f and the fast g are the same (when you're modeling reals) despite g being better for floating point error while f is better for readability
22:39:20
aeth
there's one intermediate step (+ a (* t (- a)) (* t b)) to see that it's equivalent (and, yes, you can't use t in CL because of CL:T for true... e.g. alexandria:lerp uses v)
23:04:06
paulapatience
aeth: I was thinking more of like omitting intermediate vectors in some operations, to reduce consing
23:06:45
aeth
paulapatience: for simd-pack in SB-SIMD, any intermediate heap consing for simd-pack vectors that stay within the scope of the function _should_ be reported to #sbcl as bugs (and I even noticed one once)
23:07:20
aeth
For the CL scalar fallback, declare dynamic-extent can be used, but that probably only needs to be used for the longer stuff like matrices.
23:08:12
aeth
Vectors don't have to actually exist at all because they can be converted to 2-4 (values) and handled with multiple-value-bind and multiple-value-call, which is one huge reason why a tree-walking transformer is easier than using it directly (m-v-c really makes things increadibly verbose, while m-v-b doesn't allow more than one binding, unlike let and let*, so it ruins the indentation levels)
23:09:40
aeth
For looping, I will have to use my own loop macro that _only_ becomes for loops because CL:LOOP is frequently while loops and is quite advanced, CL:DO is kind of a while loop, and CL:DOTIMES isn't powerful enough
23:10:19
aeth
Not just for optimizations, also for easier targeting of SIMD without having to generate GOtos and then turn those GOtos into structured programming, which seems to be a common IR request these days, whether SPIR-V or WASM
23:12:27
paulapatience
Ah, of course, your vectors will be small, since you're writing this for your game engine.
23:13:18
aeth
Right, it's graphics vectors, which become SPIR-V or SIMD vectors (or not; portable fallback is key). Any higher level vectors have to be constructed from lower level vectors, or from "arrays"
23:13:35
aeth
Technically speaking, they're vecs (common to avoid the name collision), while vectors (1D arrays) are a separate data type
23:14:47
aeth
It's possible that the higher level interface will permit things like complex vectors (a real and a complex %VEC paired together?) or longer vectors
23:16:11
aeth
Similarly, matrices are strictly 2x2 through 4x4 in size and anything larger has to be made up of block matrices with blocks only up to 4x4. At least for %MATRIX, not necessarily for MATRIX
23:16:52
aeth
Permitting smaller (e.g. 1x4 and 4x1) is a separate issue, especially since they'd overlap with the (column) vectors
23:18:31
aeth
paulapatience: Another reason for tree-walking is creating types that don't exist at the CL level, but can be used for correctness. Since with DEFTYPE, a type defined for (simple-array single-float (3)) and another type defined for (simple-array single-float (3)) are the same, and if you wanted metadata, you'd have to put them in separate structs and have that extra cost at runtime. Similarly, the matrix
23:18:37
aeth
representation could vary between 1D row-major, 1D column-major, 2D row-major, 2D column-major, etc., without having different user-exposed code and just do that at the tree-walking transformation level