libera/#commonlisp - IRC Chatlog
Search
18:39:24
hexology
hmm.. so if i have several separate list files in my program (with one package per file), i might need/want to declaim my desired optimization settings in each one?
18:48:54
jeosol
Bike: really, then I don't think the optimization was applied in my case. I just ran the (declaim ...) part at the top level and ql loaded the required system
18:49:58
jeosol
However in my case with the default speed 1, I think there probably isn't much to be had
19:40:15
pjb
hexology: AFAIK, declaimations are global. Proclaim Establishes the declaration specified by declaration-specifier in the global environment.
20:06:24
NotThatRPG
Pretty sure DECLAIM is under-specified. It's certainly been a problem in figuring out how ASDF should deal with optimizations
20:13:33
pjb
NotThatRPG: basically: (defmacro declaim (&rest decls) `(eval-when (:compile-toplevel :load-toplevel :execute) ,@(mapcar (lambda (decl) `(proclaim ',decl)) decls)))
20:22:00
Bike
yes, but that doesn't actually help for the question of whether the effects last past the file or not
20:26:15
pjb
I consider that libraries should not contain optimization proclaimations. It should be the user who sets them depending on his needs. Therefore if they're limited to the file or the compilation-unit doesn't matter for me.
20:28:22
NotThatRPG
Right: as far as we could tell it's up to the implementation to decide the scope of DECLAIM
20:29:58
White_Flame
for one of my large projects, I have a (optimize-file) macro invocation at the top of each file, defined early in config, to punt learning how this might differ per implementation & environment :-P
20:42:19
White_Flame
yeah, on speed sensitive code, I have (fast-body . <body>) and (slow-body . <body>), which also interact with the global optimize-file config
20:43:48
aeth
in case anyone wants to do that, the code probably looks like this: (defmacro stfu-sbcl (&body body) `(locally (declare #+sbcl (sb-ext:muffle-conditions sb-ext:compiler-note)) ,@body))
20:45:29
White_Flame
actually I did a deftype optimization-note where I put the #+sbcl, for some reason
20:47:02
_death
for compiler notes, I learned to accept them.. they remind me that my data structures are probably suboptimal
20:53:05
_death
in some places, I do need to use potential bignums in code that needs to be performant, so muffling related notes could make sense
20:53:29
aeth
it'll be like (typecase x (double-float ...) (single-float ...) (integer ...) (t ...))
20:55:39
pjb
aeth: do you realize that (typecase x (double-float ...) (single-float ...) (integer ...) (t ...)) is not conforming? In some implementation, you may get a warning about duplicate clauses.
20:57:28
aeth
because there's no way to solve the general case of making sure that the second type in a typecase isn't entirely covered by the first, but ofc you sometimes can
20:57:40
aeth
(in the case where there's only one float type, in case someone doesn't get the issue)
20:57:54
aeth
(you can get this on almost all implementations if you do e.g. long-float and short-float)
20:58:12
pjb
aeth: strange, I solved the general case in com.informatimago.common-lisp.cesarum.utility:float-typecase.
20:58:46
aeth
I mean the general general case, not just for floats, which would require handling SATISFIES
21:03:42
_death
it's also being nice to macro writers, or when some type is changed and now overlaps another
21:14:07
Bike
duplicate typecase clauses would be a style warning at best, since the behavior of the typecase is totally defined, it's just that one clause will never happen
21:20:17
resttime
How much does SBCL use SIMD intrinsics when compiling? I'm wondering whether the compiler works fine or it's necessary to learn how to use sb-simd to squeeze out max performance for numerical operations
21:25:02
White_Flame
I don't believe it uses them by default, but there is a simd compatibility library and some asm/vop examples for explicitly using them
21:26:47
kakuhen
and I'd strongly recommend against setting safety to 0 for performance; there's often a good reason the compiler will introduce a type or bounds check at runtime, and its usually because you couldn't convince the compiler you'd consistently get X type or some array Y with exactly N elements
21:31:48
resttime
White_Flame: Hmmm, guess I'll look more into sb-simd, came across a paper: https://zenodo.org/record/6335627
21:34:53
resttime
kakuhen: Safety is good, I'm willing to sacrifice for performance since there's nothing critical. I'm trying to push the limits by writing with a raytracer and learn stuff along the way
21:38:59
kakuhen
on my machine, deep-copy-1 on a 4096 element array will cost me about 200,000 processor cycles whereas deep-copy-2 would be about 13,000; the number drops to about 5,000 if you remove bounds checks, but I'd rather pay the cost of bounds checks anyway.
21:41:00
kakuhen
Notice how with a sufficiently "good" type declaration I was able to emit much more concise code and not have to compromise on safety
21:41:29
kakuhen
this is why imo (declare (optimize (speed 3) (safety 0))) is one of the last things you should try doing for extracting performance out of a function
21:42:01
kakuhen
note: if you know the length of your simple-array ahead of time, the disassembly becomes even shorter (and probably faster)
21:51:42
resttime
Found the loopus library doesn't seem to be in there and I'm in the sb-simd package when trying to run the code (was trying to disassemble the deep copyfunctions), https://github.com/marcoheisig/Loopus/blob/main/code/packages.lisp
21:54:12
Shinmera
While I'm here and not asleep, here's an update on Kandria that may interest the particular set of individuals in this channel: https://twitter.com/Shinmera/status/1554585025789706241
21:54:50
resttime
kakuhen: haha well that's alright then I assume it's some kind of looping macro, might have been superseded by loopus
21:55:30
kakuhen
yes, the intent of do-vectorized was that you used types exported by sb-simd and the compiler would attempt using the best available simd instruction set on your computer (in my case, avx2)
22:00:17
resttime
Ohhh, in the sb-simd paper mentions a INSTRUCTION-SET-CASE macro for selecting best availble code at run time, might've been related/used to implement do-vectorized
22:01:00
Shinmera
dunno if it's smart enough to eliminate the dispatch by rewiring at startup, though.
22:06:51
resttime
Wait found a DO-VECTORIZED used here https://github.com/marcoheisig/Loopus/blob/bd84132eb5d0e94b1fadcb9be734a3ff8b0c1aff/code/ir/sb-simd.lisp#L46
22:07:04
kakuhen
resttime: well, assuming you can get loopus to compile, you may be able to replace sb-simd:do-vectorized with loopus:for
22:07:15
kakuhen
Unfortunately it doesn't seem to compile on my machine due to linker errors with libisl.so
22:09:21
kakuhen
resttime: i assume that's because the vectorizers get generated for various simd instruction sets on your cpu, then you're supposed to invoke loopus:for and let the backend handle the instruction set
22:09:26
Kingsy
curious, with clip, isnt it possible to use templates that inherit others? and if so how come the depp tutorial for radiance doesnt do it that way?
22:09:41
kakuhen
fwiw do-vectorizer doesn't explicitly exist in sb-simd codebase either, but it was provided to be when I was testing the example I sent you on SBCL 2.2.2
22:09:47
phantomics
From the sb-simd docs, I find it a bit unclear how the instructions map to the sb-vm primitive functions
22:10:41
Shinmera
As to why, idunno. It's been way too long since I wrote it. Probably thought it unnecessary to complicate matters for the two or what pages that it actually needs.
22:10:57
Kingsy
Shinmera: well with other frameworks, you get a base.twig or a base.whatever, which contains the <head> and all that other stuff, then you have a view.whatever which always uses the base.
22:11:44
Kingsy
no I mean, it makes sense if you wanted to include something in the head (which is common) you would only need to modify one file.
22:12:14
phantomics
https://sb-simd.common-lisp.dev/supported_sse.shtml lists the different add instructions that are supported, am I to assume that sb-vm::%sse-add/simple-array-single-float-1 implements ADDSS?
22:13:00
Shinmera
It's not clear to me whether it's better to have a master file that includes a subfile, or a subfile that includes a scaffold. You can do either, or none.
22:13:36
Kingsy
usually I like the subfile that includes the scaffold. btu yeah. nice! good to hear.
22:14:53
resttime
kakuhen: https://github.com/marcoheisig/sb-simd/commit/67ff8cb36962a02e36f1eeba71a50c0c6d073ced there we go, mystery solved
22:15:32
kakuhen
I was wondering why cl-isl was really insistent on picking libisl.so rather than libisl.dylib on my system
22:16:38
kakuhen
resttime: yes, if you look at the paper, heisig seems to use loopus:for much like do-vectorized was once used
22:16:55
kakuhen
so right now I'm on a journey to eventually get cl-isl compiling on my mac, to get loopus.sb-simd compiling, and test loopus:for
22:17:43
resttime
Issue that I see now is that loopus tries to use this missing symbol&package still so I it's prob outdated a bit https://github.com/marcoheisig/Loopus/blob/bd84132eb5d0e94b1fadcb9be734a3ff8b0c1aff/code/ir/sb-simd.lisp#L46
22:21:30
kakuhen
for instance, sb-simd-avx2:f64.4 will not exist on computers building sb-simd on intel processors without avx2 instructions
22:26:42
resttime
kakuhen: Want me to open a PR to add (:darwin "libisl.dynlib") to the prologue.lisp? That'd solve for the future for osx systems
22:28:50
kakuhen
I'm also having to make sure my copy of libisl is ABI compatible, since it looks like cl-isl needs version 22, but my system has version 23.
22:29:03
Kingsy
Shinmera: could I trouble you with a radiance question? I have asked in clschool, but as you are here I am wondering if I could be cheeky
22:31:16
kakuhen
looks like my SBCL is hanging in the middle of building cl-isl, so there may be ABI issues... I have no way of definitively telling, and I'm too lazy to install an outdated isl just to see if I can build this tbh
22:32:28
kakuhen
running in terminal now to see if I get dropped to LDB; I suspect that may be happening when SBCL hangs.
22:34:12
kakuhen
"READ error during COMPILE-FILE: Package SB-SIMD-VECTORIZER does not exist. Line: 3, Column: 71, File-Position: 96." in code/ir/sb-simd.lisp
22:37:28
kakuhen
With that said, I also am getting at the very bottom of the debugger output: "Bogus form-number: the source file has probably changed too much to cope with"
22:41:05
resttime
Went ahead and created PR, should solve minor annoyance with picking the write lib on OSX, I've written crossplatform CFFi bindings before and just specify the lib name explicitly depending on the platform
22:46:35
Kingsy
more generic question then if anyone might know, whatis the best way of recompiling an .asd lisp module and I am working on?
22:49:16
kakuhen
I'm guessing it's because :UNIX is in *FEATURES* as well, but I'm not sure since I've never used CFFI.
22:50:32
kakuhen
in any case, I proceeded building the package despite the warnings in loopus.sb-simd, and codegen became absolutely borked, so I think the presence of sb-simd-vectorizer is important :<
22:52:28
resttime
kakuhen: Yeah, the way DEFINE-FOREIGN-LIBRARY works is that it'll check the load clauses in order, so the one for :DARWIN will have to be before the :UNIX one
22:56:28
resttime
Kingsy: Dunno best way, but I keep a symlink to the project repo in quicklisp/local-projects and quickload the system when needed
22:57:42
Kingsy
resttime: I think I did it within emacs using SPC-m-c -> f, no restart of radiance needed. after that I just hit the radiance server and the new endpoint was there.. pertty sweet.
22:58:38
Kingsy
resttime: ah that makes sense. my project is actually in there right now, so just a quickload would have worked too I suppsoe, thats probably what emacs is doing under the hood.
23:00:30
resttime
np, and it kinda depends. My context is if I'm working on a project that depends-on another, I'd make a local copy of the other repo, make changes, add symlink to the quicklisp/local-projects
23:01:10
resttime
Then every subsequent quickload will automatically prioritise the one in local-projects instead of the one in the quicklisp repo
23:04:22
resttime
If it's just my own project then I'd just take advantage of hotloading with the SLY/SLIME keybindings
23:09:39
Kingsy
hehe thats kinda over my head. but I am sure i will figure it out at some point. I have a depends on and the sly keybinding worked. but we will see.
3:39:03
beach
I don't know the answer, but I wonder why that is important. I mean, you typically don't start your Common Lisp image very often.
3:40:40
smlckz
something like this: https://slime.common-lisp.dev/doc/html/Loading-Swank-faster.html
4:18:11
phantomics
Checking in again with a question about the finer points of threading. In lparallel is there a fast way to get a count of the number of active workers in the kernel? I'm working on finding a way to divide large, unpredictable workloads without causing delays due to shortages of available threads
4:22:15
phantomics
(lparallel:task-categories-running) will get the info, but running it millions of times will definitely cost
4:22:42
hayley
Though it'd be "intrusive" to your code, you could have your tasks atomically bump a counter?
4:22:54
phantomics
I've thought of keeping an integer count of active threads and incrementing/decrementing it when appropriate, but I'll have to handle many conditions
4:23:38
phantomics
Intrusive isn't much of a problem, since I'm concentrating all parallelism in the system in a single function to focus on optimizing
5:57:31
flip214
phantomics: is that more a question of "how many cores did I get assigned on startup", or "how busy is my system right now"?
6:01:00
flip214
the latter will be unpredictable to some degree... the first question is "just" a syscall, but could vary too