libera/#commonlisp - IRC Chatlog
Search
14:16:28
hayley
On topic, I do a fair amount of model checking and fuzzing of my code, to help find bugs; there usually are a few bugs before I run a fuzzer for the first time. That sort of workload is embarrassingly parallel, so I decided throwing money at a nice CPU would improve my productivity enough to justify the cost.
14:19:27
jeosol
Doing something similar now - my code mostly runs but there are some code paths that will result in problems every now are then. I am running stochastic optimizations so something some situations may arise and I have not accounted for nor written protective code around
14:20:45
hayley
Pratchett once said that "real stupidity beats artificial intelligence every time", but often artificial stupidity beats my intelligence ):
14:22:24
jeosol
beach: On the timing and performance issue, is there some resource (or may be there needs to be one) that emphasizes CL style for better performant code. Perhaps some of the Lisp guides and/or some repos have this. Like for python, because of multiple ways of doing things, some looping constructs run slower and others that use C underneat or
14:24:32
beach
I am unaware of any such resource. Also, I imagine that for Python, it's for a single implementation, whereas for Common Lisp, the recommendations might vary a lot between implementations.
14:26:25
jeosol
my bad, I am sorry, but butchering the name. I get confused, not sure if I have seen the other spelling somewhere or my mind is playing tricks on me
14:26:50
hayley
I always thought the rule for Python was to just use a C module, or to hope PyPy fares better.
14:27:54
jeosol
beach: good point beach. I was even thing of say things like organize let forms, loop macro constructs that are performance, albeit slightly, etc. But I do agree that will definitely vary my implementation.
14:28:32
hayley
The instructor didn't like my sense of humour for some submitted work; we had to produce an algorithm that ran quickly, which was to be written in Python, so I made a C module which did the computation. (Better yet, we were allowed external libraries, and those often use C as well.)
14:28:35
jeosol
hayley: I don't claim to be a python expert, and don't know the language fully, just use it to get some things (in the ML/AI space) done
14:29:25
jeosol
hayley: yes, for serious computation, it was recommended to use a C module or libs with C modules behind
14:31:02
hayley
Later he had admitted that my submission was fast even for a C program; perhaps I could have done it in Python properly. But I searched how to perform a count-trailing-zeroes operation in Python, and was told to make a string and search for a '0' character in the string, which scared me.
14:36:38
hayley
As for an optimisation guide for Common Lisp, I think one might be tempted to write too much about optimising for SBCL. (Though it might be good advice to recommend you try hacking on the compiler of your favourite implementation, too.]
14:37:25
hayley
Oh, it was searching for the '1' in the binary representation, which isn't much better. Ideally this operation would just use the BSF instruction on x86, I think.
14:38:16
hayley
(Arguably I didn't write my code in C, as I used a GCC intrinsic function to perform count trailing/leading zeroes though.)
14:41:53
hayley
Common Lisp is quite good for bit munging, in comparison. LDB is already amazing, INTEGER-LENGTH is find-last-set(?) and LOGCOUNT is Hamming weight.
15:21:45
jeosol
TMA: that's where I have seen that then. I was confused when I saw it and then saw his name later on. Good to know
16:27:18
phantomics
Hey, quick question, are there any libraries for CL that optimize integer division, i.e. (floor), like libdivide for C/C++?
16:33:08
pjb
(time (floor 903128712708418 12312)) #| --> 73353534170 ; 7378 |# took 4 microseconds (0.000004 seconds) to run.
16:33:48
phantomics
The naive implementation of floor can be improved on a bit, libdivide uses some strategic bit shifts and other techniques to make it faster
16:34:29
phantomics
And April's row-major-index based array transformations do a lot of (floor)ing to convert indices
16:35:06
pjb
Yes. Again, you can report a bug asking vendors to use libdivide or better in their cl:floor implementation.
16:38:05
jeosol
I recall seeing that PLN is not in SBCL a while back. What is the current status? I tried in many moons ago and got some errors.
16:38:35
phantomics
Or just FFI it directly in April, but that would be inelegant, although it appears that some recent processors have much faster int division, like M1 and Ryzen
16:40:02
jackdaniel
afair commercial implementations either picked changes up, or will do that, but that would need a confirmation
16:40:19
jackdaniel
I think that Shinmera had some fancy page that summarizes support for various extensions
16:40:21
jeosol
jackdaniel: thanks, I only use SBCL, I tried it a while ago and remember seeing something that asdf for SBCL didn't update something (i don't remember exactly)
16:52:16
jeosol
jackdaniel: I think I recall my error. I was trying to use PLN with uiop:define-package, that's why I wasn't able to get it to work, not plain defpackage
17:36:33
jeosol
Shinmera: thanks for that. It seems there is some issue using it with uiop:define-package
17:37:09
jeosol
jackdaniel: did you confirm that it gives an error too, or that I shouldn't be using it that way
17:37:55
jeosol
Shinmera: interesting. So it means I'd have to update manually because I am using latest SBCL
17:38:24
jeosol
I am not using PLN now per se, but trying to make things more maintainable, so looking at options
17:39:18
Shinmera
I've been using PLNs in all my libraries published since 2019 and it's been great.
17:40:12
jeosol
you point about dependence about is to use plain defpackage instead of uiop:define-package
17:54:42
mfiano
Shinmera: Well PLNs didn't really originate in SBCL, but it was the first implementation of the external specification
17:55:29
jackdaniel
the specification was written by nsivola for sbcl with intent to propose it as CDR after some user feedback
18:00:38
mfiano
It's pretty amazing how that feature caught like wildfire in the world of opinionated Lispers.
18:32:07
Bike
phantomics: sbcl optimizes division by small constants into shifts+etc already https://github.com/sbcl/sbcl/blob/master/src/compiler/srctran.lisp#L3093-L3119. if you want to do something like libdivide you could call compile? like use (compile nil `(lambda (x) (truncate x ,d))) where libdivide has libdivide::divider<int>;
18:32:29
Bike
if the function call overhead is too much, you could probably leverage sbcl internals to produce appropriate forms yourself and rely on the modular arithmetic optimizations
19:13:31
phantomics
By small constants does it have to be expressed as a constant in the code, like (floor x 3)? Will it work with for example (floor x (aref v y)), where v is a vector of positive integers?
0:43:55
Bike
phantomics: it will have to be a constant, which is why i suggested the compile thing for a case like libdivide's, where it's not a compile time constant but you're using it for the same division a lot where you want to amortize
0:44:44
Bike
converting to shifts and multiplications and all is probably slower than using hardware division for cases where the dividend varies each time, of course
0:46:09
Bike
"probably" shoudl be definitely actually, because generating new machine instructions and running them is absolutely going to be slower than one ALU operation
0:46:46
phantomics
I'm dividing by dimensional factors, so I may consistently have a need for numbers 0-10 but the factors for high dimensions could be anything
0:47:25
phantomics
I guess I could make a set of functions for low numbers and reference them from a LUT but I couldn't create a function in memory for just any number
0:54:26
Bike
you should also profile to see if this is actually a problem, of course. i just did a quick test to compare truncate with a constant dividend versus variable where in either case they're declared fixnum and there's optimize speed. with a million iterations the constant version is 3 ms or ~22% faster, which given that it only took 16 ms in the variable version isn't too amazing
0:54:50
Bike
with a hundred milion iterations it's ~48%. that's the kind of size you're gonna need for this to matter at all
0:56:27
phantomics
My use cases could get pretty big, I'm converting row-major indices between arrays
0:57:10
phantomics
So that means that I have to do a number of integer divisions equal to the rank of the array to convert its row-major index to integer coordinates, and then usually some kind of multiplication of each of those
0:58:11
phantomics
So for large arrays with many steps of index conversion, millions of iterations are not unlikely
0:58:33
Bike
oh, and in my example there's function call overhead in both cases, but in reality you wouldn't need it for the variable truncate
0:58:52
phantomics
The benefit of doing row-major index conversion for array processing is that it allows the transformations to be multithreaded with nothing shared, so that can give a significant speedup
1:03:33
phantomics
Appreciate the examples Bike, I'll try running something like that and see how it profiles
1:04:57
Bike
oh, and also i think sbcl can use range information about the dividend in its optimization, so if you can provide that information for your indices it could help
1:10:08
phantomics
Right, the CPU will make a difference here, M1s apparently have very fast integer division, so I may want to try with Stas's SBCL port
1:11:25
Bike
you could even try to do something fancy where your thing tests which implementation is faster on the actual machine and then uses that
1:13:19
phantomics
That's something I've been considering, an autotuning system, it would be helpful to determine the cutoff point for when to add a new thread to a given transformation on an array of a given size
1:15:33
Bike
cl-jpeg tries to do this, but it's a little janky https://github.com/sharplispers/cl-jpeg/blob/master/jpeg.lisp#L312-L361
1:18:38
Bike
yeah, which is just a boolean. then the quantize-block macro decides what to expand into based on that variable.
1:26:13
phantomics
Pretty simple, I was envisioning a big table of values for each operation. Though with my loop-fusion system such a technique may only matter for single-stage transformations of big arrays
1:46:03
lagash
I'm looking into coding some sort of parsing / generating library for LDraw (standard file format for LEGO designs) and I'm not entirely sure where to start. I've done stuff with PEGs and other parsing grammars before, but in other languages. What would be something I should look into for bidirectional LDraw<->Sexp?
1:53:31
phantomics
lagash: You could take a look at https://github.com/eugeneia/maxpc for building parsers, be advised it's AGPL-licensed though
1:59:19
lagash
I'm not too enthused about it not implementing packrat parsing as well as being AGPL'd
2:05:53
phantomics
As far as writing back to LDraw it looks like any option would involve significant work, I see the format is unique and not based on XML or anything like that
2:07:07
phantomics
I see there's https://github.com/JoshTheDerf/LDRParser that can convert LSD -to- JSON, but converting from JSON to LDR is still a todo, if it was implemented you could generate JSON from CL to give to it
2:12:13
lagash
And I was meaning more, that the parser and the writer were integrated together, versus having two libraries, one for parsing, the other writing / generating