libera/#commonlisp - IRC Chatlog

14:13:31 jeosol hayley: that's lofty

14:16:28 hayley On topic, I do a fair amount of model checking and fuzzing of my code, to help find bugs; there usually are a few bugs before I run a fuzzer for the first time. That sort of workload is embarrassingly parallel, so I decided throwing money at a nice CPU would improve my productivity enough to justify the cost.

14:19:27 jeosol Doing something similar now - my code mostly runs but there are some code paths that will result in problems every now are then. I am running stochastic optimizations so something some situations may arise and I have not accounted for nor written protective code around

14:20:45 hayley Pratchett once said that "real stupidity beats artificial intelligence every time", but often artificial stupidity beats my intelligence ):

14:22:24 jeosol beach: On the timing and performance issue, is there some resource (or may be there needs to be one) that emphasizes CL style for better performant code. Perhaps some of the Lisp guides and/or some repos have this. Like for python, because of multiple ways of doing things, some looping constructs run slower and others that use C underneat or

14:22:25 jeosol vectorized operation run faster.

14:22:48 jeosol I haven't looked at Vseloved's(?) book see if he touches on this area

14:24:32 beach I am unaware of any such resource. Also, I imagine that for Python, it's for a single implementation, whereas for Common Lisp, the recommendations might vary a lot between implementations.

14:25:32 TMA jeosol: the name is Vsevolod

14:26:25 jeosol my bad, I am sorry, but butchering the name. I get confused, not sure if I have seen the other spelling somewhere or my mind is playing tricks on me

14:26:50 hayley I always thought the rule for Python was to just use a C module, or to hope PyPy fares better.

14:27:54 jeosol beach: good point beach. I was even thing of say things like organize let forms, loop macro constructs that are performance, albeit slightly, etc. But I do agree that will definitely vary my implementation.

14:28:32 hayley The instructor didn't like my sense of humour for some submitted work; we had to produce an algorithm that ran quickly, which was to be written in Python, so I made a C module which did the computation. (Better yet, we were allowed external libraries, and those often use C as well.)

14:28:35 jeosol hayley: I don't claim to be a python expert, and don't know the language fully, just use it to get some things (in the ML/AI space) done

14:29:25 jeosol hayley: yes, for serious computation, it was recommended to use a C module or libs with C modules behind

14:31:02 hayley Later he had admitted that my submission was fast even for a C program; perhaps I could have done it in Python properly. But I searched how to perform a count-trailing-zeroes operation in Python, and was told to make a string and search for a '0' character in the string, which scared me.

14:36:28 yitzi Trailing zeros in the decimal representation of a number?

14:36:38 hayley As for an optimisation guide for Common Lisp, I think one might be tempted to write too much about optimising for SBCL. (Though it might be good advice to recommend you try hacking on the compiler of your favourite implementation, too.]

14:37:25 hayley Oh, it was searching for the '1' in the binary representation, which isn't much better. Ideally this operation would just use the BSF instruction on x86, I think.

14:38:16 hayley (Arguably I didn't write my code in C, as I used a GCC intrinsic function to perform count trailing/leading zeroes though.)

14:41:53 hayley Common Lisp is quite good for bit munging, in comparison. LDB is already amazing, INTEGER-LENGTH is find-last-set(?) and LOGCOUNT is Hamming weight.

14:55:48 TMA jeosol: it is his e-mail address (and twitter handle), but not his name, so...

15:21:45 jeosol TMA: that's where I have seen that then. I was confused when I saw it and then saw his name later on. Good to know

16:27:18 phantomics Hey, quick question, are there any libraries for CL that optimize integer division, i.e. (floor), like libdivide for C/C++?

16:32:05 pjb phantomics: (/ 1 3) #| --> 1/3 |# what do you mean?

16:32:29 pjb If you find your CL implementation doesn't divide fast enough, then report a bug!

16:32:41 pjb (floor 1 3) #| --> 0 ; 1 |#

16:33:08 pjb (time (floor 903128712708418 12312)) #| --> 73353534170 ; 7378 |# took 4 microseconds (0.000004 seconds) to run.

16:33:35 pjb Well, with slime repl, anything takes 4 µs…

16:33:48 phantomics The naive implementation of floor can be improved on a bit, libdivide uses some strategic bit shifts and other techniques to make it faster

16:33:48 jackdaniel a bottleneck!

16:34:29 phantomics And April's row-major-index based array transformations do a lot of (floor)ing to convert indices

16:35:06 pjb Yes. Again, you can report a bug asking vendors to use libdivide or better in their cl:floor implementation.

16:38:05 jeosol I recall seeing that PLN is not in SBCL a while back. What is the current status? I tried in many moons ago and got some errors.

16:38:08 jeosol thanks

16:38:35 phantomics Or just FFI it directly in April, but that would be inelegant, although it appears that some recent processors have much faster int division, like M1 and Ryzen

16:39:20 jackdaniel jeosol: it works at least on abcl, ccl, ecl and sbcl

16:40:02 jackdaniel afair commercial implementations either picked changes up, or will do that, but that would need a confirmation

16:40:19 jackdaniel I think that Shinmera had some fancy page that summarizes support for various extensions

16:40:21 jeosol jackdaniel: thanks, I only use SBCL, I tried it a while ago and remember seeing something that asdf for SBCL didn't update something (i don't remember exactly)

16:40:27 jeosol thanks, I will check it again

16:42:56 yitzi jackdaniel: This? https://portability.cl/

16:44:48 jackdaniel yitzi: yes, thanks!

16:45:26 jackdaniel and clasp! (sorry for leaving it up)

16:46:03 jackdaniel s/up/out/

16:52:16 jeosol jackdaniel: I think I recall my error. I was trying to use PLN with uiop:define-package, that's why I wasn't able to get it to work, not plain defpackage

16:54:18 jackdaniel I agree taht using uiop:define-package is an error :) /jk

17:26:57 Shinmera jeosol: PLNs originated in SBCL.

17:36:33 jeosol Shinmera: thanks for that. It seems there is some issue using it with uiop:define-package

17:37:00 Shinmera anyway, what jackdaniel said but without the /jk

17:37:09 jeosol jackdaniel: did you confirm that it gives an error too, or that I shouldn't be using it that way

17:37:19 jeosol Shinmera: ok, thanks, that resolves it then

17:37:21 Shinmera you need a recent asdf to add support for it

17:37:48 Shinmera I just do not like depending on asdf in my code unless absolutely necessary.

17:37:55 jeosol Shinmera: interesting. So it means I'd have to update manually because I am using latest SBCL

17:38:03 jeosol Good point

17:38:24 jeosol I am not using PLN now per se, but trying to make things more maintainable, so looking at options

17:39:18 Shinmera I've been using PLNs in all my libraries published since 2019 and it's been great.

17:39:50 jeosol You use SBCL or what implementation

17:40:03 Shinmera primarily, but my libraries are portable.

17:40:12 jeosol you point about dependence about is to use plain defpackage instead of uiop:define-package

17:41:43 Josh_2 PLN?

17:42:00 random-nick Josh_2: package local nicknames

17:42:02 Josh_2 Oh

17:43:18 Josh_2 Makes sense

17:54:42 mfiano Shinmera: Well PLNs didn't really originate in SBCL, but it was the first implementation of the external specification

17:55:29 jackdaniel the specification was written by nsivola for sbcl with intent to propose it as CDR after some user feedback

17:55:44 mfiano Actually it was not

17:56:10 mfiano It was written by our good friend |3b|

17:57:04 mfiano https://github.com/3b/package-local-nicknames

17:58:35 jackdaniel I see

18:00:38 mfiano It's pretty amazing how that feature caught like wildfire in the world of opinionated Lispers.

18:01:14 mfiano What a great thing to have, and what a great thing we all agree on.

18:32:07 Bike phantomics: sbcl optimizes division by small constants into shifts+etc already https://github.com/sbcl/sbcl/blob/master/src/compiler/srctran.lisp#L3093-L3119. if you want to do something like libdivide you could call compile? like use (compile nil `(lambda (x) (truncate x ,d))) where libdivide has libdivide::divider<int>;

18:32:29 Bike if the function call overhead is too much, you could probably leverage sbcl internals to produce appropriate forms yourself and rely on the modular arithmetic optimizations

19:11:58 phantomics Thanks Bike

19:13:31 phantomics By small constants does it have to be expressed as a constant in the code, like (floor x 3)? Will it work with for example (floor x (aref v y)), where v is a vector of positive integers?

0:43:55 Bike phantomics: it will have to be a constant, which is why i suggested the compile thing for a case like libdivide's, where it's not a compile time constant but you're using it for the same division a lot where you want to amortize

0:44:44 Bike converting to shifts and multiplications and all is probably slower than using hardware division for cases where the dividend varies each time, of course

0:45:36 phantomics I see, the savings may not be there in that case

0:46:09 Bike "probably" shoudl be definitely actually, because generating new machine instructions and running them is absolutely going to be slower than one ALU operation

0:46:46 phantomics I'm dividing by dimensional factors, so I may consistently have a need for numbers 0-10 but the factors for high dimensions could be anything

0:47:25 phantomics I guess I could make a set of functions for low numbers and reference them from a LUT but I couldn't create a function in memory for just any number

0:54:26 Bike you should also profile to see if this is actually a problem, of course. i just did a quick test to compare truncate with a constant dividend versus variable where in either case they're declared fixnum and there's optimize speed. with a million iterations the constant version is 3 ms or ~22% faster, which given that it only took 16 ms in the variable version isn't too amazing

0:54:50 Bike with a hundred milion iterations it's ~48%. that's the kind of size you're gonna need for this to matter at all

0:56:10 Bike on the bright side, compiling a truncator only takes like 3 ms

0:56:27 phantomics My use cases could get pretty big, I'm converting row-major indices between arrays

0:57:10 phantomics So that means that I have to do a number of integer divisions equal to the rank of the array to convert its row-major index to integer coordinates, and then usually some kind of multiplication of each of those

0:58:11 phantomics So for large arrays with many steps of index conversion, millions of iterations are not unlikely

0:58:33 Bike oh, and in my example there's function call overhead in both cases, but in reality you wouldn't need it for the variable truncate

0:58:52 phantomics The benefit of doing row-major index conversion for array processing is that it allows the transformations to be multithreaded with nothing shared, so that can give a significant speedup

1:03:33 phantomics Appreciate the examples Bike, I'll try running something like that and see how it profiles

1:04:27 Bike make sure your examples are actually compiling down to shifts by using disassemble

1:04:57 Bike oh, and also i think sbcl can use range information about the dividend in its optimization, so if you can provide that information for your indices it could help

1:06:43 phantomics How do you specify that?

1:07:09 Bike (declare (type (integer 0 maxgoeshere) variable))

1:07:25 Bike (assuming 0 is your minimum, given these are array indices)

1:10:08 phantomics Right, the CPU will make a difference here, M1s apparently have very fast integer division, so I may want to try with Stas's SBCL port

1:10:22 phantomics But the M1 has few cores, so the threading advantage will be less

1:11:25 Bike you could even try to do something fancy where your thing tests which implementation is faster on the actual machine and then uses that

1:11:29 Bike can get pretty dicey tho

1:13:19 phantomics That's something I've been considering, an autotuning system, it would be helpful to determine the cutoff point for when to add a new thread to a given transformation on an array of a given size

1:15:33 Bike cl-jpeg tries to do this, but it's a little janky https://github.com/sharplispers/cl-jpeg/blob/master/jpeg.lisp#L312-L361

1:15:55 Bike "this" being the autotuning. i have no idea what the actual math it's doing is

1:17:23 phantomics So the optimization data is stored in that *quantize-optimization* variable

1:18:38 Bike yeah, which is just a boolean. then the quantize-block macro decides what to expand into based on that variable.

1:26:13 phantomics Pretty simple, I was envisioning a big table of values for each operation. Though with my loop-fusion system such a technique may only matter for single-stage transformations of big arrays

1:46:03 lagash I'm looking into coding some sort of parsing / generating library for LDraw (standard file format for LEGO designs) and I'm not entirely sure where to start. I've done stuff with PEGs and other parsing grammars before, but in other languages. What would be something I should look into for bidirectional LDraw<->Sexp?

1:52:18 karlosz_ ** NICK karlosz

1:53:31 phantomics lagash: You could take a look at https://github.com/eugeneia/maxpc for building parsers, be advised it's AGPL-licensed though

1:56:32 lagash phantomics: so a parser combinator library?

1:57:19 lagash but is it bidirectional?

1:57:52 phantomics Afaik you would need to write a separate module to write back to LDraw

1:58:50 lagash Is there more sophisticated libraries that have more of that functionality?

1:59:19 lagash I'm not too enthused about it not implementing packrat parsing as well as being AGPL'd

2:01:18 phantomics There's https://github.com/scymtym/esrap

2:05:53 phantomics As far as writing back to LDraw it looks like any option would involve significant work, I see the format is unique and not based on XML or anything like that

2:07:07 phantomics I see there's https://github.com/JoshTheDerf/LDRParser that can convert LSD -to- JSON, but converting from JSON to LDR is still a todo, if it was implemented you could generate JSON from CL to give to it

2:11:41 lagash Yeah I was thinking esrap

2:12:13 lagash And I was meaning more, that the parser and the writer were integrated together, versus having two libraries, one for parsing, the other writing / generating