freenode/#sicl - IRC Chatlog

19:36:34 shka_ lol

19:37:11 shka_ i spend so much time on this damn shuffle that i was very surprised when i turned out that merge works correctly with first attempt

19:37:32 shka_ what kind of treachery is this!

19:43:06 jcowan beach: rereading the Big SICL Book (at 443 pages it is a book) now

19:43:39 shka_ there is something bigger?

19:43:45 shka_ jcowan: link please?

19:44:20 jcowan sorry, 224 pages (brain fart); http://metamodular.com/sicl.pdf

19:44:30 jcowan still a book

19:45:51 shka_ ok

19:46:37 jcowan anyway, I have two new concerns: I think 8-bit and indeed 16-bit subtypes of string are indeed important, since a huge fraction of all text is within the Latin-1 repertoire and almost all of it is in the Plane 0 repertoire

19:46:48 jcowan I am not talking about UTF-8 or UTF-16 encoding here at all

19:48:55 jcowan second, I find the idea of 27-bit single-floats on 32-bit platforms to be a bad one, for two reasons: (a) rounding from a 32-bit computed result to a 27-bit stored one will have to be performed carefully and without hardware assist;

19:49:19 jcowan (b) I think it will be very surprising that arithmetic on arguments obtained from specialized float32 arrays (which are mentioned in th array chapter) will be done only to 27-bit precision.

0:33:50 fiddlerwoaroof beach, scymtym: on #slime, there's some talk about using eclector as a backend for some slime features

0:47:07 luis At this point, I'm wondering if it makes sense to grab user-defined reader macros from cl:*readtable* into the Eclector readtable to be able to play with code that uses such reader macros or if there's a better strategy. (E.g., do it the other way around and inject bits of Eclector into cl:*readtable*)

0:50:09 luis Also, is there a code walker that takes CSTs as input? It doesn't seem too hard to adapt an existing code walker to use CSTs, but I guess I'm fishing for code walker recommendations as well.

3:45:54 beach Good morning everyone!

3:46:07 no-defun-allowed morning beach!

3:46:40 beach jcowan: I removed any mention of 32-bit platforms for floats, and I removed the suggested non-IEEE formats. I have yet to decide about strings.

3:47:09 beach fiddlerwoaroof: Thanks. I heard rumors of that briefly in the past. Very interesting.

3:49:07 beach luis: The "code walker" I am thinking of for Second Climacs is Cleavir. It takes a CST and converts it to an AST, using a first-class global environment. I am thinking of making an incremental implementation of the first-class global environment protocol so that it would be possible to restart the parser after any top-level form in a file/buffer.

3:49:25 beach But that's probably more work than what you had in mind.

4:50:28 jcowan beach: you might also want to consider providing IEEE 16-bit floats as short-floats, as there is now some hardware support for them on x86_64 (specifically an instruction that properly rounds a 16-bit float to a 32-bit float). Every float16 value has an exact float32 counterpart.

4:56:59 beach Good point.

4:57:27 beach Some domains use 16-bit floats I believe. Maybe machine learning?

4:57:59 jcowan yes, absolutely

4:58:07 jcowan they need a big dynamic range but not so much precision

4:59:21 |3b| also graphics

4:59:39 beach jcowan: I see, yes.

4:59:41 no-defun-allowed yeah, there's a lot of research into how many bits evaluation and training of neural networks requires

4:59:46 beach |3b|: Oh, I didn't know that.

4:59:51 beach Makes sense I guess.

4:59:55 no-defun-allowed i think 3 was the lowest i've seen

5:00:22 |3b| interesting light values range from candle to direct sun, but don't need much precision for the direct sun case

5:01:32 no-defun-allowed i don't believe there's hardware support except for custom logic, but 8bits is the hip in NNs now

5:02:54 beach Y'all seem to know a lot. Do you happen to know whether there is an x86 instruction for multiplying two complex double floats?

5:03:34 no-defun-allowed i don't think x86 has complex support natively...

5:03:42 jcowan I doubt it also

5:03:53 beach OK.

5:04:15 no-defun-allowed https://stackoverflow.com/questions/10329903/efficient-complex-arithmetic-in-x86-assembly

5:04:50 jcowan this page looks relevant: https://stackoverflow.com/questions/10329903/efficient-complex-arithmetic-in-x86-assembly

5:05:00 no-defun-allowed admittedly, i don't know much complex arithmetic besides the output of bordeaux-fft so i can't really comment on that

5:05:01 beach Heh, thanks!

5:05:04 no-defun-allowed jcowan: jinks

5:05:22 |3b| yeah, "not explicitly but not hard to do with SIMD extensions" was my guess

5:05:35 beach Good enough.

5:05:55 beach Thanks.

5:06:06 |3b| ACTION mostly uses GPU for math, but similar answer there

5:06:38 |3b| though possibly not worth it for doubles depending on brand and price of the GPU, since that is a 'pro' feature on some brands :(

5:06:39 beach Oh, I guess the GPU would be ideal for computing with sound signals as well.

5:06:52 |3b| depends on how much latency you need, but yeah

5:07:04 |3b| (how much lack of latency rather)

5:07:11 beach They do single floats mostly, yes?

5:07:32 |3b| yeah, not hard to get tflops of single float on GPU

5:07:52 beach Wow. Very nice.

5:08:10 |3b| 10-15TFLOPS peak on high-end cards

5:08:18 jcowan partciularly given the two pipeline stages, one to multiply and one to add

5:08:32 |3b| looks like ~1 on lower-end new standalone GPUs

5:08:33 jcowan on standard x86 chips nowadays

5:09:19 |3b| oddly 16bit float is a 'pro' feature too, since NN like them, so might not be faster than single :/

5:10:11 |3b| card that does 10TFLOPS single does 350GFLOPS double, 180G half

5:11:02 beach That should be plenty to produce a symphony orchestra using additive synthesis.

5:11:38 |3b| yeah, if latency isn't an issue, you can probably do a lot of sound processing on a GPU :)

5:12:13 beach Latency is an issue for real-time sound production. Much more so than for video.

5:12:40 beach Our ears are extremely sensitive to delay and also to small errors in computation.

5:12:41 |3b| right, which is why i mention it :)

5:12:53 beach Yes, thanks. Good to know.

5:13:22 beach I am not planning to use this information any time soon, but it changes how I think about some of my very low-priority projects.

5:14:34 |3b| GPU are optimized for working on large chunks of data at once, so you tend to want to work on longer segments, and you also have latency for transferring to/from GPU memory, and for going through the API

5:14:42 beach Another thing that has changed the game is multi-core processors. Producing sound is a highly parallel procedure.

5:15:11 |3b| yeah, CPU are pretty decent at that sort of thing too, especially if you can use their SIMD features

5:15:27 beach Exactly.

5:18:22 |3b| (and just to be clear, when i mentioned latency i mean from starting processing to getting results, so delay when processing a live stream or playing live... once you start though, you should be able to produce a continuous stream without gaps assuming it runs in realtime to start with)

5:18:50 beach I think I understand.

5:20:36 |3b| so good for offline generation/processing of sound, or realtime generation/processing if an initial delay is OK

5:21:00 beach Got it.

5:21:16 |3b| for other cases, might be OK, but i'd suggest to do some tests before investing a bunch of effort into it :)

5:22:38 |3b| (graphics, particularly VR, is doing latencies in the 10s of ms range or less, but doesn't have to copy back to CPU to send to another API for output, or at worst goes through some optimiized path in drivers)

5:23:16 |3b| "10s" = "tens" not "ten seconds"

5:24:16 beach I see. Tens of milliseconds would be unacceptable for many sound applications.

5:24:44 |3b| yeah, probably could get into the few ms range, but i'm not sure exactly without actually trying it

5:25:33 beach Like I said, I am not going to do anything about this soon, but I'll keep these things in mind when contemplating future work.

5:25:37 |3b| and also depends on the amount of work, have to be doing a lot of work per sample for a few ms of sound to be a good fiit for GPU

5:26:40 beach With additive synthesis, that ought to be the case.