freenode/#sicl - IRC Chatlog

15:29:30 beach Pretty much.

15:29:35 beach Let me think...

15:30:03 beach The required ones, of course: bit, character.

15:31:05 beach (unsigned-byte 8), (unsigned-byte 32), (unsigned-byte 64), (signed-byte 32), (signed-byte 64)

15:31:22 beach single-float, double-float

15:31:54 beach (complex single-float), (complex double-float)

15:32:09 beach I can't think of any others now.

15:40:07 pfdietz There's that annoying NIL element type as well, that the letter of the standard requires. :)

15:40:46 beach Right.

15:43:29 pfdietz Also, if you support (signed-byte N) and (unsigned-byte N), the spec requires you support (unsigned-byte N-1)

15:43:44 pfdietz (the intersection of those types)

15:44:16 beach I didn't know that. Do you have the relevant Common Lisp HyperSpec page?

15:44:32 pfdietz upgraded-array-element-type

15:44:35 Bike i guess it just follows from the "upgrading has to maintain the lattice" thing

15:44:39 pfdietz Yes

15:44:47 Bike that's pretty annoying

15:45:01 beach I see.

15:45:13 pfdietz No one will complain much if you just document a variance from the standard on that point.

15:45:30 beach It doesn't cost much to comply either.

15:45:43 pfdietz The situation with complex types is less happy. The spec is just inconsistent on upgrading there.

15:50:52 jcowan One additional possibility is fixnum

15:52:18 |3b| ACTION uses signed/unsigned-byte 16 once in a while

15:54:41 pfdietz ALso strictly speaking, the "maintains the lattice" thing talks about subtypes, not "recognizable" subtypes. Which in the presence of SATISFIES types means it's undecidable. They probably meant it to be consistent with what the implementation does with SUBTYPEP, but that's not what they wrote.

15:57:03 Bike i'm not sure the requirement is really a good idea anyway

15:57:43 beach jcowan: Yes, fixnum is a possibility.

15:58:52 frodef fixnums, just to avoid type-testing? They tend to be boxed already..

15:59:57 beach It may not be that useful.

16:00:02 beach I would have to think about it.

16:00:11 jcowan That's the counterargument, yes, it doesn't save space but it does save time.

16:00:18 jcowan (relative to s64s0

16:00:31 jcowan s/s64s0/s64 values)

16:01:01 jcowan where you have to check whether you need to return a bignum, etc.

16:01:34 |3b| would you store them tagged or like sb64 with limited range?

16:01:40 jcowan and compared to general vectors it provides a guarantee that the value really is a fixnum.

16:01:42 jcowan Tagged, I would think

16:01:57 |3b| also saves GC compared to general vector

16:02:08 jcowan The other thing I wanted to bring up is immediate doubles via NaNboxing or some variant of it

16:04:01 jcowan with such a design, 32-bit floats are short-floats, 64-bit floats are single, double, and long-floats (unless you want to use arbitrary precision floats as long-floats)

16:05:03 |3b| single-float not being "ieee single precision" would be confusing :p

16:05:26 |3b| ACTION would rather have 16bit short floats anyway :p

16:06:13 jcowan I agree, but CL insisting that default float is always single-float makes it hard to avoid. Every other language I know of defaults to double-float except Fortran

16:06:39 |3b| ah, i guess it is "binary32" now rather than "single"

16:07:46 jcowan Historically the drag of 32-bit systems has held down NaNboxing as a strategy, but with 64-bit (almost) everywhere and all pointers still 48 bits or less (except on Slowlaris), it starts to look like the go-to strategy.

16:13:24 beach How would you encode a fixnum in such a setting?

16:13:48 beach I remember deciding against NaN boxing, but I don't remember why.

16:16:09 jcowan a fixnum would be a NaN with a low tag saying "fixnum", potentially just one bit

16:16:16 jcowan thus allowing 52-bit fixnums

16:17:02 jcowan You can rotate nanboxed values so that pointers look native and doubles have to be adjusted, which is called "punboxing"

16:17:32 beach So that will ruin the property that you can add and subtract fixnums using machine add and sub instructions?

16:17:33 jcowan https://wingolog.org/archives/2011/05/18/value-representation-in-javascript-implementations explains it, but incorrectly refers to punboxing as "nunboxing" (which is actually a different idea)

16:17:53 jcowan Not if the low tag for fixnums is 0

16:17:58 pfdietz ARM64 can do something special with the type byte of pointers; does anyone exploit that?

16:18:09 pfdietz top byte

16:18:36 beach jcowan: Won't the exponent-part be destroyed?

16:20:48 jcowan then pointers and fixnums have all high bits 0, and to interpret a double you exchange the high tags 0 and NaN before operating on it

16:21:24 jcowan which is surely cheaper than treating it as a SICL general object

16:21:37 beach For a float? Sure.

16:22:58 jcowan "So, when you choose to do nan-boxing, you basically choose to do one of two things: you favor pointers, or you favor doubles. To favor pointers means that you recognize pointers as having initial (most-significant) 0 bits; if the initial bits are not 0, then it's a double, and you have to add or subtract a bit pattern to get to the double value." --from the above link

16:23:05 jcowan Clearly Lisps need to favor pointers

16:23:27 jcowan where "pointer" here includes fixnums and other immediates distinguished by their low tag.

16:24:24 beach I still don't get it. So the fixnum 0 will be all 0s?

16:24:36 jcowan yes

16:24:47 beach And that's not a valid float?

16:26:18 jcowan Floats are, as I say, encoded such that the all-0s exponent is mapped into the all-1s exponent and vice versa.

16:26:26 frodef /me thinks floats should be the very bottom priority of runtime design.

16:26:31 jcowan this can be achieved by flipping the exponent bigs

16:26:32 jcowan bits

16:26:54 frodef ACTION checks if /me works.

16:26:56 jcowan Lisps have historically had very good float performance; today, not so much. IWBN to get back there

16:27:02 beach jcowan: I think I understand.

16:28:05 jcowan You actually want to occupy only the signaling NaN space for pointers/immediates because you do not know which quiet NaN various processor operations will return.

16:29:56 jcowan so with 53 mantissa bits and a 1-bit low tag of 0 for fixnums, you get 52-bit fixnums, which should be enough for practical use

16:30:26 jcowan sorry, 52 mantissa bits, 51-bit fixnums

16:34:27 beach Oh, but wait. How do you detect overflow in fixnum addition and subtraction?

16:34:42 beach You would have to add a comparison for that.

16:34:56 jcowan Yes, you can;t use a hardware overflow bit

16:35:07 beach I see, yes.

16:36:21 Bike on the other hand, couldn't you add a bunch of fixnums together without checking overflow until the end, since it'll just fill up some high bits?

16:36:36 beach Sure.

16:37:20 beach Another interesting piece of information would be to determine how many double floats need to be tagged in a typical program that uses floats.

16:38:39 Bike sbcl with high optimization settings notes when it has to do tagging and untagging. i've hit it a few times

16:38:42 Bike nice feature

16:45:20 frodef maybe with nan-boxing it will be easier to include a javascript compiler frontend? :)

16:46:34 Bike i remember hitting it mostly at function boundaries, which is kind of hard to fix while allowing recompilation if floats aren't immediate

16:46:45 Bike i mean you can inline of course

16:50:00 Shinmera beach: 24bit arrays can be important for audio processing

16:50:52 Shinmera and efficient floats are of course very important for a wide array of applications, many of which I personally care about :)

16:51:17 pfdietz What are the implications of making the representations work with GPUs?

16:52:08 Bike for GPUs i'd think the cost of transferring data back and forth is probably going to dominate tagging and untagging costs? dunno...

16:52:49 Shinmera yes, communicating with the GPU in any way quickly becomes the bottleneck

16:53:17 Shinmera both in terms of how much data you push and the frequency of it.

17:27:30 beach Shinmera: I'll think about it. When I did audio research, we decided to you double floats everywhere. They are fast and there is plenty of space these days.

17:29:21 Shinmera beach: really, double floats? single floats are becoming more widely used now and I agree that it's the best choice for processing it, but for storage in end formats, 24 bit integers unfortunately still happen.

17:29:57 beach I see.

17:30:04 jcowan That's the experience of the Big Data platform company I used to work for. The speed of transfer out of the GPU totally obscures any improvement in computation

17:30:43 Shinmera jcowan: GPU still has unique things to watch out for like the fact that branching is expensive again

17:42:59 jcowan Branching is very expensive on the CPU too

17:43:09 jcowan unless it is highly predictable

17:59:11 |3b| GPU are motivation for me wanting 16bit short floats

18:01:27 |3b| though most of the data in those formats is just getting shovelled from disk to GPU without much CPU interaction

18:02:38 |3b| (and also motivation for wanting 16bit specialized arrays, graphics data is still big enough that cutting size in half is nice)

18:41:46 Shinmera |3b|: next you'll want support for the R11FG11FB10F texture format!

21:21:34 Bike beach: cleavir loses information about when block/tagbody dynamic extents are exited. i think we could add an invalidate-catch instruction of some sort, and then have ast-to-hir maintain a stack of contexts so that we can generate the appropriate invalidations whenever we exit a block, without too much trouble

21:21:45 Bike cleavir loses information -> AST-to-HIR loses information, really