freenode/#lisp - IRC Chatlog

17:54:34 phantomics Hey, a question: in CL implementations supporting Unicode, does (char-code) always give the Unicode code point for a character?

17:56:47 Bike usually. obviously there's no standard.

17:58:29 Bike and of course with how unicode works there's not exactly a 1-1 mapping there

17:58:46 Bike in my sbcl, "̀a" has length 2

18:03:20 phantomics Thanks Bike, combining characters can be confusing like that

18:03:48 pyc what do you normally do for unicode support? do you code for a specific CL implementation? or is there a portability layer available for it too?

18:04:41 Bike http://edicl.github.io/cl-unicode/

18:07:04 phantomics I looked at cl-unicode but it doesn't seem to have a function that gets the code point for a given character

18:08:03 phantomics The lowercase-mapping and uppercase-mapping functions can get a char's code point, but letters have their case changed; there's no function I can see that just gets the code point

18:14:09 Bike the docs here imply that it assumes char-code is the code point

18:15:52 phantomics Ok, they must be trusting current implementations to do that

18:18:25 aeth other interpretations of what char-code and characters are in an implementation with Unicode wouldn't really be standards-compliant afaik.

18:18:40 aeth I guess unless they wanted to randomly shuffle which char-code corresponds to which thing in unicode for no reason

18:19:41 aeth So I think the main risk with char-code/code-char is that the implementation is using something other than Unicode.

18:20:42 phantomics My concern is for April, which you need Unicode to use in the first place

18:22:04 aeth I think the main issue is that sb-unicode has a bunch of useful things that no portable library has. cl-unicode gets you only some of it, with a much worse API (and probably much slower performance on SBCL, too)

18:25:37 aeth So my Scheme will only handle Unicode 100% correctly on SBCL until someone resolves this issue. I set out to make a Scheme, not a huge Unicode library.

18:28:58 phantomics That's annoying

18:29:21 aeth That is, I use babel (for UTF8<->strings) and cl-unicode outside of SBCL, and I use SBCL's libraries for SBCL. Some things only work fully conforming on SBCL if cl-unicode doesn't have a clear alternative to a thing in sb-unicode. And technically you can always use babel for the part that babel does, but that'll just hurt you on benchmarks since SBCL's internal octet conversion is faster.

18:29:57 aeth It's annoying, but it's only temporary. It will be resolved by someone later on. We had to work around floating point for a long time before float-features was released.

18:30:38 aeth (Now there's a bunch of libraries that are still only efficient with floating point in SBCL when they should be moving to float-features:with-float-traps-masked instead.)

21:07:11 shoshin5 ** NICK shoshin

22:37:00 Xach ACTION thinks some more

22:56:15 jasom aeth: I had a portable version of sb-unicode at one point

22:56:40 jasom aeth: but I do think it required that the char-code be the code-point

22:57:13 notzmv ** NICK Guest73794

22:58:17 Bike you have to make _some_ assumption, don't you? that char codes match, or char names match, or something

23:05:43 jasom well I just tried it on ccl and it confusable-p worked, so it hasn't completely bit-rotted

23:05:50 jasom probably needs updated input files

23:08:05 Bike my utf8string thing assumed char-codes matched, i think...

23:08:17 Bike yeah.

23:09:21 jasom It did not work on abcl: (psb-unicode:confusable-p "pa" "р𝝰") ; => NIL

23:10:25 jasom (char "р𝝰" 1) ;=> #\?

23:11:10 jasom so that just looks like not accepting utf-8 from repl

23:11:14 Bike oh no

23:12:23 jasom but if I stuff an actual greek alpha in there it returns true

23:13:53 zmv- ** NICK notzmv

23:13:53 jasom "р𝝰" ;=> "р?��"

23:17:38 jasom (psb-unicode:confusable-p "p" (babel:octets-to-string (make-array 2 :element-type '(unsigned-byte 8) :initial-contents #(#xcf #x81)))) ; => T

23:20:21 jasom oh, clisp found some non-portable LOOP forms

23:22:13 jasom but that simple smoketest passed on ccl, clisp, and abcl

0:45:36 X-Scale` ** NICK X-Scale

1:26:56 nitrix_ ** NICK nitrix

2:30:02 sz0_ ** NICK sz0

2:37:47 zmv ** NICK Guest2945

2:42:55 Lord_of_Life_ ** NICK Lord_of_Life

3:19:39 susam Good morning everyone!

4:11:24 beach Good morning everyone!

4:14:14 mfiano Hello. I would like some help constructing a particular type specifier

4:16:15 mfiano I'm wondering if such a type declaration would be possible that satisifies the constraint mentioned in the comment: https://gist.github.com/mfiano/bab595782c93421cf8a97671d1e6d30f

4:32:59 Bike so ub8a is short for a simple-array of (unsigned-byte 8), and the optional parameter controls the dimension specification?

4:33:51 mfiano Yes

4:33:54 Bike the fact that you want to treat a bare integer as indicating a single-dimensional array, rather than as a rank, kind of complicates it. without that it would just be `(simple-array (unsigned-byte) ,length)

4:34:17 Bike with that, i suppose ,(if (integerp length) `(,length) ,length)?

4:34:40 Bike length should default to * which is what you want

4:34:57 Bike so, ub8a by itself expands to (simple-array (unsigned-byte 8) *)

4:35:47 mfiano I suppose that's good enough. Thank you

4:36:17 moon-child perhaps `(if (integerp ,length) (list ,length) ,length) ?

4:36:35 Bike well you don't want an if in the expansion

4:36:37 moon-child so length can be an arbitrary expression

4:36:48 Bike though i did screw it up, i meant ,(if (integerp length) `(,length) length)