freenode/#lisp - IRC Chatlog
Search
17:54:34
phantomics
Hey, a question: in CL implementations supporting Unicode, does (char-code) always give the Unicode code point for a character?
18:03:48
pyc
what do you normally do for unicode support? do you code for a specific CL implementation? or is there a portability layer available for it too?
18:07:04
phantomics
I looked at cl-unicode but it doesn't seem to have a function that gets the code point for a given character
18:08:03
phantomics
The lowercase-mapping and uppercase-mapping functions can get a char's code point, but letters have their case changed; there's no function I can see that just gets the code point
18:18:25
aeth
other interpretations of what char-code and characters are in an implementation with Unicode wouldn't really be standards-compliant afaik.
18:18:40
aeth
I guess unless they wanted to randomly shuffle which char-code corresponds to which thing in unicode for no reason
18:19:41
aeth
So I think the main risk with char-code/code-char is that the implementation is using something other than Unicode.
18:22:04
aeth
I think the main issue is that sb-unicode has a bunch of useful things that no portable library has. cl-unicode gets you only some of it, with a much worse API (and probably much slower performance on SBCL, too)
18:25:37
aeth
So my Scheme will only handle Unicode 100% correctly on SBCL until someone resolves this issue. I set out to make a Scheme, not a huge Unicode library.
18:29:21
aeth
That is, I use babel (for UTF8<->strings) and cl-unicode outside of SBCL, and I use SBCL's libraries for SBCL. Some things only work fully conforming on SBCL if cl-unicode doesn't have a clear alternative to a thing in sb-unicode. And technically you can always use babel for the part that babel does, but that'll just hurt you on benchmarks since SBCL's internal octet conversion is faster.
18:29:57
aeth
It's annoying, but it's only temporary. It will be resolved by someone later on. We had to work around floating point for a long time before float-features was released.
18:30:38
aeth
(Now there's a bunch of libraries that are still only efficient with floating point in SBCL when they should be moving to float-features:with-float-traps-masked instead.)
22:58:17
Bike
you have to make _some_ assumption, don't you? that char codes match, or char names match, or something
23:05:43
jasom
well I just tried it on ccl and it confusable-p worked, so it hasn't completely bit-rotted
23:17:38
jasom
(psb-unicode:confusable-p "p" (babel:octets-to-string (make-array 2 :element-type '(unsigned-byte 8) :initial-contents #(#xcf #x81)))) ; => T
4:16:15
mfiano
I'm wondering if such a type declaration would be possible that satisifies the constraint mentioned in the comment: https://gist.github.com/mfiano/bab595782c93421cf8a97671d1e6d30f
4:32:59
Bike
so ub8a is short for a simple-array of (unsigned-byte 8), and the optional parameter controls the dimension specification?
4:33:54
Bike
the fact that you want to treat a bare integer as indicating a single-dimensional array, rather than as a rank, kind of complicates it. without that it would just be `(simple-array (unsigned-byte) ,length)