freenode/#clim - IRC Chatlog

1:02:32 loke jackdaniel: Did you forget a link to the sketch?

2:27:42 beach Good morning everyone!

2:51:07 7IZAAM9GH morning beach

2:53:15 slyrus1 that's better. morning beach.

2:55:42 beach Wow, yes indeed.

2:57:55 slyrus1 perhaps one day I'll go back to beirc and that crap won't happen, but beirc has its own (lack of) stability issues.

2:59:20 beach I'll consider using it when it acquires abbrevs and a spell checker.

3:04:27 loke slyruswhich irc client are you using?

3:09:43 slyrus1 thunderbird's chatzilla extension

4:32:22 slyrus1 OK, now PR up for the PDF ellipse stuff.

4:42:48 slyrus1 So back to what I was saying this morning... If I define a presentation-method on a presentation-type that does not have a corresponding CLOS class, we get the "Cannot find type for specializer" style-warnings. Is this something we should try to fix in define-presentation-method or should we fix all of our code such that there are CLOS classes for our presentation-types?

4:43:02 loke slyrus1: actually...

4:43:26 loke slyrus1: I think that only happens if you define the presentation-type and use it in the same file

4:43:52 loke try putting the define-presentation-type in a different file that is loaded before.

4:43:55 slyrus1 That could be -- but it doesn't happen if you have the corresponding CLOS clas, as far as I can tell.

4:44:42 loke slyrus1: It does.

4:44:46 loke happens to me all the time.

4:45:04 loke I usually put the declarations in EVAL-WHEN to get around it.

4:49:29 slyrus1 Hmm... I've got a case here where it happens reliably without the defclass def'n and no error with the defclass defintion.

4:49:55 slyrus1 but, nevertheless, we should be able to define presentation types without the CLOS class, from what I gather the spec is saying.

4:58:53 slyrus1 I think the :compile-toplevel and :load-toplevel distinctions between record-presentation-type and %define-presentation-type are bogus.

5:03:48 jackdaniel slyrus1: we are, loke is right with eval-when

5:04:06 jackdaniel the gist of this behavior is this: we need to be able to attach to CLOS class of that name if it exists

5:04:46 slyrus1 OK, but the eval-when stuff should be handled under the covers by the define-presentation-type stuff, no?

5:04:55 jackdaniel that's why we expand it differently. if it does not exist (what may be determined at load time), then we have presentation type without backing it class

5:05:14 jackdaniel well, sure, that could be improved if possible

5:06:05 jackdaniel I'm just saying that it is not bogus, just a little tedious (like defconstant on ccl)

5:06:28 jackdaniel loke: you are right, I forgot the link

5:06:36 jackdaniel https://github.com/McCLIM/McCLIM/commit/91d0279165ed7359393b0ad249cb6206d1d6e2b5#diff-9c90cd104f8f00c4e1074bb8b2d676cf

5:07:31 jackdaniel I've added in a meantime climb:font-face, climb:font-size and climb:font-fixed-width

5:10:36 loke jackdaniel: Is FONT-CHARACTER-WIDTH really necessary? A character is often not representable as a glyph (i.e. it doesn't have a “width” on its own). Usually, in Unicode, a smallest displayable unit is what is called a “grapheme cluster”

5:11:01 loke It would be more consistent to just settle with the STRING-WIDTH

5:11:29 jackdaniel loke: font-character-width is a mapping from text-style-character-width

5:12:13 loke jackdaniel: my argument against that one is the same, so can't that one simply map to FONT-STRING-WIDTH (with a single-character string)?

5:12:17 jackdaniel also font-string-width goes after advance-width, while character-width takes left and right bearning

5:12:36 jackdaniel that said, default method is defined for font-character-width

5:13:00 loke jackdaniel: But that's very Latin-centric. In most other languages, things are much more complex.

5:13:33 jackdaniel in non-latin language there is not such a thing as width of a standalone character?

5:13:55 loke Most of the time, you can't talk about a single “character” as something that has a visual representation. The way it's displayed depends on the other unicode characters that surround it (or are attached to it)

5:14:00 loke jackdaniel: Correct.

5:14:17 loke jackdaniel: Many languages doesn't even have a concept of standalone character.

5:14:43 jackdaniel character width is meant for a standalone characters. if there isn't such a thing, then you just wont call font-character-width

5:14:49 jackdaniel because you won't be able to pass the second argument

5:14:58 loke Or, they kanda do, but the “character” in such languages are not represented by a single Unicode codepoint.

5:14:59 jackdaniel I'm not sure where the controversy comes from?

5:15:36 jackdaniel it is like complining, that you can't call (char string 1), becase some languages doesn't have characters (?)

5:16:16 jackdaniel there is a property in ttf for the glyph bounding box, I'm sure these languages are representable in this format

5:16:17 loke Hang on, there is a Unicode TR that describes it...

5:16:21 jackdaniel language alphabets*

5:16:33 loke OK, look up how Hangul Jamos work

5:16:43 loke I think it's at least aprtically covered here:

5:16:48 loke http://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries

5:16:55 loke “partially”

5:17:30 loke The first paragraph explains what I was trying to say:

5:17:34 loke “It is important to recognize that what the user thinks of as a “character”a basic unit of a writing system for a languagemay not be just a single Unicode code point. Instead, that basic unit may be made up of multiple Unicode code points. To avoid ambiguity with the computer use of the term character, this is called a user-perceived character. For example, “G” + grave-accent is a user-perceived character: users think of it as a single

5:17:34 loke character, yet is actually represented by two Unicode code points. These user-perceived characters are approximated by what is called a grapheme cluster, which can be determined programmatically.”

5:17:49 loke Ooops... sorry for the big paste

5:18:43 jackdaniel loke: I beliee you that in some alphabets calling font-character-width doesn't have much sense, but it does for what we perceive as character in common lisp

5:19:09 loke jackdaniel: I argue it doesn't.

5:19:25 loke Because a character in almost (all?) CL implementations is nothing more than a single Unicode codepoint.

5:19:36 jackdaniel and I really don't see reason, why we shouldn't include it, especially that we have text-style-character-width which *is* in clim standard

5:19:57 jackdaniel loke: so it will work for all characters which have a single unicode codepoint - that's good enough for me

5:20:06 jackdaniel there is plenty of characters like that

5:20:16 loke I tis, because CLIM was entirely latin-centric and had no concept of combining characters, or anything of the kind.

5:20:28 loke jackdaniel: Good enough for you, but it won't even work for swedish.

5:20:54 jackdaniel so glyphs in swedish does not have glyph width? or glyph left/right bearings?

5:20:58 jackdaniel I can't believe that

5:21:30 loke jackdaniel: Swedish has single ccharacters made up of multiple code points.

5:22:09 jackdaniel and do they have representing glyphs?

5:22:28 loke The point is, that Unicode is complicated, but this is a well-researched topic and there are solutions that work. Implementing a new API to support characters and fonts and ignoring these topics is not the right thing to do.

5:23:04 loke jackdaniel: The smallest single unit of a “measureable” thing that you can draw is called a “grapheme cluster”. Anything smaller makes no sense.

5:23:06 jackdaniel implementing api which supports specification is the right thing to do - in contrary to saying: this part of the spec is not implemented, because it is too latin-centric

5:23:25 loke In Latin, all grapheme clusters can be represetned by a single unicode copdepoint, but this is an exception.

5:24:13 jackdaniel also, nothing prevents specializing this method on a character which is made of multiple codepoints, what really matters here is a glyph

5:25:12 loke jackdaniel: and how do you represent a glyph?

5:25:33 jackdaniel in truetype? as a structure. hash code is a number

5:25:52 loke I know that. I mean in the Lisp API.

5:25:56 jackdaniel in fact, codes contain two glyphs already because of kerning

5:26:27 jackdaniel I believe lisp api does not have concept of glyph, only of graphic character

5:27:08 loke jackdaniel: Right, and Unicode has one way of doing that, which is as a string of codepoints.

5:27:31 jackdaniel OK, and what part of this api prevents of using it that way?

5:27:33 loke However, as you say, that's not enough to uniquely identify a specific glyph from a font.

5:28:19 jackdaniel we are interested in character width, if you represent characters as standard-class objects - nothing prevents you from doing that

5:28:21 loke So are you saying that the reference to ‘character’ in FONT-CHARCTER-WIDTH refers to an abstract object that somehow describes a specific glyph in a font definiition?

5:29:19 jackdaniel sure, it is not specialized whatsoever. I'm not saying I'm going to implement it for strings of codepoints, but nothing prevents such specialization

5:29:40 jackdaniel such specialization on string would signal an error for something what is not a single character

5:30:07 loke Because if that's the case, then I think my complaint reduces to the choice of words. “character” is ambiguous, and it should probably be renamed to GLYPS-SPECIFICATION or something like that. What do you think?

5:30:26 jackdaniel no, because it is meant to map well on text-styles

5:31:14 jackdaniel hm, was I mistaken? I can't find text-style-character-width in the spec

5:31:20 loke jackdaniel: the thing is that in font files, glyphs are numbered, and for basic-latin those glpyhs map to the corresponding codepoint, but that's not necessarily the case, and is definitely not the case for most other languages.

5:31:33 jackdaniel either way, for glyph specification you have climb:font-glyph-* gfs

5:31:58 loke So while you can get away with using the glyph index and codepoints interchangeably for many european languages, that simply doesn't work elsewhere.

5:32:26 jackdaniel this api does not hint any of that, does it?

5:32:35 loke so in FONT-GLYPH-WIDTH for example, the argument CODE refers to a glyph index, not a unicode codepoint?

5:32:53 jackdaniel again, font-specific concept, not defined by this protocol

5:33:06 jackdaniel in ttf it is a cache index

5:33:17 loke When I read the document, I read references to both CODE and CHARACTER as referring to unicode codepoints. If that is not correct, then most of my complaints are invalid.

5:33:22 jackdaniel and cache index is composed of two consecutive character codes

5:34:04 jackdaniel this api is general enough that all these may be standard-classes with mixins whatsoever

5:45:22 loke jackdaniel: Is the FONT object also an abstract concept?

5:45:59 jackdaniel yes, it is opaque but should be specializable

5:46:25 jackdaniel so it can't be a list (:my-coolvetica "latin" "bold" 13)

5:46:46 loke jackdaniel: then what is needed is a way to transform a string into a sequence of glyph codes.

5:47:04 loke In the Freetype backend, that is what Harfbuzz does.

5:47:32 loke For latin, you can just use IDENTITY

5:47:44 loke (maybe?)

5:50:34 jackdaniel I was thinking about adding method generate-glyph to this protocol, but I've rejected the idea because it is very specific to the font used

5:50:42 jackdaniel otoh we have font-glyph-* functions

5:50:50 jackdaniel so we expect, that some code may be obtained

5:51:02 jackdaniel I'll think about it

5:51:45 jackdaniel regrading latin and identity - for simpleton implementation it could be true, but for instance ttf has a separate codes for each letter pair so kerning may be done faster

5:52:03 jackdaniel we don't need to probe kerning-offset each time, it is already part of the advance-width/advance-height of the glyph

5:52:37 loke jackdaniel: to expand a little... Coming at this from the outside, you have a string, containing a sequence of Unicode codepoints representing some text. The system needs to take this text, and convert this into individual things that can be drawn on specific locations on the screen. These “things” are glyphs that come from font files. The coordinates where to draw said glyphs are computed using some algorithm,.

5:53:20 loke The algorithm can be very simplistic, as is the case with monospace latin fonts.

5:53:38 loke Or it can be highly complex, as is the case with arabic or hindi.

5:54:41 loke But that's really all there is to it. Boiled down to its essentials, all you need is a way to transform a string into a sequence of glyphs along with the coordinates where those glyphs should be placed relative to eachother.

5:55:27 jackdaniel your perspective is very freetype-centric

5:55:52 loke jackdaniel: Yes.

5:56:02 loke But it's also as generic as can be.

5:56:08 jackdaniel for instance, in truetype you calculate only the first cooridnate of the baseline and rener whole set of glyphs and xrender uses advance-width/height

5:56:54 jackdaniel *also* some backend may handle fonts on its own side, so positioning etc is not handled on lisp side at all, you may simply throw a string into its jaws

5:57:22 jackdaniel so these gfs are for us to figure some characteristics of the font, but not to prepare drawing ourself

5:57:30 loke jackdaniel: Actually, xrender is too specefic. Xrender assocaites an “advance” to each glyph. This doesn't actually work, since the advance can be different for the same gryph depending on context. That's why I had to render each chracter individually instead of using render-draw-glyphs.

5:57:53 loke So I set the advance to 0 and manually reposition the next position based on context.

5:58:12 jackdaniel same glyphs may have different indexes depending on a context (and different properties)

5:58:42 jackdaniel that's why I say you take perspective which is too arbitrary-implementation-specific

5:58:57 loke jackdaniel: Hmm... we might be using different terminology. To me they are different glyphs since they have different glyph indexes in the font file.

5:59:37 loke But, you can have literally the same glyph index in the font file, but you have to use different advance depending on the context in which the character appears.

5:59:55 jackdaniel sure, and the same glyphs have the same indexes in the font-file may have different indexes in drawn string

6:00:02 loke (I know the font has an advance attached to the glyph, but they can't be used as-is)

6:00:04 jackdaniel that's exactly what I do in truetype renderer

6:00:22 jackdaniel for instance glyph for letter A (first letter of alphabet)

6:00:37 jackdaniel may have many different entries in glyphset depending on the context

6:01:02 jackdaniel I take letter pairs, so indexes are [nextchar<<16 + thischar] (or something like that)

6:01:12 loke jackdaniel: Yes, but even if you have the exact same entry in the glyphset you might still end up with differnt advances and offsets.

6:01:14 jackdaniel and advancement is different for different A's

6:01:28 slyrus1 loke: what kind of eval-when hackery do you use to get around the aforementioned warning?

6:01:48 jackdaniel my point is that you create different entry in glyphset for glyphs with different advances

6:01:59 loke jackdaniel: but that's not how it works.

6:02:04 jackdaniel but, that's irrelevant

6:02:22 jackdaniel nothing prevents fonts to have their own structure underneath

6:02:35 loke jackdaniel: If you want to be able to render Hindi, you cannot simply take a sequence of glyph entry indexes and be able to draw it.

6:02:44 loke jackdaniel: Oh wait...

6:03:02 loke jackdaniel: When you say “font”, you're not referring to, like, an OTF ofont or whatever?

6:03:23 jackdaniel no, I'm referring to lisp object

6:03:26 loke jackdaniel: So are you saying that I would put the Harfbuzz magic as part of the “font” itself?

6:03:40 jackdaniel our protocol works on lisp objects, doesn't it?

6:03:55 loke so from the point of the caller, the whole harfbuzz+font-repalcement-magic+freetype styll looks like a “font”?

6:03:56 jackdaniel whole point of this protocol is to unify *some* properties of opaque fonts, so core modules may use them

6:04:03 jackdaniel yes

6:04:10 loke OK, I'm starting to understand.

6:04:15 loke Thank you for clarifying.

6:04:26 jackdaniel sure

6:04:43 loke OK, I'm going out for lunch now, and I will think about this in terms of what I just now realised.

6:05:13 loke But before that, can you explain what part of CLIM would be calling the FONT-GLYPH-HEIGHT function for example?

6:06:41 jackdaniel this accessor if for completness of glyph description

6:08:20 jackdaniel otoh height is in fact (- top bottom)

6:09:00 jackdaniel but no, glyph-height is not that, it may be bigger

6:09:16 jackdaniel because it contains top/bottom bearings (if they exist)

6:09:37 jackdaniel *it doesn't contain*, not it contains

6:14:41 jackdaniel either way, I'm not saying that clim parts use it, only that it is used for completness. I suppose it duplicates ascent+descent, but otoh I can imagine vertical fonts (not ttf I suppose), which have different character height

6:38:49 jackdaniel I think that addin climb:font-string-glyph-codes is justified. that will also make possible to create "default" method for font-text-extents

6:43:52 jackdaniel loke: and to address your doubts, here is improved default method for font-character-width

6:43:55 jackdaniel (defgeneric climb:font-character-width (font character)

6:43:57 jackdaniel (:method (font character)

6:44:00 jackdaniel (let ((code (alexandria:first-elt (climb:font-string-glyph-codes (string character)))))

6:44:03 jackdaniel (+ (climb:font-glyph-left font code)

6:44:05 jackdaniel (climb:font-glyph-width font code)

6:44:08 jackdaniel (climb:font-glyph-right font code)))))

6:44:11 loke I see

6:44:15 loke that was educational.

7:23:48 jackdaniel so imho font-text-extents should return 12 values: pixel-wise width,ascent,descent,left,height (multiline text!) font-wise: width*,ascent*,descent*,height* and cursor-dx,cursor-dy which takes into account size and the last glyph (i.e new line)

7:24:22 jackdaniel font-wise should also contain left for RTL direction

7:24:33 jackdaniel (it is missing in the above enumeration)

7:26:19 loke jackdaniel: Should FONT-TEXT-EXTENTS really have to deal with multiline?

7:26:28 loke Shouldn't the splitting into lines happen at a higher level?

7:27:56 jackdaniel splitting into lines may happen at a higher level, but in order to provide correct boxes font-text-extents should take that into account

7:28:11 jackdaniel unless you mean, that we should split lines on level of text-size and text-bounding-rectangle

7:28:22 loke Yes

7:28:33 loke And then each line is measured individually

7:29:44 jackdaniel and what is a rationale behind splitting such computation between higher and lower abstraction?

7:31:43 jackdaniel (in particular that means, that we'll duplicate algorithm for splitting by lines on both text-size and text-bounding-rectangle)

7:31:59 loke jackdaniel: That it's better to have it at a higher level, since the splitting behaviour should be identical between all backends. I cannot think of any case where a backends needs different behaviour compared to another.

7:32:05 jackdaniel that's not such big concern, but is a point for putting it in font-text-extents

7:34:39 jackdaniel in principle font-text-extents also will be the same for all backends (climb:font-glyph-* climb:font-* function-wise), other specializations will be only for speed

7:35:29 jackdaniel I'm not hard against moving it higher, I just don't see much merit in doing so

7:36:04 loke Well, I gave my opinion on the subject. I don't consider it that important.

7:36:12 loke :-)

7:51:36 jackdaniel as a side note: I can imagine interpretation, where supplied text is considered being a paragraph. In that case first line gets text-style-specific indent of the first line and/or cursor advance being bigger than line leading. but that's just a side-thought

7:53:14 loke jackdaniel: is that really the responsibility of the individual backends?

7:53:29 loke I guess it could be, but I can think of various othe rproblems.

7:54:11 jackdaniel as I said, it is just a side-thought

7:54:28 jackdaniel I'm going to write a default method for font-text-extents operating on other verbs in the protocol

7:54:49 loke One related problem that I can think of is how to actually deal with word-wrapping. The problem is that to do wrapping you need to know the width of the “page”, but the text drawing functions do not have access to this information.

7:55:23 loke (besides, it's impossible to actually acquire that information since you might be drawing to an output record and you don't know where said output record will be placed)

7:55:24 jackdaniel wrapping is indeed done in higher level (and should be)

7:55:28 loke Yeah

7:55:50 jackdaniel I'm not arguing we should do everything in here, but some things may be done

7:55:54 loke So if WW is done at a higher level, I'm not sure I see why the lower level should have to deal with multiline.

7:56:23 jackdaniel because multiline may be dealed in font-text-extents, wrapping can't

7:56:51 jackdaniel also text-size is something what is done on mediums, not streams

7:56:55 jackdaniel wrapping is a stream thing

7:57:21 loke Yeah, discussing WW might have been a bit too far removed from the issue at hand.

7:58:17 jackdaniel as I see font-text-extents is that it takes the text, and computes its extents

7:58:25 jackdaniel it does no modification to it whatsoever

7:58:36 loke Yes.

8:09:38 hhdave_ ** NICK hhdave