freenode/#clim - IRC Chatlog
Search
2:57:55
slyrus1
perhaps one day I'll go back to beirc and that crap won't happen, but beirc has its own (lack of) stability issues.
4:42:48
slyrus1
So back to what I was saying this morning... If I define a presentation-method on a presentation-type that does not have a corresponding CLOS class, we get the "Cannot find type for specializer" style-warnings. Is this something we should try to fix in define-presentation-method or should we fix all of our code such that there are CLOS classes for our presentation-types?
4:43:26
loke
slyrus1: I think that only happens if you define the presentation-type and use it in the same file
4:43:55
slyrus1
That could be -- but it doesn't happen if you have the corresponding CLOS clas, as far as I can tell.
4:49:29
slyrus1
Hmm... I've got a case here where it happens reliably without the defclass def'n and no error with the defclass defintion.
4:49:55
slyrus1
but, nevertheless, we should be able to define presentation types without the CLOS class, from what I gather the spec is saying.
4:58:53
slyrus1
I think the :compile-toplevel and :load-toplevel distinctions between record-presentation-type and %define-presentation-type are bogus.
5:04:06
jackdaniel
the gist of this behavior is this: we need to be able to attach to CLOS class of that name if it exists
5:04:46
slyrus1
OK, but the eval-when stuff should be handled under the covers by the define-presentation-type stuff, no?
5:04:55
jackdaniel
that's why we expand it differently. if it does not exist (what may be determined at load time), then we have presentation type without backing it class
5:06:05
jackdaniel
I'm just saying that it is not bogus, just a little tedious (like defconstant on ccl)
5:06:36
jackdaniel
https://github.com/McCLIM/McCLIM/commit/91d0279165ed7359393b0ad249cb6206d1d6e2b5#diff-9c90cd104f8f00c4e1074bb8b2d676cf
5:07:31
jackdaniel
I've added in a meantime climb:font-face, climb:font-size and climb:font-fixed-width
5:10:36
loke
jackdaniel: Is FONT-CHARACTER-WIDTH really necessary? A character is often not representable as a glyph (i.e. it doesn't have a “width” on its own). Usually, in Unicode, a smallest displayable unit is what is called a “grapheme cluster”
5:12:13
loke
jackdaniel: my argument against that one is the same, so can't that one simply map to FONT-STRING-WIDTH (with a single-character string)?
5:12:17
jackdaniel
also font-string-width goes after advance-width, while character-width takes left and right bearning
5:13:00
loke
jackdaniel: But that's very Latin-centric. In most other languages, things are much more complex.
5:13:33
jackdaniel
in non-latin language there is not such a thing as width of a standalone character?
5:13:55
loke
Most of the time, you can't talk about a single “character” as something that has a visual representation. The way it's displayed depends on the other unicode characters that surround it (or are attached to it)
5:14:43
jackdaniel
character width is meant for a standalone characters. if there isn't such a thing, then you just wont call font-character-width
5:14:58
loke
Or, they kanda do, but the “character” in such languages are not represented by a single Unicode codepoint.
5:15:36
jackdaniel
it is like complining, that you can't call (char string 1), becase some languages doesn't have characters (?)
5:16:16
jackdaniel
there is a property in ttf for the glyph bounding box, I'm sure these languages are representable in this format
5:17:34
loke
“It is important to recognize that what the user thinks of as a “character”a basic unit of a writing system for a languagemay not be just a single Unicode code point. Instead, that basic unit may be made up of multiple Unicode code points. To avoid ambiguity with the computer use of the term character, this is called a user-perceived character. For example, “G” + grave-accent is a user-perceived character: users think of it as a single
5:17:34
loke
character, yet is actually represented by two Unicode code points. These user-perceived characters are approximated by what is called a grapheme cluster, which can be determined programmatically.”
5:18:43
jackdaniel
loke: I beliee you that in some alphabets calling font-character-width doesn't have much sense, but it does for what we perceive as character in common lisp
5:19:25
loke
Because a character in almost (all?) CL implementations is nothing more than a single Unicode codepoint.
5:19:36
jackdaniel
and I really don't see reason, why we shouldn't include it, especially that we have text-style-character-width which *is* in clim standard
5:19:57
jackdaniel
loke: so it will work for all characters which have a single unicode codepoint - that's good enough for me
5:20:16
loke
I tis, because CLIM was entirely latin-centric and had no concept of combining characters, or anything of the kind.
5:22:28
loke
The point is, that Unicode is complicated, but this is a well-researched topic and there are solutions that work. Implementing a new API to support characters and fonts and ignoring these topics is not the right thing to do.
5:23:04
loke
jackdaniel: The smallest single unit of a “measureable” thing that you can draw is called a “grapheme cluster”. Anything smaller makes no sense.
5:23:06
jackdaniel
implementing api which supports specification is the right thing to do - in contrary to saying: this part of the spec is not implemented, because it is too latin-centric
5:23:25
loke
In Latin, all grapheme clusters can be represetned by a single unicode copdepoint, but this is an exception.
5:24:13
jackdaniel
also, nothing prevents specializing this method on a character which is made of multiple codepoints, what really matters here is a glyph
5:27:08
loke
jackdaniel: Right, and Unicode has one way of doing that, which is as a string of codepoints.
5:27:33
loke
However, as you say, that's not enough to uniquely identify a specific glyph from a font.
5:28:19
jackdaniel
we are interested in character width, if you represent characters as standard-class objects - nothing prevents you from doing that
5:28:21
loke
So are you saying that the reference to ‘character’ in FONT-CHARCTER-WIDTH refers to an abstract object that somehow describes a specific glyph in a font definiition?
5:29:19
jackdaniel
sure, it is not specialized whatsoever. I'm not saying I'm going to implement it for strings of codepoints, but nothing prevents such specialization
5:29:40
jackdaniel
such specialization on string would signal an error for something what is not a single character
5:30:07
loke
Because if that's the case, then I think my complaint reduces to the choice of words. “character” is ambiguous, and it should probably be renamed to GLYPS-SPECIFICATION or something like that. What do you think?
5:31:20
loke
jackdaniel: the thing is that in font files, glyphs are numbered, and for basic-latin those glpyhs map to the corresponding codepoint, but that's not necessarily the case, and is definitely not the case for most other languages.
5:31:58
loke
So while you can get away with using the glyph index and codepoints interchangeably for many european languages, that simply doesn't work elsewhere.
5:32:35
loke
so in FONT-GLYPH-WIDTH for example, the argument CODE refers to a glyph index, not a unicode codepoint?
5:33:17
loke
When I read the document, I read references to both CODE and CHARACTER as referring to unicode codepoints. If that is not correct, then most of my complaints are invalid.
5:34:04
jackdaniel
this api is general enough that all these may be standard-classes with mixins whatsoever
5:46:46
loke
jackdaniel: then what is needed is a way to transform a string into a sequence of glyph codes.
5:50:34
jackdaniel
I was thinking about adding method generate-glyph to this protocol, but I've rejected the idea because it is very specific to the font used
5:51:45
jackdaniel
regrading latin and identity - for simpleton implementation it could be true, but for instance ttf has a separate codes for each letter pair so kerning may be done faster
5:52:03
jackdaniel
we don't need to probe kerning-offset each time, it is already part of the advance-width/advance-height of the glyph
5:52:37
loke
jackdaniel: to expand a little... Coming at this from the outside, you have a string, containing a sequence of Unicode codepoints representing some text. The system needs to take this text, and convert this into individual things that can be drawn on specific locations on the screen. These “things” are glyphs that come from font files. The coordinates where to draw said glyphs are computed using some algorithm,.
5:54:41
loke
But that's really all there is to it. Boiled down to its essentials, all you need is a way to transform a string into a sequence of glyphs along with the coordinates where those glyphs should be placed relative to eachother.
5:56:08
jackdaniel
for instance, in truetype you calculate only the first cooridnate of the baseline and rener whole set of glyphs and xrender uses advance-width/height
5:56:54
jackdaniel
*also* some backend may handle fonts on its own side, so positioning etc is not handled on lisp side at all, you may simply throw a string into its jaws
5:57:22
jackdaniel
so these gfs are for us to figure some characteristics of the font, but not to prepare drawing ourself
5:57:30
loke
jackdaniel: Actually, xrender is too specefic. Xrender assocaites an “advance” to each glyph. This doesn't actually work, since the advance can be different for the same gryph depending on context. That's why I had to render each chracter individually instead of using render-draw-glyphs.
5:58:12
jackdaniel
same glyphs may have different indexes depending on a context (and different properties)
5:58:42
jackdaniel
that's why I say you take perspective which is too arbitrary-implementation-specific
5:58:57
loke
jackdaniel: Hmm... we might be using different terminology. To me they are different glyphs since they have different glyph indexes in the font file.
5:59:37
loke
But, you can have literally the same glyph index in the font file, but you have to use different advance depending on the context in which the character appears.
5:59:55
jackdaniel
sure, and the same glyphs have the same indexes in the font-file may have different indexes in drawn string
6:01:02
jackdaniel
I take letter pairs, so indexes are [nextchar<<16 + thischar] (or something like that)
6:01:12
loke
jackdaniel: Yes, but even if you have the exact same entry in the glyphset you might still end up with differnt advances and offsets.
6:01:28
slyrus1
loke: what kind of eval-when hackery do you use to get around the aforementioned warning?
6:01:48
jackdaniel
my point is that you create different entry in glyphset for glyphs with different advances
6:02:35
loke
jackdaniel: If you want to be able to render Hindi, you cannot simply take a sequence of glyph entry indexes and be able to draw it.
6:03:02
loke
jackdaniel: When you say “font”, you're not referring to, like, an OTF ofont or whatever?
6:03:26
loke
jackdaniel: So are you saying that I would put the Harfbuzz magic as part of the “font” itself?
6:03:55
loke
so from the point of the caller, the whole harfbuzz+font-repalcement-magic+freetype styll looks like a “font”?
6:03:56
jackdaniel
whole point of this protocol is to unify *some* properties of opaque fonts, so core modules may use them
6:04:43
loke
OK, I'm going out for lunch now, and I will think about this in terms of what I just now realised.
6:05:13
loke
But before that, can you explain what part of CLIM would be calling the FONT-GLYPH-HEIGHT function for example?
6:14:41
jackdaniel
either way, I'm not saying that clim parts use it, only that it is used for completness. I suppose it duplicates ascent+descent, but otoh I can imagine vertical fonts (not ttf I suppose), which have different character height
6:38:49
jackdaniel
I think that addin climb:font-string-glyph-codes is justified. that will also make possible to create "default" method for font-text-extents
6:43:52
jackdaniel
loke: and to address your doubts, here is improved default method for font-character-width
6:44:00
jackdaniel
(let ((code (alexandria:first-elt (climb:font-string-glyph-codes (string character)))))
7:23:48
jackdaniel
so imho font-text-extents should return 12 values: pixel-wise width,ascent,descent,left,height (multiline text!) font-wise: width*,ascent*,descent*,height* and cursor-dx,cursor-dy which takes into account size and the last glyph (i.e new line)
7:27:56
jackdaniel
splitting into lines may happen at a higher level, but in order to provide correct boxes font-text-extents should take that into account
7:28:11
jackdaniel
unless you mean, that we should split lines on level of text-size and text-bounding-rectangle
7:29:44
jackdaniel
and what is a rationale behind splitting such computation between higher and lower abstraction?
7:31:43
jackdaniel
(in particular that means, that we'll duplicate algorithm for splitting by lines on both text-size and text-bounding-rectangle)
7:31:59
loke
jackdaniel: That it's better to have it at a higher level, since the splitting behaviour should be identical between all backends. I cannot think of any case where a backends needs different behaviour compared to another.
7:34:39
jackdaniel
in principle font-text-extents also will be the same for all backends (climb:font-glyph-* climb:font-* function-wise), other specializations will be only for speed
7:51:36
jackdaniel
as a side note: I can imagine interpretation, where supplied text is considered being a paragraph. In that case first line gets text-style-specific indent of the first line and/or cursor advance being bigger than line leading. but that's just a side-thought
7:54:28
jackdaniel
I'm going to write a default method for font-text-extents operating on other verbs in the protocol
7:54:49
loke
One related problem that I can think of is how to actually deal with word-wrapping. The problem is that to do wrapping you need to know the width of the “page”, but the text drawing functions do not have access to this information.
7:55:23
loke
(besides, it's impossible to actually acquire that information since you might be drawing to an output record and you don't know where said output record will be placed)
7:55:54
loke
So if WW is done at a higher level, I'm not sure I see why the lower level should have to deal with multiline.