freenode/#sicl - IRC Chatlog
Search
12:06:36
heisig
Yes, I do. (https://github.com/marcoheisig/Petalisp/tree/master/code/type-inference)
12:07:12
heisig
It is very fast and doesn't even cons. Obviously, SICL will have slightly different requirements than Petalisp.
12:08:09
beach
I suppose. I don't quite see how you can take advantage of it, since you are not in control of the underlying Common Lisp implementation.
12:08:50
beach
I suppose you could take advantage of how (say) SBCL optimizes certain declarations etc.
12:10:13
heisig
Petalisp uses the result to store results in specialized arrays, when possible. And it uses the type inference to eliminate certain type checks.
12:11:26
heisig
But as I said, SICL has different requirement. I assume some amount of consing won't hurt, for example.
12:12:26
heisig
The crucial question is how we intend to infer the value types of calling a particular function.
12:12:43
heisig
SBCL uses hand-written type inference functions for that, but that is error prone and a lot of work.
12:13:56
heisig
Right. Standard functions and, possibly, functions that have been declared constant by other means.
12:14:38
heisig
I wonder whether Cleavir could generate type inference functions automatically from the AST or HIR of the original function.
12:16:04
beach
I think so, yes. I mean, the plan is to do type inference on HIR. That's partly why HIR exists after all. So if we are lucky, it will infer the types of the places that are ultimately returned.
12:17:13
beach
I don't think it is good enough to do type inference on source code if that is what you are asking.
12:17:39
heisig
One cool thing about such generated type inference functions would be that they could handle higher-order functions, too.
12:17:44
beach
Transformations that are not possible to express in source code will improve the precision of the inferred types.
12:18:14
heisig
I wonder if functions like MAP or REDUCE could be properly reasoned about automatically.
12:18:30
no-defun-allowed
But then one writes inference rules on HIR instructions. I suppose that could be easier as there are fewer HIR instructions?
12:22:32
beach
But, let me remind y'all that there are these two separate purposes of type inference. One is improved performance. The other is better compile-time messages to the programmer.
12:23:51
beach
Because then a number of possible inferred types won't necessarily make any difference, so could be put off until later, or abandoned altogether.
12:27:10
beach
So to start with for SICL, there are going to be tons of test for standard-object/not-standard-object simply because this test is required in order to determine whether to look for a stamp or not.
12:27:42
beach
So lots and lots of discriminating functions will contain some variation of this test.
12:30:20
heisig
But you also want to get (coerce X 'single-float) right, so suddenly you have to worry about symbols and EQL specifiers.
12:33:11
heisig
If the type inference understands EQL types, it can detect that the second argument to COERCE is the symbol SINGLE-FLOAT and hence the result type has to be single-float.
12:33:55
beach
So that would be an example of what the type inferencer would do on the function COERCE.
12:35:48
heisig
It would check whether the second argument is known to be a constant. If so, it could turn that constant into a type (or emit a warning if it is not a type specifier).
12:37:02
heisig
Ideally, that pass could also switch to a specialized version of COERCE. And in the best case where the first argument to COERCE already has the correct type, the call can be optimized away entirely.
12:37:08
beach
Right. So the other thing that we absolutely must take into account is the frequency of occurrence of certain constructs vs the cost of not handling those constructs.
12:38:04
beach
I don't think I have a single occurrence of COERCE in any code I have written. But it might of course be used in some arithmetic function for numeric. contagion.
12:40:49
heisig
A trick I use occasionally is (coerce sequence 'simple-vector). Because every sane CL compiler will then optimize the element access.
12:42:55
beach
And access to the vector elements would very likely be done in a loop, so lots of checks would be handled by loop invariants.
12:45:18
heisig
The 'simple' in 'simple-vector' has a different meaning than the 'simple' in 'simple-array'. I implies the element type is T.
12:46:01
beach
I am saying that, if the elements of SEQUENCE is then accessed in a loop, there will be tests for that in AREF, and those tests will then become loop invariants, so the test will be done once anyway.
12:47:22
beach
It is possible that I am wrong in some such cases but that's not the point of what I am trying to say right now.
12:48:04
beach
The point is that we should not try to optimize cases that don't matter. Either because they don't happen frequently enough, or because optimizing them won't make a difference.
12:48:39
beach
And I think it is crucial to think that through before putting work into creating code that will then have to be maintained, even though it serves no purpose.
12:49:18
beach
And here "think that through" might involve some profiling to be convinced in some cases.
12:50:07
heisig
I get what you are saying. But I am still trying to come up with a solution that is both easy to maintain, correct, and produces high-quality results.
13:10:00
beach
I am asking because, as we have seen in the past, we may need some simple value numbering to infer the types of some variables.
13:15:18
heisig
The current plan is to write a HIR transformation for turning functions that operate on values into functions that operate on types.
13:16:22
heisig
We also have to chose a representation for types. This is what's giving me a headache right now.
13:16:39
beach
I am not sure I understand that idea, but my understanding can wait until you have something more concrete.
13:17:17
beach
The other thing is control flow. We currently don't have a good idea of control flow in more complicated situations. And the type a variable can take on crucially depends on it.
13:19:05
MichaelRaskin
If you have an operation taking only type X and producing type Y, you use it both to infer the type of values dependent on the output and to infer what type the original parameters need to be.
13:19:57
beach
And what do you do with the last type of information that doesn't violate the semantics of the language?
13:21:41
beach
heisig: The problem with control flow is shared variables. If you have (f (lambda (y) (setf x y))), then x can be set to anything at any time, because F might create a thread.
13:23:43
beach
heisig: Because it might be known that F does NOT create a thread. So then the damage is limited.
13:24:43
beach
But the problem is that we need to have as precise a control graph as possible, and I believe Bike gave up on that.
15:24:37
beach
OK, I think I understand how this-parsing-technology-that-can't-be-called-packrat-parsing-because-it-handles-left-recursion work. Next step is to read scymtym's code and documentation for the library named parse.packrat, whatever technique it may use.
20:45:08
Bike
"But the problem is that we need to have as precise a control graph as possible, and I believe Bike gave up on that." ? if anything i made it more precise, since function attributes let the compiler know things like that SORT won't call its function in some long lived thread
20:46:15
Bike
karlosz's local function stuff in general helps since it covers many common cases where you know all call sites for a given function
22:44:36
karlosz
i did a lot of type inference related work on Cleavir, but it's not anywhere close to the sophistication that exists in sbcl
22:45:35
karlosz
i wrote a path sensitive analysis that worked on HIR, but which lacked bottom up propagation (think interval type reasoning from arithemtic functions)
22:46:52
karlosz
and then a bottom up propagation pass for BIR, which actually does okay on basic stuff in BIR (no type deriver method support though, though that could be easy to add given a good database and function attribute representation) but with no path sensistive analysis
22:47:28
karlosz
which simultaneously eliminates type checks and does compile time type checking fairly quickyl
22:48:44
karlosz
even just inferring how many values local functions return would be really helpful for better calling conventions
22:50:49
karlosz
this section of the manual of cmucl (which is missing in sbcl) has a list of all the optimizations i think should at least be present in SICL: https://common-lisp.net/project/cmucl/downloads/doc/cmu-user-2010-05-03/compiler-hint.html
22:51:38
Bike
in the mirtype branch i want to make all the datum types values types and then use that to fix the slowdown so that branch can be merged and we can move closer to good inference
22:54:46
karlosz
Bike: have you read the CMU CL section i linked before? i just found out it has a great explanation of the if-if optimization and a bunch of other stuff
22:56:49
karlosz
i wish i knew about this document sooner; it has literally everything i had to figure out from looking at source code
22:58:55
karlosz
5.3.3 (global function type inference) has the stuff heisig was talking about with deducing the type of a function automatically from its definition
22:59:36
Bike
mm, the fact it's just a flag is... i mean, i've been wondering how to handle the redefinition problem forever now
22:59:49
Bike
the snippets thing would help, but that's a lot of work just to get to where sbcl already is
23:00:01
karlosz
it's still not as powerful as a real type deriver, which actually has to be handwritten, since humans know more about math and stuff
23:02:11
Bike
yeah. i was thinking in mirtype i could shortcut some of the values stuff by doing something similar to how single values were distinguished before, but like, that's a lot of work just to avoid doing other better work
23:02:44
karlosz
like if you could integrate SSAing of lexical variables into the Python framework that would be huge for code with setq and loop analyses
23:04:25
karlosz
also 5.3.5 has this really cool optimization i hadn't seen before and also makes it clear why EQL constraints exist
23:19:04
karlosz
where you can turn (or null structure) tests into just a type test against nil in many situations
23:19:36
karlosz
i think the bar we should try to clear is like the last example given in 5.3.6, where that function has NO type checks
23:20:15
karlosz
and notably "This sort of situation shows how precise type checking combined with precise declarations can actually result in reduced type checking.
23:20:18
karlosz
" so that is a good counterpoint to the discussion about "approximate type lattices" beach and heisig were discussing